Redo and undo Log in MySQL transaction

We all know that transactions have four characteristics: atomicity, consistency, isolation and persistence, and that the operation in a transaction is either fully executed or not done at all, which is the purpose of the transaction. Transaction isolation is achieved by locking mechanism, atomicity, consistency and persistence are guaranteed by transaction redo log and undo log. So this article will discuss several issues about redo and undo in transactions:

What are redo logs and undo logs?

How does redo ensure transaction durability?

Is undo log an inverse process of redo log?

redo log

Types of Redo

The redo log is used to ensure the persistence of transactions, that is, D in transaction ACID. In fact, it can be divided into the following two types:

Physical Redo Log

Logical Redo Log

In InnoDB storage engine,In most cases, Redo is a physical log, which records the physical changes of data pages.。 Logical Redo logs, however, do not record the actual modification of the page, but record a kind of modification of the page, such as new data pages, need to record logical logs. As far as logical Redo logs are concerned, we just need to remember that in most cases, Redo is a physical log, and the modification of pages by DML needs to record Redo.

Role of Redo

The main function of Redo log is for database crash recovery

Composition of Redo

Redo log can be simply divided into the following two parts:

One is redo log buffer in memory, which is volatile and in memory.

Second, redo log files are persistent and stored on disk.

When do I write Redo?

The above figure simply reflects the writing process of Redo. Here we go into details about the timing of writing Redo:

After the modification of the data page is completed, the redo log is written before the dirty page brushes out the disk. Note that first modify the data, then write the log.

Reo logs are written back to disk before data pages

Cluster index, secondary index and undo page modification all need to record Redo logs.

Redo’s overall process

Take an update transaction as an example, grasp the redo log flow process macroscopically, as shown in the following figure:

Step 1: First read the original data from disk into memory, modify the memory copy of the data

Step 2: Generate a redo log and write it to redo log buffer, which records the modified values of the data

Step 3: When the transaction commit, refresh the contents of redo log buffer to redo log file, and write redo log file in an additional way.

Step 4: Regularly refresh the modified data in memory to disk

How does redo ensure transaction durability?

InnoDB is the storage engine for transactions through whichForce Log at Commit mechanismTo achieve transaction persistence, that is, when a transaction is committed, the redo log buffer is written to the redo log file for persistence, and it is not completed until the commit operation of the transaction is completed. This practice is also known asWrite-Ahead Log (pre-log persistence)Before persisting a data page, persist the corresponding log page in memory.

To ensure that every log is written to the redo log file, after every redo buffer is written to the redo log file, by default, the InnoDB storage engine needs to be called onceFsync operationBecause there is no O_DIRECT option to open the redo log, the redo log is written to the file system cache first. To ensure that the redo log is written to disk, an fsync operation must be performed. Fsync is a system call operation whose efficiency depends on the performance of disk, so the performance of disk also affects the performance of transaction submission, that is, the performance of database.
(The O_DIRECT option is an option in Linux system. After using this option, files are directly IO operated and written to disk without file system caching)

As mentioned aboveForce Log at Commit mechanismIt depends on the parameters provided by InnoDB storage engine.`innodb_flush_log_at_trx_commit`To control, this parameter can control the strategy of redo log refresh to disk. Setting this parameter value can also allow users to set non-persistent situations, as follows:

When the parameter is set to 1, (default is 1), it means that the transaction must be invoked once when committing.`fsync`Operations, the safest configuration, guaranteeing persistence

When the parameter is set to 2, only do it when the transaction commitswriteOperation only guarantees that redo log buffer is written to the page cache of the system without fsync operation. Therefore, if MySQL database goes down, transactions will not be lost, but if the operating system goes down, transactions may be lost.

When the parameter is set to 0, it means that the redo log operation is not written when the transaction is committed. This operation is only performed in master threads, while the fsync operation of redo logs is performed every second in master threads, so the instance crash loses transactions within one second at most. (master thread is responsible for asynchronous refresh of data from buffer pool to disk to ensure data consistency)

`fsync`and`write`Operations are actually system call functions that are used in many persistence scenarios, such as AOF persistence in Redis.`fsync`The operation submits the data to the hard disk, forces the hard disk to synchronize, and returns after writing to the hard disk.`fsync`Operations have performance bottlenecks, and`write`The operation returns the data immediately after it is written to the page cache of the system, and then brushes the cached data to disk by the scheduling mechanism of the system, in the order of user buffer > page cache > disk.

In addition to the Force Log at Commit mechanism mentioned above, which guarantees transaction persistence, the implementation of redo logs actually depends on mini-transaction.

How does Redo work in InnoDB? Connection with mini-transaction?

The implementation of Redo is closely related to mini-transaction, which is a mechanism used within InnoDB through mini-transaction.Ensure data consistency in data pages under concurrent transaction operations and database exceptionsBut it’s not a business.

In order to ensure data consistency in data pages, mini-transaction must follow three protocols：

The FIX Rules

Write-Ahead Log

Force-log-at-commit

The FIX Rules

When modifying a data page, we need to obtain the x-latch (exclusive lock) of the page, and when acquiring a data page, we need the s-latch (read lock or shared lock) or x-latch of the page, holding the lock of the page until the operation of modifying or accessing the page is completed.

Write-Ahead Log

Write-Ahead Log is mentioned in the previous description. Before persisting a data page, the corresponding log page in memory must be persisted. Each page has an LSN (log sequence number), which represents the log sequence number (LSN occupies 8 bytes, monotonically increasing). Before a data page needs to be written to the persistent device, a log with less than LSN in memory is required to be written to the persistent device first.

So why do you have to write a log first? Can you write data directly to disk without writing logs? In principle, it is possible, but it will cause some problems. Data modification will produce random IO, but the log is written sequentially in IO, append mode, which is a serial mode, in order to make full use of the disk performance.

Force-log-at-commit

This is how to ensure the durability of the aforementioned content, here again summarizes, and echoes the above content. Write-Ahead Log can modify multiple pages in a transaction, but it can not guarantee the consistency of a single data page. Force-log-at-commit requires that when a transaction is committed, all mini-transaction logs generated by it must be refreshed to disk. If the log refresh is completed, the pages in the buffer pool should be refreshed to the data before the persistent storage device. When the database is down, the integrity of the data can be guaranteed by logging when the database restarts.

Writing process of redo log

The above chart shows the writing process of redo log. Each mini-transaction corresponds to each DML operation, such as an update statement, which is guaranteed by a mini-transaction. After modifying the data, redo1 is generated. First, redo1 is written into the mini-transaction private Buffer. After the update statement is finished, redo1 is copied from the private Buffer to the public Log Buffer. When the entire external transaction is committed, the redo log buffer is then brushed into the redo log file.

undo log

Definition of undo log

Undo log mainly records the logical changes of data. In order to roll back the previous operations when an error occurs, it is necessary to record all the previous operations, and then roll back when an error occurs.

The role of undo log

Undo is a logical log with two functions:

Rollback for transactions

MVCC

This article focuses on undo log for transaction rollback.

The undo log only restores the database logically to its original state. When it rolls back, it actually does the opposite work, such as an INSERT, corresponding to a DELETE, and for each UPDATE, corresponding to an opposite UPDATE, putting back the lines before modification. Undo logs are used to roll back transactions, thus ensuring the atomicity of transactions.

Writing timing of undo log

Record undo log before DML operation modifies cluster index

Modifications to secondary index records, not undo logs

It should be noted that changes to undo pages also require recording redo logs.

Storage location of undo

In InnoDB storage engine, Undo is stored in Rollback Segment, where 1024 undo log segments are recorded for each rollback segment, and applications for undo pages are made in each undo log segment. Before 5.6, Rollback Segment was in shared table space. After 5.6.3, the location of undo storage can be set through innodb_undo_tablespace.

Types of undo

In InnoDB storage engine, undo log is divided into:

insert undo log

update undo log

Insert undo log refers to the undo log generated in insert operation, because the record of insert operation is only visible to the transaction itself, but not to other transactions. Therefore, the undo log can be deleted directly after transaction submission without purge operation.

The update undo log records the undo log generated by the delete and update operations, which may need to provide MVCC mechanism, so it can no longer be deleted when the transaction is committed. Put it in the undo log list when submitting and wait for the purge thread to delete it finally.

Supplementary: The purge thread has two main functions: cleaning undo pages and cleaning data rows with Delete_Bit identifiers in pages. In InnoDB, the Delete operation in a transaction is not really a Delete Mark operation, but a Delete Mark operation that identifies Delete_Bit on a record without deleting a record. It’s a kind of “false deletion”, just a mark. The real deletion work needs the background purge thread to complete.

Is undo log an inverse process of redo log?

Is undo log an inverse process of redo log? In fact, we can get the answer from the previous article. undo log is a logical log. When rolling back a transaction, it only restores the database logically to its original state. redo log is a physical log, recording the physical changes of data pages. Obviously undo log is not the reverse process of redo log.

Summary of redo & undo

Following is a simplified process of redo log + undo log, which facilitates the understanding of the two logging processes:

Assuming that there are two data A and B, the values are 1 and 2, respectively.
1. Transaction Start
2. Record A = 1 to undo log
3. modify A=3
4. Record A = 3 to redo log
5. Record B = 2 to undo log
6. modify B=4
7. Record B = 4 to redo log
8. Write redo log to disk
9. Transaction submission

In fact, in insert / update / delete operations, redo and undo record different contents and quantities respectively. In InnoDB memory, the general order is as follows:

Redo for undo
Write undo
Modify data pages
Write Redo

Summary

This paper analyses the redo and undo logs in transactions, and draws a conclusion from some reference books that there may be some unclear expressions. If there are any mistakes, you are welcome to point out.

Tìm kiếm Blog này

Kiên Bùi

Redo and undo Log in MySQL transaction

redo log

Types of Redo

Role of Redo

The main function of Redo log is for database crash recovery

Composition of Redo

Redo log can be simply divided into the following two parts:

One is redo log buffer in memory, which is volatile and in memory.

Second, redo log files are persistent and stored on disk.

When do I write Redo?

Redo’s overall process

How does redo ensure transaction durability?

How does Redo work in InnoDB? Connection with mini-transaction?

undo log

Definition of undo log

Undo log mainly records the logical changes of data. In order to roll back the previous operations when an error occurs, it is necessary to record all the previous operations, and then roll back when an error occurs.

The role of undo log

Writing timing of undo log

Record undo log before DML operation modifies cluster index

Modifications to secondary index records, not undo logs

It should be noted that changes to undo pages also require recording redo logs.

Storage location of undo

Types of undo

Is undo log an inverse process of redo log?

Summary of redo & undo

Summary

This paper analyses the redo and undo logs in transactions, and draws a conclusion from some reference books that there may be some unclear expressions. If there are any mistakes, you are welcome to point out.

Nhận xét

Đăng nhận xét

Bài đăng phổ biến từ blog này

ActiveMQ 5.x

[Kubernetes Series] - Bài 19 - Adding custom resource to Kubernetes

Redo and undo Log in MySQL transaction

redo log

Types of Redo

Role of Redo

The main function of Redo log is for database crash recovery

Composition of Redo

Redo log can be simply divided into the following two parts: One is redo log buffer in memory, which is volatile and in memory. Second, redo log files are persistent and stored on disk.

When do I write Redo?

Redo’s overall process

How does redo ensure transaction durability?

How does Redo work in InnoDB? Connection with mini-transaction?

undo log

Definition of undo log

Undo log mainly records the logical changes of data. In order to roll back the previous operations when an error occurs, it is necessary to record all the previous operations, and then roll back when an error occurs.

The role of undo log

Writing timing of undo log

Record undo log before DML operation modifies cluster index Modifications to secondary index records, not undo logs It should be noted that changes to undo pages also require recording redo logs.

Storage location of undo

Types of undo

Is undo log an inverse process of redo log?

Summary of redo & undo

Summary

This paper analyses the redo and undo logs in transactions, and draws a conclusion from some reference books that there may be some unclear expressions. If there are any mistakes, you are welcome to point out.

Nhận xét

Đăng nhận xét

Bài đăng phổ biến từ blog này

ActiveMQ 5.x

[Kubernetes Series] - Bài 19 - Adding custom resource to Kubernetes

Redo log can be simply divided into the following two parts:

One is redo log buffer in memory, which is volatile and in memory.

Second, redo log files are persistent and stored on disk.

Record undo log before DML operation modifies cluster index

Modifications to secondary index records, not undo logs

It should be noted that changes to undo pages also require recording redo logs.