Aries recovery algorithm in dbms: Recovery System

What is ARIES? What is the data structure of ARIES? How it can be used as a recovery algorithm? If you want to find out all the answers of these questions, sit back, relax and read this post till the end and I guarantee you that you will gain some knowledge through this. So, without wasting any more time, let's begin-

What is ARIES?

ARIES is one of the best method to illustrate the state of art in recovery method. The recovery technique, Recovery Algorithm, along with the logical undo logging techniques described in early lock release and logical undo operations, is modeled after ARIES, but has been simplified significantly to bring out key concepts and make it easier to understand.

In contrast, ARIES uses a number of techniques to reduce the time taken for recovery, and to reduce the overhead of checkpointing. In particular, ARIES is able to void redoing many logged operations that have already been applied and to reduce the amount of information logged. The price paid is greater complexity; the benefits are worth the price.

The major differences between ARIES and the recovery algorithm presented earlier are that ARIES:

1. Uses a log sequence number (LSN) to identify log records, and stores LSNs in database pages to identify which operations have been applied to a database page.

2. Supports physiological redo operations, which are physical in that the affected page is physically identified, but can be logical within the page.For instance, the deflection of a record from a page may result in many other records in the page being shifted, if a slotted page structure is used. With physical redo logging, all bytes of the page affected by the shifting of records must be logged. With physiological logging, the deletion operation can be logged, resulting in a much smaller log record. Redo of the deletion operation would delete the record and shift other records as required.

3. Uses a dirty page table to minimize unmecessary redos during recovery. As mentioned earlier, dirty pages are those that have been updated in memory, and the disk version is not up-to-date.

4. Uses a fuzzy-checkpointing scheme that records only information about dirty pages and associated information and does not even require writing of dirty pages to disk. It flushes dirty pages in the background, continuously, instead of writing them during checkpoints.

In the rest of this post, I will provide an overview of ARIES. The bibliographical notes list references that provide a complete description of ARIES.

What are the Data Structures used in ARIES?

Aries recovery algorithm in dbms: Recovery System
Data Structures used in ARIES

Each log record in ARIES has a long sequence number (LSN) that uniquely identifies the record. The number is conceptually just a logical identifier whose value is greater for log records that occur later in the log. In practice, the LSN is generated in such a way that it can also be used to locate the log record on disk. Typically, ARIES splits a log into multiple log files, each of which has a file number.

When a log file grows to some limit, ARIES appends further log records to a new log file; the new log file has a file number that is higher by 1 than the previous log file. The LSN then consists of a file number and an offset within the file.

Each page also maintains an identifier called the PageLSN. Whenever an update operation (whether  physical or physiological) occurs on a page, the operation stores the LSN of it's log record in the PageLSN field of the page. During the redo phase of recovery, any log records with LSN less than or equal to the PageLSNs of a page should not be executed on the page, since their actions are already reflected on the page.

In combination with a scheme for recording PageLSNs as part of checkpointing, which we present later, ARIES can avoid even reading many pages for which logged operations are already reflected on disk. Thereby, recovery time is reduced significantly. The PageLSN is essential for ensuring idempotence in the presence of physiological redo operations, since reapplying a physiological redo that has already been applied to a page could cause incorrect changes to a page.

Pages should not be flushed to disk while an update is in progress, since physiological operations cannot be redone on the partially updated state of the page on disk. Therefore, ARIES uses laches on buffer pages to prevent them from being written on the disk while they are being updated. It releases the buffer page latch only after the update is completed, and the log record for the update has been written to the log.

Each log record also contains the LSN of the previous log record of the same transaction. This value, stored in the PrevLSN field, permits log records of a transaction ton be fetched backward, without reading the whole log. There are special redo-only log records generated during transaction rollback, called compensation log records (CLRs) in ARIES.

These serve the same purpose as the redo-only log records in our earlier recovery scheme. In addition CLRs serve the role of the operation-abort log records in our scheme. The CLRs have an extra field, called the UndoNextLSN, that records the LSN of the log that needs to be undone next, when the transaction is being rolled back.

This field serves the same purpose as the operation identifier in the operation-abort log record in our earlier recovery scheme, which helps to skip over log records that have already been rolled back.

How ARIES can be used in Recovery Algorithm?

Aries recovery algorithm in dbms: Recovery System
Recovery Algorithm in ARIES

ARIES recovers from a system crash in three passes.

  • Analysis Pass: This pass determines which transactions to undo, which pages where dirty at the time of the crash, and the LSN from which the redo pass should start.
  • Redo Pass: This pass starts from a position determined during analysis, and performs a redo, repeating history, to bring the database to a state it was in before the crash.
  • Undo Pass: This pass rolls back all transactions that were incomplete at the time of crash.

Analysis Pass

The analysis pass finds the last complete checkpoint log record, and reads in the DirtyPageTable from this record. It then sets RedoLSN to the minimum of the RecLSNs of the pages in the DirtyPageTable. If there are no dirty pages, it sets RedoLSN to the LSN of the checkpoint log record. The redo pass starts its scan of the log from RedoLSN. All the log records earlier than this point have already been applied to the database pages on the disk.

The analysis pass initially sets the list of transactions to be undone, undo-list, to the list of transactions in the checkpoint log record. The analysis pass also reads from the checkpoint log record the LSNs of the last log record for each transaction in undo-list.

The analysis pass continues scanning forward from the checkpoint. Whenever it finds a log record for a transaction not in the undo-list, it adds the transaction to undo-list. Whenever it finds a transaction end log record, it deletes the transaction from undo-list. All transactions left in undo-list at the end of analysis have to be rolled back later, in the undo pass.

Redo Pass

The redo pass repeats history by replaying every action that is not already reflected in the page on the disk. The redo pass scans the log forward from RedoLSN. Whenever it finds an update log record, it takes the action:

1. If the page is not in DirtyPageTable or the LSN of the update log record is less than the RecLSN of the page in DirtyPageTable, then the redo pass skips the log record.

2. Otherwise the redo pass fetches the page from disk, and if the PageLSN is less than the LSN of the log record, it redoes the log record.

Note that if either of the tests is negative, then the effects of the log record have already appeared on the page; otherwise the effects of the log record are not reflected on the page. Since ARIES allows non-idempotent physiological log records, a log record should not be redone if its effect is already reflected on the page.

Undo Pass

The undo pass is relatively straightforward. It performs a single backward scan of the log, undoing all the transactions in undo-list. The undo pass examines only log records of transactions in undo-list; the last LSN recorded during the analysis pass is used to find the last log record for each transaction in undo-list.

Whenever an update log is found, it is used to perform an undo (whether for transaction rollback during normal processing, or during the restart undo pass). The undo pass generates a CLR containing the undo action performed (which must be physiological). It sets the UndoNextLSN of the CLR to the PrevLSN value of the update log record.

If a CLR is found, its UndoNextLSN value indicates the LSN of the next log record to be undone for the transaction; later log records for that transaction have already been rolled back. For log records other than CLRs, the PrevLSN field of the log record indicates the LSN of the log record to be undone for that transaction. The next log record to be processed at each stop in the undo pass is the maximum, across all transactions in undo-list, of next log record.

If want to learn about Buffer Management in DBMS click here.

If you enjoyed and got some knowledge through this post, please share it with your friends and family members and if you don't want to miss my articles, you can subscribe us by sharing your email id through FeedBurner(click on the subscribe button in the right ). Comment below if you got any query related to this article. 

Next Post »

No comments:

Post a Comment

Please do not enter any spam links in comment box.