Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Fix typos in the atomiccommit.html document. Update the proposed release date for 3.7.16. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
b17469386db3bd138dad5b0ed2668796 |
User & Date: | drh 2013-03-11 20:22:24.400 |
Context
2013-03-13
| ||
17:11 | Documentation updates and typo fixes in preparation for the next release. (check-in: d29bab5808 user: drh tags: trunk) | |
2013-03-11
| ||
20:22 | Fix typos in the atomiccommit.html document. Update the proposed release date for 3.7.16. (check-in: b17469386d user: drh tags: trunk) | |
2013-03-09
| ||
15:12 | Fix various documentation typos. (check-in: 8a8b9f4250 user: drh tags: trunk) | |
Changes
Changes to pages/atomiccommit.in.
︙ | ︙ | |||
28 29 30 31 32 33 34 | <p>This article describes the techniques used by SQLite to create the illusion of atomic commit.</p> <p>The information in this article applies only when SQLite is operating in "rollback mode", or in other words when SQLite is not using a [write-ahead log]. SQLite still supports atomic commit when write-ahead logging is enabled, but it accomplishes atomic commit by | | | 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | <p>This article describes the techniques used by SQLite to create the illusion of atomic commit.</p> <p>The information in this article applies only when SQLite is operating in "rollback mode", or in other words when SQLite is not using a [write-ahead log]. SQLite still supports atomic commit when write-ahead logging is enabled, but it accomplishes atomic commit by a different mechanism from the one described in this article. See the [WAL | write-ahead log documentation] for additional information on how SQLite supports atomic commit in that context.</p> <tcl>hd_fragment hardware</tcl> <h2>2.0 Hardware Assumptions</h2> <p>Throughout this article, we will call the mass storage device "disk" |
︙ | ︙ | |||
158 159 160 161 162 163 164 | SQLite does not add any redundancy to the database file for the purpose of detecting corruption or I/O errors. SQLite assumes that the data it reads is exactly the same data that it previously wrote.</p> <p>By default, SQLite assumes that an operating system call to write a range of bytes will not damage or alter any bytes outside of that range | | | 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 | SQLite does not add any redundancy to the database file for the purpose of detecting corruption or I/O errors. SQLite assumes that the data it reads is exactly the same data that it previously wrote.</p> <p>By default, SQLite assumes that an operating system call to write a range of bytes will not damage or alter any bytes outside of that range even if a power loss or OS crash occurs during that write. We call this the "[PSOW | powersafe overwrite]" property. Prior to version 3.7.9, SQLite did not assume powersafe overwrite. But with the standard sector size increasing from 512 to 4096 bytes on most disk drives, it has become necessary to assume powersafe overwrite in order to maintain historical performance levels and so powersafe overwrite is assumed by default in recent versions of SQLite. The assumption of powersafe overwrite property can be disabled at compile-time or a run-time if |
︙ | ︙ | |||
222 223 224 225 226 227 228 | lock allows two or more database connections to read from the database file at the same time. But a shared lock prevents another database connection from writing to the database file while we are reading it. This is necessary because if another database connection were writing to the database file at the same time we are reading from the database file, we might read some data before the change and other data after the change. | | | 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 | lock allows two or more database connections to read from the database file at the same time. But a shared lock prevents another database connection from writing to the database file while we are reading it. This is necessary because if another database connection were writing to the database file at the same time we are reading from the database file, we might read some data before the change and other data after the change. This would make it appear as if the change made by the other process is not atomic.</p> <p>Notice that the shared lock is on the operating system disk cache, not on the disk itself. File locks really are just flags within the operating system kernel, usually. (The details depend on the specific OS layer interface.) Hence, the lock will instantly vanish if the |
︙ | ︙ | |||
273 274 275 276 277 278 279 | and shared lock allow other processes to read from the database file. A single reserve lock can coexist with multiple shared locks from other processes. However, there can only be a single reserved lock on the database file. Hence only a single process can be attempting to write to the database at one time.</p> | | | 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 | and shared lock allow other processes to read from the database file. A single reserve lock can coexist with multiple shared locks from other processes. However, there can only be a single reserved lock on the database file. Hence only a single process can be attempting to write to the database at one time.</p> <p>The idea behind a reserved lock is that it signals that a process intends to modify the database file in the near future but has not yet started to make the modifications. And because the modifications have not yet started, other processes can continue to read from the database. However, no other process should also begin trying to write to the database.</p> |
︙ | ︙ | |||
402 403 404 405 406 407 408 | <p>Another flush must occur to make sure that all the database changes are written into nonvolatile storage. This is a critical step to ensure that the database will survive a power loss without damage. However, because of the inherent slowness of writing to disk or flash memory, this step together with the rollback journal file flush in section | | | | 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 | <p>Another flush must occur to make sure that all the database changes are written into nonvolatile storage. This is a critical step to ensure that the database will survive a power loss without damage. However, because of the inherent slowness of writing to disk or flash memory, this step together with the rollback journal file flush in section 3.7 above takes up most of the time required to complete a transaction commit in SQLite.</p> <br clear="both"> <a name="section_3_11"></a> <h3>3.11 Deleting The Rollback Journal</h3> <img src="images/ac/commit-A.gif" align="right" hspace="15"> <p>After the database changes are all safely on the mass storage device, the rollback journal file is deleted. This is the instant where the transaction commits. If a power failure or system crash occurs prior to this point, then recovery processes to be described later make it appear as if no changes were ever made to the database file. If a power failure or system crash occurs after the rollback journal is deleted, then it appears as if all changes have been written to disk. Thus, SQLite gives the appearance of having made no changes to the database file or having made the complete set of changes to the database file depending on whether or not the rollback journal file exists.</p> |
︙ | ︙ | |||
499 500 501 502 503 504 505 | have to "rollback" any partial changes and restore the database to the state it was in prior to the beginning of the transaction.</p> <tcl>hd_fragment crisis</tcl> <h3>4.1 When Something Goes Wrong...</h3> <img src="images/ac/rollback-0.gif" align="right" hspace="15"> | | > > | > | | | 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 | have to "rollback" any partial changes and restore the database to the state it was in prior to the beginning of the transaction.</p> <tcl>hd_fragment crisis</tcl> <h3>4.1 When Something Goes Wrong...</h3> <img src="images/ac/rollback-0.gif" align="right" hspace="15"> <p>Suppose the power loss occurred during <a href="#section_3_10">step 3.10</a> above, while the database changes were being written to disk. After power is restored, the situation might be something like what is shown to the right. We were trying to change three pages of the database file but only one page was successfully written. Another page was partially written and a third page was not written at all.</p> <p>The rollback journal is complete and intact on disk when the power is restored. This is a key point. The reason for the flush operation in <a href="#section_3_7">step 3.7</a> is to make absolutely sure that all of the rollback journal is safely on nonvolatile storage prior to making any changes to the database file itself.</p> <br clear="both"> <a name="section_4_2"></a> <h3>4.2 Hot Rollback Journals</h3> <img src="images/ac/rollback-1.gif" align="right" hspace="15"> <p>The first time that any SQLite process attempts to access the database file, it obtains a shared lock as described in <a href="section_3_2">section 3.2</a> above. But then it notices that there is a rollback journal file present. SQLite then checks to see if the rollback journal is a "hot journal". A hot journal is a rollback journal that needs to be played back in order to restore the database to a sane state. A hot journal only exists when an earlier process was in the middle of committing a transaction when it crashed or lost power.</p> <p>A rollback journal is a "hot" journal if all of the following are true:</p> <ul> <li>The rollback journal exists. <li>The rollback journal is not an empty file. <li>There is no reserved lock on the main database file. <li>The header of the rollback journal is well-formed and in particular has not been zeroed out. <li>The rollback journal does not contain the name of a master journal file (see <a href="#section_5_5">section 5.5</a> below) or if does |
︙ | ︙ | |||
569 570 571 572 573 574 575 | <a name="section_4_4"></a> <h3>4.4 Rolling Back Incomplete Changes</h3> <img src="images/ac/rollback-3.gif" align="right" hspace="15"> <p>Once a process obtains an exclusive lock, it is permitted to write to the database file. It then proceeds to read the original content of pages out of the rollback journal and write | | | 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 | <a name="section_4_4"></a> <h3>4.4 Rolling Back Incomplete Changes</h3> <img src="images/ac/rollback-3.gif" align="right" hspace="15"> <p>Once a process obtains an exclusive lock, it is permitted to write to the database file. It then proceeds to read the original content of pages out of the rollback journal and write that content back to where it came from in the database file. Recall that the header of the rollback journal records the original size of the database file prior to the start of the aborted transaction. SQLite uses this information to truncate the database file back to its original size in cases where the incomplete transaction caused the database to grow. At the end of this step, the database should be the same size and contain the same information as it did before the start of |
︙ | ︙ | |||
592 593 594 595 596 597 598 | played back into the database file (and flushed to disk in case we encounter yet another power failure), the hot rollback journal can be deleted.</p> <p>As in <a href="#section_3_11">section 3.11</a>, the journal file might be truncated to zero length or its header might be overwritten with zeros as an optimization on systems where | | | | 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 | played back into the database file (and flushed to disk in case we encounter yet another power failure), the hot rollback journal can be deleted.</p> <p>As in <a href="#section_3_11">section 3.11</a>, the journal file might be truncated to zero length or its header might be overwritten with zeros as an optimization on systems where deleting a file is expensive. Either way, the journal is no longer hot after this step.</p> <br clear="both"> <tcl>hd_fragment cont</tcl> <h3>4.6 Continue As If The Uncompleted Writes Had Never Happened</h3> <img src="images/ac/rollback-5.gif" align="right" hspace="15"> <p>The final recovery step is to reduce the exclusive lock back |
︙ | ︙ | |||
948 949 950 951 952 953 954 | <a name="section_7_2"></a> <h3>7.2 Exclusive Access Mode</h3> <p>SQLite version 3.3.14 adds the concept of "Exclusive Access Mode". In exclusive access mode, SQLite retains the exclusive database lock at the conclusion of each transaction. This prevents | | | 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 | <a name="section_7_2"></a> <h3>7.2 Exclusive Access Mode</h3> <p>SQLite version 3.3.14 adds the concept of "Exclusive Access Mode". In exclusive access mode, SQLite retains the exclusive database lock at the conclusion of each transaction. This prevents other processes from accessing the database, but in many deployments only a single process is using a database so this is not a serious problem. The advantage of exclusive access mode is that disk I/O can be reduced in three ways:</p> <ol> <li><p>It is not necessary to increment the change counter in the database header for transactions after the first transaction. This |
︙ | ︙ | |||
1030 1031 1032 1033 1034 1035 1036 | writes. When power is lost, the drive uses energy stored in capacitors and/or the angular momentum of the disk platter to provide power to complete any operation in progress. Nevertheless, there are so many layers in between the write system call and the on-board disk drive electronics that we take the safe approach in both Unix and w32 VFS implementations and assume that sector writes are not atomic. On the other hand, device | | | 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 | writes. When power is lost, the drive uses energy stored in capacitors and/or the angular momentum of the disk platter to provide power to complete any operation in progress. Nevertheless, there are so many layers in between the write system call and the on-board disk drive electronics that we take the safe approach in both Unix and w32 VFS implementations and assume that sector writes are not atomic. On the other hand, device manufacturers with more control over their filesystems might want to consider enabling the atomic write property of xDeviceCharacteristics if their hardware really does do atomic writes.</p> <p>When sector writes are atomic and the page size of a database is the same as a sector size, and when there is a database change that only touches a single database page, then SQLite skips the whole journaling and syncing process and simply writes the modified page |
︙ | ︙ | |||
1278 1279 1280 1281 1282 1283 1284 | <tcl>hd_fragment mvhotjrnl</tcl> <h3>9.5 Deleting Or Renaming A Hot Journal</h3> <p>If a crash or power loss does occur and a hot journal is left on the disk, it is essential that the original database file and the hot journal remain on disk with their original names until the database file is opened by another SQLite process and rolled back. | | | 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 | <tcl>hd_fragment mvhotjrnl</tcl> <h3>9.5 Deleting Or Renaming A Hot Journal</h3> <p>If a crash or power loss does occur and a hot journal is left on the disk, it is essential that the original database file and the hot journal remain on disk with their original names until the database file is opened by another SQLite process and rolled back. During recovery at <a href="#section_4_2">step 4.2</a> SQLite locates the hot journal by looking for a file in the same directory as the database being opened and whose name is derived from the name of the file being opened. If either the original database file or the hot journal have been moved or renamed, then the hot journal will not be seen and the database will not be rolled back.</p> <p>We suspect that a common failure mode for SQLite recovery happens |
︙ | ︙ |
Changes to pages/changes.in.
︙ | ︙ | |||
38 39 40 41 42 43 44 | http://www.sqlite.org/src/timeline</a>.</p> } hd_close_aux hd_enable_main 1 } } | | | 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 | http://www.sqlite.org/src/timeline</a>.</p> } hd_close_aux hd_enable_main 1 } } chng {2013-03-18 (3.7.16)} { <li>Added the [PRAGMA foreign_key_check] command. <li>Added new extended error codes for all SQLITE_CONSTRAINT errors <li>Added SQL functions [unicode(A)] and [char(X1,...,XN)]. <li>Performance improvements for [PRAGMA incremental_vacuum], especially in cases where the number of free pages is greater than what will fit on a single trunk page of the freelist. <li>Improved optimization of queries containing aggregate min() or max(). |
︙ | ︙ |