Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Updates to the lockingv3.html document. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
c354e084499b83e6949fd2b8007d7bab |
User & Date: | drh 2010-07-13 13:57:53.000 |
Context
2010-07-13
| ||
14:50 | Fix a typo on the temp_store_directory pragma documentation. (check-in: 40fa307f4e user: drh tags: trunk) | |
13:57 | Updates to the lockingv3.html document. (check-in: c354e08449 user: drh tags: trunk) | |
2010-07-08
| ||
18:12 | Update PRAGMA journal_mode documentation. Fix typos in the wal.html document. (check-in: b1e171c029 user: drh tags: trunk) | |
Changes
Changes to pages/lockingv3.in.
︙ | ︙ | |||
27 28 29 30 31 32 33 | set pnum(6) 0 set pnum(7) 0 set pnum(8) 0 HEADING 1 {File Locking And Concurrency In SQLite Version 3} </tcl> | | | > > > > > > > > > > | 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | set pnum(6) 0 set pnum(7) 0 set pnum(8) 0 HEADING 1 {File Locking And Concurrency In SQLite Version 3} </tcl> <p>SQLite [Version 3.0.0] introduced a new locking and journaling mechanism designed to improve concurrency over SQLite version 2 and to reduce the writer starvation problem. The new mechanism also allows atomic commits of transactions involving multiple database files. This document describes the new locking mechanism. The intended audience is programmers who want to understand and/or modify the pager code and reviewers working to verify the design of SQLite version 3. </p> <p>This document was originally created in early 2004 when SQLite version 2 was still in widesprad use and was written to introduce the new concepts of SQLite version 3 to readers who were already familiar with SQLite version 2. But these days, most readers of this document have probably never seen SQLite version 2 and are only familiar with SQLite version 3. Nevertheless, this document continues to serve as an authoritative reference to how database file locking works in SQLite version 3.</p> <tcl>HEADING 1 {Overview} overview</tcl> <p> Locking and concurrency control are handled by the the <a href="http://www.sqlite.org/src/finfo?name=src/pager.c"> pager module</a>. |
︙ | ︙ | |||
146 147 148 149 150 151 152 | A PENDING lock is always just a temporary stepping stone on the path to an EXCLUSIVE lock and so the pager module does not track PENDING locks. </p> <tcl>HEADING 1 {The Rollback Journal} rollback {rollback journal}</tcl> | | > > | > | | | < < < | | > | | 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 | A PENDING lock is always just a temporary stepping stone on the path to an EXCLUSIVE lock and so the pager module does not track PENDING locks. </p> <tcl>HEADING 1 {The Rollback Journal} rollback {rollback journal}</tcl> <p>When a process wants to change a database file (and it is not in [WAL] mode), it first records the original unchanged database content in a <em>rollback journal</em>. The rollback journal is an ordinary disk file that is always located in the same directory or folder as the database file and has the same name as the database file with the addition of a <tt>-journal</tt> suffix. The rollback journal also records the initial size of the database so that if the database file grows it can be truncated back to its original size on a rollback.</p> <p>If SQLite is working with multiple databases at the same time (using the [ATTACH] command) then each database has its own rollback journal. But there is also a separate aggregate journal called the <em>master journal</em>. The master journal does not contain page data used for rolling back changes. Instead the master journal contains the names of the individual database rollback journals for each of the ATTACHed databases. Each of the individual database rollback journals also contain the name of the master journal. If there are no ATTACHed databases (or if none of the ATTACHed database is participating in the current transaction) no master journal is created and the normal rollback journal contains an empty string in the place normally reserved for recording the name of the master journal.</p> <tcl>hd_fragment hotjrnl {hot journal}</tcl> |
︙ | ︙ | |||
187 188 189 190 191 192 193 | If everything is working correctly (that is, if there are no crashes or power failures) you will never get a hot journal. </p> <p> If no master journal is involved, then | | > > > | 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 | If everything is working correctly (that is, if there are no crashes or power failures) you will never get a hot journal. </p> <p> If no master journal is involved, then a journal is hot if it exists and has a non-zero header and its corresponding database file does not have a RESERVED lock. If a master journal is named in the file journal, then the file journal is hot if its master journal exists and there is no RESERVED lock on the corresponding database file. It is important to understand when a journal is hot so the preceding rules will be repeated in bullets: </p> <ul> <li>A journal is hot if... <ul> <li>It exists, and</li> <li>Its size is greater than 512 bytes, and</li> <li>The journal header is non-zero and well-formed, and</li> <li>Its master journal exists or the master journal name is an empty string, and</li> <li>There is no RESERVED lock on the corresponding database file.</li> </ul> </li> </ul> |
︙ | ︙ | |||
235 236 237 238 239 240 241 | (Note: Do not acquire a RESERVED lock because that would make other processes think the journal was no longer hot.) If we fail to acquire these locks it means another process is already trying to do the rollback. In that case, drop all locks, close the database, and return SQLITE_BUSY. </li> <li>Read the journal file and roll back the changes.</li> <li>Wait for the rolled back changes to be written onto | | | > > > | 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 | (Note: Do not acquire a RESERVED lock because that would make other processes think the journal was no longer hot.) If we fail to acquire these locks it means another process is already trying to do the rollback. In that case, drop all locks, close the database, and return SQLITE_BUSY. </li> <li>Read the journal file and roll back the changes.</li> <li>Wait for the rolled back changes to be written onto persistent storage. This protects the integrity of the database in case another power failure or crash occurs.</li> <li>Delete the journal file (or truncate the journal to zero bytes in length if [PRAGMA journal_mode | PRAGMA journal_mode=TRUNCATE] is set, or zero the journal header if [PRAGMA journal_mode | PRAGMA journal_mode=PERSIST] is set).</li> <li>Delete the master journal file if it is safe to do so. This step is optional. It is here only to prevent stale master journals from cluttering up the disk drive. See the discussion below for details.</li> <li>Drop the EXCLUSIVE and PENDING locks but retain the SHARED lock.</li> </ol> |
︙ | ︙ | |||
338 339 340 341 342 343 344 | <ol> <li value="4"> Obtain an EXCLUSIVE lock on the database file and make sure all memory changes have been written to the database file using the algorithm of steps 1-3 above.</li> <li>Flush all database file changes to the disk. Wait for those changes to actually be written onto the disk surface.</li> | > > | | | 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 | <ol> <li value="4"> Obtain an EXCLUSIVE lock on the database file and make sure all memory changes have been written to the database file using the algorithm of steps 1-3 above.</li> <li>Flush all database file changes to the disk. Wait for those changes to actually be written onto the disk surface.</li> <li>Delete the journal file. (Or if the [PRAGMA journal_mode] is TRUNCATE or PERSIST, truncate the journal file or zero the header of the journal file, respectively.) This is the instant when the changes are committed. Prior to deleting the journal file, if a power failure or crash occurs, the next process to open the database will see that it has a hot journal and will roll the changes back. After the journal is deleted, there will no longer be a hot journal and the changes will persist. </li> <li>Drop the EXCLUSIVE and PENDING locks from the database file. </li> </ol> <p>As soon as the PENDING lock is released from the database file, other processes can begin reading the database again. In the current implementation, the RESERVED lock is also released, but that is not essential. Future versions of SQLite might provide a "CHECKPOINT" SQL command that will commit all changes made so far within a transaction but retain the RESERVED lock so that additional changes can be made without given any other process an opportunity to write.</p> |
︙ | ︙ | |||
409 410 411 412 413 414 415 | will prevent new readers from coming in. Assuming existing readers do eventually complete, all SHARED locks will eventually clear and the writer will be given a chance to make its changes.</p> <tcl>HEADING 1 {How To Corrupt Your Database Files} how_to_corrupt \ {How To Corrupt Your Database Files}</tcl> | | | < | 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 | will prevent new readers from coming in. Assuming existing readers do eventually complete, all SHARED locks will eventually clear and the writer will be given a chance to make its changes.</p> <tcl>HEADING 1 {How To Corrupt Your Database Files} how_to_corrupt \ {How To Corrupt Your Database Files}</tcl> <p>The pager module is very robust but it can be subverted. This section attempts to identify and explain the risks. (See also the [Things That Can Go Wrong] section of the article on [Atomic Commit].</p> <p> Clearly, a hardware or operating system fault that introduces incorrect data into the middle of the database file or journal will cause problems. Likewise, |
︙ | ︙ | |||
452 453 454 455 456 457 458 459 460 461 462 463 464 465 | in volatile cache memory in the disk drive electronics. There are also reports that Windows sometimes chooses to ignore FlushFileBuffers() for unspecified reasons. The author cannot verify any of these reports. But if they are true, it means that database corruption is a possibility following an unexpected power loss. These are hardware and/or operating system bugs that SQLite is unable to defend against. </p> <p> If a crash or power failure occurs and results in a hot journal but that journal is deleted, the next process to open the database will not know that it contains changes that need to be rolled back. The rollback will not occur and the database will be left in an inconsistent state. Rollback journals might be deleted for any number of reasons: | > > > > > > > > > > > | 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 | in volatile cache memory in the disk drive electronics. There are also reports that Windows sometimes chooses to ignore FlushFileBuffers() for unspecified reasons. The author cannot verify any of these reports. But if they are true, it means that database corruption is a possibility following an unexpected power loss. These are hardware and/or operating system bugs that SQLite is unable to defend against. </p> <p>In the Linux <a href="http://en.wikipedia.org/wiki/Ext3">ext3</a> file system, if the filesystem is not mounted with the "barrier=1" option in the <a href="http://en.wikipedia.org/wiki/fstab">/etc/fstab</a> then filesystem corruption can occur following a power loss or OS crash. We are told that most Linux distributions do not use barrier=1 and so most Linux distributions are vulnerable to this problem. Note that this is an operating system issue and that there is nothing that SQLite can do to work around it. <a href="http://ozlabs.org/~rusty/index.cgi/tech/2009-10-20.html"> Other database engines</a> have also run into this same problem.</p> <p> If a crash or power failure occurs and results in a hot journal but that journal is deleted, the next process to open the database will not know that it contains changes that need to be rolled back. The rollback will not occur and the database will be left in an inconsistent state. Rollback journals might be deleted for any number of reasons: |
︙ | ︙ |