Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Tweaks to the wal.html document. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
34cbc3f99191f8e20c7a82b8a718e367 |
User & Date: | drh 2010-05-24 22:39:48.000 |
Context
2010-05-25
| ||
23:40 | Update appearance with new logo and color scheme. (check-in: 225857b614 user: drh tags: trunk) | |
2010-05-24
| ||
22:39 | Tweaks to the wal.html document. (check-in: 34cbc3f991 user: drh tags: trunk) | |
2010-05-08
| ||
14:05 | Fix typos in the testing document. Change "insure" to "ensure" throughout the website. Ticket [6afbaac77a52917fc]. (check-in: 8c1658c1e2 user: drh tags: trunk) | |
Changes
Changes to pages/wal.in.
︙ | ︙ | |||
8 9 10 11 12 13 14 | Beginning with [version 3.7.0], a new "Write-Ahead Log" option (hereafter referred to as "WAL") is available.</p> <p>There are advantages and disadvantages to using WAL instead of a rollback journal. Advantages include:</p> <ol> | | | | | | | 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 | Beginning with [version 3.7.0], a new "Write-Ahead Log" option (hereafter referred to as "WAL") is available.</p> <p>There are advantages and disadvantages to using WAL instead of a rollback journal. Advantages include:</p> <ol> <li>WAL is significantly faster in the common case of many small transactions. <li>WAL provides more concurrency as readers do not block writers and are not blocked by a writer. Reading and writing can proceed concurrently. <li>Disk I/O operations tends to be more sequential using WAL. <li>WAL uses many fewer fsync() operations and is thus less vulnerable to problems on systems where the fsync() system call is broken. </ol> <p>But there are also disadvantages:</p> <ol> <li>WAL is only available when the [sqlite3_vfs | VFS] supports shared-memory primitives. The built-in unix and windows VFSes support this but third-party extension VFSes for custom operating systems might not. <li>All processes using a database must be on the same host computer; WAL does not work over a network filesystem. <li>Transactions that involve changes against multiple [ATTACH | ATTACHed] databases are atomic for each individual database, but are not atomic across all databases as a set. <li>With WAL, it is not possible to change the database page sizes using [VACUUM] or when restoring from a backup using the [backup API]. <li>WAL can be slower than the traditional rollback-journal approach for large transactions (such as when using the [VACUUM] command). <li>There is the extra operation of [checkpointing] which, though automatic by default, is still something that application developers need to be mindful of. </ol> <h2>How WAL Works</h2> <p>The traditional rollback journal works by writing a copy of the original unchanged database content into a separate rollback journal file and then writing changes directly into the database file. In the event of a crash or [ROLLBACK], the original content contained in the rollback journal is played back into the database file to revert the database file to its original state. The [COMMIT] occurs when the rollback journal is deleted.</p> <p>The WAL approach inverts this. The original content is preserved in the database file and the changes are appended into a separate WAL file. A [COMMIT] occurs when a special record indicating a commit is appended to the WAL. Thus a COMMIT can happen without ever writing to the original database, which allows reader to continue operating from the original unaltered database while changes are simultaneously committed into the WAL. Multiple transactions can be appended to the end of a single WAL file.</p> <tcl>hd_fragment ckpt checkpoint checkpointed checkpointing</tcl> <h3>Checkpointing</h3> <p>Of course, one wants to eventually transfer all the transactions that are appended in the WAL file back into the original database. Moving |
︙ | ︙ | |||
81 82 83 84 85 86 87 | <p>When a read operation begins on a WAL-mode database, it first remembers the location of the last valid commit record in the WAL. Call this point the "end mark". Because the WAL can be growing and adding new commit records while various readers connect to the database, each reader can potentially have its own end mark. But for any particular reader, the end mark is unchanged for the duration of the | | | | | | | | | 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 | <p>When a read operation begins on a WAL-mode database, it first remembers the location of the last valid commit record in the WAL. Call this point the "end mark". Because the WAL can be growing and adding new commit records while various readers connect to the database, each reader can potentially have its own end mark. But for any particular reader, the end mark is unchanged for the duration of the transaction, thus ensuring that a single read transaction only sees the database content as it existed at a single point in time.</p> <p>When a reader needs a page of content, it first checks the WAL to see if that page appears there, and if so it pulls in the last copy of the page that occurs in the WAL prior to the reader's end mark. If no copy of the page exists in the WAL prior to the reader's end mark, then the page is read from the original database file. Readers can exist in separate processes, so to avoid forcing every reader to scan the entire WAL looking for pages (the WAL file can grow to multiple megabytes, depending on how often checkpoints are run), a data structure called the "wal-index" is maintained in shared memory which helps readers locate pages in the WAL quickly and with a minimum of I/O. The wal-index greatly improves the performance of readers, but the use of shared memory means that all readers must exist on the same machine. This is why the write-ahead log implementation will not work on a network filesystem.</p> <p>Writers merely append new content to the end of the WAL file. Because writers do nothing that would interfere with the actions of readers, writers and readers can run at the same time. However, since there is only one WAL file, there can only be one writer at a time.</p> <p>A checkpoint operation takes all the content from the WAL file and transfers it back into the original database file. As long as all readers are looking at the entire WAL file (in other words, as long as no reader is using an end mark that is earlier than the actual end of the WAL) then it will be OK for the checkpoint to run in parallel with the readers. The changes to the database file that the checkpointer copies over from the WAL will not interfere with the readers, because the modified database pages all also exist in the WAL and so the readers will never access them. Thus, a checkpoint will never block a reader and readers will not block a checkpoint as long as all readers are looking at the entire WAL. But, older readers that only see a prefix of the WAL will block a checkpoint. And writers will block a checkpoint. And a checkpoint will block a writer.</p> <tcl>hd_fragment fast</tcl> <h3>Performance Considerations</h3> |
︙ | ︙ | |||
154 155 156 157 158 159 160 | write transactions.</p> <p>The default strategy is to allow successive write transactions to grow the WAL until the WAL becomes about 1000 pages in size, then to run a single checkpoint operation. By default, the checkpoint will be run automatically by the same thread that does the COMMIT that pushes the WAL over its size limit. This has the effect of causing most | | | | | | 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 | write transactions.</p> <p>The default strategy is to allow successive write transactions to grow the WAL until the WAL becomes about 1000 pages in size, then to run a single checkpoint operation. By default, the checkpoint will be run automatically by the same thread that does the COMMIT that pushes the WAL over its size limit. This has the effect of causing most COMMIT operations to be very fast but an occasional COMMIT (those that trigger a checkpoint) to be much slower. If that effect is undesirable, then the application can disable automatic checkpointing and run the periodic checkpoints in a separate thread, or separate process.</p> <p>Notice that there is a tradeoff between average read performance and average write performance. To maximize the read performance, one wants to keep the WAL as small as possible and hence run checkpoints frequently, perhaps as often as every COMMIT. To maximize write performance, one wants to amortize the cost of each checkpoint over as many writes as possible, meaning that one wants to run checkpoints infrequently and let the WAL grow as large as possible before each checkpoint. The decision of how often to run checkpoints may therefore vary from one application to another depending on the relative read and write performance requirements of the application. The default strategy is to run a checkpoint once the WAL reaches 1000 pages and this strategy seems to work well in test applications on workstations, but other strategies might work better on different platforms or for different workloads.</p> <h2>Activating And Configuring WAL Mode</h2> <p>An SQLite database connection defaults to [journal_mode | journal_mode=DELETE]. To convert to WAL mode, use the |
︙ | ︙ | |||
192 193 194 195 196 197 198 | On success, the pragma will return the string "<tt>wal</tt>". If the conversion to WAL could not be completed (for example, if the VFS does not support the necessary shared-memory primitives) then the journaling mode will be unchanged and the string returned from the primitive will be the prior journaling mode (for example "<tt>delete</tt>"). <p>By default, SQLite will automatically checkpoint whenever the WAL | | | > > | > | | | | | | 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 | On success, the pragma will return the string "<tt>wal</tt>". If the conversion to WAL could not be completed (for example, if the VFS does not support the necessary shared-memory primitives) then the journaling mode will be unchanged and the string returned from the primitive will be the prior journaling mode (for example "<tt>delete</tt>"). <p>By default, SQLite will automatically checkpoint whenever the WAL file reaches or exceeds 1000 pages, or when the last database connection on a database file closes. The default configuration is intended to work well for most applications. But programs that want more control can force a checkpoint using the [wal_checkpoint pragma] or by calling the [sqlite3_wal_checkpoint()] C interface. The automatic checkpoint threshold can be changed or automatic checkpointing can be completely disabled using the [wal_autocheckpoint pragma] or by calling the [sqlite3_wal_autocheckpoint()] C interface. A program can also use [sqlite3_wal_hook()] to register a callback to be invoked whenever any transaction commits to the WAL. This callback can then invoke [sqlite3_wal_checkpoint()] to for a checkpoint based on whatever criteria it thinks is appropriate. (The automatic checkpoint mechanism is implemented as a simple wrapper around [sqlite3_wal_hook()].)</p> <h2>Backwards Compatibility</h2> <p>The database file format is unchanged for WAL mode. However, the WAL file (located in the same directory or folder a the original database file and with the same name as the original database with "<tt>-wal</tt>" appended) is a new concept and so older versions of SQLite will not know how to recover a crashed SQLite database that was operating in WAL mode when the crash occurred. To prevent older versions of SQLite from trying to recover a WAL-mode database (and making matters worse) the database file format version numbers (bytes 18 and 19 in the [database header]) are increased to two in WAL mode. Thus, if an older version of SQLite attempts to connect to an SQLite database that is operating in WAL mode, it will report an error along the lines of "file is encrypted or is not a database".</p> <p>One can explicitly change out of WAL mode using a pragma such as this:</p> <blockquote><pre> PRAGMA journal_mode=DELETE; </pre></blockquote> <p>Deliberately changing out of WAL mode changes the database file format version numbers back to one so that older versions of SQLite can again access the database file.</p> |