Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Update the file format documentation for the new 64K page size. Add a caution to the WAL document. Omit annoying echos in the script that removes requirement marks from the documentation. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
0ec9f2cb8fcea0f08637f35fa94e496c |
User & Date: | drh 2010-08-12 14:38:03.000 |
Context
2010-08-12
| ||
16:26 | Work on the queryplanner.html document. (check-in: e8152063fb user: drh tags: trunk) | |
14:38 | Update the file format documentation for the new 64K page size. Add a caution to the WAL document. Omit annoying echos in the script that removes requirement marks from the documentation. (check-in: 0ec9f2cb8f user: drh tags: trunk) | |
2010-08-10
| ||
05:58 | Fix typos reported on the mailing list. (check-in: a85aaa0b62 user: drh tags: trunk) | |
Changes
Changes to pages/fileformat.in.
︙ | ︙ | |||
826 827 828 829 830 831 832 | [h3 "Pages and Page Types" "pages_and_page_types"] <p> The entire database file is divided into pages, each page consisting of <i>page-size</i> bytes, where <i>page-size</i> is the 2-byte integer value stored at offset 16 of the database header (see above). The <i>page-size</i> is always a power of two between 512 | | > > > | 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 | [h3 "Pages and Page Types" "pages_and_page_types"] <p> The entire database file is divided into pages, each page consisting of <i>page-size</i> bytes, where <i>page-size</i> is the 2-byte integer value stored at offset 16 of the database header (see above). The <i>page-size</i> is always a power of two between 512 (2<sup>9</sup>) and 32768 (2<sup>15</sup>) or the value 1 used to represent a 65536-byte page. This field can equivalently be viewed as a little-endian number which is page size divided by 256. SQLite database files always consist of an exact number of pages. <p> Pages are numbered beginning from 1, not 0. Page 1 consists of the first <i>page-size</i> bytes of the database file. The database header described in the previous section consumes the first 100 bytes of page 1. <p> |
︙ | ︙ | |||
1374 1375 1376 1377 1378 1379 1380 | [Tr]<td>1..2 <td>2<td>Byte offset of first block of free space on this page. If there are no free blocks on this page, this field is set to 0. [Tr]<td>3..4 <td>2<td>Number of cells (entries) on this page. [Tr]<td>5..6 <td>2<td>Byte offset of the first byte of the cell content area (see figure <cite>figure_indexpage</cite>), relative to the | | > | 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 | [Tr]<td>1..2 <td>2<td>Byte offset of first block of free space on this page. If there are no free blocks on this page, this field is set to 0. [Tr]<td>3..4 <td>2<td>Number of cells (entries) on this page. [Tr]<td>5..6 <td>2<td>Byte offset of the first byte of the cell content area (see figure <cite>figure_indexpage</cite>), relative to the start of the page. If this value is zero, then it should be interpreted as 65536. [Tr]<td>7 <td>1<td>Number of fragmented free bytes on page. [Tr]<td>8..11 <td>4<td>Page number of rightmost child-page (the child-page that heads the sub-tree in which all records are larger than all records stored on this page). This field is not present for leaf node pages. </table> |
︙ | ︙ |
Changes to pages/fileformat2.in.
︙ | ︙ | |||
34 35 36 37 38 39 40 | on the main database file.</p> <h3>1.1 Pages</h3> <p>The main database file consists of one or more pages. The size of a page is a power of two between 512 and 32768 inclusive. All pages within the same database are the same size. The page size for a database file | | | 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | on the main database file.</p> <h3>1.1 Pages</h3> <p>The main database file consists of one or more pages. The size of a page is a power of two between 512 and 32768 inclusive. All pages within the same database are the same size. The page size for a database file is determined by the 2-byte integer located at an offset of 16 bytes from the beginning of the database file.</p> <p>Pages are numbered beginning with 1. The maximum page number is 2147483646 (2<sup><small>31</small></sup> - 2). The minimum size SQLite database is a single 512-byte page. The maximum size database would be 2147483646 pages at 32768 bytes per page or 70,368,744,112,128 bytes (about 70 terabytes). Usually SQLite will |
︙ | ︙ | |||
100 101 102 103 104 105 106 | <i>Database Header Format</i><br> <table width="80%" border=1> <tr><th>Offset<th>Size<th>Description <tr><td valign=top align=center>0<td valign=top align=center>16<td align=left> The header string: "SQLite format 3\000" <tr><td valign=top align=center>16<td valign=top align=center>2<td align=left> The database page size in bytes. Must be a power of two between 512 | | | 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 | <i>Database Header Format</i><br> <table width="80%" border=1> <tr><th>Offset<th>Size<th>Description <tr><td valign=top align=center>0<td valign=top align=center>16<td align=left> The header string: "SQLite format 3\000" <tr><td valign=top align=center>16<td valign=top align=center>2<td align=left> The database page size in bytes. Must be a power of two between 512 and 32768 inclusive, or the value 1 representing a page size of 65536. <tr><td valign=top align=center>18<td valign=top align=center>1<td align=left> File format write version. 1 for legacy; 2 for [WAL]. <tr><td valign=top align=center>19<td valign=top align=center>1<td align=left> File format read version. 1 for legacy; 2 for [WAL]. <tr><td valign=top align=center>20<td valign=top align=center>1<td align=left> Bytes of unused "reserved" space at the end of each page. Usually 0. <tr><td valign=top align=center>21<td valign=top align=center>1<td align=left> |
︙ | ︙ | |||
152 153 154 155 156 157 158 | <h4>1.2.1 Magic Header String</h4> <p>Every SQLite database file begins with the following 16 bytes (in hex): 53 51 4c 69 74 65 20 66 6f 72 6d 61 74 20 33 00. This byte sequence corresponds to the UTF-8 string "SQLite format 3" including the nul terminator character at the end.</p> | > > > > > > > > > > > > > > | | | | | | 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 | <h4>1.2.1 Magic Header String</h4> <p>Every SQLite database file begins with the following 16 bytes (in hex): 53 51 4c 69 74 65 20 66 6f 72 6d 61 74 20 33 00. This byte sequence corresponds to the UTF-8 string "SQLite format 3" including the nul terminator character at the end.</p> <h4>1.2.2 Page Size</h4> <p>The two-byte value beginning at offset 16 determines the page size of the database. For SQLite versions 3.7.0.1 and earlier, this value is interpreted as a big-endian integer and must be a power of two between 512 and 32768, inclusive. Beginning with SQLite version 3.7.1, a page size of 65536 bytes is supported. The value 65536 will not fit in a two-byte integer, so to specify a 65536-byte page size, the value is at offset 16 is 0x00 0x01. This value can be interpreted as a big-endian 1 and thought of is as a magic number to represent the 65536 page size. Or one can view the two-byte field as a little endian number and say that it represents the page size divided by 256.</p> <h4>1.2.3 File format version numbers</h4> <p>The file format write version and file format read version at offsets 18 and 19 are intended to allow for enhancements of the file format in future versions of SQLite. In current versions of SQLite, both of these values are 1 for rollback journalling modes and 2 for [WAL] journalling mode. If a version of SQLite coded to the current file format specification encounters a database file where the read version is 1 or 2 but the write version is greater than 2, then the database file must be treated as read-only. If a database file with a read version greater than 2 is encounter, then that database cannot be read or written.</p> <h4>1.2.4 Reserved bytes per page</h4> <p>SQLite has the ability to set aside a small number of extra bytes at the end of every page for use by extensions. These extra bytes are used, for example, by the SQLite Encryption Extension to store a nonce and/or cryptographic checksum associated with each page. The "reserved space" size in the 1-byte integer at offset 20 is the number of bytes of space at the end of each page to reserve for extensions. This value is usually 0. The value can be odd.</p> <tcl>hd_fragment usable_size {usable size}</tcl> <p>The "usable size" of a database page is the page size specify by the 2-byte integer at offset 16 in the header less the "reserved" space size recorded in the 1-byte integer at offset 20 in the header. The usable size of a page might be an odd number. However, the usable size is not allowed to be less than 480. In other words, if the page size is 512, then the reserved space size cannot exceed 32.</p> <h4>1.2.5 Payload fractions</h4> <p>The maximum and minimum embedded payload fractions and the leaf payload fraction values must be 64, 32, and 32. These values were originally intended to as tunable parameters that could be used to modify the storage format of the b-tree algorithm. However, that functionality is not supported and there are no current plans to add support in the future. Hence, these three bytes are fixed at the values specified.</p> <h4>1.2.6 File change counter</h4> <tcl>hd_fragment chngctr {change counter}</tcl> <p>The file change counter is a 4-byte big-endian integer which is incremented whenever the database file is changed in rollback mode. When two or more processes are reading the same database file, each process can detect database changes from other processes by monitoring the change counter. A process will normally want to flush its database page cache when another process modified the database, since the cache has become stale. The file change counter facilitates this.</p> <p>In WAL mode, changes to the database are detected using the wal-index and so the change counter is not needed. Hence, the change counter might not be incremented on each transaction in WAL mode.</p> <h4>1.2.7 In-header database size</h4> <tcl>hd_fragment filesize {in-header database size}</tcl> <p>The 4-byte big-endian integer at offset 28 into the header stores the size of the database file in pages. If this in-header datasize size is not valid (see the next paragraph), then the database size is computed by looking at the actual size of the database file. Older versions of SQLite |
︙ | ︙ | |||
235 236 237 238 239 240 241 | know to update the in-header database size and so the in-header database size could be incorrect. But legacy versions of SQLite will also leave the version-valid-for number at offset 92 unchanged so it will not match the change-counter. Hence, invalid in-header database sizes can be detected (and ignored) by observing when the change-counter does not match the version-valid-for number.</p> | | | | | 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 | know to update the in-header database size and so the in-header database size could be incorrect. But legacy versions of SQLite will also leave the version-valid-for number at offset 92 unchanged so it will not match the change-counter. Hence, invalid in-header database sizes can be detected (and ignored) by observing when the change-counter does not match the version-valid-for number.</p> <h4>1.2.8 Free page list</h4> <p>Unused pages in the database file are stored on a freelist. The 4-byte big-endian integer at offset 32 stores the page number of the first page of the freelist, or zero if the freelist is empty. The 4-byte big-endian integer at offset 36 stores stores the total number of pages on the freelist.</p> <h4>1.2.9 Schema cookie</h4> <p>The schema cookie is a 4-byte big-endian integer at offset 40 that is incremented whenever the database schema changes. A prepared statement is compiled against a specific version of the database schema. Whenever the database schema changes, the statement must be reprepared. Whenever a prepared statement runs, it first checks the schema cookie to make sure the value is the same as when the statement was prepared and if not it aborts to force the statement to be reprepared.</p> <h4>1.2.10 Schema format number</h4> <p>The schema format number is a 4-byte big-endian integer at offset 44. The schema format number is similar to the file format read and write version numbers at offsets 18 and 19 except that the schema format number refers to the high-level SQL formatting rather than the low-level b-tree formatting. Four schema format numbers are currently defined:</p> |
︙ | ︙ | |||
288 289 290 291 292 293 294 | <p>New database files created by SQLite use format 1 by default, so that database files created by newer versions of SQLite can still be read by older versions of SQLite. The [legacy_file_format pragma] can be used to cause SQLite to create new database files using format 4. Future versions of SQLite may begin to create files using format 4 by default.</p> | | | | | | | | 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 | <p>New database files created by SQLite use format 1 by default, so that database files created by newer versions of SQLite can still be read by older versions of SQLite. The [legacy_file_format pragma] can be used to cause SQLite to create new database files using format 4. Future versions of SQLite may begin to create files using format 4 by default.</p> <h4>1.2.11 Suggested cache size</h4> <p>The 4-byte big-endian signed integer at offset 48 is the suggest cache size in pages for the database file. The value is a suggestion only and SQLite is under no obligation to honor it. The absolute value of the integer is used as the suggested size. The suggested cache size can be set using the [default_cache_size pragma].</p> <h4>1.2.12 Incremental vacuum settings</h4> <p>The two 4-byte big-endian integers at offsets 52 and 64 are used to manage the [auto_vacuum] and [incremental_vacuum] modes. If the integer at offset 52 is zero then pointer-map (ptrmap) pages are omitted from the database file and neither auto_vacuum nor incremental_vacuum are supported. If the integer at offset 52 is non-zero then it is the page number of the largest root page in the database file, the database file contain ptrmap pages, and the mode must be either auto_vacuum or incremental_vacuum. In this latter case, the integer at offset 64 is true for incremental_vacuum and false for auto_vacuum. If the integer at offset 52 is zero then the integer at offset 64 must also be zero.</p> <h4>1.2.13 Text encoding</h4> <p>The 4-byte big-endian integer at offset 56 determines the encoding used for all text strings stored in the database. A value of 1 means UTF-8. A value of 2 means UTF-16le. A value of 3 means UTF-16be. No other values are allowed.</p> <h4>1.2.14 User version number</h4> <p>The 4-byte big-endian integer at offset 60 is the user version which is set and queried by the [user_version pragma]. The user version is not used by SQLite.</p> <tcl>hd_fragment validfor {version-valid-for number}</tcl> <h4>1.2.15 Write library version number and version-valid-for number</h4> <p>The 4-byte big-endian integer at offset 96 stores the [SQLITE_VERSION_NUMBER] value. The 4-byte big-ending integer at offset 92 is the value of the [change counter] when the version number was stored. The integer at offset 92 indicates which transaction the version number is valid for and is sometimes called the "version-valid-for number". <h4>1.2.16 Header space reserved for expansion</h4> <p>All other bytes of the database file header are reserved for future expansion and must be set to zero.</p> <h3>1.3 The Lock-Byte Page</h3> <p>The lock-byte page is the single page of the database file |
︙ | ︙ | |||
509 510 511 512 513 514 515 | A value of 13 means the page is a leaf table b-tree page. Any other value is an error. <tr><td align=center valign=top>1<td align=center valign=top>2<td align=left> Byte offset into the page of the first freeblock <tr><td align=center valign=top>3<td align=center valign=top>2<td align=left> Number of cells on this page <tr><td align=center valign=top>5<td align=center valign=top>2<td align=left> | | > > > > > > > > | 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 | A value of 13 means the page is a leaf table b-tree page. Any other value is an error. <tr><td align=center valign=top>1<td align=center valign=top>2<td align=left> Byte offset into the page of the first freeblock <tr><td align=center valign=top>3<td align=center valign=top>2<td align=left> Number of cells on this page <tr><td align=center valign=top>5<td align=center valign=top>2<td align=left> Offset to the first byte of the cell content area. A zero value is used to represent an offset of 65536, which occurs on an empty root page when using a 65536-byte page size. <tr><td align=center valign=top>7<td align=center valign=top>1<td align=left> Number of fragmented free bytes within the cell content area <tr><td align=center valign=top>8<td align=center valign=top>4<td align=left> The right-most pointer (interior b-tree pages only) </table></blockquote></center> <p>The cell pointer array of a b-tree page immediately follows the b-tree page header. Let K be the number of cells on the btree. The cell pointer array consists of K 2-byte integer offsets to the cell contents. The cell pointers are arranged in key order with left-most cell (the cell with the smallest key) first and the right-most cell (the cell with the largest key) last.</p> <p>Cell content is stored in the cell content region of the b-tree page. SQLite strives to place cells as far toward the end of the b-tree page as it can, in order to leave space for future growth of the cell pointer array. The area in between the last cell pointer array entry and the beginning of the first cell is the unallocated region. </p> <p>If a page contains no cells (which is only possible for a root page of a table that contains no rows) then the offset to the cell content area will equal the page size minus the bytes of reserved space. If the database uses a 65536-byte page size and the reserved space is zero (the usual value for reserved space) then the cell content offset would want to be 65536. However, that integer is too large to be stored in a 2-byte unsigned integer, so a value of 0 is used in its place. <p>A freeblock is a structure used to identify unallocated space within a b-tree page. Freeblocks are organized on a chain. The first 2 bytes of a freeblock are a big-endian integer which is the offset in the b-tree page of the next freeblock in the chain, or zero if the freeblock is the last on the chain. The third and fourth bytes of each freeblock form a big-endian integer which is the size of the freeblock in bytes, including |
︙ | ︙ |
Changes to pages/wal.in.
︙ | ︙ | |||
48 49 50 51 52 53 54 55 56 57 58 59 60 61 | <li>There is an additional quasi-persistent "<tt>-wal</tt>" file and "<tt>-shm</tt> shared memory file associated with each database, which can make SQLite less appealing for use as an [application file-format]. <li>There is the extra operation of [checkpointing] which, though automatic by default, is still something that application developers need to be mindful of. </ol> <h2>How WAL Works</h2> <p>The traditional rollback journal works by writing a copy of the original unchanged database content into a separate rollback journal file and then writing changes directly into the database file. In the | > > > > > | 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 | <li>There is an additional quasi-persistent "<tt>-wal</tt>" file and "<tt>-shm</tt> shared memory file associated with each database, which can make SQLite less appealing for use as an [application file-format]. <li>There is the extra operation of [checkpointing] which, though automatic by default, is still something that application developers need to be mindful of. <li>WAL works best with smaller transactions. WAL does not work as well as tranditional rollback journal modes when used on exceedingly large transactions (transactions where the size of the change to the database file reaches into the gigabyte range). </ol> <h2>How WAL Works</h2> <p>The traditional rollback journal works by writing a copy of the original unchanged database content into a separate rollback journal file and then writing changes directly into the database file. In the |
︙ | ︙ |
Changes to remove_carets.sh.
︙ | ︙ | |||
8 9 10 11 12 13 14 | # certain character sequences from those files. Character sequences # removed are: # # ^( # )^ # ^ # | | < < | 8 9 10 11 12 13 14 15 16 17 18 19 20 | # certain character sequences from those files. Character sequences # removed are: # # ^( # )^ # ^ # echo 'Removing ^ characters ' find $1 -name '*.html' -print | grep -v matrix | while read file do mv $file x.html sed -e 's/\^(//g' -e 's/)^//g' -e 's/\^//g' x.html >$file done |