Documentation Source Text

Check-in [0ec9f2cb8f]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Update the file format documentation for the new 64K page size. Add a caution to the WAL document. Omit annoying echos in the script that removes requirement marks from the documentation.
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 0ec9f2cb8fcea0f08637f35fa94e496ca33d7f0d
User & Date: drh 2010-08-12 14:38:03
Context
2010-08-12
16:26
Work on the queryplanner.html document. check-in: e8152063fb user: drh tags: trunk
14:38
Update the file format documentation for the new 64K page size. Add a caution to the WAL document. Omit annoying echos in the script that removes requirement marks from the documentation. check-in: 0ec9f2cb8f user: drh tags: trunk
2010-08-10
05:58
Fix typos reported on the mailing list. check-in: a85aaa0b62 user: drh tags: trunk
Changes
Hide Diffs Side-by-Side Diffs Ignore Whitespace Patch

Changes to pages/fileformat.in.

   826    826   
   827    827       [h3 "Pages and Page Types" "pages_and_page_types"]
   828    828         <p>
   829    829           The entire database file is divided into pages, each page consisting
   830    830           of <i>page-size</i> bytes, where <i>page-size</i> is the 2-byte 
   831    831           integer value stored at offset 16 of the database header (see above).
   832    832           The <i>page-size</i> is always a power of two between 512 
   833         -        (2<sup>9</sup>) and 32768 (2<sup>15</sup>). SQLite database files
          833  +        (2<sup>9</sup>) and 32768 (2<sup>15</sup>) or the value 1 used to 
          834  +        represent a 65536-byte page. This field can equivalently be viewed
          835  +        as a little-endian number which is page size divided by 256.
          836  +        SQLite database files
   834    837           always consist of an exact number of pages.
   835    838         <p>
   836    839           Pages are numbered beginning from 1, not 0. Page 1 consists of
   837    840           the first <i>page-size</i> bytes of the database file. 
   838    841           The database header described in the previous section consumes
   839    842           the first 100 bytes of page 1.
   840    843         <p>
................................................................................
  1374   1377           [Tr]<td>1..2  <td>2<td>Byte offset of first block of free space on 
  1375   1378                                  this page. If there are no free blocks on this
  1376   1379                                  page, this field is set to 0.
  1377   1380           [Tr]<td>3..4  <td>2<td>Number of cells (entries) on this page.
  1378   1381           [Tr]<td>5..6  <td>2<td>Byte offset of the first byte of the cell
  1379   1382                                  content area (see figure 
  1380   1383                                  <cite>figure_indexpage</cite>), relative to the 
  1381         -                               start of the page.
         1384  +                               start of the page.  If this value is zero, then
         1385  +                               it should be interpreted as 65536.
  1382   1386           [Tr]<td>7     <td>1<td>Number of fragmented free bytes on page.
  1383   1387           [Tr]<td>8..11 <td>4<td>Page number of rightmost child-page (the
  1384   1388                                  child-page that heads the sub-tree in which all
  1385   1389                                  records are larger than all records stored on
  1386   1390                                  this page). This field is not present for leaf
  1387   1391                                  node pages.
  1388   1392         </table>

Changes to pages/fileformat2.in.

    34     34   on the main database file.</p>
    35     35   
    36     36   <h3>1.1 Pages</h3>
    37     37   
    38     38   <p>The main database file consists of one or more pages.  The size of a
    39     39   page is a power of two between 512 and 32768 inclusive.  All pages within
    40     40   the same database are the same size.  The page size for a database file
    41         -is determined by the 2-byte big-endian integer located at an offset of
           41  +is determined by the 2-byte integer located at an offset of
    42     42   16 bytes from the beginning of the database file.</p>
    43     43   
    44     44   <p>Pages are numbered beginning with 1.  The maximum page number is
    45     45   2147483646 (2<sup><small>31</small></sup> - 2).  The minimum size
    46     46   SQLite database is a single 512-byte page.
    47     47   The maximum size database would be 2147483646 pages at 32768 bytes per
    48     48   page or 70,368,744,112,128 bytes (about 70 terabytes).  Usually SQLite will
................................................................................
   100    100   <i>Database Header Format</i><br>
   101    101   <table width="80%" border=1>
   102    102   <tr><th>Offset<th>Size<th>Description
   103    103   <tr><td valign=top align=center>0<td valign=top align=center>16<td align=left>
   104    104   The header string: "SQLite format 3\000"
   105    105   <tr><td valign=top align=center>16<td valign=top align=center>2<td align=left>
   106    106   The database page size in bytes.  Must be a power of two between 512
   107         -and 32768 inclusive.
          107  +and 32768 inclusive, or the value 1 representing a page size of 65536.
   108    108   <tr><td valign=top align=center>18<td valign=top align=center>1<td align=left>
   109    109   File format write version.  1 for legacy; 2 for [WAL].
   110    110   <tr><td valign=top align=center>19<td valign=top align=center>1<td align=left>
   111    111   File format read version.  1 for legacy; 2 for [WAL].
   112    112   <tr><td valign=top align=center>20<td valign=top align=center>1<td align=left>
   113    113   Bytes of unused "reserved" space at the end of each page.  Usually 0.
   114    114   <tr><td valign=top align=center>21<td valign=top align=center>1<td align=left>
................................................................................
   152    152   <h4>1.2.1 Magic Header String</h4>
   153    153   
   154    154   <p>Every SQLite database file begins with the following 16 bytes (in hex):
   155    155   53 51 4c 69 74 65 20 66 6f 72 6d 61 74 20 33 00.  This byte sequence
   156    156   corresponds to the UTF-8 string "SQLite format 3" including the nul
   157    157   terminator character at the end.</p>
   158    158   
   159         -<h4>1.2.2 File format version numbers</h4>
          159  +<h4>1.2.2 Page Size</h4>
          160  +
          161  +<p>The two-byte value beginning at offset 16 determines the page size of 
          162  +the database.  For SQLite versions 3.7.0.1 and earlier, this value is 
          163  +interpreted as a big-endian integer and must be a power of two between
          164  +512 and 32768, inclusive.  Beginning with SQLite version 3.7.1, a page
          165  +size of 65536 bytes is supported.  The value 65536 will not fit in a
          166  +two-byte integer, so to specify a 65536-byte page size, the value is
          167  +at offset 16 is 0x00 0x01.
          168  +This value can be interpreted as a big-endian
          169  +1 and thought of is as a magic number to represent the 65536 page size.
          170  +Or one can view the two-byte field as a little endian number and say
          171  +that it represents the page size divided by 256.</p>
          172  +
          173  +<h4>1.2.3 File format version numbers</h4>
   160    174   
   161    175   <p>The file format write version and file format read version at offsets
   162    176   18 and 19 are intended to allow for enhancements of the file format
   163    177   in future versions of SQLite.  In current versions of SQLite, both of
   164    178   these values are 1 for rollback journalling modes and 2 for [WAL]
   165    179   journalling mode.  If a version of SQLite coded to the current
   166    180   file format specification encounters a database file where the read
   167    181   version is 1 or 2 but the write version is greater than 2, then the database
   168    182   file must be treated as read-only.  If a database file with a read version
   169    183   greater than 2 is encounter, then that database cannot be read or written.</p>
   170    184   
   171         -<h4>1.2.3 Reserved bytes per page</h4>
          185  +<h4>1.2.4 Reserved bytes per page</h4>
   172    186   
   173    187   <p>SQLite has the ability to set aside a small number of extra bytes at
   174    188   the end of every page for use by extensions.  These extra bytes are
   175    189   used, for example, by the SQLite Encryption Extension to store a nonce
   176    190   and/or cryptographic checksum associated with each page.  The 
   177    191   "reserved space" size in the 1-byte integer at offset 20 is the number
   178    192   of bytes of space at the end of each page to reserve for extensions.
................................................................................
   182    196   <p>The "usable size" of a database page is the page size specify by the
   183    197   2-byte integer at offset 16 in the header less the "reserved" space size
   184    198   recorded in the 1-byte integer at offset 20 in the header.  The usable
   185    199   size of a page might be an odd number.  However, the usable size is not
   186    200   allowed to be less than 480.  In other words, if the page size is 512,
   187    201   then the reserved space size cannot exceed 32.</p>
   188    202   
   189         -<h4>1.2.4 Payload fractions</h4>
          203  +<h4>1.2.5 Payload fractions</h4>
   190    204   
   191    205   <p>The maximum and minimum embedded payload fractions and the leaf
   192    206   payload fraction values must be 64, 32, and 32.  These values were
   193    207   originally intended to as tunable parameters that could be used to
   194    208   modify the storage format of the b-tree algorithm.  However, that
   195    209   functionality is not supported and there are no current plans to add
   196    210   support in the future.  Hence, these three bytes are fixed at the
   197    211   values specified.</p>
   198    212   
   199         -<h4>1.2.5 File change counter</h4>
          213  +<h4>1.2.6 File change counter</h4>
   200    214   
   201    215   <tcl>hd_fragment chngctr {change counter}</tcl>
   202    216   <p>The file change counter is a 4-byte big-endian integer which is
   203    217   incremented whenever the database file is changed in rollback mode.  
   204    218   When two or more processes are reading the same database file, each 
   205    219   process can detect database changes from other processes by monitoring 
   206    220   the change counter.
................................................................................
   208    222   another process modified the database, since the cache has become stale.
   209    223   The file change counter facilitates this.</p>
   210    224   
   211    225   <p>In WAL mode, changes to the database are detected using the wal-index
   212    226   and so the change counter is not needed.  Hence, the change counter might
   213    227   not be incremented on each transaction in WAL mode.</p>
   214    228   
   215         -<h4>1.2.6 In-header database size</h4>
          229  +<h4>1.2.7 In-header database size</h4>
   216    230   
   217    231   <tcl>hd_fragment filesize {in-header database size}</tcl>
   218    232   <p>The 4-byte big-endian integer at offset 28 into the header 
   219    233   stores the size of the database file in pages.  If this in-header
   220    234   datasize size is not valid (see the next paragraph), then the database 
   221    235   size is computed by looking
   222    236   at the actual size of the database file. Older versions of SQLite
................................................................................
   235    249   know to update the in-header database size and so the in-header
   236    250   database size could be incorrect.  But legacy versions of SQLite
   237    251   will also leave the version-valid-for number at offset 92 unchanged
   238    252   so it will not match the change-counter.  Hence, invalid in-header
   239    253   database sizes can be detected (and ignored) by observing when
   240    254   the change-counter does not match the version-valid-for number.</p>
   241    255   
   242         -<h4>1.2.7 Free page list</h4>
          256  +<h4>1.2.8 Free page list</h4>
   243    257   
   244    258   <p>Unused pages in the database file are stored on a freelist.  The
   245    259   4-byte big-endian integer at offset 32 stores the page number of
   246    260   the first page of the freelist, or zero if the freelist is empty.
   247    261   The 4-byte big-endian integer at offset 36 stores stores the total 
   248    262   number of pages on the freelist.</p>
   249    263   
   250         -<h4>1.2.8 Schema cookie</h4>
          264  +<h4>1.2.9 Schema cookie</h4>
   251    265   
   252    266   <p>The schema cookie is a 4-byte big-endian integer at offset 40
   253    267   that is incremented whenever the database schema changes.  A 
   254    268   prepared statement is compiled against a specific version of the
   255    269   database schema.  Whenever the database schema changes, the statement
   256    270   must be reprepared.  Whenever a prepared statement runs, it first checks
   257    271   the schema cookie to make sure the value is the same as when the statement
   258    272   was prepared and if not it aborts to force the statement to be reprepared.</p>
   259    273   
   260         -<h4>1.2.9 Schema format number</h4>
          274  +<h4>1.2.10 Schema format number</h4>
   261    275   
   262    276   <p>The schema format number is a 4-byte big-endian integer at offset 44.
   263    277   The schema format number is similar to the file format read and write
   264    278   version numbers at offsets 18 and 19 except that the schema format number
   265    279   refers to the high-level SQL formatting rather than the low-level b-tree
   266    280   formatting.  Four schema format numbers are currently defined:</p>
   267    281   
................................................................................
   288    302   <p>New database files created by SQLite use format 1 by default, so
   289    303   that database files created by newer versions of SQLite can still
   290    304   be read by older versions of SQLite.
   291    305   The [legacy_file_format pragma] can be used to cause SQLite
   292    306   to create new database files using format 4.  Future versions of 
   293    307   SQLite may begin to create files using format 4 by default.</p>
   294    308   
   295         -<h4>1.2.10 Suggested cache size</h4>
          309  +<h4>1.2.11 Suggested cache size</h4>
   296    310   
   297    311   <p>The 4-byte big-endian signed integer at offset 48 is the suggest
   298    312   cache size in pages for the database file.  The value is a suggestion
   299    313   only and SQLite is under no obligation to honor it.  The absolute value
   300    314   of the integer is used as the suggested size.  The suggested cache size
   301    315   can be set using the [default_cache_size pragma].</p>
   302    316   
   303         -<h4>1.2.11 Incremental vacuum settings</h4>
          317  +<h4>1.2.12 Incremental vacuum settings</h4>
   304    318   
   305    319   <p>The two 4-byte big-endian integers at offsets 52 and 64 are used
   306    320   to manage the [auto_vacuum] and [incremental_vacuum] modes.  If
   307    321   the integer at offset 52 is zero then pointer-map (ptrmap) pages are
   308    322   omitted from the database file and neither auto_vacuum nor
   309    323   incremental_vacuum are supported.  If the integer at offset 52 is
   310    324   non-zero then it is the page number of the largest root page in the
   311    325   database file, the database file contain ptrmap pages, and the
   312    326   mode must be either auto_vacuum or incremental_vacuum.  In this latter
   313    327   case, the integer at offset 64 is true for incremental_vacuum and
   314    328   false for auto_vacuum.  If the integer at offset 52 is zero then
   315    329   the integer at offset 64 must also be zero.</p>
   316    330   
   317         -<h4>1.2.12 Text encoding</h4>
          331  +<h4>1.2.13 Text encoding</h4>
   318    332   
   319    333   <p>The 4-byte big-endian integer at offset 56 determines the encoding
   320    334   used for all text strings stored in the database.  A value of 1 means
   321    335   UTF-8.  A value of 2 means UTF-16le.  A value of 3 means UTF-16be.
   322    336   No other values are allowed.</p>
   323    337   
   324         -<h4>1.2.13 User version number</h4>
          338  +<h4>1.2.14 User version number</h4>
   325    339   
   326    340   <p>The 4-byte big-endian integer at offset 60 is the user version which
   327    341   is set and queried by the [user_version pragma].  The user version is
   328    342   not used by SQLite.</p>
   329    343   
   330    344   <tcl>hd_fragment validfor {version-valid-for number}</tcl>
   331         -<h4>1.2.14 Write library version number and version-valid-for number</h4>
          345  +<h4>1.2.15 Write library version number and version-valid-for number</h4>
   332    346   
   333    347   <p>The 4-byte big-endian integer at offset 96 stores the 
   334    348   [SQLITE_VERSION_NUMBER] value.  The 4-byte big-ending integer at
   335    349   offset 92 is the value of the [change counter] when the version number
   336    350   was stored.  The integer at offset 92 indicates which transaction
   337    351   the version number is valid for and is sometimes called the
   338    352   "version-valid-for number".
   339    353   
   340         -<h4>1.2.15 Header space reserved for expansion</h4>
          354  +<h4>1.2.16 Header space reserved for expansion</h4>
   341    355   
   342    356   <p>All other bytes of the database file header are reserved for
   343    357   future expansion and must be set to zero.</p>
   344    358   
   345    359   <h3>1.3 The Lock-Byte Page</h3>
   346    360   
   347    361   <p>The lock-byte page is the single page of the database file
................................................................................
   509    523   A value of 13 means the page is a leaf table b-tree page.
   510    524   Any other value is an error.
   511    525   <tr><td align=center valign=top>1<td align=center valign=top>2<td align=left>
   512    526   Byte offset into the page of the first freeblock
   513    527   <tr><td align=center valign=top>3<td align=center valign=top>2<td align=left>
   514    528   Number of cells on this page
   515    529   <tr><td align=center valign=top>5<td align=center valign=top>2<td align=left>
   516         -Offset to the first byte of the cell content area
          530  +Offset to the first byte of the cell content area.  A zero value is used to represent an offset of 65536, which occurs on an empty root page when using a 65536-byte page size.
   517    531   <tr><td align=center valign=top>7<td align=center valign=top>1<td align=left>
   518    532   Number of fragmented free bytes within the cell content area
   519    533   <tr><td align=center valign=top>8<td align=center valign=top>4<td align=left>
   520    534   The right-most pointer (interior b-tree pages only)
   521    535   </table></blockquote></center>
   522    536   
   523    537   <p>The cell pointer array of a b-tree page immediately follows the b-tree
................................................................................
   529    543   
   530    544   <p>Cell content is stored in the cell content region of the b-tree page.
   531    545   SQLite strives to place cells as far toward the end of the b-tree page as
   532    546   it can, in order to leave space for future growth of the cell pointer array.
   533    547   The area in between the last cell pointer array entry and the beginning of
   534    548   the first cell is the unallocated region.
   535    549   </p>
          550  +
          551  +<p>If a page contains no cells (which is only possible for a root page
          552  +of a table that contains no rows) then the offset to the cell content
          553  +area will equal the page size minus the bytes of reserved space.  If
          554  +the database uses a 65536-byte page size and the reserved space is zero
          555  +(the usual value for reserved space) then the cell content offset would
          556  +want to be 65536.  However, that integer is too large to be stored in a
          557  +2-byte unsigned integer, so a value of 0 is used in its place.
   536    558   
   537    559   <p>A freeblock is a structure used to identify unallocated space within
   538    560   a b-tree page.  Freeblocks are organized on a chain.  The first 2 bytes of
   539    561   a freeblock are a big-endian integer which is the offset in the b-tree page
   540    562   of the next freeblock in the chain, or zero if the freeblock is the last on
   541    563   the chain.  The third and fourth bytes of each freeblock form
   542    564   a big-endian integer which is the size of the freeblock in bytes, including

Changes to pages/wal.in.

    48     48   <li>There is an additional quasi-persistent "<tt>-wal</tt>" file and
    49     49       "<tt>-shm</tt> shared memory file associated with each
    50     50       database, which can make SQLite less appealing for use as an 
    51     51       [application file-format].
    52     52   <li>There is the extra operation of [checkpointing] which, though automatic
    53     53       by default, is still something that application developers need to
    54     54       be mindful of.
           55  +<li>WAL works best with smaller transactions.  WAL does
           56  +    not work as well as tranditional rollback journal modes when used on
           57  +    exceedingly large transactions (transactions where the size of the
           58  +    change to the database file reaches into the gigabyte range).
           59  +    
    55     60   </ol>
    56     61   
    57     62   <h2>How WAL Works</h2>
    58     63   
    59     64   <p>The traditional rollback journal works by writing a copy of the
    60     65   original unchanged database content into a separate rollback journal file
    61     66   and then writing changes directly into the database file.  In the

Changes to remove_carets.sh.

     8      8   # certain character sequences from those files.  Character sequences
     9      9   # removed are:
    10     10   #
    11     11   #     ^(
    12     12   #     )^
    13     13   #     ^
    14     14   #
    15         -echo -n 'Removing ^ characters '
           15  +echo 'Removing ^ characters '
    16     16   find $1 -name '*.html' -print | grep -v matrix | while read file
    17     17   do
    18         -  echo -n .
    19     18     mv $file x.html
    20     19     sed -e 's/\^(//g' -e 's/)^//g' -e 's/\^//g' x.html >$file
    21     20   done
    22         -echo ' done'