Documentation Source Text

Check-in [3b1bef8768]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Readability improvements to the file format documentation.
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: 3b1bef87684702cbd5779c866b8da9c5a6a9bc5e3ddd8bd29338d834d7f00fce
User & Date: drh 2017-08-31 00:54:35
Context
2017-09-01
20:53
Updates to the size and performance chart check-in: 5e4cd244d1 user: drh tags: trunk
2017-08-31
00:54
Readability improvements to the file format documentation. check-in: 3b1bef8768 user: drh tags: trunk
2017-08-30
04:50
Update the speed-and-size spreadsheet. check-in: 402ce78c69 user: drh tags: trunk
Changes
Hide Diffs Side-by-Side Diffs Ignore Whitespace Patch

Changes to pages/fileformat2.in.

     1      1   <title>Database File Format</title>
     2      2   <alt-title>On-Disk Format</alt-title>
     3      3   <tcl>hd_keywords {file format} {second edition file format document}</tcl>
     4      4   
     5      5   <table_of_contents>
     6      6   
     7      7   <p>This document describes and defines the on-disk database file
     8         -format used by SQLite.</p>
            8  +format used by all releases of SQLite since 
            9  +version 3.0.0 ([dateof:3.0.0]).</p>
     9     10   
    10     11   <h1>The Database File</h1>
    11     12   
    12     13   <p>The complete state of an SQLite database is usually
    13     14   contained in a single file on disk called the "main database file".</p>
    14     15   
    15     16   <p>During a transaction, SQLite stores additional information 
................................................................................
    40     41   SQLite database is a single 512-byte page.
    41     42   The maximum size database would be 2147483646 pages at 65536 bytes per
    42     43   page or 140,737,488,224,256 bytes (about 140 terabytes).  Usually SQLite will
    43     44   hit the maximum file size limit of the underlying filesystem or disk
    44     45   hardware long before it hits its own internal size limit.</p>
    45     46   
    46     47   <p>In common use, SQLite databases tend to range in size from a few kilobytes
    47         -to a few gigabytes.</p>
           48  +to a few gigabytes, though terabyte-size SQLite databases are known to exist
           49  +in production.</p>
    48     50   
    49     51   <p>At any point in time, every page in the main database has a single
    50     52   use which is one of the following:
    51     53   <ul>
    52     54   <li>The lock-byte page
    53     55   <li>A freelist page
    54     56   <ul>
................................................................................
   151    153   (in hex): 53 51 4c 69 74 65 20 66 6f 72 6d 61 74 20 33 00.  This byte sequence
   152    154   corresponds to the UTF-8 string "SQLite format 3" including the nul
   153    155   terminator character at the end.</p>
   154    156   
   155    157   <h3>Page Size</h3>
   156    158   
   157    159   <p>The two-byte value beginning at offset 16 determines the page size of 
   158         -the database.  For SQLite versions 3.7.0.1 and earlier, this value is 
          160  +the database.  For SQLite versions 3.7.0.1 ([dateof:3.7.0.1])
          161  +and earlier, this value is 
   159    162   interpreted as a big-endian integer and must be a power of two between
   160    163   512 and 32768, inclusive.  Beginning with SQLite [version 3.7.1]
   161    164   ([dateof:3.7.1]), a page
   162    165   size of 65536 bytes is supported.  The value 65536 will not fit in a
   163    166   two-byte integer, so to specify a 65536-byte page size, the value
   164    167   at offset 16 is 0x00 0x01.
   165    168   This value can be interpreted as a big-endian
................................................................................
   238    241   size if it is available but fall back to the actual file size if
   239    242   the in-header database size is not valid.</p>
   240    243   
   241    244   <p>^The in-header database size is only considered to be valid if
   242    245   it is non-zero and if the 4-byte [change counter] at offset 24
   243    246   exactly matches the 4-byte [version-valid-for number] at offset 92.
   244    247   ^(The in-header database size is always valid 
   245         -when the database is only modified using recent versions of SQLite
   246         -(versions 3.7.0 and later).)^
          248  +when the database is only modified using recent versions of SQLite,
          249  +versions 3.7.0 ([dateof:3.7.0]) and later.)^
   247    250   If a legacy version of SQLite writes to the database, it will not
   248    251   know to update the in-header database size and so the in-header
   249    252   database size could be incorrect.  But legacy versions of SQLite
   250    253   will also leave the version-valid-for number at offset 92 unchanged
   251    254   so it will not match the change-counter.  Hence, invalid in-header
   252    255   database sizes can be detected (and ignored) by observing when
   253    256   the change-counter does not match the version-valid-for number.</p>
................................................................................
   397    400   implementations may choose to read or write bytes on the lock-byte 
   398    401   page according to the 
   399    402   needs and proclivities of the underlying system.  The unix and win32
   400    403   [VFS] implementations that come built into SQLite do not write to the
   401    404   lock-byte page, but third-party VFS implementations for
   402    405   other operating systems might.</p>
   403    406   
   404         -<p>The lock-byte page arose from the need to support Win95 which had
   405         -only mandatory file locking.  All modern operating systems that we know of
   406         -support advisory file locking, and so the lock-byte page is not really
   407         -needed any more, but is retained for backwards compatibility.</p>
          407  +<p>The lock-byte page arose from the need to support Win95 which was the
          408  +predominant operating system when this file format was designed and which 
          409  +only supported mandatory file locking.  All modern operating systems that
          410  +we know of support advisory file locking, and so the lock-byte page is
          411  +not really needed any more, but is retained for backwards compatibility.</p>
   408    412   
   409    413   <tcl>hd_fragment {freelist} {freelist} {free-page list}</tcl>
   410    414   <h2>The Freelist</h2>
   411    415   
   412    416   <p>A database file might contain one or more pages that are not in
   413    417   active use.  Unused pages can come about, for example, when information
   414    418   is deleted from the database.  Unused pages are stored on the freelist
................................................................................
   428    432   ^(Call the second integer on a freelist trunk page L.
   429    433   If L is greater than zero then integers with array indexes between 2 and
   430    434   L+1 inclusive contain page numbers for freelist leaf pages.)^</p>
   431    435   
   432    436   <p>Freelist leaf pages contain no information.  ^SQLite avoids reading or
   433    437   writing freelist leaf pages in order to reduce disk I/O.</p>
   434    438   
   435         -<p>A bug in SQLite versions prior to 3.6.0 caused the database to be
          439  +<p>A bug in SQLite versions prior to 3.6.0 ([dateof:3.6.0])
          440  +caused the database to be
   436    441   reported as corrupt if any of the last 6 entries in the freelist trunk page 
   437    442   array contained non-zero values.  Newer versions of SQLite do not have
   438    443   this problem.  ^However, newer versions of SQLite still avoid using the 
   439    444   last six entries in the freelist trunk page array in order that database
   440    445   files created by newer versions of SQLite can be read by older versions
   441    446   of SQLite.</p>
   442    447   
................................................................................
  1359   1364   WITHOUT ROWID table.)^
  1360   1365   
  1361   1366   <tcl>hd_fragment stat2tab {sqlite_stat2}</tcl>
  1362   1367   <h3>The sqlite_stat2 table</h3>
  1363   1368   
  1364   1369   <p>The sqlite_stat2 is only created and is only used if SQLite is compiled
  1365   1370   with SQLITE_ENABLE_STAT2 and if the SQLite version number is between
  1366         -3.6.18 and 3.7.8.  The sqlite_stat2 table is neither read nor written by any
         1371  +3.6.18 ([dateof:3.6.18]) and 3.7.8 ([dateof:3.7.8]).
         1372  +The sqlite_stat2 table is neither read nor written by any
  1367   1373   version of SQLite before 3.6.18 nor after 3.7.8.
  1368   1374   The sqlite_stat2 table contains additional information
  1369   1375   about the distribution of keys within an index.
  1370   1376   The schema of the sqlite_stat2 table is as follows:
  1371   1377   
  1372   1378   <blockquote><pre>
  1373   1379   CREATE TABLE sqlite_stat2(tbl,idx,sampleno,sample);
................................................................................
  1397   1403   sqlite_stat2 table, if is exists, is simply ignored.
  1398   1404   
  1399   1405   <tcl>hd_fragment stat3tab {sqlite_stat3} SQLITE_STAT3</tcl>
  1400   1406   <h3>The sqlite_stat3 table</h3>
  1401   1407   
  1402   1408   <p>The sqlite_stat3 is only used if SQLite is compiled
  1403   1409   with [SQLITE_ENABLE_STAT3] or [SQLITE_ENABLE_STAT4]
  1404         -and if the SQLite version number is 3.7.9 or greater.
         1410  +and if the SQLite version number is 3.7.9 ([dateof:3.7.9]) or greater.
  1405   1411   The sqlite_stat3 table is neither read nor written by any
  1406   1412   version of SQLite before 3.7.9.
  1407   1413   If the [SQLITE_ENABLE_STAT4] compile-time option is used and the
  1408         -SQLite version number is 3.8.1 or greater, then sqlite_stat3 might
  1409         -be read but not written.
         1414  +SQLite version number is 3.8.1 ([dateof:3.8.1]) or greater,
         1415  +then sqlite_stat3 might be read but not written.
  1410   1416   The sqlite_stat3 table contains additional information
  1411   1417   about the distribution of keys within an index, information that the
  1412   1418   query planner can use to devise better and faster query algorithms.
  1413   1419   ^(The schema of the sqlite_stat3 table is as follows:
  1414   1420   
  1415   1421   <blockquote><pre>
  1416   1422   CREATE TABLE sqlite_stat3(tbl,idx,nEq,nLt,nDLt,sample);
................................................................................
  1441   1447   sample S1 must have a smaller rowid than sample S2.)^
  1442   1448   
  1443   1449   <tcl>hd_fragment stat4tab {sqlite_stat4} SQLITE_STAT4</tcl>
  1444   1450   <h3>The sqlite_stat4 table</h3>
  1445   1451   
  1446   1452   <p>The sqlite_stat4 is only created and is only used if SQLite is compiled
  1447   1453   with [SQLITE_ENABLE_STAT4] and if the SQLite version number is
  1448         -3.8.1 or greater.  The sqlite_stat4 table is neither read nor written by any
         1454  +3.8.1 ([dateof:3.8.1]) or greater.
         1455  +The sqlite_stat4 table is neither read nor written by any
  1449   1456   version of SQLite before 3.8.1.
  1450   1457   The sqlite_stat4 table contains additional information
  1451   1458   about the distribution of keys within an index or the distribution of
  1452   1459   keys in the primary key of a [WITHOUT ROWID] table.
  1453   1460   The query planner can sometimes use the additional information in
  1454   1461   the sqlite_stat4 table to devise better and faster query algorithms.
  1455   1462   ^(The schema of the sqlite_stat4 table is as follows:

Changes to pages/formatchng.in.

     1      1   <title>File Format Changes in SQLite</title>
     2      2   
     3      3   <h2>File Format Changes in SQLite</h2>
     4      4   
     5      5   <p>
     6      6   The [file format|underlying file format] for SQLite databases does not
     7         -change in incompatible ways.  There are literally tens of billions of
            7  +change in incompatible ways.  There are literally hundredss of billions,
            8  +perhaps trillions, of
     8      9   SQLite database files in circulation and the SQLite developers are
     9     10   committing to supporting those files for decades into the future.
    10     11   </p>
    11     12   
    12     13   <p>
    13     14   This document describes incompatibilities that have occurred in
    14     15   SQLite prior to 2004.  Since 2004, there have been enhancements to
................................................................................
    33     34   <tr>
    34     35     <th>Version Change</th>
    35     36     <th>Approx. Date</th>
    36     37     <th>Description Of File Format Change</th>
    37     38   </tr>
    38     39   <tr>
    39     40     <td valign="top">1.0.32 to 2.0.0</td>
    40         -  <td valign="top">2001-Sep-20</td>
           41  +  <td valign="top">2001-09-20</td>
    41     42     <td>Version 1.0.X of SQLite used the GDBM library as its backend
    42     43     interface to the disk.  Beginning in version 2.0.0, GDBM was replaced
    43     44     by a custom B-Tree library written especially for SQLite.  The new
    44     45     B-Tree backend is twice as fast as GDBM, supports atomic commits and
    45     46     rollback, and stores an entire database in a single disk file instead
    46     47     using a separate file for each table as GDBM does.  The two
    47     48     file formats are not even remotely similar.</td>
    48     49   </tr>
    49     50   <tr>
    50     51     <td valign="top">2.0.8 to 2.1.0</td>
    51         -  <td valign="top">2001-Nov-12</td>
           52  +  <td valign="top">2001-10-12</td>
    52     53     <td>The same basic B-Tree format is used but the details of the 
    53     54     index keys were changed in order to provide better query 
    54     55     optimization opportunities.  Some of the headers were also changed in order
    55     56     to increase the maximum size of a row from 64KB to 24MB.<p>
    56     57   
    57     58     This change is an exception to the version number rule described above
    58     59     in that it is neither forwards or backwards compatible.  A complete
    59     60     reload of the database is required.  This is the only exception.</td>
    60     61   </tr>
    61     62   <tr>
    62     63     <td valign="top">2.1.7 to 2.2.0</td>
    63         -  <td valign="top">2001-Dec-21</td>
           64  +  <td valign="top">2001-12-21</td>
    64     65     <td>Beginning with version 2.2.0, SQLite no longer builds an index for
    65     66     an INTEGER PRIMARY KEY column.  Instead, it uses that column as the actual
    66     67     B-Tree key for the main table.<p>Version 2.2.0 and later of the library
    67     68     will automatically detect when it is reading a 2.1.x database and will
    68     69     disable the new INTEGER PRIMARY KEY feature.   In other words, version
    69     70     2.2.x is backwards compatible to version 2.1.x.  But version 2.1.x is not
    70     71     forward compatible with version 2.2.x. If you try to open
................................................................................
    72     73     an INTEGER PRIMARY KEY, you will likely get a coredump.  If the database
    73     74     schema does not contain any INTEGER PRIMARY KEYs, then the version 2.1.x
    74     75     and version 2.2.x database files will be identical and completely
    75     76     interchangeable.</p>
    76     77   </tr>
    77     78   <tr>
    78     79     <td valign="top">2.2.5 to 2.3.0</td>
    79         -  <td valign="top">2002-Jan-30</td>
           80  +  <td valign="top">2002-01-30</td>
    80     81     <td>Beginning with version 2.3.0, SQLite supports some additional syntax
    81     82     (the "ON CONFLICT" clause) in the CREATE TABLE and CREATE INDEX statements
    82     83     that are stored in the SQLITE_MASTER table.  If you create a database that
    83     84     contains this new syntax, then try to read that database using version 2.2.5
    84     85     or earlier, the parser will not understand the new syntax and you will get
    85     86     an error.  Otherwise, databases for 2.2.x and 2.3.x are interchangeable.</td>
    86     87   </tr>
    87     88   <tr>
    88     89     <td valign="top">2.3.3 to 2.4.0</td>
    89         -  <td valign="top">2002-Mar-10</td>
           90  +  <td valign="top">2002-03-10</td>
    90     91     <td>Beginning with version 2.4.0, SQLite added support for views. 
    91     92     Information about views is stored in the SQLITE_MASTER table.  If an older
    92     93     version of SQLite attempts to read a database that contains VIEW information
    93     94     in the SQLITE_MASTER table, the parser will not understand the new syntax
    94     95     and initialization will fail.  Also, the
    95     96     way SQLite keeps track of unused disk blocks in the database file
    96     97     changed slightly.
    97     98     If an older version of SQLite attempts to write a database that
    98     99     was previously written by version 2.4.0 or later, then it may leak disk
    99    100     blocks.</td>
   100    101   </tr>
   101    102   <tr>
   102    103     <td valign="top">2.4.12 to 2.5.0</td>
   103         -  <td valign="top">2002-Jun-17</td>
          104  +  <td valign="top">2002-06-17</td>
   104    105     <td>Beginning with version 2.5.0, SQLite added support for triggers. 
   105    106     Information about triggers is stored in the SQLITE_MASTER table.  If an older
   106    107     version of SQLite attempts to read a database that contains a CREATE TRIGGER
   107    108     in the SQLITE_MASTER table, the parser will not understand the new syntax
   108    109     and initialization will fail.
   109    110     </td>
   110    111   </tr>
   111    112   <tr>
   112    113     <td valign="top">2.5.6 to 2.6.0</td>
   113         -  <td valign="top">2002-July-17</td>
          114  +  <td valign="top">2002-07-17</td>
   114    115     <td>A design flaw in the layout of indices required a file format change
   115    116     to correct.  This change appeared in version 2.6.0.<p>
   116    117   
   117    118     If you use version 2.6.0 or later of the library to open a database file
   118    119     that was originally created by version 2.5.6 or earlier, an attempt to
   119    120     rebuild the database into the new format will occur automatically.
   120    121     This can take some time for a large database.  (Allow 1 or 2 seconds
................................................................................
   127    128     Version 2.6.0 or later of the library cannot open read-only database
   128    129     files from version 2.5.6 or earlier, since read-only files cannot be
   129    130     upgraded to the new format.</p>
   130    131     </td>
   131    132   </tr>
   132    133   <tr>
   133    134     <td valign="top">2.6.3 to 2.7.0</td>
   134         -  <td valign="top">2002-Aug-13</td>
          135  +  <td valign="top">2002-08-13</td>
   135    136     <td><p>Beginning with version 2.7.0, SQLite understands two different
   136    137     datatypes: text and numeric.  Text data sorts in memcmp() order.
   137    138     Numeric data sorts in numerical order if it looks like a number,
   138    139     or in memcmp() order if it does not.</p>
   139    140   
   140    141     <p>When SQLite version 2.7.0 or later opens a 2.6.3 or earlier database,
   141    142     it assumes all columns of all tables have type "numeric".  For 2.7.0
................................................................................
   148    149     2.7.0 and later database.  Hence version 2.6.3 and earlier of SQLite 
   149    150     will be unable to read a 2.7.0 or later database.  But version 2.7.0
   150    151     and later of SQLite will read earlier databases.</p>
   151    152     </td>
   152    153   </tr>
   153    154   <tr>
   154    155     <td valign="top">2.7.6 to 2.8.0</td>
   155         -  <td valign="top">2003-Feb-14</td>
          156  +  <td valign="top">2003-02-14</td>
   156    157     <td><p>Version 2.8.0 introduces a change to the format of the rollback
   157    158     journal file.  The main database file format is unchanged.  Versions
   158    159     2.7.6 and earlier can read and write 2.8.0 databases and vice versa.
   159    160     Version 2.8.0 can rollback a transaction that was started by version
   160    161     2.7.6 and earlier.  But version 2.7.6 and earlier cannot rollback a
   161    162     transaction started by version 2.8.0 or later.</p>
   162    163   
................................................................................
   166    167     earlier.  The 2.7.6 code will not be able to read the journal file
   167    168     and thus will not be able to rollback the incomplete transaction
   168    169     to restore the database.</p>
   169    170     </td>
   170    171   </tr>
   171    172   <tr>
   172    173     <td valign="top">2.8.14 to 3.0.0</td>
   173         -  <td valign="top">2004-Jun-18</td>
          174  +  <td valign="top">2004-06-18</td>
   174    175     <td><p>Version 3.0.0 is a major upgrade for SQLite that incorporates
   175    176     support for UTF-16, BLOBs, and a more compact encoding that results
   176    177     in database files that are typically 25% to 50% smaller.  The new file
   177    178     format is very different and is completely incompatible with the
   178    179     version 2 file format.</p>
   179    180     </td>
   180    181   </tr>
   181    182   </table>
   182    183   </blockquote>