Documentation Source Text

Check-in [9a9955c381]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Modify some requirements in fileformat.html to make them more testable.
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 9a9955c381353c1f05b8ce23ffe87da3f8f68c69
User & Date: dan 2009-05-26 04:32:49
Context
2009-05-26
15:13
Add a "todo" note to fix the schema file-format range. Also fix another todo in fileformat.html. check-in: 14e677d1dc user: dan tags: trunk
04:32
Modify some requirements in fileformat.html to make them more testable. check-in: 9a9955c381 user: dan tags: trunk
2009-05-25
12:47
Enhancements to the news.html text for the 3.6.14.2 release. check-in: f286448352 user: drh tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to pages/fileformat.in.

209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
...
247
248
249
250
251
252
253
254
255
256
257
258

259
260
261
262
263
264
265
...
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
...
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
...
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
...
450
451
452
453
454
455
456






457
458
459
460
461
462
463
...
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531





532
533
534
535
536
537
538







539
540
541
542
543
544
545
...
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
...
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
...
639
640
641
642
643
644
645
646
647
648
649
650
651
652

653
654
655
656
657
658
659
...
661
662
663
664
665
666
667

668
669
670
671
672
673
674
...
675
676
677
678
679
680
681

682
683
684
685
686
687
688
689

690
691
692
693
694
695
696

697
698
699
700
701
702
703
704

705
706
707
708
709
710
711
712

713
714
715
716
717
718
719
...
722
723
724
725
726
727
728

729
730
731
732

733
734
735
736

737
738
739
740
741
742

743
744
745
746
747
748
749
750
751
752
753
754
755
...
759
760
761
762
763
764
765
766
767
768
769
770

771
772
773
774
775
776
777

778
779
780
781
782
783

784
785
786
787

788
789
790
791
792
793
794
795
796
797
798

799
800
801
802
803
804
805
...
832
833
834
835
836
837
838
839
840

841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859






860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
...
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
....
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
....
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
....
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
....
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
....
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
....
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
....
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
....
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
....
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
....
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
....
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
....
2322
2323
2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
          readers and writers.
    </ul>

  [h2 "Glossary"]
    <table id=glossary>
      <tr><td>Auto-vacuum last root-page<td>
        A page number stored as 32-bit integer at byte offset 52 of the
        database file header (see section <cite>database_header</cite>). In
        an auto-vacuum database, this is the numerically largest 
        <i>root-page</i> number in the database. Additionally, all pages that
        occur before this page in the database are either B-Tree <i>root
        pages</i>, <i>pointer-map pages</i> or the <i>locking page</i>.

      <tr><td>Auto-vacuum database      <td>
        Each database is either an auto-vacuum database or a non auto-vacuum
        database. Auto-vacuum databases feature pointer-map pages (section
        <cite>pointer_map_pages</cite>) and have a non-zero value stored
        as a 4-byte big-endian integer at offset 52 of the file header (section
        <cite>database_header</cite>).
      <tr><td>B-Tree                    <td>
        A B-Tree is a tree structure optimized for offline storage. The table
        and index data in an SQLite database file is stored in B-Tree
        structures.

      <tr><td>B-Tree cell               <td>
................................................................................
      <tr><td>Cell content area         <td>
        The area within a B-Tree page in which the B-Tree cells are stored.

      <tr><td>(Database) text encoding  <td>
        The text encoding used for all text values in the database file. One
        of UTF-8, big-endian UTF-16 and little-endian UTF-16. The database
        text encoding is defined by a 4 byte field stored at byte offset
        56 of the database file header (see section <cite>database_header</cite>).

      <tr><td>(Database) file header    <td>
        The first 100 bytes of an SQLite database file constitute the
        database file header.


      <tr><td>(Database) page size      <td>
        An SQLite database file is divided into one or more pages of
        page-size bytes each.

      <tr><td>Database record           <td>
        A database record is a blob of data containing the serialized
................................................................................
        A 32-bit integer field stored at byte offset 48 of the database file
        header (see section <cite>database_header</cite>).

      <tr><td style="white-space:nowrap">(Database) usable page size <td>
        The number of bytes of each database page that is usable. This
        is the page-size less the number of bytes left unused at the end
        of each page. The number of bytes left unused is governed by the
        value stored at offset 20 of the file header (see section
        <cite>database_header</cite>).

      <tr><td>File format read version  <td>
        Single byte field stored at byte offset 20 of the database file header
        (see section <cite>database_header</cite>).

      <tr><td>File format write version  <td>
        Single byte field stored at byte offset 19 of the database file header
        (see section <cite>database_header</cite>).

      <tr><td>File change counter       <td>
        A 32-bit integer field stored at byte offset 24 of the database file
        header (see section <cite>database_header</cite>). Normally, SQLite
        increments this value each time it commits a transaction.

................................................................................
        database properties that may be set by the user (auto-vacuum,
        page-size, user-cookie value etc.),

      <tr><td>Non-auto-vacuum database  <td>
        Any database that is not an auto-vacuum database. A non-auto-vacuum
        database contains no pointer-map pages and has a zero value stored
        in the 4-byte big-endian integer field at offset 52 of the database
        file header (section <cite>database_header</cite>).

      <tr><td>Overflow chain             <td>
        A linked list of overflow pages across which a single (large)
        database record is stored (see section 
        <cite>overflow_page_chains</cite>).

      <tr><td>Overflow page             <td>
................................................................................

      <tr><td>Root page                 <td>
        A root page is a database page used to store the root node of a
        B-Tree data structure.

      <tr><td>Schema layer file format  <td>
        An integer between 1 and 4 stored as a 4 byte big-endian integer at
        offset 44 of the file header (section <cite>database_header</cite>).
        Certain file format constructions may only be present in databases
        with a certain minimum schema layer file format value.

      <tr><td>Schema table              <td>
        The table B-Tree with root-page 1 used to store database records
        describing the database schema. Accessible as the "sqlite_master" 
        table from within SQLite.
................................................................................
	a journal file. It contains a full file-system path identifying 
	a master-journal file.
      }]
      [Glossary "Database File-System Representation" {
        A file or files within the file-system used to store an SQLite 
        database image.
      }]







    </table>

<!--
h1 "SQLite Database Files" sqlite_database_files
 
  <p>
................................................................................
    <li>The database schema: The set of database tables, virtual tables, 
        indexes, triggers and views stored in the database.

    <li>The database contents: The set of tuples (rows) stored in
	each database table.

    <li>Other database properties, as follows:
      <ul>
	<li>The page-size of the database.
	<li>The text-encoding of the database.
	<li>A flag indicating whether or not the database is an auto-vacuum 
	    database.
	<li>The value of the database user-cookie.
	<li>If the database is an auto-vacuum database, a flag indicating 
	    whether or not the database is in incremental vacuum mode or not.
	<li>The default page cache size in pages to use with the database (an 
	    integer field).
      </ul>
  </ul>






  <p class=todo>
    The concept of a logical database and its contents should be defined
    properly in some requirements document so that it can be referenced from
    here and other places. The definition will be something like the list of
    bullet points above.








  <p>
    A well-formed SQLite database image is defined as an image for which
    all of the statements itemized as requirements within this section
    are true. <span class=todo>mention the requirements numbering scheme
    here.</span> A software system that wishes to interoperate with other
    systems using the SQLite database image format should only ever
    output well-formed SQLite databases. In the case of SQLite itself,
................................................................................
      for a single key or range of keys. As implemented in SQLite, each
      entry may be associated with a blob of data that is not part of the
      key. For the canonical introduction to the B-Tree and its variants, 
      refer to reference <cite>ref_comer_btree</cite>. The B-Tree 
      implementation in SQLite also adopts some of the enhancements 
      suggested in <cite>ref_knuth_btree</cite>.
    <p>
      An SQLite database file contains one or more B-Tree structures. Each
      B-Tree structure stores the data for a single database table or 
      index. Hence each database file contains a single B-Tree to store
      the contents of the <i>sqlite_master</i> table, and one B-Tree
      for each database table or index created by the user. If the database
      uses auto-increment integer primary keys, then the database file
      also contains a B-Tree to store the contents of the automatically 
      created <i>sqlite_sequence</i> table.
................................................................................
      The following sections and sub-sections describe precisely the format
      used to serialize the B-Tree structures within an SQLite database image.

  [h2 "Global Structure"]

    [h3 "Database Header" "database_header"]
      <p>
        An SQLite database image begins with a 100-byte database header. The header
        file consists of a well known 16-byte sequence followed by a series of
        1, 2 and 4 byte unsigned integers. All integers in the file header (as
        well as the rest of the database file) are stored in big-endian format.
        
      <p>
        The well known 16-byte sequence that begins every SQLite database file
        is:
      <pre>
          0x53 0x51 0x4c 0x69 0x74 0x65 0x20 0x66 0x6f 0x72 0x6d 0x61 0x74 0x20 0x33 0x00</pre>
................................................................................
        Interpreted as UTF-8 encoded text, this byte sequence corresponds 
        to the string "SQLite format 3" followed by a nul-terminator byte.

          [fileformat_import_requirement2 H30030]

      <p>
        The 1, 2 and 4 byte unsigned integers that make up the rest of the
        database file header are described in the following table.

      [Table]
        [Tr]<th>Byte Range <th>Byte Size <th width=100%>Description
	[Tr]<td>16..17 <td>2<td>
            Database page size in bytes. See section 
            <cite>pages_and_page_types</cite> for details.


        [Tr]<td>18     <td>1<td>
            <p style="margin-top:0">
            File-format "write version". Currently, this field
            is always set to 1. If a value greater than 1 is read by SQLite,
            then the library will only open the file for read-only access.

................................................................................
            This field and the next one are intended to be used for 
            forwards compatibility, should the need ever arise. If in the
            future a version of SQLite is created that uses a file format
            that may be safely read but not written by older versions of
            SQLite, then this field will be set to a value greater than 1
            to prevent older SQLite versions from writing to a file that
            uses the new format. 


        [Tr]<td>19     <td>1<td>
            <p style="margin-top:0">
             File-format "read version". Currently, this 
            field is always set to 1. If a value greater than 1 is read 
            by SQLite, then the library will refuse to open the database 

................................................................................
            <p style="margin-bottom:0">
            Like the "write version" described above, this field exists
            to facilitate some degree of forwards compatibility, in case
            it is ever required. If a version of SQLite created in the 
            future uses a file format that may not be safely read by older
            SQLite versions, then this field will be set to a value greater
            than 1.


        [Tr]<td>20     <td>1<td>
            Number of bytes of unused space at the end of each database
            page. Usually this field is set to 0. If it is non-zero, then 
            it contains the number of bytes that are left unused at the
            end of every database page (see section
            <cite>pages_and_page_types</cite> for a description of a
            database page).


        [Tr]<td>21     <td>1<td>
             Maximum fraction of an index tree page to use for 
            embedded content. This value is used to determine the maximum
            size of a B-Tree cell to store as embedded content on a
            page that is part of an index B-Tree. Refer to section 
            <cite>index_btree_cell_format</cite> for details.


        [Tr]<td>22     <td>1<td>
            Minimum fraction of an index B-Tree page to use for
            embedded content when an entry uses one or more overflow pages.
            This value is used to determine the portion of a B-Tree cell 
            that requires one or more overflow pages to store as embedded
            content on a page that is part of an index B-Tree. Refer to
            section <cite>index_btree_cell_format</cite> for details.


        [Tr]<td>23     <td>1<td>
            Minimum fraction of an table B-Tree leaf page to use for
            embedded content when an entry uses one or more overflow pages.
            This value is used to determine the portion of a B-Tree cell 
            that requires one or more overflow pages to store as embedded
            content on a page that is a leaf of a table B-Tree. Refer to
            section <cite>table_btree_cell_format</cite> for details.


        [Tr]<td>24..27 <td>4<td>
            <p style="margin-top:0">
            The file change counter. Each time a database transaction is
            committed, the value of the 32-bit unsigned integer stored in
            this field is incremented.
            <p style="margin-bottom:0">
................................................................................
            a portion of the file cached in memory. However, since the file
            is unlocked, another process may use SQLite to modify the 
            contents of the file, invalidating the internal cache of the
            first process. When the file is relocked, the first process can
            check if the value of the file change counter has been modified
            since the file was unlocked. If it has not, then the internal
            cache may be assumed to be valid and may be reused.


        [Tr]<td>32..35 <td>4<td>
            Page number of first freelist trunk page. 
            For more details, refer to section <cite>free_page_list</cite>.


        [Tr]<td>36..39 <td>4<td>
            Number of free pages in the database file.
            For more details, refer to section <cite>free_page_list</cite>.


        [Tr]<td>40..43 <td>4<td>
            The schema version. Each time the database schema is modified (by
            creating or deleting a database table, index, trigger or view)
            the value of the 32-bit unsigned integer stored in this field
            is incremented.


        [Tr]<td>44..47 <td>4<td>
            <p style="margin-top:0">
            Schema layer file-format. This value is similar to the
            "read-version" and "write-version" fields at offsets 18 and 19
            of the database file header. If SQLite encounters a database
            with a schema layer file-format value greater than the file-format
            that it understands (currently 4), then SQLite will refuse to
            access the database.
            <p>
            Usually, this value is set to 1. However, if any of the following
            file-format features are used, then the schema layer file-format
            must be set to the corresponding value or greater:
................................................................................
              <li> Implicit default (non-NULL) values at the end of table
                   records (see section <cite>table_btree_content</cite>).
              <li> Descending indexes (see section
                   <cite>index_btree_compare_func</cite>) and Boolean values
                   in database records (see section <cite>record_format</cite>,
                   serial types 8 and 9).
            </ol>
            

        [Tr]<td>48..51 <td>4<td>
            Default pager cache size. This field is used by SQLite to store
            the recommended pager cache size to use for the database.


        [Tr]<td>52..55 <td>4<td>
            For auto-vacuum capable databases, the numerically largest 
            root-page number in the database. Since page 1 is always the
            root-page of the schema table (section <cite>schema_table</cite>),
            this value is always non-zero for auto-vacuum databases. For
            non-auto-vacuum databases, this value is always zero.


        [Tr]<td>56..59 <td>4<td>
            (constant) Database text encoding. A value of 1 means all 
            text values are stored using UTF-8 encoding. 2 indicates
            little-endian UTF-16 text. A value of 3 means that the database
            contains big-endian UTF-16 text.  


        [Tr]<td>60..63 <td>4<td>
            The user-cookie value. A 32-bit integer value available to the
            user for read/write access.


        [Tr]<td>64..67 <td>4<td>
            The incremental-vacuum flag. In non-auto-vacuum databases this
            value is always zero. In auto-vacuum databases, this field is
            set to 1 if "incremental vacuum" mode is enabled. If incremental
            vacuum mode is not enabled, then the database file is reorganized
            so that it contains no free pages (section
            <cite>free_page_list</cite>) at the end of each database
            transaction. If incremental vacuum mode is enabled, then the
            reorganization is not performed until explicitly requested
            by the user.


      </table>

      <p>
        The four byte block beginning at offset 28 is unused. As is the
        32 byte block beginning at offset 68.
      </p>
................................................................................
      </ul>

      <p>
        The following requirement encompasses all of the above.

          [fileformat_import_requirement2 H30040]


      <p>

        Following the 16 byte magic string in the file header is the
        <i>page size</i>, a 2-byte field. See section
        <cite>pages_and_page_types</cite> for details.

          [fileformat_import_requirement2 H30100]

      <p>
        Following the <i>file change counter</i> in the database header are
        two 4-byte fields related to the database file <i>free page list</i>.
        See section <cite>free_page_list</cite> for details.

          [fileformat_import_requirement2 H30110]
          [fileformat_import_requirement2 H30120]
          [fileformat_import_requirement2 H30130]
          [fileformat_import_requirement2 H30140]
          [fileformat_import_requirement2 H30150]
          [fileformat_import_requirement2 H30160]
          [fileformat_import_requirement2 H30170]
          [fileformat_import_requirement2 H30180]







    [h3 "Pages and Page Types" "pages_and_page_types"]
      <p>
        The entire database file is divided into pages, each page consisting
        of <i>page-size</i> bytes, where <i>page-size</i> is the 2-byte 
        integer value stored at offset 16 of the file header (see above).
        The <i>page-size</i> is always a power of two between 512 
        (2<sup>9</sup>) and 32768 (2<sup>15</sup>). SQLite database files
        always consist of an exact number of pages.
      <p>
        Pages are numbered beginning from 1, not 0. Page 1 consists of
        the first <i>page-size</i> bytes of the database file. The file header
        described in the previous section consumes the first 100 bytes of page
        1.
      <p>
        Each page of the database file is one of the following:
      <ul>
        <li><b>A B-Tree page</b>. B-Tree pages are part of the tree 
            structures used to store database tables and indexes.
        <li><b>An overflow page</b>. Overflow pages are used by particularly
            large database records that do not fit on a single B-Tree page.
        <li><b>A free page</b>. Free pages are pages within the database file
            that are not being used to store meaningful data.
        <li><b>A "pointer-map" page</b>. In auto-vacuum capable databases
            (databases for which the 4 byte big-endian integer stored at
            byte offset 52 of the file header is non-zero) some pages are
            permanently designated "pointer-map" pages. See section 
            <cite>pointer_map_pages</cite> for details.
        <li><b>The locking page</b>. The database page that starts at
            byte offset 2<sup>30</sup>, if it is large enough to contain
            such a page, is always left unused.
      </ul>

          [fileformat_import_requirement2 H30190]
          [fileformat_import_requirement2 H30200]
          [fileformat_import_requirement2 H30210]
          [fileformat_import_requirement2 H30220]
        

    [h3 "The Schema Table" schema_table]
      <p>
................................................................................
        be used as pointer-map pages (see section
        <cite>pointer_map_pages</cite>).
      <p>
        As noted in section <cite>database_header</cite>, in an auto-vacuum
        database the page number of the page immediately following the
        final root page in the contiguous set of root pages is stored
        as a 4 byte big-endian integer at byte offset 52 of the database
        file header. Unless that page is itself a pointer-map page, in which
        case the page number of the page following it is stored instead.

      <p>
        For example, if the schema of a logical database is created using
        the following SQL statements:
      <pre>
          CREATE TABLE abc(a, b, c);
................................................................................
            <td>An SQL real value (type SQLITE_FLOAT), stored as an
                8-byte IEEE floating point value.
        [Tr]<td>8
            <td>The literal SQL integer 0 (type SQLITE_INTEGER). The value 
                consumes zero bytes of space in the record's data area.
                Values of this type are only present in databases with
                a schema file format (the 32-bit integer at byte offset 44
                of the database file header) value of 4 or greater.

        [Tr]<td>9
            <td>The literal SQL integer 1 (type SQLITE_INTEGER). The value
                consumes zero bytes of space in the record's data area.
                Values of this type are only present in databases with
                a schema file format (the 32-bit integer at byte offset 44
                of the database file header) value of 4 or greater.

        [Tr]<td style="white-space:nowrap"><i>bytes</i> * 2 + 12
            <td>Even values greater than 12 are used to signify a blob of
                data (type SQLITE_BLOB) (<i>n</i>-12)/2 bytes in length, where
                <i>n</i> is the integer value stored in the record header.
                
        [Tr]<td style="white-space:nowrap"><i>bytes</i> * 2 + 13
................................................................................
        in the records data area. If the corresponding integer type value
        in the record header is 0 (NULL), 8 (integer value 0) or 9 (integer
        value 1), then the blob of data is zero bytes in length. Otherwise,
        the length of the data field is as described in the table above.
      <p>
        The data field associated with a string value contains the string
        encoded using the database encoding, as defined in the database
        file header (see section <cite>database_header</cite>). No 
        nul-terminator character is stored in the database.

          [fileformat_import_requirement2 H30560]
          [fileformat_import_requirement2 H30570]
          [fileformat_import_requirement2 H30580]
          [fileformat_import_requirement2 H30590]
          [fileformat_import_requirement2 H30600]
................................................................................
          the cell if it consists of less than:
        <pre>
            <i>max-local</i> := (<i>usable-size</i> - 12) * <i>max-embedded-fraction</i> / 255 - 23
</pre>
        <p>
          bytes. In the formula above, <i>usable-size</i> is the page-size
          in bytes less the number of unused bytes left at the end of every
          page (as read from byte offset 20 of the file header), and
          <i>max-embedded-fraction</i> is the value read from byte offset 
          21 of the file header.
        [Figure indexshortrecord.gif figure_indexshortrecord "Small Record Index B-Tree Cell"]
        <p>
          If the cell record is larger than the maximum size identified by
          the formula above, then only the first part of the record is stored
          within the cell. The remainder is stored in an overflow-chain (see
          section <cite>overflow_page_chains</cite> for details). Following 
          the part of the record stored within the cell is the page number 
................................................................................
            <i>local-size</i> := <i>min-local</i> + (<i>record-size</i> - <i>min-local</i>) % (<i>usable-size</i> - 4)
            if( <i>local-size</i> &gt; <i>max-local</i> )
                <i>local-size</i> := <i>min-local</i>
</pre>
        <p>
          In the formula above, <i>usable-size</i> is the page-size
          in bytes less the number of unused bytes left at the end of every
          page (as read from byte offset 20 of the file header), and
          <i>max-embedded-fraction</i> and <i>min-embedded-fraction</i> are
          the values read from byte offsets 21 and 22 of the file header,
          respectively.
        [Figure indexlongrecord.gif figure_indexlongrecord "Large Record Index B-Tree Cell"]

          [fileformat_import_requirement2 H30970]
          [fileformat_import_requirement2 H30980]
          [fileformat_import_requirement2 H30990]
          [fileformat_import_requirement2 H31000]
................................................................................

      <p>
        Requirements H31010 and H30990 are similar to the algorithms 
        presented in the text above. However instead of 
        <i>min-embedded-fraction</i> and <i>max-embedded-fraction</i> the
        requirements use the constant values 32 and 64, as well-formed 
        database files are required by H30080 and H30070 to store these 
        values in the relevant database file header fields.

    [h3 "Table B-Trees" table_btrees]
      <p>
        As noted in section <cite>fileformat_overview</cite>, table B-Trees
        store a set of unique 64-bit signed integer keys. Associated with
        each key is a database record. As with index B-Trees, the database
        file pages that make up a table B-Tree are organized into a tree
................................................................................
          <li>The first byte of the page-header, the "flags" field, is set to 
              0x05 for internal tree node pages, and 0x0D for leaf pages.
          <li>The content and format of the B-Tree cells is different. See
              section <cite>table_btree_cell_format</cite> for details.
          <li>The format of page 1 is the same as any other table B-Tree,
              except that 100 bytes less than usual is available for content.
              The first 100 bytes of page 1 is consumed by the database
              file header.
        </ul>

          [fileformat_import_requirement2 H31130]
          [fileformat_import_requirement2 H31140]
        
      <p>
        Most of the requirements specified in section 
................................................................................
          sufficiently small is defined as less than or equal to:
        <pre>
          max-local := <i>usable-size</i> - 35
</pre>
        <p>
          bytes. Where <i>usable-size</i> is defined as the page-size
          in bytes less the number of unused bytes left at the end of every
          page (as read from byte offset 20 of the file header). 
          This scenario, where the entire record is
          stored within the B-Tree cell, is depicted in figure
          <cite>figure_tableshortrecord</cite>.
        [Figure tableshortrecord.gif figure_tableshortrecord "Table B-Tree Small Record Leaf Node Cell"]

        <p>
          If the record is too large to be stored entirely within the B-Tree
................................................................................
            <i>max-local</i> := <i>usable-size</i> - 35
            <i>local-size</i> := <i>min-local</i> + (<i>record-size</i> - <i>min-local</i>) % (<i>usable-size</i> - 4)
            if( <i>local-size</i> &gt; <i>max-local</i> )
                <i>local-size</i> := <i>min-local</i>
</pre>
        <p>
          In this case, <i>min-embedded-fraction</i> is the value read from
          byte offset 22 of the file header. The layout of the cell in this
          case, when an overflow-chain is required, is shown in figure
          <cite>figure_tablelongrecord</cite>.

        [Figure tablelongrecord.gif figure_tablelongrecord "Table B-Tree Large Record Leaf Node Cell"]

        <p>
          If the leaf page is page 1, then the value of <i>usable-size</i> is
................................................................................
        page is:
      <pre>
        <i>available-space</i> := <i>usable-size</i> - 4
</pre>
      <p>
        Where <i>usable-size</i> is defined as the page-size in bytes less the
        number of unused bytes left at the end of every page (as read from 
        byte offset 20 of the file header).
      <p>
        Each overflow page except for the last one in the linked list 
        contains <i>available-space</i> bytes of record data. The last
        page in the list contains the remaining data, starting at byte
        offset 4. The value of the "next page" field on the last page
        in an overflow chain is undefined.

................................................................................
      Each free-list trunk page contains up to:
    <pre>
        <i>max-leaf-pointers</i> := (<i>usable-size</i> - 8) / 4
</pre>
    <p>
      pointers, where <i>usable-size</i> is defined as the page-size in bytes
      less the number of unused bytes left at the end of every page (as read
      from byte offset 20 of the file header).

      [Figure freelistpage.gif figure_freelistpage "Free List Trunk Page Format"]
    <p>
      All trunk pages in the free-list except for the first contain the 
      maximum possible number of references to leaf pages. <span class=todo>Is this actually true in an auto-vacuum capable database?</span> The page number
      of the first page in the linked list of free-list trunk pages is 
      stored as a 4-byte big-endian unsigned integer at offset 32 of the
      file header (section <cite>database_header</cite>).

          [fileformat_import_requirement2 H31240]
          [fileformat_import_requirement2 H31250]
          [fileformat_import_requirement2 H31260]
          [fileformat_import_requirement2 H31270]
          [fileformat_import_requirement2 H31280]
          [fileformat_import_requirement2 H31290]
          [fileformat_import_requirement2 H31300]

    <p>The following statements govern the two 4-byte big-endian integers
       associated with the <i>free page list</i> structure in the database
       file header.

          [fileformat_import_requirement2 H31310]
          [fileformat_import_requirement2 H31320]
  

  [h2 "Pointer Map Pages" pointer_map_pages]
    <p>
................................................................................
      pointer-map page may therefore contain:
    <pre>
        <i>num-entries</i> := <i>usable-size</i> / 5
</pre>
    <p>
      entries, where <i>usable-size</i> is defined as the page-size in bytes
      less the number of unused bytes left at the end of every page (as read
      from byte offset 20 of the file header).
    <p>
      Assuming the database is auto-vacuum capable, page 2 is always a 
      pointer map page. It contains the pointer map lookup table entries for
      pages 3 through (2 + <i>num-entries</i>), inclusive. The first 5 bytes
      of page 2 contain the pointer map lookup table entry for page 3. Bytes
      5 through 9, inclusive, contain the pointer map lookup table entry
      for page 4, and so on.
................................................................................

[h2 "Reading an SQLite Database" reading_from_files]

  <p>
    As described in section <cite>pages_and_page_types</cite> of this document,
    an SQLite database image is a set of contiguously numbered fixed size 
    pages. The numbering starts at 1, not 0. Page 1 contains the 
    <i>database file header</i> and the root page of the <i>schema table</i>, 
    and all other pages within the database image are somehow referenced
    by number, either directly or indirectly, from page 1, either directly 
    or indirectly. In order to be able to read the database image from within
    the file-system, a database reader needs to be able to ascertain:

  <ol>
    <li> The <i>page-size</i> used by the database image,







|









|







 







|

|
|
|
>







 







|



|



|







 







|







 







|







 







>
>
>
>
>
>







 







|









|

>
>
>
>
>







>
>
>
>
>
>
>







 







|







 







|
|
|







 







|


|



>







 







>







 







>








>







>








>








>







 







>




>




>






>





|







 







|




>







>






>




>











>







 







<

>
|
|
<

|
<
<
<
<
<
<
|
|
|
|
|
|
|
|
>
>
>
>
>
>





|





|













|







<







 







|







 







|






|







 







|







 







|

|







 







|

|







 







|







 







|







 







|







 







|







 







|







 







|







|











|







 







|







 







|







209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
...
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
...
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
...
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
...
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
...
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
...
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
...
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
...
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
...
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
...
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
...
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
...
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
...
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
...
867
868
869
870
871
872
873

874
875
876
877

878
879






880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926

927
928
929
930
931
932
933
....
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
....
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
....
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
....
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
....
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
....
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
....
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
....
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
....
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
....
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
....
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
....
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
....
2355
2356
2357
2358
2359
2360
2361
2362
2363
2364
2365
2366
2367
2368
2369
          readers and writers.
    </ul>

  [h2 "Glossary"]
    <table id=glossary>
      <tr><td>Auto-vacuum last root-page<td>
        A page number stored as 32-bit integer at byte offset 52 of the
        database header (see section <cite>database_header</cite>). In
        an auto-vacuum database, this is the numerically largest 
        <i>root-page</i> number in the database. Additionally, all pages that
        occur before this page in the database are either B-Tree <i>root
        pages</i>, <i>pointer-map pages</i> or the <i>locking page</i>.

      <tr><td>Auto-vacuum database      <td>
        Each database is either an auto-vacuum database or a non auto-vacuum
        database. Auto-vacuum databases feature pointer-map pages (section
        <cite>pointer_map_pages</cite>) and have a non-zero value stored
        as a 4-byte big-endian integer at offset 52 of the database header (section
        <cite>database_header</cite>).
      <tr><td>B-Tree                    <td>
        A B-Tree is a tree structure optimized for offline storage. The table
        and index data in an SQLite database file is stored in B-Tree
        structures.

      <tr><td>B-Tree cell               <td>
................................................................................
      <tr><td>Cell content area         <td>
        The area within a B-Tree page in which the B-Tree cells are stored.

      <tr><td>(Database) text encoding  <td>
        The text encoding used for all text values in the database file. One
        of UTF-8, big-endian UTF-16 and little-endian UTF-16. The database
        text encoding is defined by a 4 byte field stored at byte offset
        56 of the database header (see section <cite>database_header</cite>).

      [Glossary "Database header" {
        The first 100 bytes of an SQLite database image constitute the
	database header. See section <cite>database_header</cite> for details.
      }]

      <tr><td>(Database) page size      <td>
        An SQLite database file is divided into one or more pages of
        page-size bytes each.

      <tr><td>Database record           <td>
        A database record is a blob of data containing the serialized
................................................................................
        A 32-bit integer field stored at byte offset 48 of the database file
        header (see section <cite>database_header</cite>).

      <tr><td style="white-space:nowrap">(Database) usable page size <td>
        The number of bytes of each database page that is usable. This
        is the page-size less the number of bytes left unused at the end
        of each page. The number of bytes left unused is governed by the
        value stored at offset 20 of the database header (see section
        <cite>database_header</cite>).

      <tr><td>File format read version  <td>
        Single byte field stored at byte offset 20 of the database header
        (see section <cite>database_header</cite>).

      <tr><td>File format write version  <td>
        Single byte field stored at byte offset 19 of the database header
        (see section <cite>database_header</cite>).

      <tr><td>File change counter       <td>
        A 32-bit integer field stored at byte offset 24 of the database file
        header (see section <cite>database_header</cite>). Normally, SQLite
        increments this value each time it commits a transaction.

................................................................................
        database properties that may be set by the user (auto-vacuum,
        page-size, user-cookie value etc.),

      <tr><td>Non-auto-vacuum database  <td>
        Any database that is not an auto-vacuum database. A non-auto-vacuum
        database contains no pointer-map pages and has a zero value stored
        in the 4-byte big-endian integer field at offset 52 of the database
        database header (section <cite>database_header</cite>).

      <tr><td>Overflow chain             <td>
        A linked list of overflow pages across which a single (large)
        database record is stored (see section 
        <cite>overflow_page_chains</cite>).

      <tr><td>Overflow page             <td>
................................................................................

      <tr><td>Root page                 <td>
        A root page is a database page used to store the root node of a
        B-Tree data structure.

      <tr><td>Schema layer file format  <td>
        An integer between 1 and 4 stored as a 4 byte big-endian integer at
        offset 44 of the database header (section <cite>database_header</cite>).
        Certain file format constructions may only be present in databases
        with a certain minimum schema layer file format value.

      <tr><td>Schema table              <td>
        The table B-Tree with root-page 1 used to store database records
        describing the database schema. Accessible as the "sqlite_master" 
        table from within SQLite.
................................................................................
	a journal file. It contains a full file-system path identifying 
	a master-journal file.
      }]
      [Glossary "Database File-System Representation" {
        A file or files within the file-system used to store an SQLite 
        database image.
      }]

      [Glossary "Database user-cookie" {
        An SQLite database contains a single 32-bit signed integer field known
	as the database user-cookie. Applications may read and write this field
	for any purpose.
      }]

    </table>

<!--
h1 "SQLite Database Files" sqlite_database_files
 
  <p>
................................................................................
    <li>The database schema: The set of database tables, virtual tables, 
        indexes, triggers and views stored in the database.

    <li>The database contents: The set of tuples (rows) stored in
	each database table.

    <li>Other database properties, as follows:
      <ol>
	<li>The page-size of the database.
	<li>The text-encoding of the database.
	<li>A flag indicating whether or not the database is an auto-vacuum 
	    database.
	<li>The value of the database user-cookie.
	<li>If the database is an auto-vacuum database, a flag indicating 
	    whether or not the database is in incremental vacuum mode or not.
	<li>The default page cache size in pages to use with the database (an 
	    integer field).
      </ol>
  </ul>

  <p class=todo>
    Say something about items 5 and 6 being advisory only. SQLite places an
    interpretation on each field, but other applications may safely ignore
    them.

  <p class=todo>
    The concept of a logical database and its contents should be defined
    properly in some requirements document so that it can be referenced from
    here and other places. The definition will be something like the list of
    bullet points above.

  <p>
    Many of the numbered requirements in the following sub-sections describe 
    the relationship between the contents of the logical database, as itemized
    above, and the contents of the serialized database image. Others describe
    the relationships between various database image substructures, invariants
    that are true for all well-formed database images.

  <p>
    A well-formed SQLite database image is defined as an image for which
    all of the statements itemized as requirements within this section
    are true. <span class=todo>mention the requirements numbering scheme
    here.</span> A software system that wishes to interoperate with other
    systems using the SQLite database image format should only ever
    output well-formed SQLite databases. In the case of SQLite itself,
................................................................................
      for a single key or range of keys. As implemented in SQLite, each
      entry may be associated with a blob of data that is not part of the
      key. For the canonical introduction to the B-Tree and its variants, 
      refer to reference <cite>ref_comer_btree</cite>. The B-Tree 
      implementation in SQLite also adopts some of the enhancements 
      suggested in <cite>ref_knuth_btree</cite>.
    <p>
      An SQLite database image contains one or more B-Tree structures. Each
      B-Tree structure stores the data for a single database table or 
      index. Hence each database file contains a single B-Tree to store
      the contents of the <i>sqlite_master</i> table, and one B-Tree
      for each database table or index created by the user. If the database
      uses auto-increment integer primary keys, then the database file
      also contains a B-Tree to store the contents of the automatically 
      created <i>sqlite_sequence</i> table.
................................................................................
      The following sections and sub-sections describe precisely the format
      used to serialize the B-Tree structures within an SQLite database image.

  [h2 "Global Structure"]

    [h3 "Database Header" "database_header"]
      <p>
        An SQLite database image begins with a 100-byte database header. The database 
        header consists of a well known 16-byte sequence followed by a series of
        1, 2 and 4 byte unsigned integers. All integers in the database header (as
        well as the rest of the database file) are stored in big-endian format.
        
      <p>
        The well known 16-byte sequence that begins every SQLite database file
        is:
      <pre>
          0x53 0x51 0x4c 0x69 0x74 0x65 0x20 0x66 0x6f 0x72 0x6d 0x61 0x74 0x20 0x33 0x00</pre>
................................................................................
        Interpreted as UTF-8 encoded text, this byte sequence corresponds 
        to the string "SQLite format 3" followed by a nul-terminator byte.

          [fileformat_import_requirement2 H30030]

      <p>
        The 1, 2 and 4 byte unsigned integers that make up the rest of the
        database header are described in the following table.

      [Table]
        [Tr]<th>Byte Range <th>Byte Size <th width=100%>Description <th>Reqs
	[Tr]<td>16..17 <td>2<td>
            Database page size in bytes. See section 
            <cite>pages_and_page_types</cite> for details.
            <td>H30190

        [Tr]<td>18     <td>1<td>
            <p style="margin-top:0">
            File-format "write version". Currently, this field
            is always set to 1. If a value greater than 1 is read by SQLite,
            then the library will only open the file for read-only access.

................................................................................
            This field and the next one are intended to be used for 
            forwards compatibility, should the need ever arise. If in the
            future a version of SQLite is created that uses a file format
            that may be safely read but not written by older versions of
            SQLite, then this field will be set to a value greater than 1
            to prevent older SQLite versions from writing to a file that
            uses the new format. 
            <td>H30040

        [Tr]<td>19     <td>1<td>
            <p style="margin-top:0">
             File-format "read version". Currently, this 
            field is always set to 1. If a value greater than 1 is read 
            by SQLite, then the library will refuse to open the database 

................................................................................
            <p style="margin-bottom:0">
            Like the "write version" described above, this field exists
            to facilitate some degree of forwards compatibility, in case
            it is ever required. If a version of SQLite created in the 
            future uses a file format that may not be safely read by older
            SQLite versions, then this field will be set to a value greater
            than 1.
            <td>H30040

        [Tr]<td>20     <td>1<td>
            Number of bytes of unused space at the end of each database
            page. Usually this field is set to 0. If it is non-zero, then 
            it contains the number of bytes that are left unused at the
            end of every database page (see section
            <cite>pages_and_page_types</cite> for a description of a
            database page).
            <td>H30040

        [Tr]<td>21     <td>1<td>
             Maximum fraction of an index tree page to use for 
            embedded content. This value is used to determine the maximum
            size of a B-Tree cell to store as embedded content on a
            page that is part of an index B-Tree. Refer to section 
            <cite>index_btree_cell_format</cite> for details.
            <td>H30040

        [Tr]<td>22     <td>1<td>
            Minimum fraction of an index B-Tree page to use for
            embedded content when an entry uses one or more overflow pages.
            This value is used to determine the portion of a B-Tree cell 
            that requires one or more overflow pages to store as embedded
            content on a page that is part of an index B-Tree. Refer to
            section <cite>index_btree_cell_format</cite> for details.
            <td>H30040

        [Tr]<td>23     <td>1<td>
            Minimum fraction of an table B-Tree leaf page to use for
            embedded content when an entry uses one or more overflow pages.
            This value is used to determine the portion of a B-Tree cell 
            that requires one or more overflow pages to store as embedded
            content on a page that is a leaf of a table B-Tree. Refer to
            section <cite>table_btree_cell_format</cite> for details.
            <td>H30040

        [Tr]<td>24..27 <td>4<td>
            <p style="margin-top:0">
            The file change counter. Each time a database transaction is
            committed, the value of the 32-bit unsigned integer stored in
            this field is incremented.
            <p style="margin-bottom:0">
................................................................................
            a portion of the file cached in memory. However, since the file
            is unlocked, another process may use SQLite to modify the 
            contents of the file, invalidating the internal cache of the
            first process. When the file is relocked, the first process can
            check if the value of the file change counter has been modified
            since the file was unlocked. If it has not, then the internal
            cache may be assumed to be valid and may be reused.
            <td>H33040

        [Tr]<td>32..35 <td>4<td>
            Page number of first freelist trunk page. 
            For more details, refer to section <cite>free_page_list</cite>.
            <td>H31320

        [Tr]<td>36..39 <td>4<td>
            Number of free pages in the database file.
            For more details, refer to section <cite>free_page_list</cite>.
            <td>H31310

        [Tr]<td>40..43 <td>4<td>
            The schema version. Each time the database schema is modified (by
            creating or deleting a database table, index, trigger or view)
            the value of the 32-bit unsigned integer stored in this field
            is incremented.
            <td>H33050

        [Tr]<td>44..47 <td>4<td>
            <p style="margin-top:0">
            Schema layer file-format. This value is similar to the
            "read-version" and "write-version" fields at offsets 18 and 19
            of the database header. If SQLite encounters a database
            with a schema layer file-format value greater than the file-format
            that it understands (currently 4), then SQLite will refuse to
            access the database.
            <p>
            Usually, this value is set to 1. However, if any of the following
            file-format features are used, then the schema layer file-format
            must be set to the corresponding value or greater:
................................................................................
              <li> Implicit default (non-NULL) values at the end of table
                   records (see section <cite>table_btree_content</cite>).
              <li> Descending indexes (see section
                   <cite>index_btree_compare_func</cite>) and Boolean values
                   in database records (see section <cite>record_format</cite>,
                   serial types 8 and 9).
            </ol>
            <td>H30120

        [Tr]<td>48..51 <td>4<td>
            Default pager cache size. This field is used by SQLite to store
            the recommended pager cache size to use for the database.
            <td>H30130

        [Tr]<td>52..55 <td>4<td>
            For auto-vacuum capable databases, the numerically largest 
            root-page number in the database. Since page 1 is always the
            root-page of the schema table (section <cite>schema_table</cite>),
            this value is always non-zero for auto-vacuum databases. For
            non-auto-vacuum databases, this value is always zero.
            <td>H30140, H30141

        [Tr]<td>56..59 <td>4<td>
            (constant) Database text encoding. A value of 1 means all 
            text values are stored using UTF-8 encoding. 2 indicates
            little-endian UTF-16 text. A value of 3 means that the database
            contains big-endian UTF-16 text.  
            <td>H30150

        [Tr]<td>60..63 <td>4<td>
            The user-cookie value. A 32-bit integer value available to the
            user for read/write access.
            <td>H30160

        [Tr]<td>64..67 <td>4<td>
            The incremental-vacuum flag. In non-auto-vacuum databases this
            value is always zero. In auto-vacuum databases, this field is
            set to 1 if "incremental vacuum" mode is enabled. If incremental
            vacuum mode is not enabled, then the database file is reorganized
            so that it contains no free pages (section
            <cite>free_page_list</cite>) at the end of each database
            transaction. If incremental vacuum mode is enabled, then the
            reorganization is not performed until explicitly requested
            by the user.
            <td>H30171

      </table>

      <p>
        The four byte block beginning at offset 28 is unused. As is the
        32 byte block beginning at offset 68.
      </p>
................................................................................
      </ul>

      <p>
        The following requirement encompasses all of the above.

          [fileformat_import_requirement2 H30040]


      <p>
        Section <cite>database_file_format</cite> identifies six persistent
        user-visible properties of an SQLite database. The following 
        requirements describe the way in which these properties are stored.


          [fileformat_import_requirement2 H30190]






          [fileformat_import_requirement2 H30191]
          [fileformat_import_requirement2 H30150]
          [fileformat_import_requirement2 H30140]
          [fileformat_import_requirement2 H30141]
          [fileformat_import_requirement2 H30160]
          [fileformat_import_requirement2 H30170]
          [fileformat_import_requirement2 H30171]
          [fileformat_import_requirement2 H30130]

      <p>
        The following requirement describes the valid range of values for the
        schema layer file format field.

          [fileformat_import_requirement2 H30120]

    [h3 "Pages and Page Types" "pages_and_page_types"]
      <p>
        The entire database file is divided into pages, each page consisting
        of <i>page-size</i> bytes, where <i>page-size</i> is the 2-byte 
        integer value stored at offset 16 of the database header (see above).
        The <i>page-size</i> is always a power of two between 512 
        (2<sup>9</sup>) and 32768 (2<sup>15</sup>). SQLite database files
        always consist of an exact number of pages.
      <p>
        Pages are numbered beginning from 1, not 0. Page 1 consists of
        the first <i>page-size</i> bytes of the database file. The database header
        described in the previous section consumes the first 100 bytes of page
        1.
      <p>
        Each page of the database file is one of the following:
      <ul>
        <li><b>A B-Tree page</b>. B-Tree pages are part of the tree 
            structures used to store database tables and indexes.
        <li><b>An overflow page</b>. Overflow pages are used by particularly
            large database records that do not fit on a single B-Tree page.
        <li><b>A free page</b>. Free pages are pages within the database file
            that are not being used to store meaningful data.
        <li><b>A "pointer-map" page</b>. In auto-vacuum capable databases
            (databases for which the 4 byte big-endian integer stored at
            byte offset 52 of the database header is non-zero) some pages are
            permanently designated "pointer-map" pages. See section 
            <cite>pointer_map_pages</cite> for details.
        <li><b>The locking page</b>. The database page that starts at
            byte offset 2<sup>30</sup>, if it is large enough to contain
            such a page, is always left unused.
      </ul>


          [fileformat_import_requirement2 H30200]
          [fileformat_import_requirement2 H30210]
          [fileformat_import_requirement2 H30220]
        

    [h3 "The Schema Table" schema_table]
      <p>
................................................................................
        be used as pointer-map pages (see section
        <cite>pointer_map_pages</cite>).
      <p>
        As noted in section <cite>database_header</cite>, in an auto-vacuum
        database the page number of the page immediately following the
        final root page in the contiguous set of root pages is stored
        as a 4 byte big-endian integer at byte offset 52 of the database
        header. Unless that page is itself a pointer-map page, in which
        case the page number of the page following it is stored instead.

      <p>
        For example, if the schema of a logical database is created using
        the following SQL statements:
      <pre>
          CREATE TABLE abc(a, b, c);
................................................................................
            <td>An SQL real value (type SQLITE_FLOAT), stored as an
                8-byte IEEE floating point value.
        [Tr]<td>8
            <td>The literal SQL integer 0 (type SQLITE_INTEGER). The value 
                consumes zero bytes of space in the record's data area.
                Values of this type are only present in databases with
                a schema file format (the 32-bit integer at byte offset 44
                of the database header) value of 4 or greater.

        [Tr]<td>9
            <td>The literal SQL integer 1 (type SQLITE_INTEGER). The value
                consumes zero bytes of space in the record's data area.
                Values of this type are only present in databases with
                a schema file format (the 32-bit integer at byte offset 44
                of the database header) value of 4 or greater.

        [Tr]<td style="white-space:nowrap"><i>bytes</i> * 2 + 12
            <td>Even values greater than 12 are used to signify a blob of
                data (type SQLITE_BLOB) (<i>n</i>-12)/2 bytes in length, where
                <i>n</i> is the integer value stored in the record header.
                
        [Tr]<td style="white-space:nowrap"><i>bytes</i> * 2 + 13
................................................................................
        in the records data area. If the corresponding integer type value
        in the record header is 0 (NULL), 8 (integer value 0) or 9 (integer
        value 1), then the blob of data is zero bytes in length. Otherwise,
        the length of the data field is as described in the table above.
      <p>
        The data field associated with a string value contains the string
        encoded using the database encoding, as defined in the database
        header (see section <cite>database_header</cite>). No 
        nul-terminator character is stored in the database.

          [fileformat_import_requirement2 H30560]
          [fileformat_import_requirement2 H30570]
          [fileformat_import_requirement2 H30580]
          [fileformat_import_requirement2 H30590]
          [fileformat_import_requirement2 H30600]
................................................................................
          the cell if it consists of less than:
        <pre>
            <i>max-local</i> := (<i>usable-size</i> - 12) * <i>max-embedded-fraction</i> / 255 - 23
</pre>
        <p>
          bytes. In the formula above, <i>usable-size</i> is the page-size
          in bytes less the number of unused bytes left at the end of every
          page (as read from byte offset 20 of the database header), and
          <i>max-embedded-fraction</i> is the value read from byte offset 
          21 of the database header.
        [Figure indexshortrecord.gif figure_indexshortrecord "Small Record Index B-Tree Cell"]
        <p>
          If the cell record is larger than the maximum size identified by
          the formula above, then only the first part of the record is stored
          within the cell. The remainder is stored in an overflow-chain (see
          section <cite>overflow_page_chains</cite> for details). Following 
          the part of the record stored within the cell is the page number 
................................................................................
            <i>local-size</i> := <i>min-local</i> + (<i>record-size</i> - <i>min-local</i>) % (<i>usable-size</i> - 4)
            if( <i>local-size</i> &gt; <i>max-local</i> )
                <i>local-size</i> := <i>min-local</i>
</pre>
        <p>
          In the formula above, <i>usable-size</i> is the page-size
          in bytes less the number of unused bytes left at the end of every
          page (as read from byte offset 20 of the database header), and
          <i>max-embedded-fraction</i> and <i>min-embedded-fraction</i> are
          the values read from byte offsets 21 and 22 of the database header,
          respectively.
        [Figure indexlongrecord.gif figure_indexlongrecord "Large Record Index B-Tree Cell"]

          [fileformat_import_requirement2 H30970]
          [fileformat_import_requirement2 H30980]
          [fileformat_import_requirement2 H30990]
          [fileformat_import_requirement2 H31000]
................................................................................

      <p>
        Requirements H31010 and H30990 are similar to the algorithms 
        presented in the text above. However instead of 
        <i>min-embedded-fraction</i> and <i>max-embedded-fraction</i> the
        requirements use the constant values 32 and 64, as well-formed 
        database files are required by H30080 and H30070 to store these 
        values in the relevant database database header fields.

    [h3 "Table B-Trees" table_btrees]
      <p>
        As noted in section <cite>fileformat_overview</cite>, table B-Trees
        store a set of unique 64-bit signed integer keys. Associated with
        each key is a database record. As with index B-Trees, the database
        file pages that make up a table B-Tree are organized into a tree
................................................................................
          <li>The first byte of the page-header, the "flags" field, is set to 
              0x05 for internal tree node pages, and 0x0D for leaf pages.
          <li>The content and format of the B-Tree cells is different. See
              section <cite>table_btree_cell_format</cite> for details.
          <li>The format of page 1 is the same as any other table B-Tree,
              except that 100 bytes less than usual is available for content.
              The first 100 bytes of page 1 is consumed by the database
              header.
        </ul>

          [fileformat_import_requirement2 H31130]
          [fileformat_import_requirement2 H31140]
        
      <p>
        Most of the requirements specified in section 
................................................................................
          sufficiently small is defined as less than or equal to:
        <pre>
          max-local := <i>usable-size</i> - 35
</pre>
        <p>
          bytes. Where <i>usable-size</i> is defined as the page-size
          in bytes less the number of unused bytes left at the end of every
          page (as read from byte offset 20 of the database header). 
          This scenario, where the entire record is
          stored within the B-Tree cell, is depicted in figure
          <cite>figure_tableshortrecord</cite>.
        [Figure tableshortrecord.gif figure_tableshortrecord "Table B-Tree Small Record Leaf Node Cell"]

        <p>
          If the record is too large to be stored entirely within the B-Tree
................................................................................
            <i>max-local</i> := <i>usable-size</i> - 35
            <i>local-size</i> := <i>min-local</i> + (<i>record-size</i> - <i>min-local</i>) % (<i>usable-size</i> - 4)
            if( <i>local-size</i> &gt; <i>max-local</i> )
                <i>local-size</i> := <i>min-local</i>
</pre>
        <p>
          In this case, <i>min-embedded-fraction</i> is the value read from
          byte offset 22 of the database header. The layout of the cell in this
          case, when an overflow-chain is required, is shown in figure
          <cite>figure_tablelongrecord</cite>.

        [Figure tablelongrecord.gif figure_tablelongrecord "Table B-Tree Large Record Leaf Node Cell"]

        <p>
          If the leaf page is page 1, then the value of <i>usable-size</i> is
................................................................................
        page is:
      <pre>
        <i>available-space</i> := <i>usable-size</i> - 4
</pre>
      <p>
        Where <i>usable-size</i> is defined as the page-size in bytes less the
        number of unused bytes left at the end of every page (as read from 
        byte offset 20 of the database header).
      <p>
        Each overflow page except for the last one in the linked list 
        contains <i>available-space</i> bytes of record data. The last
        page in the list contains the remaining data, starting at byte
        offset 4. The value of the "next page" field on the last page
        in an overflow chain is undefined.

................................................................................
      Each free-list trunk page contains up to:
    <pre>
        <i>max-leaf-pointers</i> := (<i>usable-size</i> - 8) / 4
</pre>
    <p>
      pointers, where <i>usable-size</i> is defined as the page-size in bytes
      less the number of unused bytes left at the end of every page (as read
      from byte offset 20 of the database header).

      [Figure freelistpage.gif figure_freelistpage "Free List Trunk Page Format"]
    <p>
      All trunk pages in the free-list except for the first contain the 
      maximum possible number of references to leaf pages. <span class=todo>Is this actually true in an auto-vacuum capable database?</span> The page number
      of the first page in the linked list of free-list trunk pages is 
      stored as a 4-byte big-endian unsigned integer at offset 32 of the
      database header (section <cite>database_header</cite>).

          [fileformat_import_requirement2 H31240]
          [fileformat_import_requirement2 H31250]
          [fileformat_import_requirement2 H31260]
          [fileformat_import_requirement2 H31270]
          [fileformat_import_requirement2 H31280]
          [fileformat_import_requirement2 H31290]
          [fileformat_import_requirement2 H31300]

    <p>The following statements govern the two 4-byte big-endian integers
       associated with the <i>free page list</i> structure in the database
       header.

          [fileformat_import_requirement2 H31310]
          [fileformat_import_requirement2 H31320]
  

  [h2 "Pointer Map Pages" pointer_map_pages]
    <p>
................................................................................
      pointer-map page may therefore contain:
    <pre>
        <i>num-entries</i> := <i>usable-size</i> / 5
</pre>
    <p>
      entries, where <i>usable-size</i> is defined as the page-size in bytes
      less the number of unused bytes left at the end of every page (as read
      from byte offset 20 of the database header).
    <p>
      Assuming the database is auto-vacuum capable, page 2 is always a 
      pointer map page. It contains the pointer map lookup table entries for
      pages 3 through (2 + <i>num-entries</i>), inclusive. The first 5 bytes
      of page 2 contain the pointer map lookup table entry for page 3. Bytes
      5 through 9, inclusive, contain the pointer map lookup table entry
      for page 4, and so on.
................................................................................

[h2 "Reading an SQLite Database" reading_from_files]

  <p>
    As described in section <cite>pages_and_page_types</cite> of this document,
    an SQLite database image is a set of contiguously numbered fixed size 
    pages. The numbering starts at 1, not 0. Page 1 contains the 
    database header and the root page of the <i>schema table</i>, 
    and all other pages within the database image are somehow referenced
    by number, either directly or indirectly, from page 1, either directly 
    or indirectly. In order to be able to read the database image from within
    the file-system, a database reader needs to be able to ascertain:

  <ol>
    <li> The <i>page-size</i> used by the database image,

Changes to req/hlr30000.txt.

17
18
19
20
21
22
23
24
25
26
27
28


29
30
31
32
33


34
35

36
37
38
39
40

41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58







59


60
61
62
63
64


65
66
67
68
69
70
71
72
73
74
75
76
77
78

79
80
81
82

83
84
85
86
87
88
89
90
91
92
93
94
95
....
1089
1090
1091
1092
1093
1094
1095

1096
1097
1098
1099
1100

1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
single nul-terminator byte.

HLR H30040
The 6 bytes beginning at byte offset 18 of a well-formed database 
image shall contain the values 0x01, 0x01, 0x00, 0x40, 0x20 and 
0x20, respectively.


HLR H30100
The 4 byte block starting at byte offset 24 of a well-formed
database file contains the <i>file change counter</i> formatted
as a 4-byte big-endian integer.



HLR H30110
The 4 byte block starting at byte offset 40 of a well-formed
database file contains the <i>schema version</i> formatted
as a 4-byte big-endian integer.



HLR H30120

The 4 byte block starting at byte offset 44 of a well-formed
database file, the <i>schema layer file format</i>, contains a
big-endian integer value between 1 and 4, inclusive.

HLR H30130

The 4 byte block starting at byte offset 48 of a well-formed
database file contains the <i>default pager cache size</i> formatted
as a 4-byte big-endian integer.

HLR H30140
The 4 byte block starting at byte offset 52 of a well-formed
database file contains the <i>auto-vacuum last root-page</i>
formatted as a 4-byte big-endian integer. If this value is non-zero,
the database is said to be an <i>auto-vacuum database</i>.

HLR H30150
The 4 byte block starting at byte offset 56 of a well-formed
database file, the <i>text encoding</i> contains a big-endian integer
value between 1 and 3, inclusive.

HLR H30160
The 4 byte block starting at byte offset 60 of a well-formed
database file contains the <i>user cookie</i> formatted







as a 4-byte big-endian integer.



HLR H30170
The 4 byte block starting at byte offset 64 of a well-formed
database file, the <i>incremental vaccum flag</i> contains a big-endian
integer value between 0 and 1, inclusive.



HLR H30180
In a well-formed non-autovacuum database (one with a zero stored
in the 4-byte big-endian integer value beginning at byte offset
52 of the database file header, the incremental vacuum flag is
set to 0.

HLR H30190
The <i>database page size</i> of a well-formed database, stored as a
2-byte big-endian unsigned integer at byte offset 16 of the file,
shall be an integer power of 2 between 512 and 32768, inclusive.

HLR H30200
The size of a <i>well formed database file</i> shall be an integer

multiple of the <i>database page size</i>.

HLR H30210
Each page of a <i>well formed database file</i> is exactly one of a

<i>B-Tree page</i>, an <i>overflow page</i>, a <i>free page</i>, a
<i>pointer-map page</i> or the <i>locking page</i>.

HLR H30220
The database page that starts at byte offset 2<sup>30</sup>, the
<i>locking page</i>, is never used for any purpose.

HLR H30230
In a <i>well-formed database file</i>, the portion of the first
database page not consumed by the database file-header (all but the
first 100 bytes) contains the root node of a table B-Tree,
the <i>schema table</i>.

................................................................................
If another database client holds either a RESERVED or PENDING lock on the
database file-system representation, then any journal file that exists within
the file system shall be considered invalid.


HLR H33040
A database writer shall increment the value of the database header change

counter field (H30100) as part of the first database image modification 
that it performs after obtaining an EXCLUSIVE lock.

HLR H33050
A database writer shall increment the value of the database schema version 

field (H30110) as part of the first database image modification that includes
a schema change that it performs after obtaining an EXCLUSIVE lock.

HLR H33070
If a database writer is required by either H33050 or H33040 to increment a
database header field, and that header field already contains the maximum
value possible (0xFFFFFFFF, or 4294967295 for 32-bit unsigned integer 
fields), "incrementing" the field shall be interpreted to mean setting it to 
zero.









<
|
<
<
|
>
>

|
<
<
|
>
>

|
>
|
|
<

|
>
|
|
|

|
|
|
|
|

|
|
|
|

|
|
|
>
>
>
>
>
>
>
|
>
>

|
<
<
<
>
>







<
<
<
<


<
>
|


<
>





|







 







>
|




>
|
|









17
18
19
20
21
22
23

24


25
26
27
28
29


30
31
32
33
34
35
36
37

38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70



71
72
73
74
75
76
77
78
79




80
81

82
83
84
85

86
87
88
89
90
91
92
93
94
95
96
97
98
99
....
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
single nul-terminator byte.

HLR H30040
The 6 bytes beginning at byte offset 18 of a well-formed database 
image shall contain the values 0x01, 0x01, 0x00, 0x40, 0x20 and 
0x20, respectively.


HLR H30120


The 4-byte big-endian signed integer field at byte offset 44 of a 
well-formed database image, the schema layer file format field, 
shall be set to an integer value between 1 and 4, inclusive. 

HLR H30130


The 4-byte big-endian unsigned integer field at byte offset 48 of a 
well-formed database image shall be set to the value of the 
database default page-cache size.

HLR H30140
If the database is not an auto-vacuum capable database, then the 4 byte 
big-endian unsigned integer field at byte offset 52 of a well-formed 
database image shall contain the value 0.


HLR H30141
If the database is an auto-vacuum capable database, then the 4 byte 
big-endian unsigned integer field at byte offset 52 of a well-formed 
database image shall contain the numerically largest root-page number 
of any table or index B-Tree within the database image.

HLR H30150
The 4 byte big-endian unsigned integer field at byte offset 56 of a 
well-formed database image shall be set to 1 if the database text-encoding
is UTF-8, 2 if the database text-encoding is little-endian UTF-16, and 3
if the database text-encoding is big-endian UTF-16.

HLR H30160
The 4-byte big-endian unsigned integer field at byte offset 60 of a 
well-formed database image shall be set to the value of the 
database user-cookie.

HLR H30190
The 2-byte big-endian unsigned integer field at byte offset 16 of a
well-formed database image shall be set to the value of the database
page-size.

HLR H30191
The page-size of an SQLite database in bytes shall be an integer power 
of 2 between 512 and 32768, inclusive.

HLR H30170
The 4-byte big-endian unsigned integer field at byte offset 64 of a
well-formed database image shall be set to the value of the database
incremental-vacuum flag.

HLR H30171



The value of the incremental-vacuum flag of an SQLite database shall be
either 0 or 1.

HLR H30180
In a well-formed non-autovacuum database (one with a zero stored
in the 4-byte big-endian integer value beginning at byte offset
52 of the database file header, the incremental vacuum flag is
set to 0.






HLR H30200

The size of a well formed database image shall be an integer
multiple of the database page size.

HLR H30210

Each page of a well formed database image shall be exactly one of a
<i>B-Tree page</i>, an <i>overflow page</i>, a <i>free page</i>, a
<i>pointer-map page</i> or the <i>locking page</i>.

HLR H30220
The database page that starts at byte offset 2<sup>30</sup>, the
<i>locking page</i>, shall never be used for any purpose.

HLR H30230
In a <i>well-formed database file</i>, the portion of the first
database page not consumed by the database file-header (all but the
first 100 bytes) contains the root node of a table B-Tree,
the <i>schema table</i>.

................................................................................
If another database client holds either a RESERVED or PENDING lock on the
database file-system representation, then any journal file that exists within
the file system shall be considered invalid.


HLR H33040
A database writer shall increment the value of the database header change
counter field, a 4-byte big-endian unsigned integer field stored at byte offset 24
of the database header, as part of the first database image modification 
that it performs after obtaining an EXCLUSIVE lock.

HLR H33050
A database writer shall increment the value of the database schema version 
field, a 4-byte big-endian unsigned integer field stored at byte offset 40
of the database header, as part of the first database image modification that
includes a schema change that it performs after obtaining an EXCLUSIVE lock.

HLR H33070
If a database writer is required by either H33050 or H33040 to increment a
database header field, and that header field already contains the maximum
value possible (0xFFFFFFFF, or 4294967295 for 32-bit unsigned integer 
fields), "incrementing" the field shall be interpreted to mean setting it to 
zero.