Documentation Source Text

Check-in [9c0dadcbb7]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Fill in some missing definitions and combine some requirements in fileformat.html.
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1:9c0dadcbb7caba1a4644103db90bf0149efaee01
User & Date: dan 2009-05-24 23:20:55
Context
2009-05-25
12:34
Preparations for the 3.6.14.2 release. check-in: 57157e0e9e user: drh tags: trunk
2009-05-24
23:20
Fill in some missing definitions and combine some requirements in fileformat.html. check-in: 9c0dadcbb7 user: dan tags: trunk
22:00
Remove requirement H16124. If the 3rd parameter to sqlite3_create_function is outside the range -1...127 then the behavior is undefined. check-in: 4f36464caa user: drh tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to pages/fileformat.in.

209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
...
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
...
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
...
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
...
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
...
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
...
407
408
409
410
411
412
413
414



415





416



417
418






419



420



421




422




423



424
425
426
427
428
429
430
...
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
...
602
603
604
605
606
607
608


609
610
611
612
613
614
615
616
617
618
619
620
621
622
...
766
767
768
769
770
771
772
773
774
775
776
777
778


779




















780

781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
...
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
....
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
....
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
....
2287
2288
2289
2290
2291
2292
2293
2294
2295
2296
2297
2298
2299
2300
2301
....
2451
2452
2453
2454
2455
2456
2457
2458
2459
2460
2461
2462
2463
2464
2465
....
2945
2946
2947
2948
2949
2950
2951
2952
2953
2954
2955
2956
2957
2958
2959
2960
2961
2962
2963
2964
2965
2966
2967
2968
2969
2970
2971
2972
2973
          readers and writers.
    </ul>

  [h2 "Glossary"]
    <table id=glossary>
      <tr><td>Auto-vacuum last root-page<td>
        A page number stored as 32-bit integer at byte offset 52 of the
        database file header (see section <cite>file_header</cite>). In
        an auto-vacuum database, this is the numerically largest 
        <i>root-page</i> number in the database. Additionally, all pages that
        occur before this page in the database are either B-Tree <i>root
        pages</i>, <i>pointer-map pages</i> or the <i>locking page</i>.

      <tr><td>Auto-vacuum database      <td>
        Each database is either an auto-vacuum database or a non auto-vacuum
        database. Auto-vacuum databases feature pointer-map pages (section
        <cite>pointer_map_pages</cite>) and have a non-zero value stored
        as a 4-byte big-endian integer at offset 52 of the file header (section
        <cite>file_header</cite>).
      <tr><td>B-Tree                    <td>
        A B-Tree is a tree structure optimized for offline storage. The table
        and index data in an SQLite database file is stored in B-Tree
        structures.

      <tr><td>B-Tree cell               <td>
        Each database page that is part of a B-Tree structure contains zero
................................................................................
      <tr><td>Cell content area         <td>
        The area within a B-Tree page in which the B-Tree cells are stored.

      <tr><td>(Database) text encoding  <td>
        The text encoding used for all text values in the database file. One
        of UTF-8, big-endian UTF-16 and little-endian UTF-16. The database
        text encoding is defined by a 4 byte field stored at byte offset
        56 of the database file header (see section <cite>file_header</cite>).

      <tr><td>(Database) file header    <td>
        The first 100 bytes of an SQLite database file constitute the
        database file header.

      <tr><td>(Database) page size      <td>
        An SQLite database file is divided into one or more pages of
................................................................................
        Following the database record header in each database record is
        the database record data area. It contains the actual data (string
        content, numeric value etc.) of all values in the record 
        (see section <cite>record_format</cite>).

      <tr><td>Default pager cache size  <td>
        A 32-bit integer field stored at byte offset 48 of the database file
        header (see section <cite>file_header</cite>).

      <tr><td style="white-space:nowrap">(Database) usable page size <td>
        The number of bytes of each database page that is usable. This
        is the page-size less the number of bytes left unused at the end
        of each page. The number of bytes left unused is governed by the
        value stored at offset 20 of the file header (see section
        <cite>file_header</cite>).

      <tr><td>File format read version  <td>
        Single byte field stored at byte offset 20 of the database file header
        (see section <cite>file_header</cite>).

      <tr><td>File format write version  <td>
        Single byte field stored at byte offset 19 of the database file header
        (see section <cite>file_header</cite>).

      <tr><td>File change counter       <td>
        A 32-bit integer field stored at byte offset 24 of the database file
        header (see section <cite>file_header</cite>). Normally, SQLite
        increments this value each time it commits a transaction.

      <tr><td>Fragment                  <td>
        A block of 3 or less bytes of unused space within the cell content
        area of a B-Tree page.

      <tr><td>Free block                <td>
................................................................................
      <tr><td>Index B-Tree              <td>
        One of two variants on the B-Tree data structure used within SQLite
        database files. An index B-Tree (section <cite>index_btrees</cite>)
        uses database records as keys.

      <tr><td>Incremental Vacuum flag   <td>
        A 32-bit integer field stored at byte offset 64 of the database file
        header (see section <cite>file_header</cite>). In auto-vacuum 
        databases, if this field is non-zero then the database is not
        automatically compacted at the end of each transaction.

      <tr><td>Locking page              <td>
        The database page that begins at the 1GB (2<sup>30</sup> byte)
        boundary. This page is always left unused.

................................................................................
        database properties that may be set by the user (auto-vacuum,
        page-size, user-cookie value etc.),

      <tr><td>Non-auto-vacuum database  <td>
        Any database that is not an auto-vacuum database. A non-auto-vacuum
        database contains no pointer-map pages and has a zero value stored
        in the 4-byte big-endian integer field at offset 52 of the database
        file header (section <cite>file_header</cite>).

      <tr><td>Overflow chain             <td>
        A linked list of overflow pages across which a single (large)
        database record is stored (see section 
        <cite>overflow_page_chains</cite>).

      <tr><td>Overflow page             <td>
................................................................................

      <tr><td>Root page                 <td>
        A root page is a database page used to store the root node of a
        B-Tree data structure.

      <tr><td>Schema layer file format  <td>
        An integer between 1 and 4 stored as a 4 byte big-endian integer at
        offset 44 of the file header (section <cite>file_header</cite>).
        Certain file format constructions may only be present in databases
        with a certain minimum schema layer file format value.

      <tr><td>Schema table              <td>
        The table B-Tree with root-page 1 used to store database records
        describing the database schema. Accessible as the "sqlite_master" 
        table from within SQLite.

      <tr><td>Schema version            <td>
        A 32-bit integer field stored at byte offset 40 of the database file
        header (see section <cite>file_header</cite>). Normally, SQLite
        increments this value each time it modifies the databas schema.

      <tr><td>Table B-Tree              <td>
        One of two variants on the B-Tree data structure used within SQLite
        database files. A table B-Tree (section <cite>table_btrees</cite>)
        uses 64 bit integers as key values and stores an associated database
        record along with each key value.

      <tr><td>User cookie               <td>
        A 32-bit integer field stored at byte offset 60 of the database file
        header (see section <cite>file_header</cite>). Normally, SQLite
        increments this value each time it modifies the databas schema.

      <tr><td>Variable Length Integer   <td>
        A format used for storing 64-bit signed integer values in SQLite 
        database files. Consumes between 1 and 9 bytes of space, depending
        on the precise value being stored.

................................................................................
        An SQLite database file that meets all the criteria laid out in
        section <cite>database_file_format</cite> of this document.

      [Glossary "Database image" {
        A serialized blob of data representing an SQLite database. The
        contents of a database file are usually a valid database image.
      }]
      [Glossary "Database file" {<span class=todo>This.</span>}]



      [Glossary "Journal file" {<span class=todo>This.</span>}]





      [Glossary "Page size" {<span class=todo>This.</span>}]



      [Glossary "Sector size" {<span class=todo>This.</span>}]







      [Glossary "Journal Section" {<span class=todo>This.</span>}]



      [Glossary "Journal Header" {<span class=todo>This.</span>}]



      [Glossary "Journal Record" {<span class=todo>This.</span>}]




      [Glossary "Master Journal Pointer" {<span class=todo>This.</span>}]




      [Glossary "Database File-System Representation" {<span class=todo>This.</span>}]




    </table>

<!--
h1 "SQLite Database Files" sqlite_database_files
 
  <p>
................................................................................

    <p>
      The following sections and sub-sections describe precisely the format
      used to serialize the B-Tree structures within an SQLite database image.

  [h2 "Global Structure"]

    [h3 "File Header" "file_header"]
      <p>
        Each SQLite database file begins with a 100-byte header. The header
        file consists of a well known 16-byte sequence followed by a series of
        1, 2 and 4 byte unsigned integers. All integers in the file header (as
        well as the rest of the database file) are stored in big-endian format.
        
      <p>
        The well known 16-byte sequence that begins every SQLite database file
        is:
................................................................................
      <pre>
          0x53 0x51 0x4c 0x69 0x74 0x65 0x20 0x66 0x6f 0x72 0x6d 0x61 0x74 0x20 0x33 0x00</pre>

      <p>
        Interpreted as UTF-8 encoded text, this byte sequence corresponds 
        to the string "SQLite format 3" followed by a nul-terminator byte.



      <p>
        The 1, 2 and 4 byte unsigned integers that make up the rest of the
        database file header are described in the following table.

      [Table]
        [Tr]<th>Byte Range <th>Byte Size <th width=100%>Description
        [Tr]<td>16..17 <td>2<td>
            Database page size in bytes. See section 
            <cite>pages_and_page_types</cite> for details.

        [Tr]<td>18     <td>1<td>
            <p style="margin-top:0">
            File-format "write version". Currently, this field
            is always set to 1. If a value greater than 1 is read by SQLite,
................................................................................

      <p>
        The four byte block beginning at offset 28 is unused. As is the
        32 byte block beginning at offset 68.
      </p>

      <p>
        Some of the following requirements state that certain database header
        fields must contain defined constant values, even though the sqlite 
        database file format is designed to allow various values. This is
        done to artificially constrain the definition of a 
        <i>well-formed database</i> in order to make implementation and 
        testing more practical.























          [fileformat_import_requirement2 H30030]


      <p>
        Following the 16 byte magic string in the file header is the
        <i>page size</i>, a 2-byte field. See section
        <cite>pages_and_page_types</cite> for details.

          [fileformat_import_requirement2 H30040]
          [fileformat_import_requirement2 H30050]
          [fileformat_import_requirement2 H30060]
          [fileformat_import_requirement2 H30070]
          [fileformat_import_requirement2 H30080]
          [fileformat_import_requirement2 H30090]
          [fileformat_import_requirement2 H30100]

      <p>
        Following the <i>file change counter</i> in the database header are
        two 4-byte fields related to the database file <i>free page list</i>.
        See section <cite>free_page_list</cite> for details.

................................................................................
        In a non-auto-vacuum database, the B-Tree root pages may be stored
        anywhere within the database file. For an auto-vacuum database, all
        B-Tree root pages must at all times form a contiguous set starting
        at page 3 of the database file, skipping any pages that are required to
        be used as pointer-map pages (see section
        <cite>pointer_map_pages</cite>).
      <p>
        As noted in section <cite>file_header</cite>, in an auto-vacuum
        database the page number of the page immediately following the
        final root page in the contiguous set of root pages is stored
        as a 4 byte big-endian integer at byte offset 52 of the database
        file header. Unless that page is itself a pointer-map page, in which
        case the page number of the page following it is stored instead.

      <p>
................................................................................
        in the records data area. If the corresponding integer type value
        in the record header is 0 (NULL), 8 (integer value 0) or 9 (integer
        value 1), then the blob of data is zero bytes in length. Otherwise,
        the length of the data field is as described in the table above.
      <p>
        The data field associated with a string value contains the string
        encoded using the database encoding, as defined in the database
        file header (see section <cite>file_header</cite>). No 
        nul-terminator character is stored in the database.

          [fileformat_import_requirement2 H30560]
          [fileformat_import_requirement2 H30570]
          [fileformat_import_requirement2 H30580]
          [fileformat_import_requirement2 H30590]
          [fileformat_import_requirement2 H30600]
................................................................................

      [Figure freelistpage.gif figure_freelistpage "Free List Trunk Page Format"]
    <p>
      All trunk pages in the free-list except for the first contain the 
      maximum possible number of references to leaf pages. <span class=todo>Is this actually true in an auto-vacuum capable database?</span> The page number
      of the first page in the linked list of free-list trunk pages is 
      stored as a 4-byte big-endian unsigned integer at offset 32 of the
      file header (section <cite>file_header</cite>).

          [fileformat_import_requirement2 H31240]
          [fileformat_import_requirement2 H31250]
          [fileformat_import_requirement2 H31260]
          [fileformat_import_requirement2 H31270]
          [fileformat_import_requirement2 H31280]
          [fileformat_import_requirement2 H31290]
................................................................................
  </ol>

  <p>
    Usually, the database image is simply the contents of the database file. 
    In this case, reading the database image is straightforward. The
    page-size used by the database image can be read from the 2-byte
    big-endian integer field stored at byte offset 16 of
    the database file (see section <cite>file_header</cite>). The number of
    pages in the database image can be determined by querying the size of
    the database file in bytes and then dividing by the <i>page-size</i>.
    Reading the contents of a <i>database page</i> is a simple matter of 
    reading a block of <i>page-size</i> bytes from an offset calculated from
    the page-number of the required page:
    <pre>
        <i>offset</i> := (<i>page-number</i> - 1) * page-size
................................................................................

    <li> <p><b>Locking Requirements</b>. Section <cite>locking_protocol</cite>
         contains a description of the file-system locks that must be obtained
         on the database file, and how locks placed by other database clients 
         should be interpreted.

    <li> <p><b>Header Cookie Requirements</b>. An SQLite database image header 
         (see section <cite>file_header</cite>) contains two "cookie" values
         that must sometimes be incremented when the database image stored in
         the file-system is updated. Section 
         <cite>database_header_cookies_protocol</cite> contains requirements
         identifying exactly when the cookie values must be incremented, and
         how they can be used by a database client to determine if cached
         data is valid or not.
  </ul> 
................................................................................
    </ul>

    <p>
      Similar mechanisms are used to support cache validation for each class
      of data. If a database writer changes the database schema in any way, it
      is also required to increment the value stored in the database schema
      version field of the database image header (see section 
      <cite>file_header</cite>). This way, when a database reader establishes
      a SHARED lock on a database file-system representation, it may validate
      any cached schema data by checking if the value of the database schema 
      version field has changed since the data was cached. If the value has not
      changed, then the cached schema data may be retained and reused. 
      Otherwise, if the value of the database schema version field is not the
      same as it was when the schema data was last cached, then the reader
      can deduce that some other database client has modified the database
      schema in some way and it must be reparsed.

    <p>
      Each time a database image stored within a database file-system 
      representation is modified, the database writer is required to increment
      the value stored in the change counter field of the database image header
      (see section <cite>file_header</cite>). This allows database readers to
      validate any cache of raw database image page content that may be present
      when a database reader establishes a SHARED (or other) lock on the 
      database file-system representation. If the value stored in the change
      counter field of the database image has not changed since the cached
      data was read, then it may be safely reused. Otherwise, if the change
      counter value has changed, then any cached page content data must be
      deemed untrustworthy and discarded.







|










|







 







|







 







|






|



|



|



|







 







|







 







|







 







|










|










|







 







|
>
>
>
|
>
>
>
>
>
|
>
>
>
|
<
>
>
>
>
>
>
|
>
>
>
|
>
>
>
|
>
>
>
>
|
>
>
>
>
|
>
>
>







 







|

|







 







>
>






|







 







|

|
|
|
|
>
>

>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
|
>






<
<
<
<
<
<







 







|







 







|







 







|







 







|







 







|







 







|













|







209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
...
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
...
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
...
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
...
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
...
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
...
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428

429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
...
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
...
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
...
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844






845
846
847
848
849
850
851
...
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
....
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
....
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
....
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
2349
2350
2351
2352
2353
....
2503
2504
2505
2506
2507
2508
2509
2510
2511
2512
2513
2514
2515
2516
2517
....
2997
2998
2999
3000
3001
3002
3003
3004
3005
3006
3007
3008
3009
3010
3011
3012
3013
3014
3015
3016
3017
3018
3019
3020
3021
3022
3023
3024
3025
          readers and writers.
    </ul>

  [h2 "Glossary"]
    <table id=glossary>
      <tr><td>Auto-vacuum last root-page<td>
        A page number stored as 32-bit integer at byte offset 52 of the
        database file header (see section <cite>database_header</cite>). In
        an auto-vacuum database, this is the numerically largest 
        <i>root-page</i> number in the database. Additionally, all pages that
        occur before this page in the database are either B-Tree <i>root
        pages</i>, <i>pointer-map pages</i> or the <i>locking page</i>.

      <tr><td>Auto-vacuum database      <td>
        Each database is either an auto-vacuum database or a non auto-vacuum
        database. Auto-vacuum databases feature pointer-map pages (section
        <cite>pointer_map_pages</cite>) and have a non-zero value stored
        as a 4-byte big-endian integer at offset 52 of the file header (section
        <cite>database_header</cite>).
      <tr><td>B-Tree                    <td>
        A B-Tree is a tree structure optimized for offline storage. The table
        and index data in an SQLite database file is stored in B-Tree
        structures.

      <tr><td>B-Tree cell               <td>
        Each database page that is part of a B-Tree structure contains zero
................................................................................
      <tr><td>Cell content area         <td>
        The area within a B-Tree page in which the B-Tree cells are stored.

      <tr><td>(Database) text encoding  <td>
        The text encoding used for all text values in the database file. One
        of UTF-8, big-endian UTF-16 and little-endian UTF-16. The database
        text encoding is defined by a 4 byte field stored at byte offset
        56 of the database file header (see section <cite>database_header</cite>).

      <tr><td>(Database) file header    <td>
        The first 100 bytes of an SQLite database file constitute the
        database file header.

      <tr><td>(Database) page size      <td>
        An SQLite database file is divided into one or more pages of
................................................................................
        Following the database record header in each database record is
        the database record data area. It contains the actual data (string
        content, numeric value etc.) of all values in the record 
        (see section <cite>record_format</cite>).

      <tr><td>Default pager cache size  <td>
        A 32-bit integer field stored at byte offset 48 of the database file
        header (see section <cite>database_header</cite>).

      <tr><td style="white-space:nowrap">(Database) usable page size <td>
        The number of bytes of each database page that is usable. This
        is the page-size less the number of bytes left unused at the end
        of each page. The number of bytes left unused is governed by the
        value stored at offset 20 of the file header (see section
        <cite>database_header</cite>).

      <tr><td>File format read version  <td>
        Single byte field stored at byte offset 20 of the database file header
        (see section <cite>database_header</cite>).

      <tr><td>File format write version  <td>
        Single byte field stored at byte offset 19 of the database file header
        (see section <cite>database_header</cite>).

      <tr><td>File change counter       <td>
        A 32-bit integer field stored at byte offset 24 of the database file
        header (see section <cite>database_header</cite>). Normally, SQLite
        increments this value each time it commits a transaction.

      <tr><td>Fragment                  <td>
        A block of 3 or less bytes of unused space within the cell content
        area of a B-Tree page.

      <tr><td>Free block                <td>
................................................................................
      <tr><td>Index B-Tree              <td>
        One of two variants on the B-Tree data structure used within SQLite
        database files. An index B-Tree (section <cite>index_btrees</cite>)
        uses database records as keys.

      <tr><td>Incremental Vacuum flag   <td>
        A 32-bit integer field stored at byte offset 64 of the database file
        header (see section <cite>database_header</cite>). In auto-vacuum 
        databases, if this field is non-zero then the database is not
        automatically compacted at the end of each transaction.

      <tr><td>Locking page              <td>
        The database page that begins at the 1GB (2<sup>30</sup> byte)
        boundary. This page is always left unused.

................................................................................
        database properties that may be set by the user (auto-vacuum,
        page-size, user-cookie value etc.),

      <tr><td>Non-auto-vacuum database  <td>
        Any database that is not an auto-vacuum database. A non-auto-vacuum
        database contains no pointer-map pages and has a zero value stored
        in the 4-byte big-endian integer field at offset 52 of the database
        file header (section <cite>database_header</cite>).

      <tr><td>Overflow chain             <td>
        A linked list of overflow pages across which a single (large)
        database record is stored (see section 
        <cite>overflow_page_chains</cite>).

      <tr><td>Overflow page             <td>
................................................................................

      <tr><td>Root page                 <td>
        A root page is a database page used to store the root node of a
        B-Tree data structure.

      <tr><td>Schema layer file format  <td>
        An integer between 1 and 4 stored as a 4 byte big-endian integer at
        offset 44 of the file header (section <cite>database_header</cite>).
        Certain file format constructions may only be present in databases
        with a certain minimum schema layer file format value.

      <tr><td>Schema table              <td>
        The table B-Tree with root-page 1 used to store database records
        describing the database schema. Accessible as the "sqlite_master" 
        table from within SQLite.

      <tr><td>Schema version            <td>
        A 32-bit integer field stored at byte offset 40 of the database file
        header (see section <cite>database_header</cite>). Normally, SQLite
        increments this value each time it modifies the databas schema.

      <tr><td>Table B-Tree              <td>
        One of two variants on the B-Tree data structure used within SQLite
        database files. A table B-Tree (section <cite>table_btrees</cite>)
        uses 64 bit integers as key values and stores an associated database
        record along with each key value.

      <tr><td>User cookie               <td>
        A 32-bit integer field stored at byte offset 60 of the database file
        header (see section <cite>database_header</cite>). Normally, SQLite
        increments this value each time it modifies the databas schema.

      <tr><td>Variable Length Integer   <td>
        A format used for storing 64-bit signed integer values in SQLite 
        database files. Consumes between 1 and 9 bytes of space, depending
        on the precise value being stored.

................................................................................
        An SQLite database file that meets all the criteria laid out in
        section <cite>database_file_format</cite> of this document.

      [Glossary "Database image" {
        A serialized blob of data representing an SQLite database. The
        contents of a database file are usually a valid database image.
      }]
      [Glossary "Database file" {
        A database file is a file on disk that usually, but not always,
        contains a well-formed database image.
      }]
      [Glossary "Journal file" {
        For each database file, there may exist an associated journal file
	stored in the same file-system directory. Under some circumstances,
	the database image may be distributed between the database and journal
	files (instead of being stored wholly within the database file).
      }]
      [Glossary "Page size" {
        An SQLite database image is divided into fixed size pages, each 
        "page size" bytes in size.
      }]
      [Glossary "Sector size" {

        In this document, the term "sector size" refers to a field in a
	journal header which determines some aspects of the layout of the
	journal file. It is set by SQLite (or a compatible) application
	based on the properties of the underlying file-system that the journal
	file is being written to.
      }]
      [Glossary "Journal Section" {
	A journal file may contain multiple journal sections. A journal section
	consists of a journal header followed by zero or more journal records.
      }]
      [Glossary "Journal Header" {
	A journal header is a control block sector-size bytes in size that
	appears at the start of each journal section within a journal file.
      }]
      [Glossary "Journal Record" {
	A journal record is a structure used to store data for a single
	database page within a journal file. A single journal file may contain
	many journal records.
      }]
      [Glossary "Master Journal Pointer" {
        A master journal pointer is a structure that may appear at the end of
	a journal file. It contains a full file-system path identifying 
	a master-journal file.
      }]
      [Glossary "Database File-System Representation" {
        A file or files within the file-system used to store an SQLite 
        database image.
      }]

    </table>

<!--
h1 "SQLite Database Files" sqlite_database_files
 
  <p>
................................................................................

    <p>
      The following sections and sub-sections describe precisely the format
      used to serialize the B-Tree structures within an SQLite database image.

  [h2 "Global Structure"]

    [h3 "Database Header" "database_header"]
      <p>
        An SQLite database image begins with a 100-byte database header. The header
        file consists of a well known 16-byte sequence followed by a series of
        1, 2 and 4 byte unsigned integers. All integers in the file header (as
        well as the rest of the database file) are stored in big-endian format.
        
      <p>
        The well known 16-byte sequence that begins every SQLite database file
        is:
................................................................................
      <pre>
          0x53 0x51 0x4c 0x69 0x74 0x65 0x20 0x66 0x6f 0x72 0x6d 0x61 0x74 0x20 0x33 0x00</pre>

      <p>
        Interpreted as UTF-8 encoded text, this byte sequence corresponds 
        to the string "SQLite format 3" followed by a nul-terminator byte.

          [fileformat_import_requirement2 H30030]

      <p>
        The 1, 2 and 4 byte unsigned integers that make up the rest of the
        database file header are described in the following table.

      [Table]
        [Tr]<th>Byte Range <th>Byte Size <th width=100%>Description
	[Tr]<td>16..17 <td>2<td>
            Database page size in bytes. See section 
            <cite>pages_and_page_types</cite> for details.

        [Tr]<td>18     <td>1<td>
            <p style="margin-top:0">
            File-format "write version". Currently, this field
            is always set to 1. If a value greater than 1 is read by SQLite,
................................................................................

      <p>
        The four byte block beginning at offset 28 is unused. As is the
        32 byte block beginning at offset 68.
      </p>

      <p>
        The following requirements state that certain database header
        fields must contain defined constant values, even though the sqlite 
        database file format is designed to allow various values. These fields
        were intended to be flexible when the SQLite database image format
        was designed, but it has since been determined that it is faster and
        safer to require these parameters to be populated with well-known 
        values. Specifically, in a well-formed database, the following header
        fields are always set to well-known values:

      <ul>
        <li> The file-format write version (single byte field, byte offset 18), 
             is always set to 0x01.
        <li> The file-format read version (single byte field, byte offset 19), 
             is always set to 0x01.
        <li> The number of unused bytes on each page (single byte field, byte 
             offset 20), is always set to 0x01.
        <li> The maximum fraction of an index B-Tree page to use for embedded content 
	     (single byte field, byte offset 21), is always set to 0x40.  <li>
	     The minimum fraction of an index B-Tree page to use for embedded
	     content when using overflow pages (single byte field, byte 
             offset 22), is always set to 0x20.
	<li> The minimum fraction of a table B-Tree page to use for embedded
	     content when using overflow pages (single byte field, byte offset 23),
	     is always set to 0x20.
      </ul>

      <p>
        The following requirement encompasses all of the above.

          [fileformat_import_requirement2 H30040]


      <p>
        Following the 16 byte magic string in the file header is the
        <i>page size</i>, a 2-byte field. See section
        <cite>pages_and_page_types</cite> for details.







          [fileformat_import_requirement2 H30100]

      <p>
        Following the <i>file change counter</i> in the database header are
        two 4-byte fields related to the database file <i>free page list</i>.
        See section <cite>free_page_list</cite> for details.

................................................................................
        In a non-auto-vacuum database, the B-Tree root pages may be stored
        anywhere within the database file. For an auto-vacuum database, all
        B-Tree root pages must at all times form a contiguous set starting
        at page 3 of the database file, skipping any pages that are required to
        be used as pointer-map pages (see section
        <cite>pointer_map_pages</cite>).
      <p>
        As noted in section <cite>database_header</cite>, in an auto-vacuum
        database the page number of the page immediately following the
        final root page in the contiguous set of root pages is stored
        as a 4 byte big-endian integer at byte offset 52 of the database
        file header. Unless that page is itself a pointer-map page, in which
        case the page number of the page following it is stored instead.

      <p>
................................................................................
        in the records data area. If the corresponding integer type value
        in the record header is 0 (NULL), 8 (integer value 0) or 9 (integer
        value 1), then the blob of data is zero bytes in length. Otherwise,
        the length of the data field is as described in the table above.
      <p>
        The data field associated with a string value contains the string
        encoded using the database encoding, as defined in the database
        file header (see section <cite>database_header</cite>). No 
        nul-terminator character is stored in the database.

          [fileformat_import_requirement2 H30560]
          [fileformat_import_requirement2 H30570]
          [fileformat_import_requirement2 H30580]
          [fileformat_import_requirement2 H30590]
          [fileformat_import_requirement2 H30600]
................................................................................

      [Figure freelistpage.gif figure_freelistpage "Free List Trunk Page Format"]
    <p>
      All trunk pages in the free-list except for the first contain the 
      maximum possible number of references to leaf pages. <span class=todo>Is this actually true in an auto-vacuum capable database?</span> The page number
      of the first page in the linked list of free-list trunk pages is 
      stored as a 4-byte big-endian unsigned integer at offset 32 of the
      file header (section <cite>database_header</cite>).

          [fileformat_import_requirement2 H31240]
          [fileformat_import_requirement2 H31250]
          [fileformat_import_requirement2 H31260]
          [fileformat_import_requirement2 H31270]
          [fileformat_import_requirement2 H31280]
          [fileformat_import_requirement2 H31290]
................................................................................
  </ol>

  <p>
    Usually, the database image is simply the contents of the database file. 
    In this case, reading the database image is straightforward. The
    page-size used by the database image can be read from the 2-byte
    big-endian integer field stored at byte offset 16 of
    the database file (see section <cite>database_header</cite>). The number of
    pages in the database image can be determined by querying the size of
    the database file in bytes and then dividing by the <i>page-size</i>.
    Reading the contents of a <i>database page</i> is a simple matter of 
    reading a block of <i>page-size</i> bytes from an offset calculated from
    the page-number of the required page:
    <pre>
        <i>offset</i> := (<i>page-number</i> - 1) * page-size
................................................................................

    <li> <p><b>Locking Requirements</b>. Section <cite>locking_protocol</cite>
         contains a description of the file-system locks that must be obtained
         on the database file, and how locks placed by other database clients 
         should be interpreted.

    <li> <p><b>Header Cookie Requirements</b>. An SQLite database image header 
         (see section <cite>database_header</cite>) contains two "cookie" values
         that must sometimes be incremented when the database image stored in
         the file-system is updated. Section 
         <cite>database_header_cookies_protocol</cite> contains requirements
         identifying exactly when the cookie values must be incremented, and
         how they can be used by a database client to determine if cached
         data is valid or not.
  </ul> 
................................................................................
    </ul>

    <p>
      Similar mechanisms are used to support cache validation for each class
      of data. If a database writer changes the database schema in any way, it
      is also required to increment the value stored in the database schema
      version field of the database image header (see section 
      <cite>database_header</cite>). This way, when a database reader establishes
      a SHARED lock on a database file-system representation, it may validate
      any cached schema data by checking if the value of the database schema 
      version field has changed since the data was cached. If the value has not
      changed, then the cached schema data may be retained and reused. 
      Otherwise, if the value of the database schema version field is not the
      same as it was when the schema data was last cached, then the reader
      can deduce that some other database client has modified the database
      schema in some way and it must be reparsed.

    <p>
      Each time a database image stored within a database file-system 
      representation is modified, the database writer is required to increment
      the value stored in the change counter field of the database image header
      (see section <cite>database_header</cite>). This allows database readers to
      validate any cache of raw database image page content that may be present
      when a database reader establishes a SHARED (or other) lock on the 
      database file-system representation. If the value stored in the change
      counter field of the database image has not changed since the cached
      data was read, then it may be safely reused. Otherwise, if the change
      counter value has changed, then any cached page content data must be
      deemed untrustworthy and discarded.

Changes to req/hlr30000.txt.

4
5
6
7
8
9
10
11


12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
a <i>well-formed SQLite database file</i>.

HLR H30020
The system shall ensure that at the successful conclusion of a
database transaction the contents of the database file are a valid
serialization of the contents of the logical SQL database produced
by the transaction.



HLR H30030
The first 16 bytes of a well-formed database file contain the UTF-8
encoding of the string "SQLite format 3" followed by a single
nul-terminator byte.

HLR H30040
The 19th byte (byte offset 18), the <i>file-format write version</i>,
of a well-formed database file contains the value 0x01.

HLR H30050
The 20th byte (byte offset 19), the <i>file-format read version</i>,
of a well-formed database file contains the value 0x01.

HLR H30060
The 21st byte (byte offset 20), the number of unused bytes on each
page, of a well-formed database file shall contain the value 0x00.

HLR H30070
The 22nd byte (byte offset 21), the maximum fraction of an index
B-Tree page to use for embedded content, of a well-formed database
file shall contain the value 0x40.


HLR H30080
The 23rd byte (byte offset 22), the minimum fraction of an index
B-Tree page to use for embedded content when using overflow pages,
of a well-formed database file contains the value 0x20.

HLR H30090
The 24th byte (byte offset 23), the minimum fraction of a table
B-Tree page to use for embedded content when using overflow pages,
of a well-formed database file contains the value 0x20.

HLR H30100
The 4 byte block starting at byte offset 24 of a well-formed
database file contains the <i>file change counter</i> formatted
as a 4-byte big-endian integer.

HLR H30110








>
>

|
|
|


<
|
<
<
<
<
<
<
<
<
<
<
<
<
|
>

<
<
<
<
<
<
<
<
<







4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

20












21
22
23









24
25
26
27
28
29
30
a <i>well-formed SQLite database file</i>.

HLR H30020
The system shall ensure that at the successful conclusion of a
database transaction the contents of the database file are a valid
serialization of the contents of the logical SQL database produced
by the transaction.



HLR H30030
The first 16 bytes of a well-formed database file shall contain 
the UTF-8 encoding of the string "SQLite format 3" followed by a 
single nul-terminator byte.

HLR H30040

The 6 bytes beginning at byte offset 18 of a well-formed database 












image shall contain the values 0x01, 0x01, 0x00, 0x40, 0x20 and 
0x20, respectively.











HLR H30100
The 4 byte block starting at byte offset 24 of a well-formed
database file contains the <i>file change counter</i> formatted
as a 4-byte big-endian integer.

HLR H30110