Documentation Source Text

Check-in [e9007b6030]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Also add a table of contents to fileformat2.html.
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: e9007b6030f64e13c3b0f887195dc6d69798a8bb
User & Date: dan 2016-04-26 19:37:32
Context
2016-05-02
11:03
Updates to the change log for 3.13.0. check-in: 53e8efc79e user: drh tags: trunk
2016-04-26
19:37
Also add a table of contents to fileformat2.html. check-in: e9007b6030 user: dan tags: trunk
17:59
Add automatically generated tables of contents to rbu.html, cli.html and datatype3.html. check-in: bd775e82aa user: dan tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to pages/fileformat2.in.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
..
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
..
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
...
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
...
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
...
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
...
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
...
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
...
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
...
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
...
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
...
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
...
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
...
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
...
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
...
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
...
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
...
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
...
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
....
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
....
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
....
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
....
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
....
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
....
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
....
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
....
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
....
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
....
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
....
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
....
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
....
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
....
1535
1536
1537
1538
1539
1540
1541

1542
1543
1544
1545
1546
1547
1548
....
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
....
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
....
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
....
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
....
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
<title>File Format For SQLite Databases</title>
<tcl>hd_keywords {file format} {second edition file format document}</tcl>

<h1 align=center>
The SQLite Database File Format
</h1>

<p>This document describes and defines the on-disk database file
format used by SQLite.</p>

<h2>1.0 The Database File</h2>

<p>The complete state of an SQLite database is usually
contained a single file on disk called the "main database file".</p>

<p>During a transaction, SQLite stores additional information 
in a second file called the "rollback journal", or if SQLite is in
[WAL mode], a write-ahead log file.
................................................................................
the state of the database, they are called a "hot journal" or "hot WAL file".
Hot journals and WAL files are only a factor during error recovery
scenarios and so are uncommon, but they are part of the state of an SQLite
database and so cannot be ignored.  This document defines the format
of a rollback journal and the write-ahead log file, but the focus is
on the main database file.</p>

<h3>1.1 Pages</h3>

<p>The main database file consists of one or more pages.  ^The size of a
page is a power of two between 512 and 65536 inclusive.  All pages within
the same database are the same size.  ^The page size for a database file
is determined by the 2-byte integer located at an offset of
16 bytes from the beginning of the database file.</p>

................................................................................
rolled back, the rollback journal can then be used to restore the
database to its original state.  ^Freelist leaf pages bear no
information that would need to be restored on a rollback and so they
are not written to the journal prior to modification, in order to
reduce disk I/O.</p>

<tcl>hd_fragment database_header {database header}</tcl>
<h3>1.2 The Database Header</h3>

<p>The first 100 bytes of the database file comprise the database file 
header.  The database file header is divided into fields as shown by
the table below.  All multibyte fields in the database file header are
stored with the most significant byte first (big-endian).</p>

<center>
................................................................................
Reserved for expansion.  Must be zero.
<tr><td valign=top align=center>92<td valign=top align=center>4<td align=left>
The [version-valid-for number].
<tr><td valign=top align=center>96<td valign=top align=center>4<td align=left>
[SQLITE_VERSION_NUMBER]
</table></center>

<h4>1.2.1 Magic Header String</h4>

<p>^Every valid SQLite database file begins with the following 16 bytes 
(in hex): 53 51 4c 69 74 65 20 66 6f 72 6d 61 74 20 33 00.  This byte sequence
corresponds to the UTF-8 string "SQLite format 3" including the nul
terminator character at the end.</p>

<h4>1.2.2 Page Size</h4>

<p>The two-byte value beginning at offset 16 determines the page size of 
the database.  For SQLite versions 3.7.0.1 and earlier, this value is 
interpreted as a big-endian integer and must be a power of two between
512 and 32768, inclusive.  Beginning with SQLite version 3.7.1, a page
size of 65536 bytes is supported.  The value 65536 will not fit in a
two-byte integer, so to specify a 65536-byte page size, the value
................................................................................
at offset 16 is 0x00 0x01.
This value can be interpreted as a big-endian
1 and thought of is as a magic number to represent the 65536 page size.
Or one can view the two-byte field as a little endian number and say
that it represents the page size divided by 256.  These two 
interpretations of the page-size field are equivalent.</p>

<h4>1.2.3 File format version numbers</h4>

<p>The file format write version and file format read version at offsets
18 and 19 are intended to allow for enhancements of the file format
in future versions of SQLite.  In current versions of SQLite, both of
these values are 1 for rollback journalling modes and 2 for [WAL]
journalling mode.  If a version of SQLite coded to the current
file format specification encounters a database file where the read
version is 1 or 2 but the write version is greater than 2, then the database
file must be treated as read-only.  If a database file with a read version
greater than 2 is encountered, then that database cannot be read or written.</p>

<h4>1.2.4 Reserved bytes per page</h4>

<p>SQLite has the ability to set aside a small number of extra bytes at
the end of every page for use by extensions.  These extra bytes are
used, for example, by the SQLite Encryption Extension to store a nonce
and/or cryptographic checksum associated with each page.  ^The 
"reserved space" size in the 1-byte integer at offset 20 is the number
of bytes of space at the end of each page to reserve for extensions.
................................................................................
<p>The "usable size" of a database page is the page size specify by the
2-byte integer at offset 16 in the header less the "reserved" space size
recorded in the 1-byte integer at offset 20 in the header.  The usable
size of a page might be an odd number.  ^(However, the usable size is not
allowed to be less than 480.  In other words, if the page size is 512,
then the reserved space size cannot exceed 32.)^</p>

<h4>1.2.5 Payload fractions</h4>

<p>^The maximum and minimum embedded payload fractions and the leaf
payload fraction values must be 64, 32, and 32.  These values were
originally intended to be tunable parameters that could be used to
modify the storage format of the b-tree algorithm.  However, that
functionality is not supported and there are no current plans to add
support in the future.  Hence, these three bytes are fixed at the
values specified.</p>

<h4>1.2.6 File change counter</h4>

<tcl>hd_fragment chngctr {change counter}</tcl>
<p>^The file change counter is a 4-byte big-endian integer at
offset 24 that is incremented whenever the database file is unlocked
after having been modified.
When two or more processes are reading the same database file, each 
process can detect database changes from other processes by monitoring 
................................................................................
another process modified the database, since the cache has become stale.
The file change counter facilitates this.</p>

<p>In WAL mode, changes to the database are detected using the wal-index
and so the change counter is not needed.  Hence, the change counter might
not be incremented on each transaction in WAL mode.</p>

<h4>1.2.7 In-header database size</h4>

<tcl>hd_fragment filesize {in-header database size}</tcl>
<p>^The 4-byte big-endian integer at offset 28 into the header 
stores the size of the database file in pages.  ^If this in-header
datasize size is not valid (see the next paragraph), then the database 
size is computed by looking
at the actual size of the database file. Older versions of SQLite
................................................................................
know to update the in-header database size and so the in-header
database size could be incorrect.  But legacy versions of SQLite
will also leave the version-valid-for number at offset 92 unchanged
so it will not match the change-counter.  Hence, invalid in-header
database sizes can be detected (and ignored) by observing when
the change-counter does not match the version-valid-for number.</p>

<h4>1.2.8 Free page list</h4>

<p>Unused pages in the database file are stored on a freelist.  ^The
4-byte big-endian integer at offset 32 stores the page number of
the first page of the freelist, or zero if the freelist is empty.
^The 4-byte big-endian integer at offset 36 stores stores the total 
number of pages on the freelist.</p>

<h4>1.2.9 Schema cookie</h4>

<p>^The schema cookie is a 4-byte big-endian integer at offset 40
that is incremented whenever the database schema changes.  A 
prepared statement is compiled against a specific version of the
database schema.  When the database schema changes, the statement
must be reprepared.  ^When a prepared statement runs, it first checks
the schema cookie to ensure the value is the same as when the statement
was prepared and if the schema cookie has changed, the statement either
automatically reprepares and reruns or it aborts with an [SQLITE_SCHEMA] 
error.</p>

<tcl>hd_fragment {schemaformat} {schema format} {schema format number}</tcl>
<h4>1.2.10 Schema format number</h4>

<p>The schema format number is a 4-byte big-endian integer at offset 44.
The schema format number is similar to the file format read and write
version numbers at offsets 18 and 19 except that the schema format number
refers to the high-level SQL formatting rather than the low-level b-tree
formatting.  Four schema format numbers are currently defined:</p>

................................................................................
<p>^New database files created by SQLite use format 4 by default.
^The [legacy_file_format pragma] can be used to cause SQLite
to create new database files using format 1.
The format version number can be made to default to 1 instead of 4 by
setting [SQLITE_DEFAULT_FILE_FORMAT]=1 at compile-time.
</p>

<h4>1.2.11 Suggested cache size</h4>

<p>The 4-byte big-endian signed integer at offset 48 is the suggested
cache size in pages for the database file.  The value is a suggestion
only and SQLite is under no obligation to honor it.  The absolute value
of the integer is used as the suggested size.  The suggested cache size
can be set using the [default_cache_size pragma].</p>

<h4>1.2.12 Incremental vacuum settings</h4>

<p>The two 4-byte big-endian integers at offsets 52 and 64 are used
to manage the [auto_vacuum] and [incremental_vacuum] modes.  ^If
the integer at offset 52 is zero then pointer-map (ptrmap) pages are
omitted from the database file and neither auto_vacuum nor
incremental_vacuum are supported.  ^If the integer at offset 52 is
non-zero then it is the page number of the largest root page in the
................................................................................
database file, the database file will contain ptrmap pages, and the
mode must be either auto_vacuum or incremental_vacuum.  ^In this latter
case, the integer at offset 64 is true for incremental_vacuum and
false for auto_vacuum.  ^If the integer at offset 52 is zero then
the integer at offset 64 must also be zero.</p>

<tcl>hd_fragment enc {text encoding}</tcl>
<h4>1.2.13 Text encoding</h4>

<p>^The 4-byte big-endian integer at offset 56 determines the encoding
used for all text strings stored in the database.  
^A value of 1 means UTF-8.
^A value of 2 means UTF-16le.
^A value of 3 means UTF-16be.
No other values are allowed.
^(The sqlite3.h header file defines C-preprocessor macros SQLITE_UTF8 as 1,
SQLITE_UTF16LE as 2, and SQLITE_UTF16BE as 3, to use in place of
the numeric codes for the text encoding.)^</p>

<h4>1.2.14 User version number</h4>

<p>^The 4-byte big-endian integer at offset 60 is the user version which
is set and queried by the [user_version pragma].  The user version is
not used by SQLite.</p>

<tcl>hd_fragment appid {Application ID}</tcl>
<h4>1.2.15 Application ID</h4>

<p>^The 4-byte big-endian integer at offset 68 is an "Application ID" that
can be set by the [PRAGMA application_id] command in order to identify the
database as belonging to or associated with a particular application.
The application ID is intended for database files used as an
[application file-format].  The application ID can be used by utilities 
such as [http://www.darwinsys.com/file/ | file(1)] to determine the specific
file type rather than just reporting "SQLite3 Database".  A list of
assigned application IDs can be seen by consulting the
[http://www.sqlite.org/src/artifact?ci=trunk&filename=magic.txt|magic.txt]
file in the SQLite source repository.</p>

<tcl>hd_fragment validfor {version-valid-for number}</tcl>
<h4>1.2.16 Write library version number and version-valid-for number</h4>

<p>^The 4-byte big-endian integer at offset 96 stores the 
[SQLITE_VERSION_NUMBER] value for the SQLite library that most
recently modified the database file.  ^The 4-byte big-endian integer at
offset 92 is the value of the [change counter] when the version number
was stored.  The integer at offset 92 indicates which transaction
the version number is valid for and is sometimes called the
"version-valid-for number".

<h4>1.2.16 Header space reserved for expansion</h4>

<p>All other bytes of the database file header are reserved for
future expansion and must be set to zero.</p>

<h3>1.3 The Lock-Byte Page</h3>

<p>The lock-byte page is the single page of the database file
that contains the bytes at offsets between 1073741824 and 1073742335,
inclusive.  A database file that is less than or equal to 1073741824 bytes 
in size contains no lock-byte page.  A database file larger than
1073741824 contains exactly one lock-byte page.
</p>
................................................................................
page according to the 
needs and proclivities of the underlying system.  The unix and win32
[VFS] implementations that come built into SQLite do not write to the
lock-byte page, but third-party VFS implementations for
other operating systems might.</p>

<tcl>hd_fragment {freelist} {freelist} {free-page list}</tcl>
<h3>1.4 The Freelist</h3>

<p>A database file might contain one or more pages that are not in
active use.  Unused pages can come about, for example, when information
is deleted from the database.  Unused pages are stored on the freelist
and are reused when additional pages are required.</p>

<p>The freelist is organized as a linked list of freelist trunk pages
................................................................................
<p>^The number of freelist pages is stored as a 4-byte big-endian integer
in the database header at an offset of 36 from the beginning of the file.
^The database header also stores the page number of the first freelist trunk
page as a 4-byte big-endian integer at an offset of 32 from the beginning
of the file.</p>

<tcl>hd_fragment btree {B*-Trees} {B-tree}</tcl>
<h3>1.5 B-tree Pages</h3>

<p>The b-tree algorithm provides key/data storage with unique and
ordered keys on page-oriented storage devices.
For background information on b-trees, see
Knuth, <u>The Art Of Computer Programming</u>, Volume 3 "Sorting
and Searching", pages 471-479.  Two kinds of b-trees are used by
SQLite.  The algorithm that Knuth calls "B*-Tree" stores all data
................................................................................
<tr><td align=center valign=top>7<td align=center valign=top>1<td align=left>
^The one-byte integer at offset 7 gives the number of fragmented free
bytes within the cell content area.
<tr><td align=center valign=top>8<td align=center valign=top>4<td align=left>
^(The four-byte page number at offset 8 is the right-most pointer.  This
value appears in the header of interior b-tree pages only and is omitted from
all other pages.)^
</table></blockquote></center>

<p>^The cell pointer array of a b-tree page immediately follows the b-tree
page header.  Let K be the number of cells on the btree.  ^The cell pointer
array consists of K 2-byte integer offsets to the cell contents.  ^The
cell pointers are arranged in key order with left-most cell (the cell with the
smallest key) first and the right-most cell (the cell with the largest
key) last.</p>
................................................................................
The lower seven bits of each of the first eight bytes and all 8 bits of
the ninth byte are used to reconstruct the 64-bit twos-complement integer.
Varints are big-endian: bits taken from the earlier byte of the varint
are the more significant than bits taken from the later bytes. </p>

<p>The format of a cell depends on which kind of b-tree page the cell
appears on.  The following table shows the elements of a cell, in
order of appearance, for the various b-tree page types.</p>

<blockquote><dl>
<dt><p>Table B-Tree Leaf Cell (header 0x0d):</p></dt>
<dd><p><ul>
<li>A varint which is the total number of bytes of payload, including any
overflow
<li>A varint which is the integer key, a.k.a. "[rowid]"
<li>The initial portion of the payload that does not spill to overflow
pages.
................................................................................
<li>A varint which is the total number of bytes of key payload, including any
overflow
<li>The initial portion of the payload that does not spill to overflow
pages.
<li>A 4-byte big-endian integer page number for the first page of the
overflow page list - omitted if all payload fits on the b-tree page.
</ul></p></dd>
</dl></blockquote>

<p>The information above can be recast into a table format as follows:</p>

<tcl>hd_fragment cellformat {cell format summary}</tcl>
<center>
<i>B-tree Cell Format</i>
<table border=1 width="80%">
................................................................................
    <td align=center valign=top>&nbsp;
    <td align=center valign=top>&#x2714;
    <td align=center valign=top>&#x2714;
    <td align=left>Page number of first overflow page
</table></center>



<tr><td align=center valign=top>

<p>The amount of payload that spills onto overflow pages also depends on
the page type.  For the following computations, let U be the usable size
of a database page, the total page size less the reserved space at the
end of each page.  And let P be the payload size.  In the following,
symbol X represents the maximum amount of payload that can be stored directly
on the b-tree page without spilling onto an overflow page and symbol M
represents the minimum amount of payload that must be stored on the btree
page before spilling is allowed.</p>

<blockquote><dl>
<dt>Table B-Tree Leaf Cell:</dt>
<dd><p>
^(Let X be U-35.  If the payload size P is less than or equal to X then
the entire payload is stored on the b-tree leaf page.)^
^(Let M be ((U-12)*32/255)-23 and let K be M+((P-M)%(U-4)).
If P is greater than X
then the number of bytes stored on the table b-tree leaf page is K
if K is less or equal to X or M otherwise.)^
^(Note that number of bytes stored on the leaf page is never less than M.)^
</p></dd>

<dt>Table B-Tree Interior Cell:</dt>
<dd><p>
Interior pages of table b-trees have no payload and so there is never
any payload to spill.
</p></dd>

<dt>Index B-Tree Leaf Or Interior Cell:</dt>
<dd><p>
^(Let X be ((U-12)*64/255)-23).  If the payload size P is less than
or equal to X then the entire payload is stored on the b-tree page.)^
^(Let M be ((U-12)*32/255)-23 and let K be M+((P-M)%(U-4)).
If P is greater than X then the number
of bytes stored on the index b-tree page is K if K is less than or
equal to X or M otherwise.)^
^(Note that number of bytes stored on the index page is never less than M.)^
</p></dd>
</dl></blockquote>

<p>Here is an alternative description of the same computation:

<ul>
<li>X is U-35 for table btree leaf pages or
    ((U-12)*64/255)-23 for index pages.
<li>M is always ((U-12)*32/255)-23.
<li>Let K be M+((P-M)%(U-4)).
<li>^If P<=X then all P bytes of payload are stored directly on the 
    btree page without overflow.
<li>^If P>X and K<=X then the first K bytes of P are stored on the 
    btree page and the remaining P-K bytes are stored on overflow pages.
<li>^If P>X and K>X then the first M bytes of P are stored on the
    btree page and the remaining P-M bytes are stored on overflow pages.
</ul>

<p>The overflow thresholds are designed to give a minimum fanout of
4 for index b-trees and to make sure enough of the payload
................................................................................
without consulting an overflow page.  In hindsight, the designers of
the SQLite b-tree logic realize that these thresholds could have been
made much simpler.  However, the computations cannot be changed
without resulting in an incompatible file format.  And the current computations
work well, even if they are a little complex.</p>

<tcl>hd_fragment ovflpgs {overflow page} {overflow pages}</tcl>
<h3>1.6 Cell Payload Overflow Pages</h3>

<p>^When the payload of a b-tree cell is too large for the b-tree page,
the surplus is spilled onto overflow pages.  ^Overflow pages form a linked
list.  ^The first four bytes of each overflow page are a big-endian
integer which is the page number of the next page in the chain, or zero
for the final page in the chain.  ^The fifth byte through the last usable
byte are used to hold overflow content.</p>

<h3>1.7 Pointer Map or Ptrmap Pages</h3>

<p>Pointer map or ptrmap pages are extra pages inserted into the database
to make the operation of [auto_vacuum] and [incremental_vacuum] modes
more efficient.  Other page types in the database typically have pointers
from parent to child.  For example, an interior b-tree page contains pointers
to its child b-tree pages and an overflow chain has a pointer
from earlier to later links in the chain.  A ptrmap page contains linkage
................................................................................
logic does not know how to update the root_page field of the sqlite_master
table and so it is necessary to prevent root pages from being moved
during an auto-vacuum in order to preserve the integrity of the
sqlite_master table.  ^Root pages are moved to the beginning of the
database file by the CREATE TABLE, CREATE INDEX, DROP TABLE, and
DROP INDEX operations.</p>

<h2>2.0 Schema Layer</h2>

<p>The foregoing text describes low-level aspects of the SQLite file
format.  The b-tree mechanism provides a powerful and efficient means of
accessing a large data set.  This section will describe how the
low-level b-tree layer is used to implement higher-level SQL
capabilities.</p>

<tcl>hd_fragment record_format {record format}</tcl>
<h3>2.1 Record Format</h3>

<p>The data for a table b-tree leaf page and the key
of an index b-tree page was characterized above
as an arbitrary sequence of bytes.
The prior discussion mentioned one key being less than another, but
did not define what "less than" meant.  The current section will address
these omissions.</p>
................................................................................
The varint format is very efficient at coding the record header.</p>

<p>^The values for each column in the record immediately follow the header.
^(Note that for serial types 0, 8, 9, 12, and 13, the value is zero bytes in
length.  If all columns are of these types then the body section of the
record is empty.)^</p>

<h3>2.2 Record Sort Order</h3>

<p>The order of keys in an index b-tree is determined by the sort order of
the records that the keys represent.  Record comparison progresses column
by column.  Columns of a record are examined from left to right.  The
first pair of columns that are not equal determines the relative order
of the two records.  The sort order of individual columns is as
follows:</p>
................................................................................
^Alternative collating functions for table columns can be specified in the
[CREATE TABLE] statement using the COLLATE clause on the [column definition].
^When a column is indexed, the same collating function specified in the
[CREATE TABLE] statement is used for the column in the index, by default,
though this can be overridden using a COLLATE clause in the 
[CREATE INDEX] statement.

<h3>2.3 Representation Of SQL Tables</h3>

<p> ^Each ordinary SQL table in the database schema is represented on-disk
by a table b-tree.  ^Each entry in the table b-tree corresponds to a row
of the SQL table.  ^The [rowid] of the SQL table is the 64-bit signed
integer key for each entry in the table b-tree.</p>

<p> ^The content of each SQL table row is stored in the database file by
................................................................................
<p> ^If the [affinity] of a column is REAL and that column contains a
value that can be converted to an integer without loss of information
(if the value contains no fractional part and is not too large to be
represented as an integer) then the column may be stored in the record
as an integer.  ^SQLite will convert the value back to floating
point when extracting it from the record.</p>

<h3>2.4 Representation of WITHOUT ROWID Tables</h3>

<p>^If an SQL table is created using the "WITHOUT ROWID" clause at the
end of its CREATE TABLE statement, then that table is a [WITHOUT ROWID]
table and uses a different on-disk representation.  ^A WITHOUT ROWID
table uses an index b-tree rather than a table b-tree for storage.
^The key for each entry in the WITHOUT ROWID b-tree is a record composed
of the columns of the PRIMARY KEY followed by all remaining columns of
................................................................................
as the content encoding for an ordinary rowid table, except that the
order of the columns is rearranged so that PRIMARY KEY columns come
first, and the content is used as the key in an index b-tree rather
than as the data in a table b-tree.
^The special encoding rules for columns with REAL affinity
apply to WITHOUT ROWID tables the same as they do with rowid tables.

<h3>2.5 Representation Of SQL Indices</h3>

<p>^Each SQL index, whether explicitly declared via a [CREATE INDEX] statement
or implied by a UNIQUE or PRIMARY KEY constraint, corresponds to an 
index b-tree in the database file.
^Each entry in the index b-tree corresponds to a single row in the 
associated SQL table.
^The key to an index b-tree is
................................................................................
table and entries in each index associated with that table.
^However, in a [partial index], the index b-tree only contains entries
corresponding to table rows for which the WHERE clause expression on the
CREATE INDEX statement is true.
^Corresponding rows in the index and table b-trees share the same rowid
or primary key values and contain the same value for all indexed columns.</p>

<h4>2.5.1 Suppression of redundant columns in WITHOUT ROWID secondary indexed
</h4>

<p> ^In an index on a WITHOUT ROWID table, if one or more of the columns
of the table PRIMARY KEY are also columns of the index, then the
indexed column is not repeated in the table-key suffix on the end of
the index record.  ^(As an example, consider the following SQL:

<blockquote><pre>
................................................................................

<p> ^The suppression of redundant columns in the key suffix of an index
entry only occurs in WITHOUT ROWID tables.  ^In an ordinary rowid table,
the index entry always ends with the rowid even if the [INTEGER PRIMARY KEY]
column is one of the columns being indexed.</p>

<tcl>hd_fragment sqlite_master {sqlite_master} {sqlite_master table}</tcl>
<h3>2.6 Storage Of The SQL Database Schema</h3>

<p>^Page 1 of a database file is the root page of a table b-tree that
holds a special table named "sqlite_master" (or "sqlite_temp_master" in
the case of a TEMP database) which stores the complete
database schema.  ^(The structure of the sqlite_master table is as
if it had been created using the following SQL:</p>

................................................................................


<p>^(The sqlite_master.type column will be one
of the following text strings:  'table', 'index', 'view', or 'trigger'
according to the type of object defined.  The 'table' string is used
for both ordinary and [virtual tables].)^</p>

</p>^(The sqlite_master.name column will hold the name of the object.)^
^([UNIQUE] and [PRIMARY KEY] constraints on tables cause SQLite to create
[internal indexes] with names of the form "sqlite_autoindex_TABLE_N"
where TABLE is replaced by the name of the table that contains the
constraint and N is an integer beginning with 1 and increasing by one
with each constraint seen in the table definition.)^
^(In a [WITHOUT ROWID] table, there is no sqlite_master entry for the
PRIMARY KEY, but the "sqlite_autoindex_TABLE_N" name is set aside
................................................................................
^(The sqlite_master.sql is NULL for the [internal indexes] that are
automatically created by [UNIQUE] or [PRIMARY KEY] constraints.)^</p>


<tcl>hd_fragment intschema {internal schema objects} \
{internal schema object} {internal index} {internal indexes} \
{internal table} {internal tables}</tcl>
<h4>2.6.1 Internal Schema Objects</h4>

<p>^In addition to the tables, indexes, views, and triggers created by
the application and/or the developer using CREATE statements SQL, the
sqlite_master table may contain zero or more entries for 
<i>internal schema objects</i> that are created by SQLite for its 
own internal use.  ^The names of internal schema objects
always begin with "sqlite_" and any table, index, view, or trigger
................................................................................
       algorithm to use for each query.
</ul>

<p>New internal schema objects names, always beginning with "sqlite_",
may be added to the SQLite file format in future releases.

<tcl>hd_fragment seqtab {sqlite_sequence}</tcl>
<h4>2.6.2 The sqlite_sequence table</h4>

<p>^The sqlite_sequence table is an internal table used to help implement
[AUTOINCREMENT].  ^The sqlite_sequence table is created automatically
whenever any ordinary table with an AUTOINCREMENT integer primary
key is created.  ^Once created, the sqlite_sequence table exists in the
sqlite_master table forever; it cannot be dropped.
^(The schema for the sqlite_sequence table is:
................................................................................
<p>^Application code is allowed to modify the sqlite_sequence table, to add
new rows, to delete rows, or to modify existing rows.  ^However, application
code cannot create the sqlite_sequence table if it does not already exist.
^Application code can delete all entries from the sqlite_sequence table,
but application code cannot drop the sqlite_sequence table.

<tcl>hd_fragment stat1tab {sqlite_stat1} SQLITE_STAT1 </tcl>
<h4>2.6.3 The sqlite_stat1 table</h4>

<p>^The sqlite_stat1 is an internal table created by the [ANALYZE] command
and used to hold supplemental information about tables and indexes that the
query planner can use to help it find better ways of performing queries.
^Applications can update, delete from, insert into or drop the sqlite_stat1
table, but may not create or alter the sqlite_stat1 table.
^(The schema of the sqlite_stat1 table is as follows:
................................................................................
of the stat column are silently ignored.

<p>^(If the sqlite_stat1.idx column is NULL, then the sqlite_stat1.stat
column contains a single integer which is the approximate number of
rows in the table identified by sqlite_stat1.tbl.)^

<tcl>hd_fragment stat2tab {sqlite_stat2}</tcl>
<h4>2.6.4 The sqlite_stat2 table</h4>

<p>The sqlite_stat2 is only created and is only used if SQLite is compiled
with SQLITE_ENABLE_STAT2 and if the SQLite version number is between
3.6.18 and 3.7.8.  The sqlite_stat2 table is neither read nor written by any
version of SQLite before 3.6.18 nor after 3.7.8.
The sqlite_stat2 table contains additional information
about the distribution of keys within an index.
................................................................................
10 uniform buckets and the samples are the middle row from each bucket.

<p>The format for sqlite_stat2 is recorded here for legacy reference.  
Recent versions of SQLite no longer support sqlite_stat2 and the
sqlite_stat2 table, it is exists, is simply ignored.

<tcl>hd_fragment stat3tab {sqlite_stat3} SQLITE_STAT3</tcl>
<h4>2.6.5 The sqlite_stat3 table</h4>

<p>The sqlite_stat3 is only used if SQLite is compiled
with [SQLITE_ENABLE_STAT3] or [SQLITE_ENABLE_STAT4]
and if the SQLite version number is 3.7.9 or greater.
The sqlite_stat3 table is neither read nor written by any
version of SQLite before 3.7.9.
If the [SQLITE_ENABLE_STAT4] compile-time option is used and the
................................................................................
index must appear in the same order that they occur in the index.  
In other words, if the entry with left-most column S1 is earlier in
the index b-tree than the
entry with left-most column S2, then in the sqlite_stat3 table, 
sample S1 must have a smaller rowid than sample S2.)^

<tcl>hd_fragment stat4tab {sqlite_stat4} SQLITE_STAT4</tcl>
<h4>2.6.6 The sqlite_stat4 table</h4>

<p>The sqlite_stat4 is only created and is only used if SQLite is compiled
with [SQLITE_ENABLE_STAT4] and if the SQLite version number is
3.8.1 or greater.  The sqlite_stat4 table is neither read nor written by any
version of SQLite before 3.8.1.
The sqlite_stat4 table contains additional information
about the distribution of keys within an index or the distribution of
................................................................................
<p>^(In a well-formed sqlite_stat4 table, the samples for any single
index must appear in the same order that they occur in the index.  
In other words, if entry S1 is earlier in the index b-tree than 
entry S2, then in the sqlite_stat4 table, sample S1 must have a
smaller rowid than sample S2.)^

<tcl>hd_fragment rollbackjournal {rollback journal format}</tcl>
<h2>3.0 The Rollback Journal</h2>

<p>The rollback journal is a file associated with each SQLite database
file that hold information used to restore the database file to its initial
state during the course of a transaction.
^The rollback journal file is always located in the same 
directory as the database
file and has the same name as the database file but with the string
................................................................................
of three ways:
<ol>
<li>^(The rollback journal file can be deleted)^,
<li>^(The rollback journal file can be truncated to zero length)^, or
<li>^(The header of the rollback journal can be overwritten with
invalid header text (for example, all zeros).)^
</ol>

^These three ways of committing a transaction correspond to the DELETE,
TRUNCATE, and PERSIST settings, respectively, of the [journal_mode pragma].
</p>


<p>A valid rollback journal begins with a header in the following format:</p>

................................................................................
journal must contain the same database page size and sector size.</p>

<p>^If M is -1 in the initial journal header, then the number of page records
that follow is computed by computing how many page records will fit in
the available space of the remainder of the journal file.</p>

<tcl>hd_fragment walformat {WAL format}</tcl>
<h2>4.0 The Write-Ahead Log</h2>

<p>Beginning with [version 3.7.0], SQLite supports a new transaction
control mechanism called "[WAL | write-ahead log]" or "[WAL]".
^When a database is in WAL mode, all connections to that database must
use the WAL.  ^A particular database will use either a rollback journal
or a WAL, but not both at the same time.
^The WAL is always located in the same directory as the database
file and has the same name as the database file but with the string
"<tt>-wal</tt>" appended.</p>

<h3>4.1 WAL File Format</h3>

<p>A WAL file consists of a header followed by zero or more "frames".
Each frame records the revised content of a single page from the
database file.  All changes to the database are recorded by writing
frames into the WAL.  Transactions commit when a frame is written that
contains a commit marker.  ^A single WAL can and usually does record 
multiple transactions.  Periodically, the content of the WAL is
................................................................................
       exactly match the checksum computed consecutively on the
       first 24 bytes of the WAL header and the first 8 bytes and
       the content of all frames
       up to and including the current frame.</p></li></li>
</ol>)^

<tcl>hd_fragment walcksm {WAL checksum algorithm}</tcl>
<h3>4.2 Checksum Algorithm</h3>

<p>The checksum is computed by interpreting the input as
an even number of unsigned 32-bit integers: x(0) through x(N).
^The 32-bit integers are big-endian if the
magic number in the first 4 bytes of the WAL header is 0x377f0683 and
the integers are little-endian if the magic number is 0x377f0682.
^The checksum values are always stored in the frame header in a
................................................................................
</pre></blockquote>)^

<p>^The outputs s0 and s1 are both weighted checksums using Fibonacci weights
in reverse order.  (^The largest Fibonacci weight occurs on the first element
of the sequence being summed.)  ^The s1 value spans all 32-bit integer
terms of the sequence whereas s0 omits the final term.</p>

<h3>4.3 Checkpoint Algorithm</h3>

<p>^On a [checkpoint], the WAL is first flushed to persistent storage using
the xSync method of the [sqlite3_io_methods | VFS]. 
^Then valid content of the WAL is transferred into the database file.
^Finally, the database is flushed to persistent storage using another
xSync method call.
The xSync operations serve as write barriers - all writes launched
................................................................................
the WAL file from the beginning.  ^At the start of the first new
write transaction, the WAL header salt-1 value is incremented
and the salt-2 value is randomized.  These changes to the salts invalidate
old frames in the WAL that have already been checkpointed but not yet
overwritten, and prevent them from being checkpointed again.</p>

<tcl>hd_fragment walread {WAL read algorithm}</tcl>
<h3>4.4 Reader Algorithm</h3>

<p>^(To read a page from the database (call it page number P), a reader
first checks the WAL to see if it contains page P.  If so, then the
last valid instance of page P that is followed by a commit frame
or is a commit frame itself becomes the value read.)^  ^If the WAL
contains no copies of page P that are valid and which are a commit
frame or are followed by a commit frame, then page P is read from
................................................................................
reader has to scan the entire WAL looking for page P frames.  If the
WAL is large (multiple megabytes is typical) that scan can be slow,
and read performance suffers.  ^To overcome this problem, a separate
data structure called the wal-index is maintained to expedite the
search for frames of a particular page.</p>

<tcl>hd_fragment walindexformat {wal-index} {WAL-index format}</tcl>
<h3>4.5 WAL-Index Format</h3>

<p>Conceptually, the wal-index is shared memory, though the current
VFS implementations use a mmapped file for the wal-index.  ^The mmapped
file is in the same directory as the database and has the same name
as the database with a "<tt>-shm</tt>" suffix appended.  Because
the wal-index is shared memory, SQLite does not support 
[PRAGMA journal_mode | journal_mode=WAL] 



|
<
<




|







 







|







 







|







 







|






|







 







|











|







 







|









|







 







|







 







|







|












|







 







|







|







 







|











|






|













|









|




|







 







|







 







|







 







|







 







|

|







 







|







 







<
<
<







|

|
|










|





|









|








|

|







 







|








|







 







|








|







 







|







 







|







 







|







 







|







 







|
|







 







|







 







|







 







|







 







|







 







|







 







|







 







|







 







|







 







|







 







>







 







|










|







 







|







 







|







 







|







 







|







1
2
3
4


5
6
7
8
9
10
11
12
13
14
15
16
..
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
..
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
...
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
...
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
...
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
...
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
...
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
...
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
...
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
...
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
...
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
...
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
...
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
...
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
...
737
738
739
740
741
742
743



744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
...
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
...
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
...
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
....
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
....
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
....
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
....
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
....
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
....
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
....
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
....
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
....
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
....
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
....
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
....
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
....
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
....
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
....
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
....
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
....
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
....
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
....
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
<title>File Format For SQLite Databases</title>
<tcl>hd_keywords {file format} {second edition file format document}</tcl>

<table_of_contents>



<p>This document describes and defines the on-disk database file
format used by SQLite.</p>

<h1>The Database File</h1>

<p>The complete state of an SQLite database is usually
contained a single file on disk called the "main database file".</p>

<p>During a transaction, SQLite stores additional information 
in a second file called the "rollback journal", or if SQLite is in
[WAL mode], a write-ahead log file.
................................................................................
the state of the database, they are called a "hot journal" or "hot WAL file".
Hot journals and WAL files are only a factor during error recovery
scenarios and so are uncommon, but they are part of the state of an SQLite
database and so cannot be ignored.  This document defines the format
of a rollback journal and the write-ahead log file, but the focus is
on the main database file.</p>

<h2>Pages</h2>

<p>The main database file consists of one or more pages.  ^The size of a
page is a power of two between 512 and 65536 inclusive.  All pages within
the same database are the same size.  ^The page size for a database file
is determined by the 2-byte integer located at an offset of
16 bytes from the beginning of the database file.</p>

................................................................................
rolled back, the rollback journal can then be used to restore the
database to its original state.  ^Freelist leaf pages bear no
information that would need to be restored on a rollback and so they
are not written to the journal prior to modification, in order to
reduce disk I/O.</p>

<tcl>hd_fragment database_header {database header}</tcl>
<h2>The Database Header</h2>

<p>The first 100 bytes of the database file comprise the database file 
header.  The database file header is divided into fields as shown by
the table below.  All multibyte fields in the database file header are
stored with the most significant byte first (big-endian).</p>

<center>
................................................................................
Reserved for expansion.  Must be zero.
<tr><td valign=top align=center>92<td valign=top align=center>4<td align=left>
The [version-valid-for number].
<tr><td valign=top align=center>96<td valign=top align=center>4<td align=left>
[SQLITE_VERSION_NUMBER]
</table></center>

<h3>Magic Header String</h3>

<p>^Every valid SQLite database file begins with the following 16 bytes 
(in hex): 53 51 4c 69 74 65 20 66 6f 72 6d 61 74 20 33 00.  This byte sequence
corresponds to the UTF-8 string "SQLite format 3" including the nul
terminator character at the end.</p>

<h3>Page Size</h3>

<p>The two-byte value beginning at offset 16 determines the page size of 
the database.  For SQLite versions 3.7.0.1 and earlier, this value is 
interpreted as a big-endian integer and must be a power of two between
512 and 32768, inclusive.  Beginning with SQLite version 3.7.1, a page
size of 65536 bytes is supported.  The value 65536 will not fit in a
two-byte integer, so to specify a 65536-byte page size, the value
................................................................................
at offset 16 is 0x00 0x01.
This value can be interpreted as a big-endian
1 and thought of is as a magic number to represent the 65536 page size.
Or one can view the two-byte field as a little endian number and say
that it represents the page size divided by 256.  These two 
interpretations of the page-size field are equivalent.</p>

<h3>File format version numbers</h3>

<p>The file format write version and file format read version at offsets
18 and 19 are intended to allow for enhancements of the file format
in future versions of SQLite.  In current versions of SQLite, both of
these values are 1 for rollback journalling modes and 2 for [WAL]
journalling mode.  If a version of SQLite coded to the current
file format specification encounters a database file where the read
version is 1 or 2 but the write version is greater than 2, then the database
file must be treated as read-only.  If a database file with a read version
greater than 2 is encountered, then that database cannot be read or written.</p>

<h3>Reserved bytes per page</h3>

<p>SQLite has the ability to set aside a small number of extra bytes at
the end of every page for use by extensions.  These extra bytes are
used, for example, by the SQLite Encryption Extension to store a nonce
and/or cryptographic checksum associated with each page.  ^The 
"reserved space" size in the 1-byte integer at offset 20 is the number
of bytes of space at the end of each page to reserve for extensions.
................................................................................
<p>The "usable size" of a database page is the page size specify by the
2-byte integer at offset 16 in the header less the "reserved" space size
recorded in the 1-byte integer at offset 20 in the header.  The usable
size of a page might be an odd number.  ^(However, the usable size is not
allowed to be less than 480.  In other words, if the page size is 512,
then the reserved space size cannot exceed 32.)^</p>

<h3>Payload fractions</h3>

<p>^The maximum and minimum embedded payload fractions and the leaf
payload fraction values must be 64, 32, and 32.  These values were
originally intended to be tunable parameters that could be used to
modify the storage format of the b-tree algorithm.  However, that
functionality is not supported and there are no current plans to add
support in the future.  Hence, these three bytes are fixed at the
values specified.</p>

<h3>File change counter</h3>

<tcl>hd_fragment chngctr {change counter}</tcl>
<p>^The file change counter is a 4-byte big-endian integer at
offset 24 that is incremented whenever the database file is unlocked
after having been modified.
When two or more processes are reading the same database file, each 
process can detect database changes from other processes by monitoring 
................................................................................
another process modified the database, since the cache has become stale.
The file change counter facilitates this.</p>

<p>In WAL mode, changes to the database are detected using the wal-index
and so the change counter is not needed.  Hence, the change counter might
not be incremented on each transaction in WAL mode.</p>

<h3>In-header database size</h3>

<tcl>hd_fragment filesize {in-header database size}</tcl>
<p>^The 4-byte big-endian integer at offset 28 into the header 
stores the size of the database file in pages.  ^If this in-header
datasize size is not valid (see the next paragraph), then the database 
size is computed by looking
at the actual size of the database file. Older versions of SQLite
................................................................................
know to update the in-header database size and so the in-header
database size could be incorrect.  But legacy versions of SQLite
will also leave the version-valid-for number at offset 92 unchanged
so it will not match the change-counter.  Hence, invalid in-header
database sizes can be detected (and ignored) by observing when
the change-counter does not match the version-valid-for number.</p>

<h3>Free page list</h3>

<p>Unused pages in the database file are stored on a freelist.  ^The
4-byte big-endian integer at offset 32 stores the page number of
the first page of the freelist, or zero if the freelist is empty.
^The 4-byte big-endian integer at offset 36 stores stores the total 
number of pages on the freelist.</p>

<h3>Schema cookie</h3>

<p>^The schema cookie is a 4-byte big-endian integer at offset 40
that is incremented whenever the database schema changes.  A 
prepared statement is compiled against a specific version of the
database schema.  When the database schema changes, the statement
must be reprepared.  ^When a prepared statement runs, it first checks
the schema cookie to ensure the value is the same as when the statement
was prepared and if the schema cookie has changed, the statement either
automatically reprepares and reruns or it aborts with an [SQLITE_SCHEMA] 
error.</p>

<tcl>hd_fragment {schemaformat} {schema format} {schema format number}</tcl>
<h3>Schema format number</h3>

<p>The schema format number is a 4-byte big-endian integer at offset 44.
The schema format number is similar to the file format read and write
version numbers at offsets 18 and 19 except that the schema format number
refers to the high-level SQL formatting rather than the low-level b-tree
formatting.  Four schema format numbers are currently defined:</p>

................................................................................
<p>^New database files created by SQLite use format 4 by default.
^The [legacy_file_format pragma] can be used to cause SQLite
to create new database files using format 1.
The format version number can be made to default to 1 instead of 4 by
setting [SQLITE_DEFAULT_FILE_FORMAT]=1 at compile-time.
</p>

<h3>Suggested cache size</h3>

<p>The 4-byte big-endian signed integer at offset 48 is the suggested
cache size in pages for the database file.  The value is a suggestion
only and SQLite is under no obligation to honor it.  The absolute value
of the integer is used as the suggested size.  The suggested cache size
can be set using the [default_cache_size pragma].</p>

<h3>Incremental vacuum settings</h3>

<p>The two 4-byte big-endian integers at offsets 52 and 64 are used
to manage the [auto_vacuum] and [incremental_vacuum] modes.  ^If
the integer at offset 52 is zero then pointer-map (ptrmap) pages are
omitted from the database file and neither auto_vacuum nor
incremental_vacuum are supported.  ^If the integer at offset 52 is
non-zero then it is the page number of the largest root page in the
................................................................................
database file, the database file will contain ptrmap pages, and the
mode must be either auto_vacuum or incremental_vacuum.  ^In this latter
case, the integer at offset 64 is true for incremental_vacuum and
false for auto_vacuum.  ^If the integer at offset 52 is zero then
the integer at offset 64 must also be zero.</p>

<tcl>hd_fragment enc {text encoding}</tcl>
<h3>Text encoding</h3>

<p>^The 4-byte big-endian integer at offset 56 determines the encoding
used for all text strings stored in the database.  
^A value of 1 means UTF-8.
^A value of 2 means UTF-16le.
^A value of 3 means UTF-16be.
No other values are allowed.
^(The sqlite3.h header file defines C-preprocessor macros SQLITE_UTF8 as 1,
SQLITE_UTF16LE as 2, and SQLITE_UTF16BE as 3, to use in place of
the numeric codes for the text encoding.)^</p>

<h3>User version number</h3>

<p>^The 4-byte big-endian integer at offset 60 is the user version which
is set and queried by the [user_version pragma].  The user version is
not used by SQLite.</p>

<tcl>hd_fragment appid {Application ID}</tcl>
<h3>Application ID</h3>

<p>^The 4-byte big-endian integer at offset 68 is an "Application ID" that
can be set by the [PRAGMA application_id] command in order to identify the
database as belonging to or associated with a particular application.
The application ID is intended for database files used as an
[application file-format].  The application ID can be used by utilities 
such as [http://www.darwinsys.com/file/ | file(1)] to determine the specific
file type rather than just reporting "SQLite3 Database".  A list of
assigned application IDs can be seen by consulting the
[http://www.sqlite.org/src/artifact?ci=trunk&filename=magic.txt|magic.txt]
file in the SQLite source repository.</p>

<tcl>hd_fragment validfor {version-valid-for number}</tcl>
<h3>Write library version number and version-valid-for number</h3>

<p>^The 4-byte big-endian integer at offset 96 stores the 
[SQLITE_VERSION_NUMBER] value for the SQLite library that most
recently modified the database file.  ^The 4-byte big-endian integer at
offset 92 is the value of the [change counter] when the version number
was stored.  The integer at offset 92 indicates which transaction
the version number is valid for and is sometimes called the
"version-valid-for number".

<h3>Header space reserved for expansion</h3>

<p>All other bytes of the database file header are reserved for
future expansion and must be set to zero.</p>

<h2>The Lock-Byte Page</h2>

<p>The lock-byte page is the single page of the database file
that contains the bytes at offsets between 1073741824 and 1073742335,
inclusive.  A database file that is less than or equal to 1073741824 bytes 
in size contains no lock-byte page.  A database file larger than
1073741824 contains exactly one lock-byte page.
</p>
................................................................................
page according to the 
needs and proclivities of the underlying system.  The unix and win32
[VFS] implementations that come built into SQLite do not write to the
lock-byte page, but third-party VFS implementations for
other operating systems might.</p>

<tcl>hd_fragment {freelist} {freelist} {free-page list}</tcl>
<h2>The Freelist</h2>

<p>A database file might contain one or more pages that are not in
active use.  Unused pages can come about, for example, when information
is deleted from the database.  Unused pages are stored on the freelist
and are reused when additional pages are required.</p>

<p>The freelist is organized as a linked list of freelist trunk pages
................................................................................
<p>^The number of freelist pages is stored as a 4-byte big-endian integer
in the database header at an offset of 36 from the beginning of the file.
^The database header also stores the page number of the first freelist trunk
page as a 4-byte big-endian integer at an offset of 32 from the beginning
of the file.</p>

<tcl>hd_fragment btree {B*-Trees} {B-tree}</tcl>
<h2>B-tree Pages</h2>

<p>The b-tree algorithm provides key/data storage with unique and
ordered keys on page-oriented storage devices.
For background information on b-trees, see
Knuth, <u>The Art Of Computer Programming</u>, Volume 3 "Sorting
and Searching", pages 471-479.  Two kinds of b-trees are used by
SQLite.  The algorithm that Knuth calls "B*-Tree" stores all data
................................................................................
<tr><td align=center valign=top>7<td align=center valign=top>1<td align=left>
^The one-byte integer at offset 7 gives the number of fragmented free
bytes within the cell content area.
<tr><td align=center valign=top>8<td align=center valign=top>4<td align=left>
^(The four-byte page number at offset 8 is the right-most pointer.  This
value appears in the header of interior b-tree pages only and is omitted from
all other pages.)^
</table></center>

<p>^The cell pointer array of a b-tree page immediately follows the b-tree
page header.  Let K be the number of cells on the btree.  ^The cell pointer
array consists of K 2-byte integer offsets to the cell contents.  ^The
cell pointers are arranged in key order with left-most cell (the cell with the
smallest key) first and the right-most cell (the cell with the largest
key) last.</p>
................................................................................
The lower seven bits of each of the first eight bytes and all 8 bits of
the ninth byte are used to reconstruct the 64-bit twos-complement integer.
Varints are big-endian: bits taken from the earlier byte of the varint
are the more significant than bits taken from the later bytes. </p>

<p>The format of a cell depends on which kind of b-tree page the cell
appears on.  The following table shows the elements of a cell, in
order of appearance, for the various b-tree page types.

<dl>
<dt><p>Table B-Tree Leaf Cell (header 0x0d):</p></dt>
<dd><p><ul>
<li>A varint which is the total number of bytes of payload, including any
overflow
<li>A varint which is the integer key, a.k.a. "[rowid]"
<li>The initial portion of the payload that does not spill to overflow
pages.
................................................................................
<li>A varint which is the total number of bytes of key payload, including any
overflow
<li>The initial portion of the payload that does not spill to overflow
pages.
<li>A 4-byte big-endian integer page number for the first page of the
overflow page list - omitted if all payload fits on the b-tree page.
</ul></p></dd>
</dl>

<p>The information above can be recast into a table format as follows:</p>

<tcl>hd_fragment cellformat {cell format summary}</tcl>
<center>
<i>B-tree Cell Format</i>
<table border=1 width="80%">
................................................................................
    <td align=center valign=top>&nbsp;
    <td align=center valign=top>&#x2714;
    <td align=center valign=top>&#x2714;
    <td align=left>Page number of first overflow page
</table></center>





<p>The amount of payload that spills onto overflow pages also depends on
the page type.  For the following computations, let U be the usable size
of a database page, the total page size less the reserved space at the
end of each page.  And let P be the payload size.  In the following,
symbol X represents the maximum amount of payload that can be stored directly
on the b-tree page without spilling onto an overflow page and symbol M
represents the minimum amount of payload that must be stored on the btree
page before spilling is allowed.

<dl>
<dt><p>Table B-Tree Leaf Cell:</dt>
<dd><p>
^(Let X be U-35.  If the payload size P is less than or equal to X then
the entire payload is stored on the b-tree leaf page.)^
^(Let M be ((U-12)*32/255)-23 and let K be M+((P-M)%(U-4)).
If P is greater than X
then the number of bytes stored on the table b-tree leaf page is K
if K is less or equal to X or M otherwise.)^
^(Note that number of bytes stored on the leaf page is never less than M.)^
</p></dd>

<dt><p>Table B-Tree Interior Cell:</dt>
<dd><p>
Interior pages of table b-trees have no payload and so there is never
any payload to spill.
</p></dd>

<dt><p>Index B-Tree Leaf Or Interior Cell:</dt>
<dd><p>
^(Let X be ((U-12)*64/255)-23).  If the payload size P is less than
or equal to X then the entire payload is stored on the b-tree page.)^
^(Let M be ((U-12)*32/255)-23 and let K be M+((P-M)%(U-4)).
If P is greater than X then the number
of bytes stored on the index b-tree page is K if K is less than or
equal to X or M otherwise.)^
^(Note that number of bytes stored on the index page is never less than M.)^
</p></dd>
</dl>

<p>Here is an alternative description of the same computation:

<ul>
<li>X is U-35 for table btree leaf pages or
    ((U-12)*64/255)-23 for index pages.
<li>M is always ((U-12)*32/255)-23.
<li>Let K be M+((P-M)%(U-4)).
<li>^If P&lt;=X then all P bytes of payload are stored directly on the 
    btree page without overflow.
<li>^If P>X and K&lt;=X then the first K bytes of P are stored on the 
    btree page and the remaining P-K bytes are stored on overflow pages.
<li>^If P>X and K>X then the first M bytes of P are stored on the
    btree page and the remaining P-M bytes are stored on overflow pages.
</ul>

<p>The overflow thresholds are designed to give a minimum fanout of
4 for index b-trees and to make sure enough of the payload
................................................................................
without consulting an overflow page.  In hindsight, the designers of
the SQLite b-tree logic realize that these thresholds could have been
made much simpler.  However, the computations cannot be changed
without resulting in an incompatible file format.  And the current computations
work well, even if they are a little complex.</p>

<tcl>hd_fragment ovflpgs {overflow page} {overflow pages}</tcl>
<h2>Cell Payload Overflow Pages</h2>

<p>^When the payload of a b-tree cell is too large for the b-tree page,
the surplus is spilled onto overflow pages.  ^Overflow pages form a linked
list.  ^The first four bytes of each overflow page are a big-endian
integer which is the page number of the next page in the chain, or zero
for the final page in the chain.  ^The fifth byte through the last usable
byte are used to hold overflow content.</p>

<h2>Pointer Map or Ptrmap Pages</h2>

<p>Pointer map or ptrmap pages are extra pages inserted into the database
to make the operation of [auto_vacuum] and [incremental_vacuum] modes
more efficient.  Other page types in the database typically have pointers
from parent to child.  For example, an interior b-tree page contains pointers
to its child b-tree pages and an overflow chain has a pointer
from earlier to later links in the chain.  A ptrmap page contains linkage
................................................................................
logic does not know how to update the root_page field of the sqlite_master
table and so it is necessary to prevent root pages from being moved
during an auto-vacuum in order to preserve the integrity of the
sqlite_master table.  ^Root pages are moved to the beginning of the
database file by the CREATE TABLE, CREATE INDEX, DROP TABLE, and
DROP INDEX operations.</p>

<h1>Schema Layer</h1>

<p>The foregoing text describes low-level aspects of the SQLite file
format.  The b-tree mechanism provides a powerful and efficient means of
accessing a large data set.  This section will describe how the
low-level b-tree layer is used to implement higher-level SQL
capabilities.</p>

<tcl>hd_fragment record_format {record format}</tcl>
<h2>Record Format</h2>

<p>The data for a table b-tree leaf page and the key
of an index b-tree page was characterized above
as an arbitrary sequence of bytes.
The prior discussion mentioned one key being less than another, but
did not define what "less than" meant.  The current section will address
these omissions.</p>
................................................................................
The varint format is very efficient at coding the record header.</p>

<p>^The values for each column in the record immediately follow the header.
^(Note that for serial types 0, 8, 9, 12, and 13, the value is zero bytes in
length.  If all columns are of these types then the body section of the
record is empty.)^</p>

<h2>Record Sort Order</h2>

<p>The order of keys in an index b-tree is determined by the sort order of
the records that the keys represent.  Record comparison progresses column
by column.  Columns of a record are examined from left to right.  The
first pair of columns that are not equal determines the relative order
of the two records.  The sort order of individual columns is as
follows:</p>
................................................................................
^Alternative collating functions for table columns can be specified in the
[CREATE TABLE] statement using the COLLATE clause on the [column definition].
^When a column is indexed, the same collating function specified in the
[CREATE TABLE] statement is used for the column in the index, by default,
though this can be overridden using a COLLATE clause in the 
[CREATE INDEX] statement.

<h2>Representation Of SQL Tables</h2>

<p> ^Each ordinary SQL table in the database schema is represented on-disk
by a table b-tree.  ^Each entry in the table b-tree corresponds to a row
of the SQL table.  ^The [rowid] of the SQL table is the 64-bit signed
integer key for each entry in the table b-tree.</p>

<p> ^The content of each SQL table row is stored in the database file by
................................................................................
<p> ^If the [affinity] of a column is REAL and that column contains a
value that can be converted to an integer without loss of information
(if the value contains no fractional part and is not too large to be
represented as an integer) then the column may be stored in the record
as an integer.  ^SQLite will convert the value back to floating
point when extracting it from the record.</p>

<h2>Representation of WITHOUT ROWID Tables</h2>

<p>^If an SQL table is created using the "WITHOUT ROWID" clause at the
end of its CREATE TABLE statement, then that table is a [WITHOUT ROWID]
table and uses a different on-disk representation.  ^A WITHOUT ROWID
table uses an index b-tree rather than a table b-tree for storage.
^The key for each entry in the WITHOUT ROWID b-tree is a record composed
of the columns of the PRIMARY KEY followed by all remaining columns of
................................................................................
as the content encoding for an ordinary rowid table, except that the
order of the columns is rearranged so that PRIMARY KEY columns come
first, and the content is used as the key in an index b-tree rather
than as the data in a table b-tree.
^The special encoding rules for columns with REAL affinity
apply to WITHOUT ROWID tables the same as they do with rowid tables.

<h2>Representation Of SQL Indices</h2>

<p>^Each SQL index, whether explicitly declared via a [CREATE INDEX] statement
or implied by a UNIQUE or PRIMARY KEY constraint, corresponds to an 
index b-tree in the database file.
^Each entry in the index b-tree corresponds to a single row in the 
associated SQL table.
^The key to an index b-tree is
................................................................................
table and entries in each index associated with that table.
^However, in a [partial index], the index b-tree only contains entries
corresponding to table rows for which the WHERE clause expression on the
CREATE INDEX statement is true.
^Corresponding rows in the index and table b-trees share the same rowid
or primary key values and contain the same value for all indexed columns.</p>

<h3>Suppression of redundant columns in WITHOUT ROWID secondary indexed
</h3>

<p> ^In an index on a WITHOUT ROWID table, if one or more of the columns
of the table PRIMARY KEY are also columns of the index, then the
indexed column is not repeated in the table-key suffix on the end of
the index record.  ^(As an example, consider the following SQL:

<blockquote><pre>
................................................................................

<p> ^The suppression of redundant columns in the key suffix of an index
entry only occurs in WITHOUT ROWID tables.  ^In an ordinary rowid table,
the index entry always ends with the rowid even if the [INTEGER PRIMARY KEY]
column is one of the columns being indexed.</p>

<tcl>hd_fragment sqlite_master {sqlite_master} {sqlite_master table}</tcl>
<h2>Storage Of The SQL Database Schema</h2>

<p>^Page 1 of a database file is the root page of a table b-tree that
holds a special table named "sqlite_master" (or "sqlite_temp_master" in
the case of a TEMP database) which stores the complete
database schema.  ^(The structure of the sqlite_master table is as
if it had been created using the following SQL:</p>

................................................................................


<p>^(The sqlite_master.type column will be one
of the following text strings:  'table', 'index', 'view', or 'trigger'
according to the type of object defined.  The 'table' string is used
for both ordinary and [virtual tables].)^</p>

<p>^(The sqlite_master.name column will hold the name of the object.)^
^([UNIQUE] and [PRIMARY KEY] constraints on tables cause SQLite to create
[internal indexes] with names of the form "sqlite_autoindex_TABLE_N"
where TABLE is replaced by the name of the table that contains the
constraint and N is an integer beginning with 1 and increasing by one
with each constraint seen in the table definition.)^
^(In a [WITHOUT ROWID] table, there is no sqlite_master entry for the
PRIMARY KEY, but the "sqlite_autoindex_TABLE_N" name is set aside
................................................................................
^(The sqlite_master.sql is NULL for the [internal indexes] that are
automatically created by [UNIQUE] or [PRIMARY KEY] constraints.)^</p>


<tcl>hd_fragment intschema {internal schema objects} \
{internal schema object} {internal index} {internal indexes} \
{internal table} {internal tables}</tcl>
<h3>Internal Schema Objects</h3>

<p>^In addition to the tables, indexes, views, and triggers created by
the application and/or the developer using CREATE statements SQL, the
sqlite_master table may contain zero or more entries for 
<i>internal schema objects</i> that are created by SQLite for its 
own internal use.  ^The names of internal schema objects
always begin with "sqlite_" and any table, index, view, or trigger
................................................................................
       algorithm to use for each query.
</ul>

<p>New internal schema objects names, always beginning with "sqlite_",
may be added to the SQLite file format in future releases.

<tcl>hd_fragment seqtab {sqlite_sequence}</tcl>
<h3>The sqlite_sequence table</h3>

<p>^The sqlite_sequence table is an internal table used to help implement
[AUTOINCREMENT].  ^The sqlite_sequence table is created automatically
whenever any ordinary table with an AUTOINCREMENT integer primary
key is created.  ^Once created, the sqlite_sequence table exists in the
sqlite_master table forever; it cannot be dropped.
^(The schema for the sqlite_sequence table is:
................................................................................
<p>^Application code is allowed to modify the sqlite_sequence table, to add
new rows, to delete rows, or to modify existing rows.  ^However, application
code cannot create the sqlite_sequence table if it does not already exist.
^Application code can delete all entries from the sqlite_sequence table,
but application code cannot drop the sqlite_sequence table.

<tcl>hd_fragment stat1tab {sqlite_stat1} SQLITE_STAT1 </tcl>
<h3>The sqlite_stat1 table</h3>

<p>^The sqlite_stat1 is an internal table created by the [ANALYZE] command
and used to hold supplemental information about tables and indexes that the
query planner can use to help it find better ways of performing queries.
^Applications can update, delete from, insert into or drop the sqlite_stat1
table, but may not create or alter the sqlite_stat1 table.
^(The schema of the sqlite_stat1 table is as follows:
................................................................................
of the stat column are silently ignored.

<p>^(If the sqlite_stat1.idx column is NULL, then the sqlite_stat1.stat
column contains a single integer which is the approximate number of
rows in the table identified by sqlite_stat1.tbl.)^

<tcl>hd_fragment stat2tab {sqlite_stat2}</tcl>
<h3>The sqlite_stat2 table</h3>

<p>The sqlite_stat2 is only created and is only used if SQLite is compiled
with SQLITE_ENABLE_STAT2 and if the SQLite version number is between
3.6.18 and 3.7.8.  The sqlite_stat2 table is neither read nor written by any
version of SQLite before 3.6.18 nor after 3.7.8.
The sqlite_stat2 table contains additional information
about the distribution of keys within an index.
................................................................................
10 uniform buckets and the samples are the middle row from each bucket.

<p>The format for sqlite_stat2 is recorded here for legacy reference.  
Recent versions of SQLite no longer support sqlite_stat2 and the
sqlite_stat2 table, it is exists, is simply ignored.

<tcl>hd_fragment stat3tab {sqlite_stat3} SQLITE_STAT3</tcl>
<h3>The sqlite_stat3 table</h3>

<p>The sqlite_stat3 is only used if SQLite is compiled
with [SQLITE_ENABLE_STAT3] or [SQLITE_ENABLE_STAT4]
and if the SQLite version number is 3.7.9 or greater.
The sqlite_stat3 table is neither read nor written by any
version of SQLite before 3.7.9.
If the [SQLITE_ENABLE_STAT4] compile-time option is used and the
................................................................................
index must appear in the same order that they occur in the index.  
In other words, if the entry with left-most column S1 is earlier in
the index b-tree than the
entry with left-most column S2, then in the sqlite_stat3 table, 
sample S1 must have a smaller rowid than sample S2.)^

<tcl>hd_fragment stat4tab {sqlite_stat4} SQLITE_STAT4</tcl>
<h3>The sqlite_stat4 table</h3>

<p>The sqlite_stat4 is only created and is only used if SQLite is compiled
with [SQLITE_ENABLE_STAT4] and if the SQLite version number is
3.8.1 or greater.  The sqlite_stat4 table is neither read nor written by any
version of SQLite before 3.8.1.
The sqlite_stat4 table contains additional information
about the distribution of keys within an index or the distribution of
................................................................................
<p>^(In a well-formed sqlite_stat4 table, the samples for any single
index must appear in the same order that they occur in the index.  
In other words, if entry S1 is earlier in the index b-tree than 
entry S2, then in the sqlite_stat4 table, sample S1 must have a
smaller rowid than sample S2.)^

<tcl>hd_fragment rollbackjournal {rollback journal format}</tcl>
<h1>The Rollback Journal</h1>

<p>The rollback journal is a file associated with each SQLite database
file that hold information used to restore the database file to its initial
state during the course of a transaction.
^The rollback journal file is always located in the same 
directory as the database
file and has the same name as the database file but with the string
................................................................................
of three ways:
<ol>
<li>^(The rollback journal file can be deleted)^,
<li>^(The rollback journal file can be truncated to zero length)^, or
<li>^(The header of the rollback journal can be overwritten with
invalid header text (for example, all zeros).)^
</ol>
<p>
^These three ways of committing a transaction correspond to the DELETE,
TRUNCATE, and PERSIST settings, respectively, of the [journal_mode pragma].
</p>


<p>A valid rollback journal begins with a header in the following format:</p>

................................................................................
journal must contain the same database page size and sector size.</p>

<p>^If M is -1 in the initial journal header, then the number of page records
that follow is computed by computing how many page records will fit in
the available space of the remainder of the journal file.</p>

<tcl>hd_fragment walformat {WAL format}</tcl>
<h1>The Write-Ahead Log</h1>

<p>Beginning with [version 3.7.0], SQLite supports a new transaction
control mechanism called "[WAL | write-ahead log]" or "[WAL]".
^When a database is in WAL mode, all connections to that database must
use the WAL.  ^A particular database will use either a rollback journal
or a WAL, but not both at the same time.
^The WAL is always located in the same directory as the database
file and has the same name as the database file but with the string
"<tt>-wal</tt>" appended.</p>

<h2>WAL File Format</h2>

<p>A WAL file consists of a header followed by zero or more "frames".
Each frame records the revised content of a single page from the
database file.  All changes to the database are recorded by writing
frames into the WAL.  Transactions commit when a frame is written that
contains a commit marker.  ^A single WAL can and usually does record 
multiple transactions.  Periodically, the content of the WAL is
................................................................................
       exactly match the checksum computed consecutively on the
       first 24 bytes of the WAL header and the first 8 bytes and
       the content of all frames
       up to and including the current frame.</p></li></li>
</ol>)^

<tcl>hd_fragment walcksm {WAL checksum algorithm}</tcl>
<h2>Checksum Algorithm</h2>

<p>The checksum is computed by interpreting the input as
an even number of unsigned 32-bit integers: x(0) through x(N).
^The 32-bit integers are big-endian if the
magic number in the first 4 bytes of the WAL header is 0x377f0683 and
the integers are little-endian if the magic number is 0x377f0682.
^The checksum values are always stored in the frame header in a
................................................................................
</pre></blockquote>)^

<p>^The outputs s0 and s1 are both weighted checksums using Fibonacci weights
in reverse order.  (^The largest Fibonacci weight occurs on the first element
of the sequence being summed.)  ^The s1 value spans all 32-bit integer
terms of the sequence whereas s0 omits the final term.</p>

<h2>Checkpoint Algorithm</h2>

<p>^On a [checkpoint], the WAL is first flushed to persistent storage using
the xSync method of the [sqlite3_io_methods | VFS]. 
^Then valid content of the WAL is transferred into the database file.
^Finally, the database is flushed to persistent storage using another
xSync method call.
The xSync operations serve as write barriers - all writes launched
................................................................................
the WAL file from the beginning.  ^At the start of the first new
write transaction, the WAL header salt-1 value is incremented
and the salt-2 value is randomized.  These changes to the salts invalidate
old frames in the WAL that have already been checkpointed but not yet
overwritten, and prevent them from being checkpointed again.</p>

<tcl>hd_fragment walread {WAL read algorithm}</tcl>
<h2>Reader Algorithm</h2>

<p>^(To read a page from the database (call it page number P), a reader
first checks the WAL to see if it contains page P.  If so, then the
last valid instance of page P that is followed by a commit frame
or is a commit frame itself becomes the value read.)^  ^If the WAL
contains no copies of page P that are valid and which are a commit
frame or are followed by a commit frame, then page P is read from
................................................................................
reader has to scan the entire WAL looking for page P frames.  If the
WAL is large (multiple megabytes is typical) that scan can be slow,
and read performance suffers.  ^To overcome this problem, a separate
data structure called the wal-index is maintained to expedite the
search for frames of a particular page.</p>

<tcl>hd_fragment walindexformat {wal-index} {WAL-index format}</tcl>
<h2>WAL-Index Format</h2>

<p>Conceptually, the wal-index is shared memory, though the current
VFS implementations use a mmapped file for the wal-index.  ^The mmapped
file is in the same directory as the database and has the same name
as the database with a "<tt>-shm</tt>" suffix appended.  Because
the wal-index is shared memory, SQLite does not support 
[PRAGMA journal_mode | journal_mode=WAL] 

Changes to wrap.tcl.

450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466

467
468
469
470
471
472
473
    /* Things for "fancyformat" documents start here. */
    .fancy img+p {font-style:italic}
    .fancy .codeblock i { color: darkblue; }
    .fancy h1,.fancy h2,.fancy h3,.fancy h4 {font-weight:normal;color:#044a64}
    .fancy h2 { margin-left: 10px }
    .fancy h3 { margin-left: 20px }
    .fancy h4 { margin-left: 30px }
    .fancy th {white-space:nowrap;text-align:left;border-bottom:solid 1px #444}
    .fancy th, .fancy td {padding: 0.2em 1ex; vertical-align:top}
    .fancy #toc a        { color: darkblue ; text-decoration: none }
    .fancy .todo         { color: #AA3333 ; font-style : italic }
    .fancy .todo:before  { content: 'TODO:' }
    .fancy p.todo        { border: solid #AA3333 1px; padding: 1ex }
    .fancy img { display:block; }
    .fancy :link:hover, .fancy :visited:hover { background: wheat }
    .fancy p,.fancy ul,.fancy ol,.fancy dl { margin: 1em 5ex }
    .fancy li p { margin: 1em 0 }

    /* End of "fancyformat" specific rules. */

    .yyterm {
      background: #fff;
      border: 1px solid #000;
      border-radius: 11px;
      padding-left: 4px;







|









>







450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
    /* Things for "fancyformat" documents start here. */
    .fancy img+p {font-style:italic}
    .fancy .codeblock i { color: darkblue; }
    .fancy h1,.fancy h2,.fancy h3,.fancy h4 {font-weight:normal;color:#044a64}
    .fancy h2 { margin-left: 10px }
    .fancy h3 { margin-left: 20px }
    .fancy h4 { margin-left: 30px }
    .fancy th {white-space:xnowrap;text-align:left;border-bottom:solid 1px #444}
    .fancy th, .fancy td {padding: 0.2em 1ex; vertical-align:top}
    .fancy #toc a        { color: darkblue ; text-decoration: none }
    .fancy .todo         { color: #AA3333 ; font-style : italic }
    .fancy .todo:before  { content: 'TODO:' }
    .fancy p.todo        { border: solid #AA3333 1px; padding: 1ex }
    .fancy img { display:block; }
    .fancy :link:hover, .fancy :visited:hover { background: wheat }
    .fancy p,.fancy ul,.fancy ol,.fancy dl { margin: 1em 5ex }
    .fancy li p { margin: 1em 0 }
    .fancy blockquote { margin-left : 10ex }
    /* End of "fancyformat" specific rules. */

    .yyterm {
      background: #fff;
      border: 1px solid #000;
      border-radius: 11px;
      padding-left: 4px;