Documentation Source Text

Check-in [df46867c0f]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Additional clarification of the file format.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: df46867c0f90008ce78b593f2dcf6198bb6f2a70
User & Date: drh 2016-06-20 11:12:22.615
Context
2016-06-23
16:59
Enhancements to the WITHOUT ROWID virtual table documentation. (check-in: ad981b591f user: drh tags: trunk)
2016-06-20
11:12
Additional clarification of the file format. (check-in: df46867c0f user: drh tags: trunk)
10:51
Enhance the file format document to make it clear that records might contain fewer values than there are columns in the table schema. (check-in: 149d28ec0e user: drh tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to pages/fileformat2.in.
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
contained a single file on disk called the "main database file".</p>

<p>During a transaction, SQLite stores additional information 
in a second file called the "rollback journal", or if SQLite is in
[WAL mode], a write-ahead log file.
If the application or
host computer crashes before the transaction completes, then the rollback
journal or write-ahead log contains critical state information needed 
to restore the main database file to a consistent state.  When a rollback 
journal or write-ahead log contains information necessary for recovering 
the state of the database, they are called a "hot journal" or "hot WAL file".
Hot journals and WAL files are only a factor during error recovery
scenarios and so are uncommon, but they are part of the state of an SQLite
database and so cannot be ignored.  This document defines the format
of a rollback journal and the write-ahead log file, but the focus is







|







12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
contained a single file on disk called the "main database file".</p>

<p>During a transaction, SQLite stores additional information 
in a second file called the "rollback journal", or if SQLite is in
[WAL mode], a write-ahead log file.
If the application or
host computer crashes before the transaction completes, then the rollback
journal or write-ahead log contains information needed 
to restore the main database file to a consistent state.  When a rollback 
journal or write-ahead log contains information necessary for recovering 
the state of the database, they are called a "hot journal" or "hot WAL file".
Hot journals and WAL files are only a factor during error recovery
scenarios and so are uncommon, but they are part of the state of an SQLite
database and so cannot be ignored.  This document defines the format
of a rollback journal and the write-ahead log file, but the focus is
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50

<p>Pages are numbered beginning with 1.  The maximum page number is
2147483646 (2<sup><small>31</small></sup> - 2).  The minimum size
SQLite database is a single 512-byte page.
The maximum size database would be 2147483646 pages at 65536 bytes per
page or 140,737,488,224,256 bytes (about 140 terabytes).  Usually SQLite will
hit the maximum file size limit of the underlying filesystem or disk
hardware size limit long before it hits its own internal size limit.</p>

<p>In common use, SQLite databases tend to range in size from a few kilobytes
to a few gigabytes.</p>

<p>At any point in time, every page in the main database has a single
use which is one of the following:
<ul>







|







36
37
38
39
40
41
42
43
44
45
46
47
48
49
50

<p>Pages are numbered beginning with 1.  The maximum page number is
2147483646 (2<sup><small>31</small></sup> - 2).  The minimum size
SQLite database is a single 512-byte page.
The maximum size database would be 2147483646 pages at 65536 bytes per
page or 140,737,488,224,256 bytes (about 140 terabytes).  Usually SQLite will
hit the maximum file size limit of the underlying filesystem or disk
hardware long before it hits its own internal size limit.</p>

<p>In common use, SQLite databases tend to range in size from a few kilobytes
to a few gigabytes.</p>

<p>At any point in time, every page in the main database has a single
use which is one of the following:
<ul>
394
395
396
397
398
399
400





401
402
403
404
405
406
407
though operating-system specific [VFS] 
implementations may choose to read or write bytes on the lock-byte 
page according to the 
needs and proclivities of the underlying system.  The unix and win32
[VFS] implementations that come built into SQLite do not write to the
lock-byte page, but third-party VFS implementations for
other operating systems might.</p>






<tcl>hd_fragment {freelist} {freelist} {free-page list}</tcl>
<h2>The Freelist</h2>

<p>A database file might contain one or more pages that are not in
active use.  Unused pages can come about, for example, when information
is deleted from the database.  Unused pages are stored on the freelist







>
>
>
>
>







394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
though operating-system specific [VFS] 
implementations may choose to read or write bytes on the lock-byte 
page according to the 
needs and proclivities of the underlying system.  The unix and win32
[VFS] implementations that come built into SQLite do not write to the
lock-byte page, but third-party VFS implementations for
other operating systems might.</p>

<p>The lock-byte page arose from the need to support Win95 which had
only manditory file locking.  All modern operating systems that we know of
support advisory file locking, and so the lock-byte page is not really
needed any more, but is retained for backwards compatibility.</p>

<tcl>hd_fragment {freelist} {freelist} {free-page list}</tcl>
<h2>The Freelist</h2>

<p>A database file might contain one or more pages that are not in
active use.  Unused pages can come about, for example, when information
is deleted from the database.  Unused pages are stored on the freelist
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
<li>^If P&gt;X and K&gt;X then the first M bytes of P are stored on the
    btree page and the remaining P-M bytes are stored on overflow pages.
</ul>

<p>The overflow thresholds are designed to give a minimum fanout of
4 for index b-trees and to make sure enough of the payload
is on the b-tree page that the record header can usually be accessed
without consulting an overflow page.  In hindsight, the designers of
the SQLite b-tree logic realize that these thresholds could have been
made much simpler.  However, the computations cannot be changed
without resulting in an incompatible file format.  And the current computations
work well, even if they are a little complex.</p>

<tcl>hd_fragment ovflpgs {overflow page} {overflow pages}</tcl>
<h2>Cell Payload Overflow Pages</h2>








|
|







799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
<li>^If P&gt;X and K&gt;X then the first M bytes of P are stored on the
    btree page and the remaining P-M bytes are stored on overflow pages.
</ul>

<p>The overflow thresholds are designed to give a minimum fanout of
4 for index b-trees and to make sure enough of the payload
is on the b-tree page that the record header can usually be accessed
without consulting an overflow page.  In hindsight, the designer of
the SQLite b-tree logic realized that these thresholds could have been
made much simpler.  However, the computations cannot be changed
without resulting in an incompatible file format.  And the current computations
work well, even if they are a little complex.</p>

<tcl>hd_fragment ovflpgs {overflow page} {overflow pages}</tcl>
<h2>Cell Payload Overflow Pages</h2>