Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Updates to the file format documentation. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
8f5cfd48d2c2bd135cd0b8e972494ae7 |
User & Date: | drh 2010-10-06 13:47:26.000 |
Context
2010-10-08
| ||
02:37 | Changes to the website for version 3.7.3. (check-in: c800d719f4 user: drh tags: trunk) | |
2010-10-06
| ||
13:47 | Updates to the file format documentation. (check-in: 8f5cfd48d2 user: drh tags: trunk) | |
12:23 | Update the change documentation for the 3.7.3 release. Fixes to the description of the index_info pragma. (check-in: d009514359 user: drh tags: trunk) | |
Changes
Changes to pages/fileformat2.in.
︙ | ︙ | |||
9 10 11 12 13 14 15 | format used by SQLite.</p> <h2>1.0 The Database File</h2> <p>Most of the time the complete state of an SQLite database is contained a single file on disk called the "main database file".</p> | | | > > | | | 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | format used by SQLite.</p> <h2>1.0 The Database File</h2> <p>Most of the time the complete state of an SQLite database is contained a single file on disk called the "main database file".</p> <p>While performing a transaction, the default behavior is to stores some temporary information in a second file called the "rollback journal". (The alternative is to use a [write-ahead log] - described separately.) If the application or host computer crashes before completing the transaction, then the rollback journal contains critical state information needed to restore the content of the main database file. When a rollback journal contains information necessary for recovering the state of the database, we say that it is a "hot journal". Hot journals only come up in an error recovery scenario and so are uncommon, but they are part of the state of an SQLite database and so should not be ignored. This document defines the format of a rollback journal (and the write-ahead log file), but main focus is on the main database file.</p> <h3>1.1 Pages</h3> <p>The main database file consists of one or more pages. ^The size of a page is a power of two between 512 and 65536 inclusive. All pages within the same database are the same size. ^The page size for a database file |
︙ | ︙ | |||
359 360 361 362 363 364 365 | inclusive. A database file that is less than or equal to 1073741824 bytes in size contains no lock-byte page. A database file larger than 1073741824 contains exactly one lock-byte page. </p> <p>The lock-byte page is set aside for use by the operating-system specific VFS implementation in implementing the database file locking primitives. | | | | > | | | 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 | inclusive. A database file that is less than or equal to 1073741824 bytes in size contains no lock-byte page. A database file larger than 1073741824 contains exactly one lock-byte page. </p> <p>The lock-byte page is set aside for use by the operating-system specific VFS implementation in implementing the database file locking primitives. ^SQLite does not use the lock-byte page. ^The SQLite core will never read or write, though operating-system specific VFS implementations may choose to read or write bytes on the lock-byte page according to the needs and proclivities of the underlying system. The unix and win32 VFS implementations that come built into SQLite do not write to the lock-byte page, but third-party VFS implementations for other operating systems might.</p> <tcl>hd_fragment {freelist} {freelist} {free-page list}</tcl> <h3>1.4 The Freelist</h3> <p>A database file might contain one or more pages that are not in active use. Unused pages can come about, for example, when information is deleted from the database. Unused pages are stored on the freelist |
︙ | ︙ | |||
792 793 794 795 796 797 798 | page number is the parent b-tree page. </ol> <p>^In any database file that contains ptrmap pages, all b-tree root pages must come before any non-root b-tree page, cell payload overflow page, or freelist page. This restriction ensures that a root page will never be moved during an auto-vacuum or incremental-vacuum. The auto-vacuum | | | 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 | page number is the parent b-tree page. </ol> <p>^In any database file that contains ptrmap pages, all b-tree root pages must come before any non-root b-tree page, cell payload overflow page, or freelist page. This restriction ensures that a root page will never be moved during an auto-vacuum or incremental-vacuum. The auto-vacuum logic does not know how to update the root_page field of the sqlite_master table and so it is necessary to prevent root pages from being moved during an auto-vacuum in order to preserve the integrity of the sqlite_master table. ^Root pages are moved to the beginning of the database file by the CREATE TABLE, CREATE INDEX, DROP TABLE, and DROP INDEX operations.</p> <h2>2.0 Schema Layer</h2> |
︙ | ︙ | |||
817 818 819 820 821 822 823 | <p>The content of a table b-tree leaf page and the key of any index b-tree page was characterized above as an arbitrary sequence of bytes. The prior discussion mentioned one key being less than another, but did not define what "less than" meant. The current section will address these omissions.</p> | | | | | 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 | <p>The content of a table b-tree leaf page and the key of any index b-tree page was characterized above as an arbitrary sequence of bytes. The prior discussion mentioned one key being less than another, but did not define what "less than" meant. The current section will address these omissions.</p> <p>Payload, either table content or index keys, is always in the "record format". The record format defines a sequence of values corresponding to columns in a table or index. The record format specifies the number of columns, the datatype of each column, and the content of each column.</p> <p>The record format makes extensive use of the [variable-length integer] or [varint] representation of 64-bit signed integers defined above.</p> <tcl>hd_fragment serialtype {serial type} {serial types}</tcl> <p>A record contains a header and a body, in that order. ^(The header begins with a single varint which determines the total number of bytes in the header. The varint value is the size of the header in bytes including the size varint itself.)^ ^Following the size varint are one or more additional varints, one per column. These additional varints are called "serial type" numbers and determine the datatype of each column, according to the following chart:</p> <center> ^(<i>Serial Type Codes Of The Record Format</i><br> <table width="80%" border=1> <tr><th>Serial Type<th>Content Size<th>Meaning <tr><td valign=top align=center>0<td valign=top align=center>0<td align=left> NULL |
︙ | ︙ | |||
885 886 887 888 889 890 891 | ^(Note that for serial types 0, 8, 9, 12, and 13, the value is zero bytes in length. If all columns are of these types then the body section of the record is empty.)^</p> <h3>2.2 Record Sort Order</h3> <p>The order of keys in an index b-tree is determined by the sort order of | | | 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 | ^(Note that for serial types 0, 8, 9, 12, and 13, the value is zero bytes in length. If all columns are of these types then the body section of the record is empty.)^</p> <h3>2.2 Record Sort Order</h3> <p>The order of keys in an index b-tree is determined by the sort order of the records that the keys represent. Record comparison progresses column by column. Columns of a record are examined from left to right. The first pair of columns that are not equal determines the relative order of the two records. The sort order of individual columns is as follows:</p> <ol> <li>^NULL values (serial type 0) sort first |
︙ | ︙ | |||
1050 1051 1052 1053 1054 1055 1056 | PRIMARY KEY constraints, the sqlite_master.sql field is NULL.)^</p> <tcl>hd_fragment rollbackjournal {rollback journal format}</tcl> <h2>3.0 The Rollback Journal</h2> <p>The rollback journal is a file associated with each SQLite database | | | > | 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 | PRIMARY KEY constraints, the sqlite_master.sql field is NULL.)^</p> <tcl>hd_fragment rollbackjournal {rollback journal format}</tcl> <h2>3.0 The Rollback Journal</h2> <p>The rollback journal is a file associated with each SQLite database file that hold information used to restore the database file to its initial state during the course of a transaction. ^The rollback journal file is always located in the same directory as the database file and has the same name as the database file but with the string "<tt>-journal</tt>" appended. There can only be a single rollback journal associated with a give database and hence there can only be one write transaction open against a single database at one time.</p> <p>If a transaction is aborted due to an application crash, an operating system crash, or a hardware power failure or crash, then the database may |
︙ | ︙ | |||
1187 1188 1189 1190 1191 1192 1193 | <h2>4.0 The Write-Ahead Log</h2> <p>Beginning with [version 3.7.0], SQLite supports a new transaction control mechanism called "[WAL | write-ahead log]" or "[WAL]". ^When a database is in WAL mode, all connections to that database must use the WAL. ^A particular database will use either a rollback journal or a WAL, but not both at the same time. | | | 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 | <h2>4.0 The Write-Ahead Log</h2> <p>Beginning with [version 3.7.0], SQLite supports a new transaction control mechanism called "[WAL | write-ahead log]" or "[WAL]". ^When a database is in WAL mode, all connections to that database must use the WAL. ^A particular database will use either a rollback journal or a WAL, but not both at the same time. ^The WAL is always located in the same directory as the database file and has the same name as the database file but with the string "<tt>-wal</tt>" appended.</p> <h3>4.1 WAL File Format</h4> <p>A WAL file consists of a header followed by zero or more "frames". Each frame records the revised content of a single page from the |
︙ | ︙ |