Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Updates to fileformat.html. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
712d82fbd7e044730695f55a289a32c0 |
User & Date: | dan 2009-04-27 11:59:29.000 |
Context
2009-04-27
| ||
18:20 | Further updates to fileformat.html. (check-in: 47c1a4a97e user: dan tags: trunk) | |
11:59 | Updates to fileformat.html. (check-in: 712d82fbd7 user: dan tags: trunk) | |
2009-04-21
| ||
16:14 | Added a new FAQ to explain how the syntax diagrams are created. (check-in: 74c109fae7 user: drh tags: trunk) | |
Changes
Changes to pages/fileformat.in.
︙ | ︙ | |||
184 185 186 187 188 189 190 | journal file and the rules for correctly reading a database image from the combination of a database file and journal file. The same section also includes guidelines and requirements describing the intended method for atomically updating the database image within the file-system without risking database corruption. <p> | | | 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 | journal file and the rules for correctly reading a database image from the combination of a database file and journal file. The same section also includes guidelines and requirements describing the intended method for atomically updating the database image within the file-system without risking database corruption. <p> <b>Section <cite>interoperability_requirements</cite></b> contains a description of and software requirements related to: <ul> <li>The locking protocol used by SQLite to manage read and write access to the database and journal files within the file-system, and <li>the change-counter and schema-cookie protocols that must be followed by all database writers to facilitate the implementation of |
︙ | ︙ | |||
208 209 210 211 212 213 214 | is required to interpret or create SQLite database images may be found in section <cite>database_file_format</cite>. Requirements and guidelines for software required to safely read and write database images to disk using an SQLite compatible system failure recovery mechanism to prevent database corruption or data loss are found in section <cite>file_system_usage</cite>. Finally, rules and requirements for software systems required to read and write live SQLite databases within a system that | | | | 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 | is required to interpret or create SQLite database images may be found in section <cite>database_file_format</cite>. Requirements and guidelines for software required to safely read and write database images to disk using an SQLite compatible system failure recovery mechanism to prevent database corruption or data loss are found in section <cite>file_system_usage</cite>. Finally, rules and requirements for software systems required to read and write live SQLite databases within a system that includes real SQLite clients that may also read and write databases are presented in section <cite>interoperability_requirements</cite>. [h2 "Glossary"] <table id=glossary> <tr><td>Auto-vacuum last root-page<td> A page number stored as 32-bit integer at byte offset 52 of the database file header (see section <cite>file_header</cite>). In an auto-vacuum database, this is the numerically largest |
︙ | ︙ | |||
433 434 435 436 437 438 439 | h1 "SQLite Database Files" sqlite_database_files <p> The bulk of this document, section <cite>database_file_format</cite>, contains the definition of a <i>well-formed SQLite database file</i>. SQLite is required to create database files that meet this definition. | < | < | | 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 | h1 "SQLite Database Files" sqlite_database_files <p> The bulk of this document, section <cite>database_file_format</cite>, contains the definition of a <i>well-formed SQLite database file</i>. SQLite is required to create database files that meet this definition. [fileformat_import_requirement2 H30010] <p> Additionally, the database file should contain a serialized version of the logical database produced by the transaction. For all but the most trivial logical databases, there are many possible serial representations. [fileformat_import_requirement2 H30020] --> <!-- <p> Section <cite>database_file_manipulation</cite> contains requirements describing in more detail the way in which SQLite manipulates the fields and data structures described in section |
︙ | ︙ | |||
760 761 762 763 764 765 766 | Some of the following requirements state that certain database header fields must contain defined constant values, even though the sqlite database file format is designed to allow various values. This is done to artificially constrain the definition of a <i>well-formed database</i> in order to make implementation and testing more practical. | < | < | < | < < | < < | < | < | < | < | < < | < < | < < | < < | < < | < < | < < | | 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 | Some of the following requirements state that certain database header fields must contain defined constant values, even though the sqlite database file format is designed to allow various values. This is done to artificially constrain the definition of a <i>well-formed database</i> in order to make implementation and testing more practical. [fileformat_import_requirement2 H30030] <p> Following the 16 byte magic string in the file header is the <i>page size</i>, a 2-byte field. See section <cite>pages_and_page_types</cite> for details. [fileformat_import_requirement2 H30040] [fileformat_import_requirement2 H30050] [fileformat_import_requirement2 H30060] [fileformat_import_requirement2 H30070] [fileformat_import_requirement2 H30080] [fileformat_import_requirement2 H30090] [fileformat_import_requirement2 H30100] <p> Following the <i>file change counter</i> in the database header are two 4-byte fields related to the database file <i>free page list</i>. See section <cite>free_page_list</cite> for details. [fileformat_import_requirement2 H30110] [fileformat_import_requirement2 H30120] [fileformat_import_requirement2 H30130] [fileformat_import_requirement2 H30140] [fileformat_import_requirement2 H30150] [fileformat_import_requirement2 H30160] [fileformat_import_requirement2 H30170] [fileformat_import_requirement2 H30180] [h3 "Pages and Page Types" "pages_and_page_types"] <p> The entire database file is divided into pages, each page consisting of <i>page-size</i> bytes, where <i>page-size</i> is the 2-byte integer value stored at offset 16 of the file header (see above). The <i>page-size</i> is always a power of two between 512 |
︙ | ︙ | |||
846 847 848 849 850 851 852 | permanently designated "pointer-map" pages. See section <cite>pointer_map_pages</cite> for details. <li><b>The locking page</b>. The database page that starts at byte offset 2<sup>30</sup>, if it is large enough to contain such a page, is always left unused. </ul> | < | < | < | < | | 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 | permanently designated "pointer-map" pages. See section <cite>pointer_map_pages</cite> for details. <li><b>The locking page</b>. The database page that starts at byte offset 2<sup>30</sup>, if it is large enough to contain such a page, is always left unused. </ul> [fileformat_import_requirement2 H30190] [fileformat_import_requirement2 H30200] [fileformat_import_requirement2 H30210] [fileformat_import_requirement2 H30220] [h3 "The Schema Table" schema_table] <p> Apart from being the page that contains the file-header, page 1 of the database file is special because it is the root page of the B-Tree structure that contains the schema table data. From the SQL |
︙ | ︙ | |||
962 963 964 965 966 967 968 | [Tr]<td>index <td>i1 <td>abc <td>3 <td>CREATE INDEX i1 ON abc(b, c) [Tr]<td>table <td>def <td>def <td>4 <td>CREATE TABLE def(a PRIMARY KEY, b, c, UNIQUE(b, c)) [Tr]<td>index <td>sqlite_autoindex_def_1 <td>def <td>5 <td> [Tr]<td>index <td>sqlite_autoindex_def_2 <td>def <td>6 <td> [Tr]<td>view <td>v1 <td>v1 <td>0 <td>CREATE VIEW v1 AS SELECT * FROM abc </table> | < | < | < | < < | < < | < < | < < | < < | < < | < | < < | < | < | < | < | < | < | < | < < | < < | < < | < | < < | < < | < < | < | < < | < | 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 | [Tr]<td>index <td>i1 <td>abc <td>3 <td>CREATE INDEX i1 ON abc(b, c) [Tr]<td>table <td>def <td>def <td>4 <td>CREATE TABLE def(a PRIMARY KEY, b, c, UNIQUE(b, c)) [Tr]<td>index <td>sqlite_autoindex_def_1 <td>def <td>5 <td> [Tr]<td>index <td>sqlite_autoindex_def_2 <td>def <td>6 <td> [Tr]<td>view <td>v1 <td>v1 <td>0 <td>CREATE VIEW v1 AS SELECT * FROM abc </table> [fileformat_import_requirement2 H30230] [fileformat_import_requirement2 H30240] <p>The following requirements describe "table" records. [fileformat_import_requirement2 H30250] [fileformat_import_requirement2 H30260] [fileformat_import_requirement2 H30270] [fileformat_import_requirement2 H30280] [fileformat_import_requirement2 H30290] [fileformat_import_requirement2 H30300] [fileformat_import_requirement2 H30310] <p>The following requirements describe "implicit index" records. [fileformat_import_requirement2 H30320] [fileformat_import_requirement2 H30330] [fileformat_import_requirement2 H30340] [fileformat_import_requirement2 H30350] <p>The following requirements describe "explicit index" records. [fileformat_import_requirement2 H30360] [fileformat_import_requirement2 H30370] [fileformat_import_requirement2 H30380] [fileformat_import_requirement2 H30390] <p>The following requirements describe "view" records. [fileformat_import_requirement2 H30400] [fileformat_import_requirement2 H30410] [fileformat_import_requirement2 H30420] [fileformat_import_requirement2 H30430] <p>The following requirements describe "trigger" records. [fileformat_import_requirement2 H30440] [fileformat_import_requirement2 H30450] [fileformat_import_requirement2 H30460] [fileformat_import_requirement2 H30470] <p>The following requirements describe the placement of B-Tree root pages in auto-vacuum databases. [fileformat_import_requirement2 H30480] [fileformat_import_requirement2 H30490] [h2 "B-Tree Structures" "btree_structures"] <p> A large part of any SQLite database file is given over to one or more B-Tree structures. A single B-Tree structure is stored using one or more database pages. Each page contains a single B-Tree node. |
︙ | ︙ | |||
1074 1075 1076 1077 1078 1079 1080 | B-Tree structures are described in detail in section <cite>table_btrees</cite>. <li>The <b>index B-Tree</b>, which uses database records as keys. Index B-Tree structures are described in detail in section <cite>index_btrees</cite>. </ul> | < | < | | 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 | B-Tree structures are described in detail in section <cite>table_btrees</cite>. <li>The <b>index B-Tree</b>, which uses database records as keys. Index B-Tree structures are described in detail in section <cite>index_btrees</cite>. </ul> [fileformat_import_requirement2 H30500] [fileformat_import_requirement2 H30510] [h3 "Variable Length Integer Format" "varint_format"] <p> In several parts of the B-Tree structure, 64-bit twos-complement signed integer values are stored in the "variable length integer format" described here. <p> |
︙ | ︙ | |||
1138 1139 1140 1141 1142 1143 1144 | [Tr]<td>200815 <td>0x000000000003106F <td>0x8C 0xA0 0x6F [Tr]<td>-1 <td>0xFFFFFFFFFFFFFFFF <td>0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF [Tr]<td>-78056 <td>0xFFFFFFFFFFFECD56 <td>0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFD 0xCD 0x56 </table> | < | < < | < < | < < | | 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 | [Tr]<td>200815 <td>0x000000000003106F <td>0x8C 0xA0 0x6F [Tr]<td>-1 <td>0xFFFFFFFFFFFFFFFF <td>0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF [Tr]<td>-78056 <td>0xFFFFFFFFFFFECD56 <td>0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFD 0xCD 0x56 </table> [fileformat_import_requirement2 H30520] [fileformat_import_requirement2 H30530] [fileformat_import_requirement2 H30540] [fileformat_import_requirement2 H30550] [h3 "Database Record Format" "record_format"] <p> A database record is a blob of data that represents an ordered list of one or more SQL values. Database records are used in two places in SQLite database files - as the associated data for entries |
︙ | ︙ | |||
1240 1241 1242 1243 1244 1245 1246 | the length of the data field is as described in the table above. <p> The data field associated with a string value contains the string encoded using the database encoding, as defined in the database file header (see section <cite>file_header</cite>). No nul-terminator character is stored in the database. | < | < < | < < | < < | < < | < | < | < | < | < | < < | < < | < < | < < | < < | < | < | | 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 | the length of the data field is as described in the table above. <p> The data field associated with a string value contains the string encoded using the database encoding, as defined in the database file header (see section <cite>file_header</cite>). No nul-terminator character is stored in the database. [fileformat_import_requirement2 H30560] [fileformat_import_requirement2 H30570] [fileformat_import_requirement2 H30580] [fileformat_import_requirement2 H30590] [fileformat_import_requirement2 H30600] [fileformat_import_requirement2 H30610] [fileformat_import_requirement2 H30620] [fileformat_import_requirement2 H30630] [fileformat_import_requirement2 H30640] [fileformat_import_requirement2 H30650] [fileformat_import_requirement2 H30660] [fileformat_import_requirement2 H30670] [fileformat_import_requirement2 H30680] [fileformat_import_requirement2 H30690] [fileformat_import_requirement2 H30700] <p> The following database file properties define restrictions on the integer values that may be stored within a <i>database record header</i>. [fileformat_import_requirement2 H30710] [fileformat_import_requirement2 H30720] [h3 "Index B-Trees" index_btrees] <p> As specified in section <cite>fileformat_overview</cite>, index B-Tree structures store a unique set of the database records described in the previous section. While in some cases, when there are very few entries in the B-Tree, the entire structure may fit on a single |
︙ | ︙ | |||
1334 1335 1336 1337 1338 1339 1340 | Figure <cite>figure_indextree</cite> depicts one possible record distribution for an index B-Tree containing records R1 to R26, assuming that for all values of N, <i>R(N+1)>R(N)</i>. In total the B-Tree structure uses 11 database file pages. Internal tree nodes contain database records and references to child node pages. Leaf nodes contain database records only. | < | < < | < < | < < | < | < < | < < | < < | | 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 | Figure <cite>figure_indextree</cite> depicts one possible record distribution for an index B-Tree containing records R1 to R26, assuming that for all values of N, <i>R(N+1)>R(N)</i>. In total the B-Tree structure uses 11 database file pages. Internal tree nodes contain database records and references to child node pages. Leaf nodes contain database records only. [fileformat_import_requirement2 H30730] [fileformat_import_requirement2 H30740] [fileformat_import_requirement2 H30750] [fileformat_import_requirement2 H30760] <p> The precise way in which index B-Tree pages and cells are formatted is described in subsequent sections. [h4 "Index B-Tree Content"] <p> The database file contains one index B-Tree for each database index in the logical database, including those created by UNIQUE or PRIMARY KEY clauses in table declarations. Each record stored in an index B-Tree contains the same number of fields, the number of indexed columns in the database index declaration plus one. <p> An index B-Tree contains an entry for each row in its associated database table. The fields of the record used as the index B-Tree key are copies of each of the indexed columns of the associated database row, in order, followed by the rowid value of the same row. See figure <cite>figure_examplepop</cite> for an example. [fileformat_import_requirement2 H30770] [fileformat_import_requirement2 H30780] [fileformat_import_requirement2 H30790] [fileformat_import_requirement2 H30800] [h4 "Record Sort Order" "index_btree_compare_func"] <p> This section defines the comparison function used when database records are used as B-Tree keys for index B-Trees. The comparison function is only defined when both database records contain the same number of fields. |
︙ | ︙ | |||
1518 1519 1520 1521 1522 1523 1524 | the page) of the next block in the list stored as a big-endian unsigned integer. The first two bytes of the final block in the list are set to zero. The third and fourth bytes of each free block contain the total size of the free block in bytes, stored as a 2 byte big-endian unsigned integer. </ul> | < | < | < | < < | < < | < < | < | < < | < < | < < | < < | < | < < | < < | < < | < < < | | 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 | the page) of the next block in the list stored as a big-endian unsigned integer. The first two bytes of the final block in the list are set to zero. The third and fourth bytes of each free block contain the total size of the free block in bytes, stored as a 2 byte big-endian unsigned integer. </ul> [fileformat_import_requirement2 H30810] [fileformat_import_requirement2 H30820] <p> The following requirements describe the <i>B-Tree page header</i> present at the start of both index and table B-Tree pages. [fileformat_import_requirement2 H30830] [fileformat_import_requirement2 H30840] [fileformat_import_requirement2 H30850] [fileformat_import_requirement2 H30860] <p> This requirement describes the cell content offset array. It applies to both B-Tree variants. [fileformat_import_requirement2 H30870] [fileformat_import_requirement2 H30880] [fileformat_import_requirement2 H30890] [fileformat_import_requirement2 H30900] [fileformat_import_requirement2 H30910] <p> The following requirements govern management of free-space within the page content area (both table and index B-Tree pages). [fileformat_import_requirement2 H30920] [fileformat_import_requirement2 H30930] [fileformat_import_requirement2 H30940] [fileformat_import_requirement2 H30950] [fileformat_import_requirement2 H30960] [h4 "Index B-Tree Cell Format" index_btree_cell_format] <p> For index B-Tree internal tree node pages, each B-Tree cell begins with a child page-number, stored as a 4-byte big-endian unsigned integer. This field is omitted for leaf pages, which have no children. |
︙ | ︙ | |||
1629 1630 1631 1632 1633 1634 1635 | in bytes less the number of unused bytes left at the end of every page (as read from byte offset 20 of the file header), and <i>max-embedded-fraction</i> and <i>min-embedded-fraction</i> are the values read from byte offsets 21 and 22 of the file header, respectively. [Figure indexlongrecord.gif figure_indexlongrecord "Large Record Index B-Tree Cell"] | < | < < | < < | < < | < < | | 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 | in bytes less the number of unused bytes left at the end of every page (as read from byte offset 20 of the file header), and <i>max-embedded-fraction</i> and <i>min-embedded-fraction</i> are the values read from byte offsets 21 and 22 of the file header, respectively. [Figure indexlongrecord.gif figure_indexlongrecord "Large Record Index B-Tree Cell"] [fileformat_import_requirement2 H30970] [fileformat_import_requirement2 H30980] [fileformat_import_requirement2 H30990] [fileformat_import_requirement2 H31000] [fileformat_import_requirement2 H31010] <p> Requirements H31010 and H30990 are similar to the algorithms presented in the text above. However instead of <i>min-embedded-fraction</i> and <i>max-embedded-fraction</i> the requirements use the constant values 32 and 64, as well-formed database files are required by H30080 and H30070 to store these |
︙ | ︙ | |||
1687 1688 1689 1690 1691 1692 1693 | Figure <cite>figure_tabletree</cite> depicts a table B-Tree containing a contiguous set of 14 integer keys starting with 1. Each key <i>n</i> has an associated database record R<i>n</i>. All the keys and their associated records are stored in the leaf pages. The internal node pages contain no database data, their only purpose is to provide a way to navigate the tree structure. | < | < < | < < | < < | | 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 | Figure <cite>figure_tabletree</cite> depicts a table B-Tree containing a contiguous set of 14 integer keys starting with 1. Each key <i>n</i> has an associated database record R<i>n</i>. All the keys and their associated records are stored in the leaf pages. The internal node pages contain no database data, their only purpose is to provide a way to navigate the tree structure. [fileformat_import_requirement2 H31020] [fileformat_import_requirement2 H31030] [fileformat_import_requirement2 H31040] [fileformat_import_requirement2 H31050] <p> The precise way in which table B-Tree pages and cells are formatted is described in subsequent sections. [h4 "Table B-Tree Content" table_btree_content] <p> |
︙ | ︙ | |||
1737 1738 1739 1740 1741 1742 1743 | are SQL NULL. If the schema layer file-format is greater than 2, then the values associated with the "missing" fields are determined by the default value of the associated database table columns. <span class=todo>Reference to CREATE TABLE syntax. How are default values determined?</span> | < | < < | < < | < < | < | < < | < < | < | < | | 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 | are SQL NULL. If the schema layer file-format is greater than 2, then the values associated with the "missing" fields are determined by the default value of the associated database table columns. <span class=todo>Reference to CREATE TABLE syntax. How are default values determined?</span> [fileformat_import_requirement2 H31060] [fileformat_import_requirement2 H31070] [fileformat_import_requirement2 H31080] [fileformat_import_requirement2 H31090] <p>The following database properties discuss table B-Tree records with implicit (default) values. [fileformat_import_requirement2 H31100] [fileformat_import_requirement2 H31110] [fileformat_import_requirement2 H31120] [h4 "Table B-Tree Page Format"] <p> Table B-Tree structures use the same page format as index B-Tree structures, described in section <cite>index_btree_page_format</cite>, with the following differences: <ul> <li>The first byte of the page-header, the "flags" field, is set to 0x05 for internal tree node pages, and 0x0D for leaf pages. <li>The content and format of the B-Tree cells is different. See section <cite>table_btree_cell_format</cite> for details. <li>The format of page 1 is the same as any other table B-Tree, except that 100 bytes less than usual is available for content. The first 100 bytes of page 1 is consumed by the database file header. </ul> [fileformat_import_requirement2 H31130] [fileformat_import_requirement2 H31140] <p> Most of the requirements specified in section <cite>index_btree_page_format</cite> also apply to table B-Tree pages. The wording of the requirements make it clear when this is the case, either by refering to generic "B-Tree pages" or by explicitly stating that the statement applies to both "table and |
︙ | ︙ | |||
1858 1859 1860 1861 1862 1863 1864 | header). <p> The following requirements describe the format of table B-Tree cells, and the distribution thereof between B-Tree and overflow pages. | < | < < | < < | < < | < < | | 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 | header). <p> The following requirements describe the format of table B-Tree cells, and the distribution thereof between B-Tree and overflow pages. [fileformat_import_requirement2 H31150] [fileformat_import_requirement2 H31160] [fileformat_import_requirement2 H31170] [fileformat_import_requirement2 H31180] [fileformat_import_requirement2 H31190] <p> Requirement H31190 is very similar to the algorithm presented in the text above. Instead of <i>min-embedded-fraction</i>, it uses the constant value 32, as well-formed database files are required by H30090 to store this value in the relevant database file header field. |
︙ | ︙ | |||
1918 1919 1920 1921 1922 1923 1924 | <p> Each overflow page except for the last one in the linked list contains <i>available-space</i> bytes of record data. The last page in the list contains the remaining data, starting at byte offset 4. The value of the "next page" field on the last page in an overflow chain is undefined. | < | < < | < < | < < | | 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 | <p> Each overflow page except for the last one in the linked list contains <i>available-space</i> bytes of record data. The last page in the list contains the remaining data, starting at byte offset 4. The value of the "next page" field on the last page in an overflow chain is undefined. [fileformat_import_requirement2 H31200] [fileformat_import_requirement2 H31210] [fileformat_import_requirement2 H31220] [fileformat_import_requirement2 H31230] [h2 "The Free Page List" free_page_list] <p> Sometimes, after deleting data from the database, SQLite removes pages from B-Tree structures. If these pages are not immediately required for some other purpose, they are placed on the free page list. The free page list contains those pages that are not currently being |
︙ | ︙ | |||
1971 1972 1973 1974 1975 1976 1977 | <p> All trunk pages in the free-list except for the first contain the maximum possible number of references to leaf pages. <span class=todo>Is this actually true in an auto-vacuum capable database?</span> The page number of the first page in the linked list of free-list trunk pages is stored as a 4-byte big-endian unsigned integer at offset 32 of the file header (section <cite>file_header</cite>). | < | < < | < < | < < | < < | < < | < < | < | < < | | 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 | <p> All trunk pages in the free-list except for the first contain the maximum possible number of references to leaf pages. <span class=todo>Is this actually true in an auto-vacuum capable database?</span> The page number of the first page in the linked list of free-list trunk pages is stored as a 4-byte big-endian unsigned integer at offset 32 of the file header (section <cite>file_header</cite>). [fileformat_import_requirement2 H31240] [fileformat_import_requirement2 H31250] [fileformat_import_requirement2 H31260] [fileformat_import_requirement2 H31270] [fileformat_import_requirement2 H31280] [fileformat_import_requirement2 H31290] [fileformat_import_requirement2 H31300] <p>The following statements govern the two 4-byte big-endian integers associated with the <i>free page list</i> structure in the database file header. [fileformat_import_requirement2 H31310] [fileformat_import_requirement2 H31320] [h2 "Pointer Map Pages" pointer_map_pages] <p> Pointer map pages are only present in auto-vacuum capable databases. A database is an auto-vacuum capable database if the value stored at byte offset 52 of the file-header is non-zero. |
︙ | ︙ | |||
2076 2077 2078 2079 2080 2081 2082 | table entries for the <i>num-entries</i> pages that follow it in the database file: <pre> <i>pointer-map-page-number</i> := 2 + <i>n</i> * <i>num-entries</i> </pre> | < | < < | < < | < < | < < | < | < | < | < | < | | | | < | | | | | < | | 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 | table entries for the <i>num-entries</i> pages that follow it in the database file: <pre> <i>pointer-map-page-number</i> := 2 + <i>n</i> * <i>num-entries</i> </pre> [fileformat_import_requirement2 H31330] [fileformat_import_requirement2 H31340] [fileformat_import_requirement2 H31350] [fileformat_import_requirement2 H31360] [fileformat_import_requirement2 H31370] <p> The following requirements govern the content of pointer-map entries. [fileformat_import_requirement2 H31380] [fileformat_import_requirement2 H31390] [fileformat_import_requirement2 H31400] [fileformat_import_requirement2 H31410] [fileformat_import_requirement2 H31420] [h1 "Database File-System Representation" file_system_usage] <p> The previous section, section <cite>database_file_format</cite> describes the format of an SQLite database image. A database image is the serialized form of a logical SQLite database. Normally, a database image is stored within the file-system in a single file, a database file. In this case no other data is stored within the database file. The first byte of the <i>database file</i> is the first byte of the database image, and the last byte of the database file is the last byte of the <i>database image</i>. For this reason, SQLite is often described as a "single-file database system". However, an SQLite database image is not always stored in a single file within the file-system. It is also possible for it to be distributed between the database file and a journal file. A third file, a <i>master-journal file</i> may also be part of the file-system representation. Although a <i>master-journal file</i> never contains any part of the <i>database image</i>, it can contain meta-data that helps determine which parts of the database image are stored within the database file, and which parts are stored within the journal file. <p> In other words, the file-system representation of an SQLite database consists of the following: <ul> <li> <p>A main <b>database file</b>. The database file is |
︙ | ︙ | |||
2165 2166 2167 2168 2169 2170 2171 | the database image to commit a database transaction. In practice, a database reader only encounters such a configuration if a previous attempt to modify the database image on disk was interrupted by an application, OS or power failure. The most practical approach (and that taken by SQLite) is to extract the subset of the database image currently stored within the journal file and write it into the database file, thus restoring the system to a state where the database file | | | | | < > | 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 | the database image to commit a database transaction. In practice, a database reader only encounters such a configuration if a previous attempt to modify the database image on disk was interrupted by an application, OS or power failure. The most practical approach (and that taken by SQLite) is to extract the subset of the database image currently stored within the journal file and write it into the database file, thus restoring the system to a state where the database file contains the entire database image. Other SQLite documentation, and the comments in the SQLite source code, identify this process as <i>hot journal rollback</i>. Instead of focusing on the <i>hot journal rollback</i> process, this document describes how journal and master-journal files must be interpreted in order to extract the current database image from the file-system representation in the general case. <p> Sub-section <cite>journal_file_formats</cite> describes the formats used by <i>journal</i> and <i>master-journal</i> files. <p> Sub-section <cite>reading_from_files</cite> contains a precise |
︙ | ︙ | |||
2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 | <p> A <i>master-journal file</i> contains the full paths to two or more <i>journal files</i>, each encoded using UTF-8 encoding and terminated by a single nul character (byte value 0x00). There is no padding between the journal paths, each UTF-8 encoded path begins immediately after the nul character that terminates the previous one. [h2 "Reading an SQLite Database" reading_from_files] <p> As described in section <cite>pages_and_page_types</cite> of this document, an SQLite database image is a set of contiguously numbered fixed size pages. The numbering starts at 1, not 0. Page 1 contains the <i>database file header</i> and the root page of the <i>schema table</i>, | > > > > > | | | | | | | | | | | > > > | 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 | <p> A <i>master-journal file</i> contains the full paths to two or more <i>journal files</i>, each encoded using UTF-8 encoding and terminated by a single nul character (byte value 0x00). There is no padding between the journal paths, each UTF-8 encoded path begins immediately after the nul character that terminates the previous one. <p class=todo> Note that the contents of a master-journal is not really all that important, and is not required at all to read the database image. Used for cleanup only. [h2 "Reading an SQLite Database" reading_from_files] <p> As described in section <cite>pages_and_page_types</cite> of this document, an SQLite database image is a set of contiguously numbered fixed size pages. The numbering starts at 1, not 0. Page 1 contains the <i>database file header</i> and the root page of the <i>schema table</i>, and all other pages within the database image are somehow referenced by number, either directly or indirectly, from page 1, either directly or indirectly. In order to be able to read the database image from within the file-system, a database reader needs to be able to ascertain: <ol> <li> The <i>page-size</i> used by the database image, <li> The number of pages in the database image, and <li> The content of each database page. </ol> <p> Usually, the database image is simply the contents of the database file. In this case, reading the database image is straightforward. The page-size used by the database image can be read from the 2-byte big-endian integer field stored at byte offset 16 of the database file (see section <cite>file_header</cite>). The number of pages in the database image can be determined by querying the size of the database file in bytes and then dividing by the <i>page-size</i>. Reading the contents of a <i>database page</i> is a simple matter of reading a block of <i>page-size</i> bytes from an offset calculated from the page-number of the required page: <pre> <i>offset</i> := (<i>page-number</i> - 1) * page-size </pre> <p> However, if there is a valid journal file corresponding to the database file present within the file-system then the situation is more complicated. The file-system is considered to contain a valid journal file if each of the following conditions are met: |
︙ | ︙ | |||
2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 | for each of these is stored in the database file. Even though the contents of the file-system is quite different in representation 2 as in representation 1, the stored database image is the same in each case: 4 pages of page-size bytes each, content A, B, C and D respectively. [Figure filesystem1.gif figure_filesystem1 "Two ways to store the same database image"] <p> These requirements describe the way a database reader must determine whether or not there is a valid journal file within the file-system. [fileformat_import_requirement2 H32000] [fileformat_import_requirement2 H32010] | > > > > > > > | 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 | for each of these is stored in the database file. Even though the contents of the file-system is quite different in representation 2 as in representation 1, the stored database image is the same in each case: 4 pages of page-size bytes each, content A, B, C and D respectively. [Figure filesystem1.gif figure_filesystem1 "Two ways to store the same database image"] <p class=todo> The requirements that follow talk about "well-formed" journal sections, records and master-journal-pointers. There should be some kind of reference back to the definitions of these things. Either in the requirements themselves (refer to other requirements by number) or in the surrounding text (point to document sections). Or, better, both. <p> These requirements describe the way a database reader must determine whether or not there is a valid journal file within the file-system. [fileformat_import_requirement2 H32000] [fileformat_import_requirement2 H32010] |
︙ | ︙ | |||
2627 2628 2629 2630 2631 2632 2633 | [fileformat_import_requirement2 H32080] [h2 "Writing to an SQLite Database" writing_to_files] <p> When an SQLite user commits a transaction that modifies the contents of the database, the database representation on disk must be modified | | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 | [fileformat_import_requirement2 H32080] [h2 "Writing to an SQLite Database" writing_to_files] <p> When an SQLite user commits a transaction that modifies the contents of the database, the database representation on disk must be modified to reflect the new contents of the database image. In doing so, SQLite is required to ensure that if an application, operating system or power failure occurs while updating the database file-system representation, the database image stored within the file-system is found to be in a valid state following recovery. If the transaction being committed to the database file replaces database image A with database image B, then all database images equivalent to A or B are considered valid database image states. <p> Two database images are considered to be equivalent if each of the following are true: <ul> <li> <p> The two database images have the same page-size. <li> <p> The two database images have the same number of pages. <li> <p> The content of each page in the first database image that is not a free-list leaf page is identical to the corresponding page in the second database image. </ul> <p> The exception for free-list leaf pages (see section <cite>free_page_list</cite>) in the third bullet point above is made because free-list leaf pages contain no valid data and are never read by SQLite database readers. Since the blob of data stored on such a page is never read for any purpose, two database images may have a different blob stored on a free-list leaf page and still be considered equivalent. This concept can sometimes be exploited to more efficiently update an SQLite database file-system representation. [fileformat_import_requirement2 H32290] <p> The following two requirements constrain the way in which a database file-system representation may be updated. In many ways, they are equivalent to "do not corrupt the database file-system representation". [fileformat_import_requirement2 H32300] [fileformat_import_requirement2 H32310] <p> The following two sections, sections <cite>rollback_journal_method</cite> and <cite>rollforward_journal_method</cite> are somewhat advisory in nature. They contain descriptions of two different methods that could be used to modify a database image within a database file-system representation in accordance with the above requirements. These are not the only methods which could be used. So long as the above requirements (and, if applicable, those in section <cite>interoperability_requirements</cite>) are honoured, any method may be used by an SQLite database writer to update the database file-system representation. SQLite itself uses the "rollback-journal method" described in section <cite>rollback_journal_method</cite>. For this reason, section <cite>rollback_journal_method</cite> contains numbered requirements but section <cite>rollforward_journal_method</cite> does not. [h3 "The Rollback-Journal Method" rollback_journal_method] <p> SQLite is required to do make all modifications associated with the transaction such that the database image is modified atomically. If an application, OS or power failure occurs while SQLite is updating the database, upon recovery the contents of the database must reflect either that all modifications associated with the database transaction were successfully applied, or that none of the modifications were applied and the contents of the database are as they were before the failed attempt to modify the database. <p> Therefore, when modifying the file-system representation of a database image so as to commit a transaction that modifies the database image from state A to state B, it must be ensured that the file-system at all times contains a database image in either state A or state B. And that if an OS or power failure occurs before, during or after any IO operation, following recovery the file-system must contain a database image in either state A or state B. <p class=todo> Should introduce requirements here - that the file-system be modified such that the file is always in state A or state B. <p> Some operations on a file-system may be considered atomic. For example deleting a file, or on some systems writing to a single disk sector. However, in general there exists no atomic file-system operation that may be used to update an SQLite database file with the effects of an arbitrary database transaction, which may remove, modify or add multiple database rows, tables or indexes. Therefore, a two stage approach to writing an SQLite database (or indeed, modifying the logical contents of any on-disk database) is required: <ol> <li> The file-system representation of the database is manipulated to a state where a single atomic operation may be used to transform the logical contents of the database from its initial state to the required final state. <li> The required atomic operation is applied. </ol> <p class=todo> The paragraph below is not quite accurate. Each interim state must correspond to either state A or state B. Not necessarily state A. And the above is not completely general either, for the same reason. <p> Step 1 of the above must be accomplished such that all interim states of the file-system correspond to the logical contents of the database as they were before the procedure began. This way, if an application, OS or power failure occurs during step 1, upon recovery the database contents remain unchanged. It is not possible for such a failure to occur "during" step 2, as step 2 consists of a single atomic operation. |
︙ | ︙ | |||
2678 2679 2680 2681 2682 2683 2684 | operation can be used to effect all required changes to the logical database contents. <p> The following two sub-sections describe the specific ways in which SQLite achieves this for single and multiple database transactions. | | | 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 | operation can be used to effect all required changes to the logical database contents. <p> The following two sub-sections describe the specific ways in which SQLite achieves this for single and multiple database transactions. [h4 "Single Database Transactions" single_db_transactions] <p> In order to atomically modify the database image stored in the file-system from database image A to database image B, the file-system must first be manipulated to a state where it contains the database image A, but can by a single atomic operation be modified to contain database image B. A file-system state that has the following properties satisfies |
︙ | ︙ | |||
2732 2733 2734 2735 2736 2737 2738 | [Figure filesystem2.gif figure_filesystem2 "Interim file-system state used to atomically overwrite database image ABCD with AEC"] <p class=todo> The exception for free-list leaves. | | > > > > > > > | > > > > > > > > > > > > > > > > > > > | > > | | < < | 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 | [Figure filesystem2.gif figure_filesystem2 "Interim file-system state used to atomically overwrite database image ABCD with AEC"] <p class=todo> The exception for free-list leaves. [h4 "Multiple Database Transactions" multi_db_transactions] <p class=todo> Deleting the master-journal is used as the atomic operation. [h3 "The Rollforward-Journal Method" rollforward_journal_method] <p class=todo> Describe the how the journal file can also be used as a roll forward journal or "transaction log". This section does not contain requirements. [h1 "SQLite Interoperabilty Requirements" interoperability_requirements] [h2 "SQLite Locking Protocol" locking_protocol] <p> Basic rules: [fileformat_import_requirement2 H33000] [fileformat_import_requirement2 H33020] <p> Special requirement for RESERVED locks: [fileformat_import_requirement2 H33010] [fileformat_import_requirement2 H33030] [h2 "SQLite Database Header Cookie Protocol" database_header_cookies_protocol] <p class=todo> The following need to take into account (a) integer overflow and (b) exclusive-locking mode. [fileformat_import_requirement2 H33040] [fileformat_import_requirement2 H33050] [h1 References] <table id="refs" style="width:auto; margin: 1em 5ex"> <tr><td style="width:5ex" id="ref_comer_btree">\[1\]<td> Douglas Comer, <u>Ubiquitous B-Tree</u>, ACM Computing Surveys (CSUR), v.11 n.2, pages 121-137, June 1979. |
︙ | ︙ |
Changes to req/hlr30000.txt.
︙ | ︙ | |||
978 979 980 981 982 983 984 | as an 8-bit unsigned integer, starting at offset 4 of the buffer and continuing until offset (buffer-size - 16) (the 17th last byte of the buffer). HLR H32210 A buffer shall be considered to contain a well-formed journal section | | | | 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 | as an 8-bit unsigned integer, starting at offset 4 of the buffer and continuing until offset (buffer-size - 16) (the 17th last byte of the buffer). HLR H32210 A buffer shall be considered to contain a well-formed journal section if it is not excluded from this category by requirements H32220, H32230 or H32240. HLR H32220 A buffer shall only be considered to contain a well-formed journal section if the first 28 bytes of it contain a well-formed journal header. HLR H32230 A buffer shall only be considered to contain a well-formed journal section |
︙ | ︙ | |||
1016 1017 1018 1019 1020 1021 1022 | A journal record shall only be considered a valid journal record if the journal section to which it belongs begins with a well-formed journal header. HLR H32280 A journal record shall only be considered a valid journal record if all journal sections that occur before the journal section containing the journal record are well-formed journal sections. | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 | A journal record shall only be considered a valid journal record if the journal section to which it belongs begins with a well-formed journal header. HLR H32280 A journal record shall only be considered a valid journal record if all journal sections that occur before the journal section containing the journal record are well-formed journal sections. HLR H32290 Two database images shall be considered to be equivalent if they (a) have the same page size, (b) contain the same number of pages and (c) the content of each page of the first database image that is not a free-list leaf page is the same as the content of the corresponding page in the second database image. HLR H32300 When writing to an SQLite database file-system representation in order to replace database image A with database image B, the file-system representation shall at all times contain a database image equivalent to either A or B. HLR H32310 If, while writing to an SQLite database file-system representation in order to replace database image A with database image B, an operating system or power failure should occur, then following recovery the database file-system representation shall contain a database image equivalent to either A or B. HLR H33000 Before reading from a database file or journal file, a database reader shall establish a SHARED or greater lock on the database file. HLR H33010 Before writing to a journal file, a database writer shall establish a RESERVED or greater lock on the database file. HLR H33020 Before writing to a database file, a database writer shall establish an EXCLUSIVE lock on the database file. HLR H33030 Before establishing a RESERVED or PENDING lock on a database file, a database writer shall ensure that the database file contains a valid database image. HLR H33040 When updating a database image stored within a file-system, a database writer shall ensure that the database header change-counter field in the updated database image is larger than the same value in the original database image. HLR H33050 When updating a database image stored within a file-system such that the contents of the schema table is changed, a database writer shall ensure that the database header schema-cookie field in the updated database image is larger than the same value in the original database image. |