Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Change the requirement numbers used in fileformat.in and fileio.in to start with "H2". |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
c1bf78dd4f8a8b475a81a9f55b28140e |
User & Date: | dan 2008-07-22 17:40:38.000 |
Context
2008-07-22
| ||
18:46 | Updates to support the requirements derivation matrix. (check-in: 273baf47db user: drh tags: trunk) | |
17:40 | Change the requirement numbers used in fileformat.in and fileio.in to start with "H2". (check-in: c1bf78dd4f user: dan tags: trunk) | |
17:24 | Move fileio.html and fileformat.html from the other repository to this one. (check-in: 861079d8dc user: dan tags: trunk) | |
Changes
Changes to pages/fileformat.in.
︙ | ︙ | |||
267 268 269 270 271 272 273 | <h1 id=sqlite_database_files>SQLite Database Files</h1> <p> The bulk of this document, section <cite>database_file_format</cite>, contains the definition of a <i>well-formed SQLite database file</i>. SQLite is required to create database files that meet this definition. | | | | 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 | <h1 id=sqlite_database_files>SQLite Database Files</h1> <p> The bulk of this document, section <cite>database_file_format</cite>, contains the definition of a <i>well-formed SQLite database file</i>. SQLite is required to create database files that meet this definition. <p class=req id=H20001> The system shall ensure that at the successful conclusion of a database transaction the contents of the database file constitute a <i>well-formed SQLite database file</i>. <p> Additionally, the database file should contain a serialized version of the logical database produced by the transaction. For all but the most trivial logical databases, there are many possible serial representations. <p class=req id=H20002> The system shall ensure that at the successful conclusion of a database transaction the contents of the database file are a valid serialization of the contents of the logical SQL database produced by the transaction. <p> Section <cite>database_file_manipulation</cite> contains requirements |
︙ | ︙ | |||
596 597 598 599 600 601 602 | Some of the following requirements state that certain database header fields must contain defined constant values, even though the sqlite database file format is designed to allow various values. This is done to artificially constrain the definition of a <i>well-formed database</i> in order to make implementation and testing more practical. | | | | | | | | | | | | | | | | | | 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 | Some of the following requirements state that certain database header fields must contain defined constant values, even though the sqlite database file format is designed to allow various values. This is done to artificially constrain the definition of a <i>well-formed database</i> in order to make implementation and testing more practical. <p class=req id=H20003> The first 16 bytes of a well-formed database file contain the UTF-8 encoding of the string "SQLite format 3" followed by a single nul-terminator byte. <p> Following the 16 byte magic string in the file header is the <i>page size</i>, a 2-byte field. See section <cite>pages_and_page_types</cite> for details. <p class=req id=H20004> The 19th byte (byte offset 18), the <i>file-format write version</i>, of a well-formed database file contains the value 0x01. <p class=req id=H20005> The 20th byte (byte offset 19), the <i>file-format read version</i>, of a well-formed database file contains the value 0x01. <p class=req id=H20006> The 21st byte (byte offset 20), the number of unused bytes on each page, of a well-formed database file shall contain the value 0x00. <p class=req id=H20007> The 22nd byte (byte offset 21), the maximum fraction of an index B-Tree page to use for embedded content, of a well-formed database file shall contain the value 0x40. <p class=req id=H20008> The 23rd byte (byte offset 22), the minimum fraction of an index B-Tree page to use for embedded content when using overflow pages, of a well-formed database file contains the value 0x20. <p class=req id=H20009> The 24th byte (byte offset 23), the minimum fraction of a table B-Tree page to use for embedded content when using overflow pages, of a well-formed database file contains the value 0x20. <p class=req id=H20010> The 4 byte block starting at byte offset 24 of a well-formed database file contains the <i>file change counter</i> formatted as a 4-byte big-endian integer. <p> Following the <i>file change counter</i> in the database header are two 4-byte fields related to the database file <i>free page list</i>. See section <cite>free_page_list</cite> for details. <p class=req id=H20011> The 4 byte block starting at byte offset 40 of a well-formed database file contains the <i>schema version</i> formatted as a 4-byte big-endian integer. <p class=req id=H20012> The 4 byte block starting at byte offset 44 of a well-formed database file, the <i>schema layer file format</i>, contains a big-endian integer value between 1 and 4, inclusive. <p class=req id=H20013> The 4 byte block starting at byte offset 48 of a well-formed database file contains the <i>default pager cache size</i> formatted as a 4-byte big-endian integer. <p class=req id=H20014> The 4 byte block starting at byte offset 52 of a well-formed database file contains the <i>auto-vacuum last root-page</i> formatted as a 4-byte big-endian integer. If this value is non-zero, the database is said to be an <i>auto-vacuum database</i>. <p class=req id=H20015> The 4 byte block starting at byte offset 56 of a well-formed database file, the <i>text encoding</i> contains a big-endian integer value between 1 and 3, inclusive. <p class=req id=H20016> The 4 byte block starting at byte offset 60 of a well-formed database file contains the <i>user cookie</i> formatted as a 4-byte big-endian integer. <p class=req id=H20017> The 4 byte block starting at byte offset 64 of a well-formed database file, the <i>incremental vaccum flag</i> contains a big-endian integer value between 0 and 1, inclusive. <p class=req id=H20018> In a well-formed non-autovacuum database (one with a zero stored in the 4-byte big-endian integer value beginning at byte offset 52 of the database file header, the incremental vacuum flag is set to 0. <h3 id="pages_and_page_types">Pages and Page Types</h3> <p> |
︙ | ︙ | |||
713 714 715 716 717 718 719 | permanently designated "pointer-map" pages. See section <cite>pointer_map_pages</cite> for details. <li><b>The locking page</b>. The database page that starts at byte offset 2<sup>30</sup>, if it is large enough to contain such a page, is always left unused. </ul> | | | | | | 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 | permanently designated "pointer-map" pages. See section <cite>pointer_map_pages</cite> for details. <li><b>The locking page</b>. The database page that starts at byte offset 2<sup>30</sup>, if it is large enough to contain such a page, is always left unused. </ul> <p class=req id=H20019> The <i>database page size</i> of a well-formed database, stored as a 2-byte big-endian unsigned integer at byte offset 16 of the file, shall be an integer power of 2 between 512 and 32768, inclusive. <p class=req id=H20020> The size of a <i>well formed database file</i> shall be an integer multiple of the <i>database page size</i>. <p class=req id=H20021> Each page of a <i>well formed database file</i> is exactly one of a <i>B-Tree page</i>, an <i>overflow page</i>, a <i>free page</i>, a <i>pointer-map page</i> or the <i>locking page</i>. <p class=req id=H20022> The database page that starts at byte offset 2<sup>30</sup>, the <i>locking page</i>, is never used for any purpose. <h3 id=schema_table>The Schema Table</h3> <p> Apart from being the page that contains the file-header, page 1 of |
︙ | ︙ | |||
835 836 837 838 839 840 841 | <tr><td>index <td>i1 <td>abc <td>3 <td>CREATE INDEX i1 ON abc(b, c) <tr><td>table <td>def <td>def <td>4 <td>CREATE TABLE def(a PRIMARY KEY, b, c, UNIQUE(b, c)) <tr><td>index <td>sqlite_autoindex_def_1 <td>def <td>5 <td> <tr><td>index <td>sqlite_autoindex_def_2 <td>def <td>6 <td> <tr><td>view <td>v1 <td>v1 <td>0 <td>CREATE VIEW v1 AS SELECT * FROM abc </table> | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 | <tr><td>index <td>i1 <td>abc <td>3 <td>CREATE INDEX i1 ON abc(b, c) <tr><td>table <td>def <td>def <td>4 <td>CREATE TABLE def(a PRIMARY KEY, b, c, UNIQUE(b, c)) <tr><td>index <td>sqlite_autoindex_def_1 <td>def <td>5 <td> <tr><td>index <td>sqlite_autoindex_def_2 <td>def <td>6 <td> <tr><td>view <td>v1 <td>v1 <td>0 <td>CREATE VIEW v1 AS SELECT * FROM abc </table> <p class=req id=H20023> In a <i>well-formed database file</i>, the portion of the first database page not consumed by the database file-header (all but the first 100 bytes) contains the root node of a table B-Tree, the <i>schema table</i>. <p class=req id=H20024> All records stored in the <i>schema table</i> contain exactly five fields. <p>The following requirements describe "table" records. <p class=req id=H20025> For each SQL table in the database apart from itself ("sqlite_master"), the <i>schema table</i> of a <i>well-formed database file</i> contains an associated record. <p class=req id=H20026> The first field of each <i>schema table</i> record associated with an SQL table shall be the text value "table". <p class=req id=H20027> The second field of each <i>schema table</i> record associated with an SQL table shall be a text value set to the name of the SQL table. <p class=req id=H20028> In a <i>well-formed database file</i>, the third field of all <i>schema table</i> records associated with SQL tables shall contain the same value as the second field. <p class=req id=H20029> In a <i>well-formed database file</i>, the fourth field of all <i>schema table</i> records associated with SQL tables that are not virtual tables contains the page number (an integer value) of the root page of the associated <i>table B-Tree</i> structure within the database file. <p class=req id=H20030> If the associated database table is a virtual table, the fourth field of the <i>schema table</i> record shall contain an SQL NULL value. <p class=req id=H20031> In a well-formed database, the fifth field of all <i>schema table</i> records associated with SQL tables shall contain a "CREATE TABLE" or "CREATE VIRTUAL TABLE" statment (a text value). The details of the statement shall be such that executing the statement would create a table of precisely the same name and schema as the existing database table. <p>The following requirements describe "implicit index" records. <p class=req id=H20032> For each PRIMARY KEY or UNIQUE constraint present in the definition of each SQL table in the database, the schema table of a well-formed database shall contain a record with the first field set to the text value "index", and the second field set to a text value containing a string of the form "sqlite_autoindex_<name>_<idx>", where <name> is the name of the SQL table and <idx> is an integer value. <p class=req id=H20033> In a well-formed database, the third field of all schema table records associated with SQL PRIMARY KEY or UNIQUE constraints shall contain the name of the table to which the constraint applies (a text value). <p class=req id=H20034> In a well-formed database, the fourth field of all schema table records associated with SQL PRIMARY KEY or UNIQUE constraints shall contain the page number (an integer value) of the root page of the associated index B-Tree structure. <p class=req id=H20035> In a well-formed database, the fifth field of all schema table records associated with SQL PRIMARY KEY or UNIQUE constraints shall contain an SQL NULL value. <p>The following requirements describe "explicit index" records. <p class=req id=H20036> For each SQL index in the database, the schema table of a well-formed database shall contain a record with the first field set to the text value "index" and the second field set to a text value containing the name of the SQL index. <p class=req id=H20037> In a well-formed database, the third field of all schema table records associated with SQL indexes shall contain the name of the SQL table that the index applies to. <p class=req id=H20038> In a well-formed database, the fourth field of all schema table records associated with SQL indexes shall contain the page number (an integer value) of the root page of the associated index B-Tree structure. <p class=req id=H20039> In a well-formed database, the fifth field of all schema table records associated with SQL indexes shall contain an SQL "CREATE INDEX" statement (a text value). The details of the statement shall be such that executing the statement would create an index of precisely the same name and content as the existing database index. <p>The following requirements describe "view" records. <p class=req id=H20040> For each SQL view in the database, the schema table of a well-formed database shall contain a record with the first field set to the text value "view" and the second field set to a text value containing the name of the SQL view. <p class=req id=H20041> In a well-formed database, the third field of all schema table records associated with SQL views shall contain the same value as the second field. <p class=req id=H20042> In a well-formed database, the third field of all schema table records associated with SQL views shall contain the integer value 0. <p class=req id=H20043> In a well-formed database, the fifth field of all schema table records associated with SQL indexes shall contain an SQL "CREATE VIEW" statement (a text value). The details of the statement shall be such that executing the statement would create a view of precisely the same name and definition as the existing database view. <p>The following requirements describe "trigger" records. <p class=req id=H20044> For each SQL trigger in the database, the schema table of a well-formed database shall contain a record with the first field set to the text value "trigger" and the second field set to a text value containing the name of the SQL trigger. <p class=req id=H20045> In a well-formed database, the third field of all schema table records associated with SQL triggers shall contain the name of the database table or view to which the trigger applies. <p class=req id=H20046> In a well-formed database, the third field of all schema table records associated with SQL triggers shall contain the integer value 0. <p class=req id=H20047> In a well-formed database, the fifth field of all schema table records associated with SQL indexes shall contain an SQL "CREATE TRIGGER" statement (a text value). The details of the statement shall be such that executing the statement would create a trigger of precisely the same name and definition as the existing database trigger. <p>The following requirements describe the placement of B-Tree root pages in auto-vacuum databases. <p class=req id=H20048> In an auto-vacuum database, all pages that occur before the page number stored in the <i>auto-vacuum last root-page</i> field of the database file header (see H20014) must be either B-Tree <i>root pages</i>, <i>pointer-map pages</i> or the <i>locking page</i>. <p class=req id=H20049> In an auto-vacuum database, no B-Tree <i>root pages</i> may occur on or after the page number stored in the <i>auto-vacuum last root-page</i> field of the database file header (see H20014) must be either B-Tree <i>root pages</i>, <i>pointer-map pages</i> or the <i>locking page</i>. <h2 id="btree_structures">B-Tree Structures</h2> <p> A large part of any SQLite database file is given over to one or more |
︙ | ︙ | |||
1020 1021 1022 1023 1024 1025 1026 | B-Tree structures are described in detail in section <cite>table_btrees</cite>. <li>The <b>index B-Tree</b>, which uses database records as keys. Index B-Tree structures are described in detail in section <cite>index_btrees</cite>. </ul> | | | | 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 | B-Tree structures are described in detail in section <cite>table_btrees</cite>. <li>The <b>index B-Tree</b>, which uses database records as keys. Index B-Tree structures are described in detail in section <cite>index_btrees</cite>. </ul> <p class=req id=H20050> As well as the <i>schema table</i>, a <i>well-formed database file</i> contains <i>N</i> table B-Tree structures, where <i>N</i> is the number of non-virtual tables in the logical database, excluding the sqlite_master table but including sqlite_sequence and other system tables. <p class=req id=H20051> A well-formed database file contains <i>N</i> table B-Tree structures, where <i>N</i> is the number of indexes in the logical database, including indexes created by UNIQUE or PRIMARY KEY clauses in the declaration of SQL tables. <h3 id="varint_format">Variable Length Integer Format</h3> <p> |
︙ | ︙ | |||
1091 1092 1093 1094 1095 1096 1097 | <tr><td>200815 <td>0x000000000003106F <td>0x8C 0xA0 0x6F <tr><td>-1 <td>0xFFFFFFFFFFFFFFFF <td>0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF <tr><td>-78056 <td>0xFFFFFFFFFFFECD56 <td>0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFD 0xCD 0x56 </table> | | | | | | 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 | <tr><td>200815 <td>0x000000000003106F <td>0x8C 0xA0 0x6F <tr><td>-1 <td>0xFFFFFFFFFFFFFFFF <td>0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF <tr><td>-78056 <td>0xFFFFFFFFFFFECD56 <td>0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFD 0xCD 0x56 </table> <p class=req id=H20052> A 64-bit signed integer value stored in <i>variable length integer</i> format consumes from 1 to 9 bytes of space. <p class=req id=H20053> The most significant bit of all bytes except the last in a serialized <i>variable length integer</i> is always set. Unless the serialized form consumes the maximum 9 bytes available, then the most significant bit of the final byte of the representation is always cleared. <p class=req id=H20054> The eight least significant bytes of the 64-bit twos-compliment representation of a value stored in a 9 byte <i>variable length integer</i> are stored in the final byte (byte offset 8) of the serialized <i>variable length integer</i>. The other 56 bits are stored in the 7 least significant bits of each of the first 8 bytes of the serialized <i>variable length integer</i>, in order from most significant to least significant. <p class=req id=H20055> A <i>variable length integer</i> that consumes less than 9 bytes of space contains a value represented as an <i>N</i>-bit unsigned integer, where <i>N</i> is equal to the number of bytes consumed by the serial representation (between 1 and 8) multiplied by 7. The <i>N</i> bits are stored in the 7 least significant bits of each byte of the serial representation, from most to least significant. |
︙ | ︙ | |||
1209 1210 1211 1212 1213 1214 1215 | the length of the data field is as described in the table above. <p> The data field associated with a string value contains the string encoded using the database encoding, as defined in the database file header (see section <cite>file_header</cite>). No nul-terminator character is stored in the database. | | | | | | | | | | | | | | | | | | | 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 | the length of the data field is as described in the table above. <p> The data field associated with a string value contains the string encoded using the database encoding, as defined in the database file header (see section <cite>file_header</cite>). No nul-terminator character is stored in the database. <p class=req id=H20056> A <i>database record</i> consists of a <i>database record header</i>, followed by <i>database record data</i>. The first part of the <i>database record header</i> is a <i>variable length integer</i> containing the total size (including itself) of the header in bytes. <p class=req id=H20057> Following the length field, the remainder of the <i>database record header</i> is populated with <i>N</i> <i>variable length integer</i> fields, where <i>N</i> is the number of database values stored in the record. <p class=req id=H20058> Following the <i>database record header</i>, the <i>database record data</i> is made up of <i>N</i> variable length blobs of data, where <i>N</i> is again the number of database values stored in the record. The <i>n</i> blob contains the data for the <i>n</i>th value in the database record. The size and format of each blob of data is encoded in the corresponding <i>variable length integer</i> field in the <i>database record header</i>. <p class=req id=H20059> A value of 0 stored within the <i>database record header</i> indicates that the corresponding database value is an SQL NULL. In this case the blob of data in the data area is 0 bytes in size. <p class=req id=H20060> A value of 1 stored within the <i>database record header</i> indicates that the corresponding database value is an SQL integer. In this case the blob of data contains the integer value, formatted as a 1-byte big-endian signed integer. <p class=req id=H20061> A value of 2 stored within the <i>database record header</i> indicates that the corresponding database value is an SQL integer. In this case the blob of data contains the integer value, formatted as a 2-byte big-endian signed integer. <p class=req id=H20062> A value of 3 stored within the <i>database record header</i> indicates that the corresponding database value is an SQL integer. In this case the blob of data contains the integer value, formatted as a 3-byte big-endian signed integer. <p class=req id=H20063> A value of 4 stored within the <i>database record header</i> indicates that the corresponding database value is an SQL integer. In this case the blob of data contains the integer value, formatted as a 4-byte big-endian signed integer. <p class=req id=H20064> A value of 5 stored within the <i>database record header</i> indicates that the corresponding database value is an SQL integer. In this case the blob of data contains the integer value, formatted as a 6-byte big-endian signed integer. <p class=req id=H20065> A value of 6 stored within the <i>database record header</i> indicates that the corresponding database value is an SQL integer. In this case the blob of data contains the integer value, formatted as a 8-byte big-endian signed integer. <p class=req id=H20066> A value of 7 stored within the <i>database record header</i> indicates that the corresponding database value is an SQL real (floating point number). In this case the blob of data contains an 8-byte IEEE floating point number, stored in big-endian byte order. <p class=req id=H20067> A value of 8 stored within the <i>database record header</i> indicates that the corresponding database value is an SQL integer, value 0. In this case the blob of data in the data area is 0 bytes in size. <p class=req id=H20068> A value of 9 stored within the <i>database record header</i> indicates that the corresponding database value is an SQL integer, value 1. In this case the blob of data in the data area is 0 bytes in size. <p class=req id=H20069> An even value greater than or equal to 12 stored within the <i>database record header</i> indicates that the corresponding database value is an SQL blob field. The blob of data contains the value data. The blob of data is exactly (<i>n</i>-12)/2 bytes in size, where <i>n</i> is the integer value stored in the <i>database record header</i>. <p class=req id=H20070> An odd value greater than or equal to 13 stored within the <i>database record header</i> indicates that the corresponding database value is an SQL text field. The blob of data contains the value text stored using the <i>database encoding</i>, with no nul-terminator. The blob of data is exactly (<i>n</i>-12)/2 bytes in size, where <i>n</i> is the integer value stored in the <i>database record header</i>. <p> The following database file properties define restrictions on the integer values that may be stored within a <i>database record header</i>. <p class=req id=H20071> In a well-formed database file, if the values 8 or 9 appear within any <i>database record header</i> within the database, then the <i>schema-layer file format</i> (stored at byte offset 44 of the database file header) must be set to 4. <p class=req id=H20072> In a well-formed database file, the values 10 and 11, and all negative values may not appear within any <i>database record header</i> in the database. <h3 id=index_btrees>Index B-Trees</h3> <p> As specified in section <cite>fileformat_overview</cite>, index |
︙ | ︙ | |||
1358 1359 1360 1361 1362 1363 1364 | Figure <cite>figure_indextree</cite> depicts one possible record distribution for an index B-Tree containing records R1 to R26, assuming that for all values of N, <i>R(N+1)>R(N)</i>. In total the B-Tree structure uses 11 database file pages. Internal tree nodes contain database records and references to child node pages. Leaf nodes contain database records only. | | | | | | 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 | Figure <cite>figure_indextree</cite> depicts one possible record distribution for an index B-Tree containing records R1 to R26, assuming that for all values of N, <i>R(N+1)>R(N)</i>. In total the B-Tree structure uses 11 database file pages. Internal tree nodes contain database records and references to child node pages. Leaf nodes contain database records only. <p class=req id=H20073> The pages in an index B-Tree structures are arranged into a tree structure such that all leaf pages are at the same depth. <p class=req id=H20074> Each leaf node page in an index B-Tree contains one or more B-Tree cells, where each cell contains a database record. <p class=req id=H20075> Each internal node page in an index B-Tree contains one or more B-Tree cells, where each cell contains a child page number, <i>C</i>, and a database record <i>R</i>. All database records stored within the sub-tree headed by page <i>C</i> are smaller than record <i>R</i>, according to the index sort order (see below). Additionally, unless <i>R</i> is the smallest database record stored on the internal node page, all integer keys within the sub-tree headed by <i>C</i> are greater than <i>R<sub>-1</sub></i>, where <i>R<sub>-1</sub></i> is the largest database record on the internal node page that is smaller than <i>R</i>. <p class=req id=H20076> As well as child page numbers associated with B-Tree cells, each internal node page in an index B-Tree contains the page number of an extra child page, the <i>right-child page</i>. All database records stored in all B-Tree cells within the sub-tree headed by the <i>right-child page</i> are greater than all database records stored within B-Tree cells on the internal node page. |
︙ | ︙ | |||
1405 1406 1407 1408 1409 1410 1411 | <p> An index B-Tree contains an entry for each row in its associated database table. The fields of the record used as the index B-Tree key are copies of each of the indexed columns of the associated database row, in order, followed by the rowid value of the same row. See figure <cite>figure_examplepop</cite> for an example. | | | | | | 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 | <p> An index B-Tree contains an entry for each row in its associated database table. The fields of the record used as the index B-Tree key are copies of each of the indexed columns of the associated database row, in order, followed by the rowid value of the same row. See figure <cite>figure_examplepop</cite> for an example. <p class=req id=H20077> In a well-formed database, each index B-Tree contains a single entry for each row in the indexed logical database table. <p class=req id=H20078> Each <i>database record</i> (key) stored by an index B-Tree in a well-formed database contains the same number of values, the number of indexed columns plus one. <p class=req id=H20079> The final value in each <i>database record</i> (key) stored by an index B-Tree in a well-formed database contains the rowid (an integer value) of the corresponding logical database row. <p class=req id=H20080> The first <i>N</i> values in each <i>database record</i> (key) stored in an index B-Tree where <i>N</i> is the number of indexed columns, contain the values of the indexed columns from the corresponding logical database row, in the order specified for the index. <h4 id="index_btree_compare_func">Record Sort Order</h4> |
︙ | ︙ | |||
1569 1570 1571 1572 1573 1574 1575 | the page) of the next block in the list stored as a big-endian unsigned integer. The first two bytes of the final block in the list are set to zero. The third and fourth bytes of each free block contain the total size of the free block in bytes, stored as a 2 byte big-endian unsigned integer. </ul> | | | | | | | | | | | | | | | | | | | | | 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 | the page) of the next block in the list stored as a big-endian unsigned integer. The first two bytes of the final block in the list are set to zero. The third and fourth bytes of each free block contain the total size of the free block in bytes, stored as a 2 byte big-endian unsigned integer. </ul> <p class=req id=H20081> The <i>b-tree page flags</i> field (the first byte) of each database page used as an internal node of an index B-Tree structure is set to 0x02. <p class=req id=H20082> The <i>b-tree page flags</i> field (the first byte) of each database page used as a leaf node of an index B-Tree structure is set to 0x0A. <p> The following requirements describe the <i>B-Tree page header</i> present at the start of both index and table B-Tree pages. <p class=req id=H20083> The first byte of each database page used as a B-Tree page contains the <i>b-tree page flags</i> field. On page 1, the <i>b-tree page flags</i> field is stored directly after the 100 byte file header at byte offset 100. <p class=req id=H20084> The number of B-Tree cells stored on a B-Tree page is stored as a 2-byte big-endian integer starting at byte offset 3 of the B-Tree page. On page 1, this field is stored at byte offset 103. <p class=req id=H20085> The 2-byte big-endian integer starting at byte offset 5 of each B-Tree page contains the byte-offset from the start of the page to the start of the <i>cell content area</i>, which consumes all space from this offset to the end of the usable region of the page. On page 1, this field is stored at byte offset 105. All B-Tree cells on the page are stored within the cell-content area. <p class=req id=H20086> On each page used as an internal node a of B-Tree structures, the page number of the rightmost child node in the B-Tree structure is stored as a 4-byte big-endian unsigned integer beginning at byte offset 8 of the database page, or byte offset 108 on page 1. <p> This requirement describes the cell content offset array. It applies to both B-Tree variants. <p class=req id=H20087> Immediately following the <i>page header</i> on each B-Tree page is the <i>cell offset array</i>, consisting of <i>N</i> 2-byte big-endian unsigned integers, where <i>N</i> is the number of cells stored on the B-Tree page (H20084). On an internal node B-Tree page, the cell offset array begins at byte offset 12, or on a leaf page, byte offset 8. For the B-Tree node on page 1, these offsets are 112 and 108, respectively. <p class=req id=H20088> The <i>cell offset array</i> and the <i>cell content area</i> (H20085) may not overlap. <p class=req id=H20089> Each value stored in the <i>cell offset array</i> must be greater than or equal to the offset to the <i>cell content area</i> (H20085), and less than the database <i>page size</i>. <p class=req id=H20090> The <i>N</i> values stored within the <i>cell offset array</i> are the byte offsets from the start of the B-Tree page to the beginning of each of the <i>N</i> cells stored on the page. <p class=req id=H20091> No two B-Tree cells may overlap. <p> The following requirements govern management of free-space within the page content area (both table and index B-Tree pages). <p class=req id=H20092> Within the <i>cell content area</i>, all blocks of contiguous free-space (space not used by B-Tree cells) greater than 3 bytes in size are linked together into a linked list, the <i>free block list</i>. Such blocks of free space are known as <i>free blocks</i>. <p class=req id=H20093> The first two bytes of each <i>free block</i> contain the offset of the next <i>free block</i> in the <i>free block list</i> formatted as a 2-byte big-endian integer, relative to the start of the database page. If there is no next <i>free block</i>, then the first two bytes are set to 0x00. <p class=req id=H20094> The second two bytes (byte offsets 2 and 3) of each <i>free block</i> contain the total size of the <i>free block</i>, formatted as a 2-byte big-endian integer. <p class=req id=H20095> On all B-Tree pages, the offset of the first <i>free block</i> in the <i>free block list</i>, relative to the start of the database page, is stored as a 2-byte big-endian integer starting at byte offset 1 of the database page. If there is no first <i>free block</i> (because the <i>free block list</i> is empty), then the two bytes at offsets 1 and 2 of the database page are set to 0x00. On page 1, this field is stored at byte offset 101 of the page. <p class=req id=H20096> Within the cell-content area, all blocks of contiguous free-space (space not used by B-Tree cells) less than or equal to 3 bytes in size are known as <i>fragments</i>. The total size of all <i>fragments</i> on a B-Tree page is stored as a 1-byte unsigned integer at byte offset 7 of the database page. On page 1, this field is stored at byte offset 107. |
︙ | ︙ | |||
1732 1733 1734 1735 1736 1737 1738 | the values read from byte offsets 21 and 22 of the file header, respectively. <center><img src="images/fileformat/indexlongrecord.gif"> <p><i>Figure <span class=fig id=figure_indexlongrecord></span> - Large Record Index B-Tree Cell.</i> </center> | | | | | | | | | | 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 | the values read from byte offsets 21 and 22 of the file header, respectively. <center><img src="images/fileformat/indexlongrecord.gif"> <p><i>Figure <span class=fig id=figure_indexlongrecord></span> - Large Record Index B-Tree Cell.</i> </center> <p class=req id=H20097> Each B-Tree cell belonging to an internal node page of an index B-Tree consists of a 4-byte big-endian unsigned integer, the <i>child page number</i>, followed by a <i>variable length integer</i> field, followed by a <i>database record</i>. The <i>variable length integer</i> field contains the length of the database record in bytes. <p class=req id=H20098> Each B-Tree cell belonging to an leaf page of an index B-Tree consists of a <i>variable length integer</i> field, followed by a <i>database record</i>. The <i>variable length integer</i> field contains the length of the database record in bytes. <p class=req id=H20099> If the database record stored in an index B-Tree page is sufficiently small, then the entire cell is stored within the index B-Tree page. Sufficiently small is defined as equal to or less than <i>max-local</i>, where: <code> <i>max-local</i> := (<i>usable-size</i> - 12) * 64 / 255 - 23</code> <p class=req id=H20100> If the database record stored as part of an index B-Tree cell is too large to be stored entirely within the B-Tree page (as defined by H20052), then only a prefix of the <i>database record</i> is stored within the B-Tree page and the remainder stored in an <i>overflow chain</i>. In this case, the database record prefix is immediately followed by the page number of the first page of the <i>overflow chain</i>, formatted as a 4-byte big-endian unsigned integer. <p class=req id=H20101> When a <i>database record</i> belonging to a table B-Tree cell is stored partially within an <i>overflow page chain</i>, the size of the prefix stored within the index B-Tree page is <i>N</i> bytes, where <i>N</i> is calculated using the following algorithm: <code> <i>min-local</i> := (<i>usable-size</i> - 12) * 32 / 255 - 23 <i>max-local</i> := (<i>usable-size</i> - 12) * 64 / 255 - 23 <i>N</i> := <i>min-local</i> + ((<i>record-size</i> - <i>min-local</i>) % (<i>usable-size</i> - 4)) if( <i>N</i> > <i>max-local</i> ) <i>N</i> := <i>min-local</i></code> <p> Requirements H20101 and H20099 are similar to the algorithms presented in the text above. However instead of <i>min-embedded-fraction</i> and <i>max-embedded-fraction</i> the requirements use the constant values 32 and 64, as well-formed database files are required by H20008 and H20007 to store these values in the relevant database file header fields. <h3 id=table_btrees>Table B-Trees</h3> <p> As noted in section <cite>fileformat_overview</cite>, table B-Trees store a set of unique 64-bit signed integer keys. Associated with each key is a database record. As with index B-Trees, the database |
︙ | ︙ | |||
1818 1819 1820 1821 1822 1823 1824 | Figure <cite>figure_tabletree</cite> depicts a table B-Tree containing a contiguous set of 14 integer keys starting with 1. Each key <i>n</i> has an associated database record R<i>n</i>. All the keys and their associated records are stored in the leaf pages. The internal node pages contain no database data, their only purpose is to provide a way to navigate the tree structure. | | | | | | 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 | Figure <cite>figure_tabletree</cite> depicts a table B-Tree containing a contiguous set of 14 integer keys starting with 1. Each key <i>n</i> has an associated database record R<i>n</i>. All the keys and their associated records are stored in the leaf pages. The internal node pages contain no database data, their only purpose is to provide a way to navigate the tree structure. <p class=req id=H20102> The pages in a table B-Tree structures are arranged into a tree structure such that all leaf pages are at the same depth. <p class=req id=H20103> Each leaf page in a table B-Tree structure contains one or more B-Tree cells, where each cell contains a 64-bit signed integer key value and a database record. <p class=req id=H20104> Each internal node page in a table B-Tree structure contains one or more B-Tree cells, where each cell contains a 64-bit signed integer key value, <i>K</i>, and a child page number, <i>C</i>. All integer key values in all B-Tree cells within the sub-tree headed by page <i>C</i> are less than or equal to <i>K</i>. Additionally, unless <i>K</i> is the smallest integer key value stored on the internal node page, all integer keys within the sub-tree headed by <i>C</i> are greater than <i>K<sub>-1</sub></i>, where <i>K<sub>-1</sub></i> is the largest integer key on the internal node page that is smaller than <i>K</i>. <p class=req id=H20105> As well as child page numbers associated with B-Tree cells, each internal node page in a table B-Tree contains the page number of an extra child page, the <i>right-child page</i>. All key values in all B-Tree cells within the sub-tree headed by the <i>right-child page</i> are greater than all key values stored within B-Tree cells on the internal node page. |
︙ | ︙ | |||
1884 1885 1886 1887 1888 1889 1890 | are SQL NULL. If the schema layer file-format is greater than 2, then the values associated with the "missing" fields are determined by the default value of the associated database table columns. <span class=todo>Reference to CREATE TABLE syntax. How are default values determined?</span> | | | | | | | | | 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 | are SQL NULL. If the schema layer file-format is greater than 2, then the values associated with the "missing" fields are determined by the default value of the associated database table columns. <span class=todo>Reference to CREATE TABLE syntax. How are default values determined?</span> <p class=req id=H20106> In a well-formed database, each table B-Tree contains a single entry for each row in the corresponding logical database table. <p class=req id=H20107> The key value (a 64-bit signed integer) for each B-Tree entry is the same as the value of the rowid field of the corresponding logical database row. <p class=req id=H20108> The SQL values serialized to make up each <i>database record</i> stored as ancillary data in a table B-Tree shall be the equal to the values taken by the <i>N</i> leftmost columns of the corresponding logical database row, where <i>N</i> is the number of values in the database record. <p class=req id=H20109> If a logical database table column is declared as an "INTEGER PRIMARY KEY", then instead of its integer value, an SQL NULL shall be stored in its place in any database records used as ancillary data in a table B-Tree. <p>The following database properties discuss table B-Tree records with implicit (default) values. <p class=req id=H20110> If the database <i>schema layer file-format</i> (the value stored as a 4-byte integer at byte offset 44 of the file header) is 1, then all database records stored as ancillary data in a table B-Tree structure have the same number of fields as there are columns in the corresponding logical database table. <p class=req id=H20111> If the database <i>schema layer file-format</i> value is two or greater and the rightmost <i>M</i> columns of a row contain SQL NULL values, then the corresponding record stored as ancillary data in the table B-Tree has between <i>N</i>-<i>M</i> and <i>N</i> fields, where <i>N</i> is the number of columns in the logical database table. <p class=req id=H20112> If the database <i>schema layer file-format</i> value is three or greater and the rightmost <i>M</i> columns of a row contain their default values according to the logical table declaration, then the corresponding record stored as ancillary data in the table B-Tree may have as few as <i>N</i>-<i>M</i> fields, where <i>N</i> is the number of columns in the logical database table. |
︙ | ︙ | |||
1948 1949 1950 1951 1952 1953 1954 | section <cite>table_btree_cell_format</cite> for details. <li>The format of page 1 is the same as any other table B-Tree, except that 100 bytes less than usual is available for content. The first 100 bytes of page 1 is consumed by the database file header. </ul> | | | | 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 | section <cite>table_btree_cell_format</cite> for details. <li>The format of page 1 is the same as any other table B-Tree, except that 100 bytes less than usual is available for content. The first 100 bytes of page 1 is consumed by the database file header. </ul> <p class=req id=H20113> In a <i>well-formed database file</i>, the first byte of each page used as an internal node of a table B-Tree structure is set to 0x05. <p class=req id=H20114> In a <i>well-formed database file</i>, the first byte of each page used as a leaf node of a table B-Tree structure is set to 0x0D. <p> Most of the requirements specified in section <cite>index_btree_page_format</cite> also apply to table B-Tree pages. The wording of the requirements make it clear when this is |
︙ | ︙ | |||
2038 2039 2040 2041 2042 2043 2044 | header). <p> The following requirements describe the format of table B-Tree cells, and the distribution thereof between B-Tree and overflow pages. | | | | | | | | | | | | 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 | header). <p> The following requirements describe the format of table B-Tree cells, and the distribution thereof between B-Tree and overflow pages. <p class=req id=H20115> B-Tree cells belonging to table B-Tree internal node pages consist of exactly two fields, a 4-byte big-endian unsigned integer immediately followed by a <i>variable length integer</i>. These fields contain the child page number and key value respectively (see H20103). <p class=req id=H20116> B-Tree cells belonging to table B-Tree leaf node pages consist of three fields, two <i>variable length integer</i> values followed by a database record. The size of the database record in bytes is stored in the first of the two <i>variable length integer</i> fields. The second of the two <i>variable length integer</i> fields contains the 64-bit signed integer key (see H20103). <p class=req id=H20117> If the size of the record stored in a table B-Tree leaf page cell is less than or equal to (<i>usable page size</i>-35) bytes, then the entire cell is stored on the B-Tree leaf page. In a well-formed database, <i>usable page size</i> is the same as the database <i>page size</i>. <p class=req id=H20118> If a table B-Tree cell is too large to be stored entirely on a leaf page (as defined by H20117), then a prefix of the cell is stored on the leaf page, and the remainder stored in an <i>overflow page chain</i>. In this case the cell prefix stored on the B-Tree leaf page is immediately followed by a 4-byte big-endian unsigned integer containing the page number of the first overflow page in the chain. <p class=req id=H20119> When a table B-Tree cell is stored partially in an <i>overflow page chain</i>, the prefix stored on the B-Tree leaf page consists of the two <i>variable length integer</i> fields, followed by the first <i>N</i> bytes of the database record, where <i>N</i> is determined by the following algorithm: <code> <i>min-local</i> := (<i>usable-size</i> - 12) * 255 / 32 - 23 <i>max-local</i> := (<i>usable-size</i> - 35) <i>N</i> := <i>min-local</i> + (<i>record-size</i> - <i>min-local</i>) % (<i>usable-size</i> - 4) if( <i>N</i> > <i>max-local</i> ) N := <i>min-local</i> </code> <p> Requirement H20119 is very similar to the algorithm presented in the text above. Instead of <i>min-embedded-fraction</i>, it uses the constant value 32, as well-formed database files are required by H20009 to store this value in the relevant database file header field. <h3 id="overflow_page_chains">Overflow Page Chains</h3> <p> Sometimes, a database record stored in either an index or table B-Trees is too large to fit entirely within a B-Tree cell. In this case part of the record is stored within the B-Tree cell and the |
︙ | ︙ | |||
2128 2129 2130 2131 2132 2133 2134 | <p> Each overflow page except for the last one in the linked list contains <i>available-space</i> bytes of record data. The last page in the list contains the remaining data, starting at byte offset 4. The value of the "next page" field on the last page in an overflow chain is undefined. | | | | | | | | | 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 | <p> Each overflow page except for the last one in the linked list contains <i>available-space</i> bytes of record data. The last page in the list contains the remaining data, starting at byte offset 4. The value of the "next page" field on the last page in an overflow chain is undefined. <p class=req id=H20120> A single <i>overflow page</i> may store up to <i>available-space</i> bytes of database record data, where <i>available-space</i> is equal to (<i>usable-size</i> - 4). <p class=req id=H20121> When a database record is too large to store within a B-Tree page (see H20117 and H20100), a prefix of the record is stored within the B-Tree page and the remainder stored across <i>N</i> overflow pages. In this case <i>N</i> is the minimum number of pages required to store the portion of the record not stored on the B-Tree page, given the maximum payload per overflow page defined by H20120. <p class=req id=H20122> The list of overflow pages used to store a single database record are linked together in a singly linked list known as an <i>overflow chain</i>. The first four bytes of each page except the last in an <i>overflow chain</i> are used to store the page number of the next page in the linked list, formatted as an unsigned big-endian integer. The first four bytes of the last page in an <i>overflow chain</i> are set to 0x00. <p class=req id=H20123> Each overflow page except the last in an <i>overflow chain</i> contains <i>N</i> bytes of record data starting at byte offset 4 of the page, where <i>N</i> is the maximum payload per overflow page, as defined by H20120. The final page in an <i>overflow chain</i> contains the remaining data, also starting at byte offset 4. <h2 id=free_page_list>The Free Page List</h2> <p> Sometimes, after deleting data from the database, SQLite removes pages from B-Tree structures. If these pages are not immediately required for some other purpose, they are placed on the free page list. The |
︙ | ︙ | |||
2199 2200 2201 2202 2203 2204 2205 | <p> All trunk pages in the free-list except for the first contain the maximum possible number of references to leaf pages. <span class=todo>Is this actually true in an auto-vacuum capable database?</span> The page number of the first page in the linked list of free-list trunk pages is stored as a 4-byte big-endian unsigned integer at offset 32 of the file header (section <cite>file_header</cite>). | | | | | | | | | | | | 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 | <p> All trunk pages in the free-list except for the first contain the maximum possible number of references to leaf pages. <span class=todo>Is this actually true in an auto-vacuum capable database?</span> The page number of the first page in the linked list of free-list trunk pages is stored as a 4-byte big-endian unsigned integer at offset 32 of the file header (section <cite>file_header</cite>). <p class=req id=H20124> All <i>free pages</i> in a <i>well-formed database file</i> are part of the database <i>free page list</i>. <p class=req id=H20125> Each free page is either a <i>free list trunk</i> page or a <i>free list leaf</i> page. <p class=req id=H20126> All <i>free list trunk</i> pages are linked together into a singly linked list. The first 4 bytes of each page in the linked list contains the page number of the next page in the list, formatted as an unsigned big-endian integer. The first 4 bytes of the last page in the linked list are set to 0x00. <p class=req id=H20127> The second 4 bytes of each <i>free list trunk</i> page contains the number of </i>free list leaf</i> page numbers stored on the free list trunk page, formatted as an unsigned big-endian integer. <p class=req id=H20128> Beginning at byte offset 8 of each <i>free list trunk</i> page are <i>N</i> page numbers, each formatted as a 4-byte unsigned big-endian integers, where <i>N</i> is the value described in requirement H20127. <p class=req id=H20129> All page numbers stored on all <i>free list trunk</i> pages refer to database pages that are <i>free list leaves</i>. <p class=req id=H20130> The page number of each <i>free list leaf</i> page in a well-formed database file appears exactly once within the set of pages numbers stored on <i>free list trunk</i> pages. <p>The following statements govern the two 4-byte big-endian integers associated with the <i>free page list</i> structure in the database file header. <p class=req id=H20131> The total number of pages in the free list, including all <i>free list trunk</i> and <i>free list leaf</i> pages, is stored as a 4-byte unsigned big-endian integer at offset 36 of the database file header. <p class=req id=H20132> The page number of the first page in the linked list of <i>free list trunk</i> pages is stored as a 4-byte big-endian unsigned integer at offset 32 of the database file header. If there are no <i>free list trunk</i> pages in the database file, then the value stored at offset 32 of the database file header is 0. |
︙ | ︙ | |||
2328 2329 2330 2331 2332 2333 2334 | table entries for the <i>num-entries</i> pages that follow it in the database file: <pre> <i>pointer-map-page-number</i> := 2 + <i>n</i> * <i>num-entries</i> </pre> | | | | | | | | | | | | | 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 | table entries for the <i>num-entries</i> pages that follow it in the database file: <pre> <i>pointer-map-page-number</i> := 2 + <i>n</i> * <i>num-entries</i> </pre> <p class=req id=H20133> Non auto-vacuum databases do not contain pointer map pages. <p class=req id=H20134> In an auto-vacuum database file, every <i>(num-entries + 1)</i>th page beginning with page 2 is designated a pointer-map page, where <i>num-entries</i> is calculated as: <code> <i>num-entries</i> := <i>database-usable-page-size</i> / 5 </code> <p class=req id=H20135> In an auto-vacuum database file, each pointer-map page contains a pointer map entry for each of the <i>num-entries</i> (defined by H20134) pages that follow it, if they exist. <p class=req id=H20136> Each pointer-map page entry consists of a 1-byte page type and a 4-byte page parent number, 5 bytes in total. <p class=req id=H20137> Pointer-map entries are packed into the pointer-map page in order, starting at offset 0. The entry associated with the database page that immediately follows the pointer-map page is located at offset 0. The entry for the following page at offset 5 etc. <p> The following requirements govern the content of pointer-map entries. <p class=req id=H20138> For each page except page 1 in an auto-vacuum database file that is the root page of a B-Tree structure, the page type of the corresponding pointer-map entry is set to the value 0x01 and the parent page number is zero. <p class=req id=H20139> For each page that is a part of an auto-vacuum database file free-list, the page type of the corresponding pointer-map entry is set to the value 0x02 and the parent page number is zero. <p class=req id=H20140> For each page in a well-formed auto-vacuum database that is the first page in an overflow chain, the page type of the corresponding pointer-map entry is set to 0x03 and the parent page number field is set to the page number of the B-Tree page that contains the start of the B-Tree cell stored in the overflow-chain. <p class=req id=H20141> For each page that is the second or a subsequent page in an overflow chain, the page type of the corresponding pointer-map entry is set to 0x04 and the parent page number field is set to the page number of the preceding page in the overflow chain. <p class=req id=H20142> For each page that is not a root page but is a part of a B-Tree tree structure (not part of an overflow chain), the page type of the corresponding pointer-map entry is set to the value 0x05 and the parent page number field is set to the page number of the parent node in the B-Tree structure. |
︙ | ︙ |
Changes to pages/fileio.in.
︙ | ︙ | |||
243 244 245 246 247 248 249 | is to determine the <i>page-size</i> used by the database file. Because it is not possible to be certain as to the <i>page-size</i> without holding at least a <i>shared lock</i> on the database file (because some other <i>database connection</i> might have changed it since the <i>database file header</i> was read), the value read from the <i>database file header</i> is known as the <i>expected page size</i>. | | | | | | 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 | is to determine the <i>page-size</i> used by the database file. Because it is not possible to be certain as to the <i>page-size</i> without holding at least a <i>shared lock</i> on the database file (because some other <i>database connection</i> might have changed it since the <i>database file header</i> was read), the value read from the <i>database file header</i> is known as the <i>expected page size</i>. <p class=req id=H21006> When a new <i>database connection</i> is required, SQLite shall attempt to open a file-handle on the database file. If the attempt fails, then no new <i>database connection</i> is created and an error returned. <p class=req id=H21007> When a new <i>database connection</i> is required, after opening the new file-handle, SQLite shall attempt to read the first 100 bytes of the database file. If the attempt fails for any other reason than that the opened file is less than 100 bytes in size, then the file-handle is closed, no new <i>database connection</i> is created and an error returned instead. <p class=req id=H21008> If the <i>database file header</i> is successfully read from a newly opened database file, the connections <i>expected page-size</i> shall be set to the value stored in the <i>page-size field</i> of the database header. <p class=req id=H21009> If the <i>database file header</i> cannot be read from a newly opened database file (because the file is less than 100 bytes in size), the connections <i>expected page-size</i> shall be set to the compile time value of the SQLITE_DEFAULT_PAGESIZE option. <h2>Closing a Connection</h2> |
︙ | ︙ | |||
318 319 320 321 322 323 324 | pages are read one at a time. SQLite never reads partial pages and always uses a single call to xRead() for each required page. After reading the data for a database page, SQLite adds it to the connections <i>page cache</i> so that it does not have to be read if required again. Refer to section <cite>page_cache_algorithms</cite> for a description of how this affects the IO performed by SQLite. | | | | | | 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 | pages are read one at a time. SQLite never reads partial pages and always uses a single call to xRead() for each required page. After reading the data for a database page, SQLite adds it to the connections <i>page cache</i> so that it does not have to be read if required again. Refer to section <cite>page_cache_algorithms</cite> for a description of how this affects the IO performed by SQLite. <p class=req id=H21001> Except for the read operation required by H21007 and those reads made as part of opening a read-only transaction, SQLite shall only read data from a <i>database connection</i> while the <i>database connection</i> has an open read-only or read/write transaction. <p> In the above requirement, reading data from a database connection includes retrieving data from the connections <i>page cache</i>. <p class=req id=H21002> Aside from those read operations described by H21007 and H21XXX, SQLite shall read data from the database in aligned blocks of <i>page-size</i> bytes, where <i>page-size</i> is the database page size used by the database file. <h2 id=open_read_only_trans>Opening a Read-Only Transaction</h2> <p> Before data may be read from a <i>database connection</i>, a |
︙ | ︙ | |||
381 382 383 384 385 386 387 | enumerated above. If this happens, then the <i>shared-lock</i> is released (if it was obtained) and an error returned to the user. Step 2 of the procedure above is described in more detail in section <cite>hot_journal_detection</cite>. Section <cite>cache_validation</cite> describes the process identified by step 3 above. Further detail on step 4 may be found in section <cite>read_page_one</cite>. | | | | | | | | 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 | enumerated above. If this happens, then the <i>shared-lock</i> is released (if it was obtained) and an error returned to the user. Step 2 of the procedure above is described in more detail in section <cite>hot_journal_detection</cite>. Section <cite>cache_validation</cite> describes the process identified by step 3 above. Further detail on step 4 may be found in section <cite>read_page_one</cite>. <p class=req id=H21010> When required to open a <i>read-only transaction</i> using a <i>database connection</i>, SQLite shall first attempt to obtain a <i>shared-lock</i> on the file-handle open on the database file. <p class=req id=H21011> If, while opening a <i>read-only transaction</i>, SQLite fails to obtain the <i>shared-lock</i> on the database file, then the process is abandoned, no transaction is opened and an error returned to the user. <p> The most common reason an attempt to obtain a <i>shared-lock</i> may fail is that some other connection is holding an <i>exclusive</i> or <i>pending lock</i>. However it may also fail because some other error (e.g. IO, comms related) occurs within the call to the xLock() method. <p class=req id=H21003> While opening a <i>read-only transaction</i>, after successfully obtaining a <i>shared lock</i> on the database file, SQLite shall attempt to detect and roll back a <i>hot journal file</i> associated with the same database file. <p class=req id=H21012> If, while opening a <i>read-only transaction</i>, SQLite encounters an error while attempting to detect or roll back a <i>hot journal file</i>, then the <i>shared-lock</i> on the database file is released, no transaction is opened and an error returned to the user. <p> Section <cite>hot_journal_detection</cite> contains a description of and requirements governing the detection of a hot-journal file refered to in the above requirements. <p class=req id=H21004> Assuming no errors have occured, then after attempting to detect and roll back a <i>hot journal file</i>, if the connections <i>page cache</i> is not empty, then SQLite shall validate the contents of the <i>page cache</i> by testing the <i>file change counter</i>. This procedure is known as <i>cache validiation</i>. <p class=req id=H21005> If the contents of the <i>page cache</i> are found to be invalid by the check prescribed by F20040, SQLite shall discard the cache contents before continuing. <h3 id=hot_journal_detection>Hot Journal Detection</h3> <p> |
︙ | ︙ | |||
495 496 497 498 499 500 501 | <cite>rollback</cite>). </ol> <p> The following requirements describe step 1 of the above procedure in more detail. | | | | | | | | | 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 | <cite>rollback</cite>). </ol> <p> The following requirements describe step 1 of the above procedure in more detail. <p class=req id=H21014> When required to attempt to detect a <i>hot-journal file</i>, SQLite shall first use the xAccess() method of the VFS layer to check if a journal file exists in the file-system. <p class=req id=H21015> When required to attempt to detect a <i>hot-journal file</i>, if the call to xAccess() required by H21014 indicates that a journal file does not exist, then the attempt to detect a <i>hot-journal file</i> is finished. A <i>hot-journal file</i> was not detected. <p> The following requirements describe step 2 of the above procedure in more detail. <p class=req id=H21016> When required to attempt to detect a <i>hot-journal file</i>, if the call to xAccess() required by H21014 indicates that a journal file is present, then the xCheckReservedLock() method of the database file file-handle is invoked to determine whether or not some other process is holding a <i>reserved</i> or greater lock on the database file. <p class=req id=H21017> If the call to xCheckReservedLock() required by H21016 indicates that some other <i>database connection</i> is holding a <i>reserved</i> or greater lock on the database file, <p class=todo> Finish this section. <h3 id=cache_validation>Cache Validation</h3> |
︙ | ︙ | |||
551 552 553 554 555 556 557 | opening a new <i>read-only transaction</i>, SQLite checks the value of the <i>file change counter</i> stored in the database file. If the value has not changed since the database file was unlocked, then the contents of the <i>page cache</i> can be trusted. If the value has changed, then the <i>page cache</i> cannot be trusted and all data is discarded. | | | | | | | | 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 | opening a new <i>read-only transaction</i>, SQLite checks the value of the <i>file change counter</i> stored in the database file. If the value has not changed since the database file was unlocked, then the contents of the <i>page cache</i> can be trusted. If the value has changed, then the <i>page cache</i> cannot be trusted and all data is discarded. <p class=req id=H21018> When a file-handle open on a database file is unlocked, if the <i>page cache</i> belonging to the associated <i>database connection</i> is not empty, SQLite shall store the value of the <i>file change counter</i> internally. <p class=req id=H21019> When required to perform <i>cache validation</i> as part of opening a <i>read transaction</i>, SQLite shall read a 16 byte block starting at byte offset 24 of the <i>database file</i> using the xRead() method of the <i>database connections</i> file handle. <p class=todo> Why a 16 byte block? Why not 4? (something to do with encrypted databases). <p class=req id=H21020> While performing <i>cache validation</i>, after loading the 16 byte block as required by H21019, SQLite shall compare the 32-bit big-endian integer stored in the first 4 bytes of the block to the most recently stored value of the <i>file change counter</i> (see H21018). If the values are not the same, then SQLite shall conclude that the contents of the cache are invalid. <p> Requirement H21005 (section <cite>open_read_only_trans</cite>) specifies the action SQLite is required to take upon determining that the cache contents are invalid. <h3 id=read_page_one>Page 1 and the Expected Page Size</h3> <p> As the last step in opening a <i>read transaction</i> on a database |
︙ | ︙ | |||
599 600 601 602 603 604 605 | the connections <i>expected page size</i>. The <i>expected page size</i> is the value of the <i>page-size</i> field read from the <i>database file header</i> while opening the database connection (see section <cite>open_new_connection</cite>), or the <i>page-size</i> stored of the database file when the most <i>read transaction</i> was concluded. | | | | | | | | | | 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 | the connections <i>expected page size</i>. The <i>expected page size</i> is the value of the <i>page-size</i> field read from the <i>database file header</i> while opening the database connection (see section <cite>open_new_connection</cite>), or the <i>page-size</i> stored of the database file when the most <i>read transaction</i> was concluded. <p class=req id=H21021> During the conclusing of a <i>read transaction</i>, before unlocking the database file, SQLite shall set the connections <i>expected page size</i> to the current database <i>page-size</i>. <p class=req id=H21022> As part of opening a new <i>read transaction</i>, immediately after performing <i>cache validation</i>, if there is no data for database page 1 in the <i>page cache</i>, SQLite shall read <i>N</i> bytes from the start of the database file using the xRead() method of the connections file handle, where <i>N</i> is the connections current <i>expected page size</i> value. <p class=req id=H21023> If page 1 data is read as required by H21023, then the value of the <i>page-size</i> field that appears in the database file header that consumes the first 100 bytes of the read block is not the same as the connections current <i>expected page size</i>, then the <i>expected page size</i> is set to this value, the database file is unlocked and the entire procedure to open a <i>read transaction</i> is repeated. <p class=req id=H21024> If page 1 data is read as required by H21023, then the value of the <i>page-size</i> field that appears in the database file header that consumes the first 100 bytes of the read block is the same as the connections current <i>expected page size</i>, then the block of data read is added to the connections <i>page cache</i> as page 1. <h2>Ending a Read-only Transaction</h2> <p> To end a <i>read-only transaction</i>, SQLite simply relinquishes the <i>shared lock</i> on the file-handle open on the database file. No other action is required. <p class=req id=H21013> When required to end a <i>read-only transaction</i>, SQLite shall relinquish the <i>shared lock</i> held on the database file by calling the xUnlock() method of the file-handle. <p> See also requirements H21018 and H21021 above. <h1 id=writing_data>Writing Data</h1> <p> Safely writing data to a database file is also a complex procedure. The database file must be updated in such a way that if a power failure, operating system crash or application fault occurs while SQLite is midway through writing to the database file the database contents are still |
︙ | ︙ | |||
714 715 716 717 718 719 720 | started may be required. This may occur if the user explicitly requests transaction rollback (i.e. by issuing a "ROLLBACK" command), or automatically, as a result of encountering an SQL constraint (see <cite>sql_sqlitert_requirements</cite>). For this reason, the original page content is stored in the <i>journal file</i> before the page is even modified within the <i>page cache</i>. | | | | 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 | started may be required. This may occur if the user explicitly requests transaction rollback (i.e. by issuing a "ROLLBACK" command), or automatically, as a result of encountering an SQL constraint (see <cite>sql_sqlitert_requirements</cite>). For this reason, the original page content is stored in the <i>journal file</i> before the page is even modified within the <i>page cache</i>. <p class=req id=H21025> Before modifying or adding any in-memory <i>page cache</i> pages in preparation for writing to the <i>database file</i>, the <i>database connection</i> shall open a <i>write transaction</i> on the database file. <p class=req id=H21026> Before modifying the <i>page cache</i> image of a database page that existed and was not a <i>free-list leaf</i> page when the current <i>write transaction</i> began, SQLite shall ensure that the original page content has been written to the journal file (<i>journalled</i>). <p class=todo> If the sector size is larger than the page-size, coresident pages must |
︙ | ︙ | |||
835 836 837 838 839 840 841 | the VFS xOpen method), and a <i>journal file header</i> written to the start of it using a single call to the file handles xWrite method. </ol> <p> Requirements describing step 1 of the above procedure in detail: | | | | | | | | 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 | the VFS xOpen method), and a <i>journal file header</i> written to the start of it using a single call to the file handles xWrite method. </ol> <p> Requirements describing step 1 of the above procedure in detail: <p class=req id=H21035> When required to open a <i>write transaction</i> on the database, SQLite shall first open a <i>read transaction</i>, if the <i>database connection</i> in question has not already opened one. <p class=req id=H21036> When required to open a <i>write transaction</i> on the database, after ensuring a <i>read transaction</i> has already been opened, SQLite shall obtain a <i>reserved lock</i> on the database file by calling the xLock method of the file-handle open on the database file. <p> Requirements describing step 2 of the above procedure in detail: <p class=req id=H21037> When required to open a <i>write transaction</i> on the database, after obtaining a <i>reserved lock</i> on the database file, SQLite shall open a read/write file-handle on the corresponding <i>journal file</i>. <p class=req id=H21038> When required to open a <i>write transaction</i> on the database, after opening a file-handle on the <i>journal file</i>, SQLite shall write a <i>journal header</i> into the first <i>sector-size</i> bytes of the journal file, using single call to the xWrite method of the recently opened file-handle. <p> Requirements describing the <i>journal header</i> written to the <i>journal file</i>: <p class=req id=H21039> The first 8 bytes of the <i>journal header</i> required to be written by H21038 shall be: <p class=todo> Reqirements describing the details of opening a <i>write transaction</i>. <p class=todo> Reqirement for error handling? |
︙ | ︙ | |||
890 891 892 893 894 895 896 | back. <p> A page is journalled by adding a <i>journal record</i> to the <i> journal file</i>. The format of a <i>journal record</i> is described in section <cite>journal_record_format</cite>. | | | | | | | | | 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 | back. <p> A page is journalled by adding a <i>journal record</i> to the <i> journal file</i>. The format of a <i>journal record</i> is described in section <cite>journal_record_format</cite>. <p class=req id=H21027> When required to <i>journal a database page</i>, SQLite shall first append the <i>page number</i> of the page being journalled to the <i>journal file</i>, formatted as a 4-byte big-endian unsigned integer, using a single call to the xWrite method of the file-handle opened on the journal file. <p class=req id=H21028> When required to <i>journal a database page</i>, if the attempt to append the <i>page number</i> to the journal file is successful, then the current page data (<i>page-size</i> bytes) shall be appended to the journal file, using a single call to the xWrite method of the file-handle opened on the journal file. <p class=req id=H21029> When required to <i>journal a database page</i>, if the attempt to append the current page data to the journal file is successful, then SQLite shall append a 4-byte big-endian integer checksum value to the to the journal file, using a single call to the xWrite method of the file-handle opened on the journal file. <p> The checksum value written to the <i>journal file</i> immediately after the page data (requirement H21029), is a function of both the page data and the <i>checksum initializer</i> field stored in the <i>journal header</i> (see section <cite>journal_header_format</cite>). Specifically, it is the sum of the <i>checksum initializer</i> and the value of every 200th byte of page data interpreted as an 8-bit unsigned integer, starting with the (<i>page-size</i> % 200)'th byte of page data. For example, if the <i>page-size</i> is 1024 bytes, then a checksum is calculated by adding the values of the bytes at offsets 23, 223, 423, 623, 823 and 1023 (the last byte of the page) together with the value of the <i>checksum initializer</i>. <p class=req id=H21030> The checksum value written to the <i>journal file</i> by the write required by H21029 shall be equal to the sum of the <i>checksum initializer</i> field stored in the <i>journal header</i> (H21XXX) and every 200th byte of the page data, beginning with the (<i>page-size</i> % 200)th byte. <p> The '%' character is used in the two paragraphs to represent the modulo operator, just as it is in programming languages such as C, Java and Javascript. |
︙ | ︙ | |||
1024 1025 1026 1027 1028 1029 1030 | prevent any other <i>database connection</i> from reading the database after a subset of the modifications that have been or will be made by a <i>write transaction</i> have been written into the database file. <p class=todo> Journal header operations? | | | | | | | 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 | prevent any other <i>database connection</i> from reading the database after a subset of the modifications that have been or will be made by a <i>write transaction</i> have been written into the database file. <p class=todo> Journal header operations? <p class=req id=H21031> Unless a <i>pending</i> or <i>exclusive</i> lock has already been obtained, when SQLite is required to <i>write out a page cache</i>, it shall first upgrade the lock on the database file to a <i>pending lock</i> using a call to the xLock method of the file-handle open on the database file. <p class=req id=H21032> Unless one has already been obtained, when SQLite is required to <i>write out a page cache</i>, after successfully obtaining a <i>pending lock</i> it shall upgrade the lock on the database file to an <i>exclusive lock</i> using a call to the xLock method of the file-handle open on the database file. <p class=todo> If obtaining the lock fails? <p class=req id=H21033> When SQLite is required to <i>write out a page cache</i>, if the required <i>exclusive lock</i> is already held or successfully obtained, SQLite shall copy the contents of all pages that have been modified within the <i>page cache</i> to the database file, using a single write of <i>page-size</i> bytes for each. <p class=req id=H21034> When the modified contents of a <i>page cache</i> is copied into the database file, as required by H21033, the write operations shall occur in <i>page number</i> order, from lowest to highest. <p> The above requirement to write data to the database file in the order in which it occurs in the file is added to improve performance. On many systems, sorting the regions of the file to be written before writing to them allows the storage hardware to operate more efficiently. |
︙ | ︙ |