Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Update the documentation for the new journal format to be introduced in version 2.8.0. (CVS 865) |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
e05a7a552f694158ee449d8682f5c137 |
User & Date: | drh 2003-02-13 02:54:03.000 |
Context
2003-02-13
| ||
02:54 | Update the documentation for the new journal format to be introduced in version 2.8.0. (CVS 1729) (check-in: dc53d92141 user: drh tags: trunk) | |
02:54 | Update the documentation for the new journal format to be introduced in version 2.8.0. (CVS 865) (check-in: e05a7a552f user: drh tags: trunk) | |
01:58 | Fix a bug in the rollback logic for the new journal format. (CVS 864) (check-in: 7c22aa3f81 user: drh tags: trunk) | |
Changes
Changes to www/changes.tcl.
︙ | ︙ | |||
20 21 22 23 24 25 26 27 28 29 30 31 32 33 | } proc chng {date desc} { puts "<DT><B>$date</B></DT>" puts "<DD><P><UL>$desc</UL></P></DD>" } chng {2003 Jan 25 (2.7.6)} { <li>Performance improvements. The library is now much faster.</li> <li>Added the <b>sqlite_set_authorizer()</b> API. Formal documentation has not been written - see the source code comments for instructions on how to use this function.</li> <li>Fix a bug in the GLOB operator that was preventing it from working | > > > > > > | 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | } proc chng {date desc} { puts "<DT><B>$date</B></DT>" puts "<DD><P><UL>$desc</UL></P></DD>" } chng {2003 Feb 14 (2.8.0)} { <li>Modified the journal file format to make it more resistant to corruption that can occur after an OS crash or power failure.</li> <li>Added a new C/C++ API that does not use callback for returning data.</li> } chng {2003 Jan 25 (2.7.6)} { <li>Performance improvements. The library is now much faster.</li> <li>Added the <b>sqlite_set_authorizer()</b> API. Formal documentation has not been written - see the source code comments for instructions on how to use this function.</li> <li>Fix a bug in the GLOB operator that was preventing it from working |
︙ | ︙ |
Changes to www/fileformat.tcl.
1 2 3 | # # Run this script to generated a fileformat.html output file # | | | 1 2 3 4 5 6 7 8 9 10 11 | # # Run this script to generated a fileformat.html output file # set rcsid {$Id: fileformat.tcl,v 1.8 2003/02/13 02:54:04 drh Exp $} puts {<html> <head> <title>SQLite Database File Format</title> </head> <body bgcolor="white"> <h1 align="center"> |
︙ | ︙ | |||
68 69 70 71 72 73 74 | main database file. As far as the pager is concerned, each page contains 1024 bytes of arbitrary data. But there is structure to the journal file. </p> <p> A journal file begins with 8 bytes as follows: | | | | > > > > > | > > > | | > > > | > > > > > | < | | > | > > > > > > > > > > > > > > > > > | > > | | > | 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 | main database file. As far as the pager is concerned, each page contains 1024 bytes of arbitrary data. But there is structure to the journal file. </p> <p> A journal file begins with 8 bytes as follows: 0xd9, 0xd5, 0x05, 0xf9, 0x20, 0xa1, 0x63, and 0xd6. Processes that are attempting to rollback a journal use these 8 bytes as a sanity check to make sure the file they think is a journal really is a valid journal. Prior version of SQLite used different journal file formats. The magic numbers for these prior formats is differ so that if a new version of the library attempts to rollback a journal created by an earlier version, it can detect that the journal uses an obsolete format and make the necessary adjustments. This article describes only the newest journal format - supported as of version 2.8.0. </p> <p> Following the 8 byte prefix is a three 4-byte integers that tell us the number of pages that have been committed to the journal, a magic number used for sanity checking each page, and the original size of the main database file before the transaction was started. The number of committed pages is used to limit how far into the journal to read. The use of the checksum magic number is described below. The original size of the database is used to restore the database file back to its original size. The size is expressed in pages (1024 bytes per page). </p> <p> All three integers in the journal header and all other multi-byte numbers used in the journal file are big-endian. That means that the most significant byte occurs first. That way, a journal file that is originally created on one machine can be rolled back by another machine that uses a different byte order. So, for example, a transaction that failed to complete on your big-endian SparcStation can still be rolled back on your little-endian Linux box. </p> <p> After the 8-byte prefix and the three 4-byte integers, the journal file consists of zero or more page records. Each page record is a 4-byte (big-endian) page number followed by 1024 bytes of data and a 4-byte checksum. The data is the original content of the database page before the transaction was started. So to roll back the transaction, the data is simply written into the corresponding page of the main database file. Pages can appear in the journal in any order, but they are guaranteed to appear only once. All page numbers will be between 1 and the maximum specified by the page size integer that appeared at the beginning of the journal. </p> <p> The so-called checksum at the end of each record is not really a checksum - it is the sum of the page number and the magic number which was the second integer in the journal header. The purpose of this value is to try to detect journal corruption that might have occurred because of a power loss or OS crash that occurred which the journal file was being written to disk. It could have been the case that the meta-data for the journal file, specifically the size of the file, had been written to the disk so that when the machine reboots it appears that file is large enough to hold the current record. But even though the file size has changed, the data for the file might not have made it to the disk surface at the time of the OS crash or power loss. This means that after reboot, the end of the journal file will contain quasi-random garbage data. The checksum is an attempt to detect such corruption. If the checksum does not match, that page of the journal is not rolled back. </p> <p> Here is a summary of the journal file format: </p> <ul> <li>8 byte prefix: 0xd9, 0xd5, 0x05, 0xf9, 0x20, 0xa1, 0x63, 0xd6</li> <li>4 byte number of records in journal</li> <li>4 byte magic number used for page checksums</li> <li>4 byte initial database page count</li> <li>Zero or more instances of the following: <ul> <li>4 byte page number</li> <li>1024 bytes of original data for the page</li> <li>4 byte checksum</li> </ul> </li> </ul> <h2>3.0 The B-Tree Layer</h2> <p> |
︙ | ︙ | |||
231 232 233 234 235 236 237 | </p> <ul> <li>48 byte header string</li> <li>4 byte integer used to determine the byte-order</li> <li>4 byte integer which is the first page of the freelist</li> <li>4 byte integer which is the number of pages on the freelist</li> | | | | 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 | </p> <ul> <li>48 byte header string</li> <li>4 byte integer used to determine the byte-order</li> <li>4 byte integer which is the first page of the freelist</li> <li>4 byte integer which is the number of pages on the freelist</li> <li>36 bytes of meta-data arranged as nine 4-byte integers</li> <li>928 bytes of unused space</li> </ul> <h3>3.2 Structure Of A Single B-Tree Page</h3> <p> Conceptually, a b-tree page contains N database entries and N+1 pointers to other b-tree pages. |
︙ | ︙ | |||
737 738 739 740 741 742 743 | by the DEFAULT_CACHE_SIZE pragma. If the value is positive it means that synchronous behavior is enable (via the DEFAULT_SYNCHRONOUS pragma) and if negative it means that synchronous behavior is disabled. </p> <p> | | > > > > > > | 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 | by the DEFAULT_CACHE_SIZE pragma. If the value is positive it means that synchronous behavior is enable (via the DEFAULT_SYNCHRONOUS pragma) and if negative it means that synchronous behavior is disabled. </p> <p> The fourth meta-value is safety level added in version 2.8.0. A value of 1 corresponds to a SYNCHRONOUS setting of OFF. In other words, SQLite does not pause to wait for journal data to reach the disk surface before overwriting pages of the database. A value of 2 corresponds to a SYNCHRONOUS setting of NORMAL. A value of 3 corresponds to a SYNCHRONOUS setting of FULL. If the value is 0, that means it has not been initialized so the default synchronous setting of NORMAL is used. </p> } puts { <p><hr /></p> <p><a href="index.html"><img src="/goback.jpg" border=0 /> Back to the SQLite Home Page</a> </p> </body></html>} |
Changes to www/formatchng.tcl.
1 2 3 | # # Run this Tcl script to generate the formatchng.html file. # | | | 1 2 3 4 5 6 7 8 9 10 11 | # # Run this Tcl script to generate the formatchng.html file. # set rcsid {$Id: formatchng.tcl,v 1.8 2003/02/13 02:54:04 drh Exp $ } puts {<html> <head> <title>File Format Changes in SQLite</title> </head> <body bgcolor=white> <h1 align=center> |
︙ | ︙ | |||
153 154 155 156 157 158 159 160 161 162 163 164 165 166 | <p>Because "text" columns have a different sort order from numeric, indices on "text" columns occur in a different order for version 2.7.0 and later database. Hence version 2.6.3 and earlier of SQLite will be unable to read a 2.7.0 or later database. But version 2.7.0 and later of SQLite will read earlier database version.</p> </td> </tr> </table> </blockquote> <p> To perform a database reload, have ready versions of the <b>sqlite</b> command-line utility for both the old and new version of SQLite. Call these two executables "<b>sqlite-old</b>" | > > > > > > > > > > > > > > > > > > | 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 | <p>Because "text" columns have a different sort order from numeric, indices on "text" columns occur in a different order for version 2.7.0 and later database. Hence version 2.6.3 and earlier of SQLite will be unable to read a 2.7.0 or later database. But version 2.7.0 and later of SQLite will read earlier database version.</p> </td> </tr> <tr> <td valign="top">2.7.6 to 2.8.0</td> <td valign="top">2003-Feb-14</td> <td><p>Version 2.8.0 introduces a change to the format of the rollback journal file. The main database file format is unchanged. Versions 2.7.6 and earlier can read and write 2.8.0 databases and vice versa. Version 2.8.0 can rollback a transation that was started by version 2.7.6 and earlier. But version 2.7.6 and earlier cannot rollback a transaction started by version 2.8.0 or later.</p> <p>The only time this would ever be an issue is when you have a program using version 2.8.0 or later that crashes with an incomplete transaction, then you try to examine the database using version 2.7.6 or earlier. The 2.7.6 code will not be able to read the journal file and thus will not be able to rollback the incomplete transaction to restore the database.</p> </td> </tr> </table> </blockquote> <p> To perform a database reload, have ready versions of the <b>sqlite</b> command-line utility for both the old and new version of SQLite. Call these two executables "<b>sqlite-old</b>" |
︙ | ︙ |
Changes to www/lang.tcl.
1 2 3 | # # Run this Tcl script to generate the sqlite.html file. # | | | 1 2 3 4 5 6 7 8 9 10 11 | # # Run this Tcl script to generate the sqlite.html file. # set rcsid {$Id: lang.tcl,v 1.50 2003/02/13 02:54:04 drh Exp $} puts {<html> <head> <title>Query Language Understood By SQLite</title> </head> <body bgcolor=white> <h1 align=center> |
︙ | ︙ | |||
1076 1077 1078 1079 1080 1081 1082 | will hold in memory at once. Each page uses about 1.5K of memory. This pragma works like the <b>cache_size</b> pragma with the addition feature that it changes the cache size persistently. With this pragma, you can set the cache size once and that setting is retained and reused everytime you reopen the database.</p></li> <li><p><b>PRAGMA default_synchronous; | | > | | | > | > | < < | > > > > > | | | > | | | 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 | will hold in memory at once. Each page uses about 1.5K of memory. This pragma works like the <b>cache_size</b> pragma with the addition feature that it changes the cache size persistently. With this pragma, you can set the cache size once and that setting is retained and reused everytime you reopen the database.</p></li> <li><p><b>PRAGMA default_synchronous; <br>PRAGMA default_synchronous = FULL; <br>PRAGMA default_synchronous = NORMAL; <br>PRAGMA default_synchronous = OFF;</b></p> <p>Query or change the setting of the "synchronous" flag in the database. When synchronous is FULL, the SQLite database engine will pause at critical moments to make sure that data has actually been written to the disk surface before continuing. This ensures that if the operating system crashes or if there is a power failure, the database will be uncorrupted after rebooting. FULL synchronous is very safe, but it is also slow. When synchronous is NORMAL (the default), the SQLite database engine will still pause at the most critical moments, but less often than in FULL mode. There is a very small (though non-zero) chance that a power failure at just the wrong time could corrupt the database in NORMAL mode. But in practice, you are more likely to suffer a catastrophic disk failure or some other unrecoverable hardware fault. So NORMAL is the default mode. With synchronous OFF, SQLite continues without pausing as soon as it has handed data off to the operating system. If the application running SQLite crashes, the data will be safe, but the database might become corrupted if the operating system crashes or the computer loses power before that data has been written to the disk surface. On the other hand, some operations are as much as 50 or more times faster with synchronous OFF. </p> <p>This pragma changes the synchronous mode persistently. Once changed, the mode stays as set even if the database is closed and reopened. The <b>synchronous</b> pragma does the same thing but only applies the setting to the current session.</p> <a name="pragma_empty_result_callbacks"> |
︙ | ︙ | |||
1175 1176 1177 1178 1179 1180 1181 | azCol[3] = "TEXT";<br> azCol[4] = "NUMERIC";<br> azCol[5] = "TEXT";<br> azCol[6] = 0; </td></table></blockquote></li> <li><p><b>PRAGMA synchronous; | | > | 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 | azCol[3] = "TEXT";<br> azCol[4] = "NUMERIC";<br> azCol[5] = "TEXT";<br> azCol[6] = 0; </td></table></blockquote></li> <li><p><b>PRAGMA synchronous; <br>PRAGMA synchronous = FULL; <br>PRAGMA synchronous = NORMAL; <br>PRAGMA synchronous = OFF;</b></p> <p>Query or change the setting of the "synchronous" flag in the database for the duration of the current database connect. The synchronous flag reverts to its default value when the database is closed and reopened. For additional information on the synchronous flag, see the description of the <b>default_synchronous</b> pragma.</p> </li> |
︙ | ︙ |