Documentation Source Text

Check-in [2a5bc288a1]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Updates to the fileformat2 document.
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 2a5bc288a18e1b40d4588989d0d5419928934496
User & Date: drh 2010-06-19 10:39:45
Context
2010-06-19
10:40
Add the oracle logo. check-in: cddbaba37a user: drh tags: trunk
10:39
Updates to the fileformat2 document. check-in: 2a5bc288a1 user: drh tags: trunk
2010-06-17
15:21
Minor edits to the WAL document. check-in: f84c08f7bf user: drh tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to pages/fileformat2.in.

116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145




146


147
148
149
150
151


152
153
154
155

156
157
158
159
160


161
162
163
164
165
166
167
...
170
171
172
173
174
175
176


177
178
179
180
181
182
183
184
185



186
187
188
189

190
191
192

































193
194
195
196
197
198


199
200
201
202
203
204
205
206


207
208
209
210
211
212
213
...
233
234
235
236
237
238
239


240
241
242
243
244
245


246
247
248
249
250
251
252
253
254
255
256
257


258
259
260
261
262


263
264
265
266












267
268
269
270
271
272
273
<tr><td valign=top align=center>22<td valign=top align=center>1<td align=left>
Minimum embedded payload fraction.  Must be 32.
<tr><td valign=top align=center>23<td valign=top align=center>1<td align=left>
Leaf payload fraction.  Must be 32.
<tr><td valign=top align=center>24<td valign=top align=center>4<td align=left>
File change counter.
<tr><td valign=top align=center>28<td valign=top align=center>4<td align=left>
Size of the database file in pages.
<tr><td valign=top align=center>32<td valign=top align=center>4<td align=left>
Page number of the first freelist trunk page.
<tr><td valign=top align=center>36<td valign=top align=center>4<td align=left>
Total number of freelist pages.
<tr><td valign=top align=center>40<td valign=top align=center>4<td align=left>
The schema cookie.
<tr><td valign=top align=center>44<td valign=top align=center>4<td align=left>
The schema format number.  Currently support schema formats are 1, 2, 3, and 4.
<tr><td valign=top align=center>48<td valign=top align=center>4<td align=left>
Default page cache size.
<tr><td valign=top align=center>52<td valign=top align=center>4<td align=left>
The page number of the largest root b-tree page when in auto-vacuum or
incremental-vacuum modes, or zero otherwise.
<tr><td valign=top align=center>56<td valign=top align=center>4<td align=left>
The database text encoding.  A value of 1 means UTF-8.  A value of 2
means UTF-16le.  A value of 3 means UTF-16be.
<tr><td valign=top align=center>60<td valign=top align=center>4<td align=left>
The "user version" as read and set by the [user_version pragma].
<tr><td valign=top align=center>64<td valign=top align=center>4<td align=left>
True (non-zero) for incremental-vacuum mode.  False (zero) otherwise.
<tr><td valign=top align=center>68<td valign=top align=center>32<td align=left>
Reserved for expansion.  Must be zero.




</table></center>



<p>Every SQLite database file begins with the following 16 bytes (in hex):
53 51 4c 69 74 65 20 66 6f 72 6d 61 74 20 33 00.  This byte sequence
corresponds to the UTF-8 string "SQLite format 3" including the nul
terminator character at the end.</p>



<p>The file format write version and file format read version at offsets
18 and 19 are intended to allow for enhancements of the file format
in future versions of SQLite.  In current versions of SQLite, both of

these values must be one.  If a version of SQLite coded to the current
file format specification encounters a database file where the read
version is 1 but the write version is greater than one, then the database
file must be treated as read-only.  If a database file with a read version
greater than 1 is encounter, then that database cannot be read or written.</p>



<p>SQLite has the ability to set aside a small number of extra bytes at
the end of every page for use by extensions.  These extra bytes are
used, for example, by the SQLite Encryption Extension to store a nonce
and/or cryptographic checksum associated with each page.  The 
"reserved space" size in the 1-byte integer at offset 20 is the number
of bytes of space at the end of each page to reserve for extensions.
................................................................................
<tcl>hd_fragment usable_size {usable size}</tcl>
<p>The "usable size" of a database page is the page size specify by the
2-byte integer at offset 16 in the header less the "reserved" space size
recorded in the 1-byte integer at offset 20 in the header.  The usable
size of a page might be an odd number.  However, the usable size is not
allowed to be less than 480.  In other words, if the page size is 512,
then the reserved space size cannot exceed 32.</p>



<p>The maximum and minimum embedded payload fractions and the leaf
payload fraction values must be 64, 32, and 32.  These values were
orginally intended to as tunable parameters that could be used to
modify the storage format of the b-tree algorithm.  However, that
functionality is not supported and there are no current plans to add
support in the future.  Hence, these three bytes are fixed at the
values specified.</p>




<p>The file change counter is a 4-byte big-endian integer which is
incremented whenever the database file is changed.  When two or more
processes are reading the same database file, each process can detect
database changes from other processes by monitoring the change counter.

A process will normally want to flush its database page cache when
another process modified the database, since the cache has become stale.
The file change counter facilitates this.</p>


































<p>Unused pages in the database file are stored on a freelist.  The
4-byte big-endian integer at offset 32 stores the page number of
the first page of the freelist, or zero if the freelist is empty.
The 4-byte big-endian integer at offset 36 stores stores the total 
number of pages on the freelist.</p>



<p>The schema cookie is a 4-byte big-endian integer at offset 40
that is incremented whenever the database schema changes.  A 
prepared statement is compiled against a specific version of the
database schema.  Whenever the database schema changes, the statement
must be reprepared.  Whenever a prepared statement runs, it first checks
the schema cookie to make sure the value is the same as when the statement
was prepared and if not it aborts to force the statement to be reprepared.</p>



<p>The schema format number is a 4-byte big-endian integer at offset 44.
The schema format number is similar to the file format read and write
version numbers at offsets 18 and 19 except that the schema format number
refers to the high-level SQL formatting rather than the low-level b-tree
formatting.  Four schema format numbers are currently defined:</p>

................................................................................

<p>New database files created by SQLite use format 1 by default, so
that database files created by newer versions of SQLite can still
be read by older versions of SQLite.
The [legacy_file_format pragma] can be used to cause SQLite
to create new database files using format 4.  Future versions of 
SQLite may begin to create files using format 4 by default.</p>



<p>The 4-byte big-endian signed integer at offset 48 is the suggest
cache size in pages for the database file.  The value is a suggestion
only and SQLite is under no obligation to honor it.  The absolute value
of the integer is used as the suggested size.  The suggested cache size
can be set using the [default_cache_size pragma].</p>



<p>The two 4-byte big-endian integers at offsets 52 and 64 are used
to manage the [auto_vacuum] and [incremental_vacuum] modes.  If
the integer at offset 52 is zero then pointer-map (ptrmap) pages are
omitted from the database file and neither auto_vacuum nor
incremental_vacuum are supported.  If the integer at offset 52 is
non-zero then it is the page number of the largest root page in the
database file, the database file contain ptrmap pages, and the
mode must be either auto_vacuum or incremental_vacuum.  In this latter
case, the integer at offset 64 is true for incremental_vacuum and
false for auto_vacuum.  If the integer at offset 52 is zero then
the integer at offset 64 must also be zero.</p>



<p>The 4-byte big-endian integer at offset 56 determines the encoding
used for all text strings stored in the database.  A value of 1 means
UTF-8.  A value of 2 means UTF-16le.  A value of 3 means UTF-16be.
No other values are allowed.</p>



<p>The 4-byte big-endian integer at offset 60 is the user version which
is set and queried by the [user_version pragma].  The user version is
not used by SQLite.</p>













<p>All other bytes of the database file header are reserved for
future expansion and must be set to zero.</p>

<h3>1.3 The Lock-Byte Page</h3>

<p>The lock-byte page is the single page of the database file







|







|












|

>
>
>
>

>
>





>
>




>
|

|

|
>
>







 







>
>









>
>
>

|
|
|
>



>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>






>
>








>
>







 







>
>






>
>












>
>





>
>




>
>
>
>
>
>
>
>
>
>
>
>







116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
...
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
...
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
<tr><td valign=top align=center>22<td valign=top align=center>1<td align=left>
Minimum embedded payload fraction.  Must be 32.
<tr><td valign=top align=center>23<td valign=top align=center>1<td align=left>
Leaf payload fraction.  Must be 32.
<tr><td valign=top align=center>24<td valign=top align=center>4<td align=left>
File change counter.
<tr><td valign=top align=center>28<td valign=top align=center>4<td align=left>
Size of the database file in pages.  The "in-header database size".
<tr><td valign=top align=center>32<td valign=top align=center>4<td align=left>
Page number of the first freelist trunk page.
<tr><td valign=top align=center>36<td valign=top align=center>4<td align=left>
Total number of freelist pages.
<tr><td valign=top align=center>40<td valign=top align=center>4<td align=left>
The schema cookie.
<tr><td valign=top align=center>44<td valign=top align=center>4<td align=left>
The schema format number.  Supported schema formats are 1, 2, 3, and 4.
<tr><td valign=top align=center>48<td valign=top align=center>4<td align=left>
Default page cache size.
<tr><td valign=top align=center>52<td valign=top align=center>4<td align=left>
The page number of the largest root b-tree page when in auto-vacuum or
incremental-vacuum modes, or zero otherwise.
<tr><td valign=top align=center>56<td valign=top align=center>4<td align=left>
The database text encoding.  A value of 1 means UTF-8.  A value of 2
means UTF-16le.  A value of 3 means UTF-16be.
<tr><td valign=top align=center>60<td valign=top align=center>4<td align=left>
The "user version" as read and set by the [user_version pragma].
<tr><td valign=top align=center>64<td valign=top align=center>4<td align=left>
True (non-zero) for incremental-vacuum mode.  False (zero) otherwise.
<tr><td valign=top align=center>68<td valign=top align=center>24<td align=left>
Reserved for expansion.  Must be zero.
<tr><td valign=top align=center>92<td valign=top align=center>4<td align=left>
The "version-valid-for" integer.
<tr><td valign=top align=center>96<td valign=top align=center>4<td align=left>
[SQLITE_VERSION_NUMBER]
</table></center>

<h4>1.2.1 Magic Header String</h4>

<p>Every SQLite database file begins with the following 16 bytes (in hex):
53 51 4c 69 74 65 20 66 6f 72 6d 61 74 20 33 00.  This byte sequence
corresponds to the UTF-8 string "SQLite format 3" including the nul
terminator character at the end.</p>

<h4>1.2.2 File format version numbers</h4>

<p>The file format write version and file format read version at offsets
18 and 19 are intended to allow for enhancements of the file format
in future versions of SQLite.  In current versions of SQLite, both of
these values are 1 for rollback journalling modes and 2 for [WAL]
journalling mode.  If a version of SQLite coded to the current
file format specification encounters a database file where the read
version is 1 or 2 but the write version is greater than 2, then the database
file must be treated as read-only.  If a database file with a read version
greater than 2 is encounter, then that database cannot be read or written.</p>

<h4>1.2.3 Reserved bytes per page</h4>

<p>SQLite has the ability to set aside a small number of extra bytes at
the end of every page for use by extensions.  These extra bytes are
used, for example, by the SQLite Encryption Extension to store a nonce
and/or cryptographic checksum associated with each page.  The 
"reserved space" size in the 1-byte integer at offset 20 is the number
of bytes of space at the end of each page to reserve for extensions.
................................................................................
<tcl>hd_fragment usable_size {usable size}</tcl>
<p>The "usable size" of a database page is the page size specify by the
2-byte integer at offset 16 in the header less the "reserved" space size
recorded in the 1-byte integer at offset 20 in the header.  The usable
size of a page might be an odd number.  However, the usable size is not
allowed to be less than 480.  In other words, if the page size is 512,
then the reserved space size cannot exceed 32.</p>

<h4>1.2.4 Payload fractions</h4>

<p>The maximum and minimum embedded payload fractions and the leaf
payload fraction values must be 64, 32, and 32.  These values were
orginally intended to as tunable parameters that could be used to
modify the storage format of the b-tree algorithm.  However, that
functionality is not supported and there are no current plans to add
support in the future.  Hence, these three bytes are fixed at the
values specified.</p>

<h4>1.2.5 File change counter</h4>

<tcl>hd_fragment chngctr {change counter}</tcl>
<p>The file change counter is a 4-byte big-endian integer which is
incremented whenever the database file is changed in rollback mode.  
When two or more processes are reading the same database file, each 
process can detect database changes from other processes by monitoring 
the change counter.
A process will normally want to flush its database page cache when
another process modified the database, since the cache has become stale.
The file change counter facilitates this.</p>

<p>In WAL mode, changes to the database are detected using the wal-index
and so the change counter is not needed.  Hence, the change counter might
not be incremented on each transaction in WAL mode.</p>

<h4>1.2.6 In-header database size</h4>

<tcl>hd_fragment filesize {in-header database size}</tcl>
<p>The 4-byte big-endian integer at offset 28 into the header 
stores the size of the database file in pages.  If this in-header
datasize size is not valid (see the next paragraph), then the database 
size is computed by looking
at the actual size of the database file. Older versions of SQLite
ignored the in-header database size and used the actual file size
exclusively.  Newer versions of SQLite use the in-header database
size if it is available but fall back to the actual file size if
the in-header database size is not valid.</p>

<p>The in-header database size is only considered to be valid if
it is non-zero and if the 4-byte [change counter] at offset 24
exactly matches the 4-byte [version-valid-for value] at offset 92.
The in-header database size should always be valid 
when the database is only modified using recent versions of SQLite
(versions 3.7.0 and later).
If a legacy version of SQLite writes to the database, it will not
know to update the in-header database size and so the in-header
database size could be incorrect.  But legacy versions of SQLite
will also leave the version-valid-for value at offset 92 unchanged
so it will not match the change-counter.  Hence, invalid in-header
database sizes can be detected (and ignored) by observing when
the change-counter does not match the version-valid-for integer.</p>

<h4>1.2.7 Free page list</h4>

<p>Unused pages in the database file are stored on a freelist.  The
4-byte big-endian integer at offset 32 stores the page number of
the first page of the freelist, or zero if the freelist is empty.
The 4-byte big-endian integer at offset 36 stores stores the total 
number of pages on the freelist.</p>

<h4>1.2.8 Schema cookie</h4>

<p>The schema cookie is a 4-byte big-endian integer at offset 40
that is incremented whenever the database schema changes.  A 
prepared statement is compiled against a specific version of the
database schema.  Whenever the database schema changes, the statement
must be reprepared.  Whenever a prepared statement runs, it first checks
the schema cookie to make sure the value is the same as when the statement
was prepared and if not it aborts to force the statement to be reprepared.</p>

<h4>1.2.9 Schema format number</h4>

<p>The schema format number is a 4-byte big-endian integer at offset 44.
The schema format number is similar to the file format read and write
version numbers at offsets 18 and 19 except that the schema format number
refers to the high-level SQL formatting rather than the low-level b-tree
formatting.  Four schema format numbers are currently defined:</p>

................................................................................

<p>New database files created by SQLite use format 1 by default, so
that database files created by newer versions of SQLite can still
be read by older versions of SQLite.
The [legacy_file_format pragma] can be used to cause SQLite
to create new database files using format 4.  Future versions of 
SQLite may begin to create files using format 4 by default.</p>

<h4>1.2.10 Suggested cache size</h4>

<p>The 4-byte big-endian signed integer at offset 48 is the suggest
cache size in pages for the database file.  The value is a suggestion
only and SQLite is under no obligation to honor it.  The absolute value
of the integer is used as the suggested size.  The suggested cache size
can be set using the [default_cache_size pragma].</p>

<h4>1.2.11 Incremental vacuum settings</h4>

<p>The two 4-byte big-endian integers at offsets 52 and 64 are used
to manage the [auto_vacuum] and [incremental_vacuum] modes.  If
the integer at offset 52 is zero then pointer-map (ptrmap) pages are
omitted from the database file and neither auto_vacuum nor
incremental_vacuum are supported.  If the integer at offset 52 is
non-zero then it is the page number of the largest root page in the
database file, the database file contain ptrmap pages, and the
mode must be either auto_vacuum or incremental_vacuum.  In this latter
case, the integer at offset 64 is true for incremental_vacuum and
false for auto_vacuum.  If the integer at offset 52 is zero then
the integer at offset 64 must also be zero.</p>

<h4>1.2.12 Text encoding</h4>

<p>The 4-byte big-endian integer at offset 56 determines the encoding
used for all text strings stored in the database.  A value of 1 means
UTF-8.  A value of 2 means UTF-16le.  A value of 3 means UTF-16be.
No other values are allowed.</p>

<h4>1.2.13 User version number</h4>

<p>The 4-byte big-endian integer at offset 60 is the user version which
is set and queried by the [user_version pragma].  The user version is
not used by SQLite.</p>

<tcl>hd_fragment validfor {version-valid-for number}</tcl>
<h4>1.2.14 Write library version number and version-valid-for number</h4>

<p>The 4-byte big-endian integer at offset 96 stores the 
[SQLITE_VERSION_NUMBER] value.  The 4-byte big-ending integer at
offset 92 is the value of the [change counter] when the version number
was stored.  The integer at offset 92 indicates which transaction
the version number is valid for and is sometimes calle the "version-valid-for"
integer.

<h4>1.2.15 Header space reserved for expansion</h4>

<p>All other bytes of the database file header are reserved for
future expansion and must be set to zero.</p>

<h3>1.3 The Lock-Byte Page</h3>

<p>The lock-byte page is the single page of the database file