SQLite

Check-in [e372a60be8]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Documentation updates. (CVS 720)
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: e372a60be8de3332ab6e45d82f21227b8b9acf82
User & Date: drh 2002-08-18 19:09:23.000
Context
2002-08-18
20:28
Fix for ticket #110: return an error if trying to start a transaction within a transaction or when attempting to commit or rollback outside of a transaction. (CVS 721) (check-in: df51cb166b user: drh tags: trunk)
19:09
Documentation updates. (CVS 720) (check-in: e372a60be8 user: drh tags: trunk)
2002-08-15
13:50
Fix for ticket #135: Remove unused variables from three files. (CVS 719) (check-in: 8cf17f2a24 user: drh tags: trunk)
Changes
Unified Diff Show Whitespace Changes Patch
Changes to main.mk.
301
302
303
304
305
306
307



308
309
310
311
312
313
314

datatypes.html:	$(TOP)/www/datatypes.tcl
	tclsh $(TOP)/www/datatypes.tcl >datatypes.html

quickstart.html:	$(TOP)/www/quickstart.tcl
	tclsh $(TOP)/www/quickstart.tcl >quickstart.html





# Files to be published on the website.
#
DOC = \
  index.html \
  sqlite.html \
  changes.html \







>
>
>







301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317

datatypes.html:	$(TOP)/www/datatypes.tcl
	tclsh $(TOP)/www/datatypes.tcl >datatypes.html

quickstart.html:	$(TOP)/www/quickstart.tcl
	tclsh $(TOP)/www/quickstart.tcl >quickstart.html

fileformat.html:	$(TOP)/www/fileformat.tcl
	tclsh $(TOP)/www/fileformat.tcl >fileformat.html


# Files to be published on the website.
#
DOC = \
  index.html \
  sqlite.html \
  changes.html \
324
325
326
327
328
329
330
331

332
333
334
335
336
337
338
  download.html \
  speed.html \
  faq.html \
  formatchng.html \
  conflict.html \
  omitted.html \
  datatypes.html \
  quickstart.html


doc:	$(DOC)
	mkdir -p doc
	mv $(DOC) doc

install:	sqlite libsqlite.a sqlite.h
	mv sqlite /usr/bin







|
>







327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
  download.html \
  speed.html \
  faq.html \
  formatchng.html \
  conflict.html \
  omitted.html \
  datatypes.html \
  quickstart.html \
  fileformat.html

doc:	$(DOC)
	mkdir -p doc
	mv $(DOC) doc

install:	sqlite libsqlite.a sqlite.h
	mv sqlite /usr/bin
Changes to www/changes.tcl.
25
26
27
28
29
30
31


32
33
34
35
36
37
38
  puts "<DD><P><UL>$desc</UL></P></DD>"
}

chng {2002 Aug ?? (2.7.0)} {
<li>Make a distinction between numeric and text values when sorting.
    Text values sort according to memcmp().  Numeric values sort in
    numeric order.</li>


}

chng {2002 Aug 12 (2.6.3)} {
<li>Add the ability to read both little-endian and big-endian databases.
    So database created under SunOS or MacOSX can be read and written
    under Linux or Windows and vice versa.</li>
<li>Convert to the new website: http://www.sqlite.org/</li>







>
>







25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
  puts "<DD><P><UL>$desc</UL></P></DD>"
}

chng {2002 Aug ?? (2.7.0)} {
<li>Make a distinction between numeric and text values when sorting.
    Text values sort according to memcmp().  Numeric values sort in
    numeric order.</li>
<li>Allow multiple simulataneous readers under windows by simulating
    the reader/writers locks that are missing from Win95/98/ME.</li>
}

chng {2002 Aug 12 (2.6.3)} {
<li>Add the ability to read both little-endian and big-endian databases.
    So database created under SunOS or MacOSX can be read and written
    under Linux or Windows and vice versa.</li>
<li>Convert to the new website: http://www.sqlite.org/</li>
Changes to www/datatypes.tcl.
1
2
3
4
5
6
7
8
9
10
11
#
# Run this script to generated a datatypes.html output file
#
set rcsid {$Id: datatypes.tcl,v 1.4 2002/08/15 13:45:17 drh Exp $}

puts {<html>
<head>
  <title>Datatypes In SQLite</title>
</head>
<body bgcolor="white">
<h1 align="center">



|







1
2
3
4
5
6
7
8
9
10
11
#
# Run this script to generated a datatypes.html output file
#
set rcsid {$Id: datatypes.tcl,v 1.5 2002/08/18 19:09:24 drh Exp $}

puts {<html>
<head>
  <title>Datatypes In SQLite</title>
</head>
<body bgcolor="white">
<h1 align="center">
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
</p>

<p>
If data is of type <b>text</b> then the comparison is determined by
the standard C data comparison functions <b>memcmp()</b> or
<b>strcmp()</b>.  The comparison looks at bytes from two inputs one
by one and returns the first non-zero difference.
String are '\000' terminated so shorter
strings sort before longer strings, as you would expect.
</p>

<p>
For numeric data, this situation is more complex.  If both strings
being compared look like well-formed numbers, then they are converted
into floating point values using <b>atof()</b> and compared numerically.
If one input is not a well-formed number but the other is, then the
number is considered to be less than the non-number.  If neither inputs
is a well-formed number, then <b>strcmp()</b> is used to do the
comparison.
</p>








|




|
|







113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
</p>

<p>
If data is of type <b>text</b> then the comparison is determined by
the standard C data comparison functions <b>memcmp()</b> or
<b>strcmp()</b>.  The comparison looks at bytes from two inputs one
by one and returns the first non-zero difference.
Strings are '\000' terminated so shorter
strings sort before longer strings, as you would expect.
</p>

<p>
For numeric data, this situation is more complex.  If both inputs
look like well-formed numbers, then they are converted
into floating point values using <b>atof()</b> and compared numerically.
If one input is not a well-formed number but the other is, then the
number is considered to be less than the non-number.  If neither inputs
is a well-formed number, then <b>strcmp()</b> is used to do the
comparison.
</p>

210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
inserted into a numeric column but 0==0.0 which violates the uniqueness
constraint.  However, the second insert in the right-hand sequence works.  In
this case, the constants 0 and 0.0 are treated a strings which means that
they are distinct.</p>

<p>SQLite always converts numbers into double-precision (64-bit) floats
for comparison purposes.  This means that a long sequence of digits that
differ only in digits of far to the right will compare equal if they
are in a numeric column but will compare unequal if they are in a text
column.  We have:</p>

<blockquote><pre>
INSERT INTO t1                            INSERT INTO t2
   VALUES('12345678901234567890');           VALUES(12345678901234567890);
INSERT INTO t1                            INSERT INTO t2







|







210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
inserted into a numeric column but 0==0.0 which violates the uniqueness
constraint.  However, the second insert in the right-hand sequence works.  In
this case, the constants 0 and 0.0 are treated a strings which means that
they are distinct.</p>

<p>SQLite always converts numbers into double-precision (64-bit) floats
for comparison purposes.  This means that a long sequence of digits that
differ only in insignificant digits will compare equal if they
are in a numeric column but will compare unequal if they are in a text
column.  We have:</p>

<blockquote><pre>
INSERT INTO t1                            INSERT INTO t2
   VALUES('12345678901234567890');           VALUES(12345678901234567890);
INSERT INTO t1                            INSERT INTO t2
Changes to www/faq.tcl.
1
2
3
4
5
6
7
8
9
10
11
#
# Run this script to generated a faq.html output file
#
set rcsid {$Id: faq.tcl,v 1.17 2002/08/15 11:48:14 drh Exp $}

puts {<html>
<head>
  <title>SQLite Frequently Asked Questions</title>
</head>
<body bgcolor="white">
<h1 align="center">Frequently Asked Questions</h1>



|







1
2
3
4
5
6
7
8
9
10
11
#
# Run this script to generated a faq.html output file
#
set rcsid {$Id: faq.tcl,v 1.18 2002/08/18 19:09:24 drh Exp $}

puts {<html>
<head>
  <title>SQLite Frequently Asked Questions</title>
</head>
<body bgcolor="white">
<h1 align="center">Frequently Asked Questions</h1>
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
} {
  <p>You can declare a table column to be of type "BLOB" but it will still
  only store null-terminated strings.  This is because the only way to 
  insert information into an SQLite database is using an INSERT SQL statement,
  and you can not include binary data in the middle of the ASCII text string
  of an INSERT statement.</p>

  <p>SQLite is 8-bit clean with regard to the data is stores as long as
  the data does not contain any NUL characters.  If you want to store binary
  data, consider encoding your data in such a way that it contains no NUL
  characters and inserting it that way.  You might use URL-style encoding:
  encode NUL as "%00" and "%" as "%25".  Or, you might consider encoding your
  binary data using base-64.  There is a source file named 
  "<b>src/encode.c</b>" in the SQLite distribution that contains
  implementations of functions named "<b>sqlite_encode_binary()</b>
  and <b>sqlite_decode_binary()</b> that can be used for converting







|
|







350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
} {
  <p>You can declare a table column to be of type "BLOB" but it will still
  only store null-terminated strings.  This is because the only way to 
  insert information into an SQLite database is using an INSERT SQL statement,
  and you can not include binary data in the middle of the ASCII text string
  of an INSERT statement.</p>

  <p>SQLite is 8-bit clean with regard to the data it stores as long as
  the data does not contain any '\000' characters.  If you want to store binary
  data, consider encoding your data in such a way that it contains no NUL
  characters and inserting it that way.  You might use URL-style encoding:
  encode NUL as "%00" and "%" as "%25".  Or, you might consider encoding your
  binary data using base-64.  There is a source file named 
  "<b>src/encode.c</b>" in the SQLite distribution that contains
  implementations of functions named "<b>sqlite_encode_binary()</b>
  and <b>sqlite_decode_binary()</b> that can be used for converting
Added www/fileformat.tcl.




































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
#
# Run this script to generated a fileformat.html output file
#
set rcsid {$Id: fileformat.tcl,v 1.6 2002/08/18 19:09:24 drh Exp $}

puts {<html>
<head>
  <title>SQLite Database File Format</title>
</head>
<body bgcolor="white">
<h1 align="center">
SQLite Database File Format
</h1>
}
puts "<p align=center>
(This page was last modified on [lrange $rcsid 3 4] UTC)
</p>"

puts {
<h2>1.0 &nbsp; Layers</h2>

<p>
SQLite is implemented in layers.
(See the <a href="arch.html">architecture description</a>.)
The format of database files is determined by three different
layers in the architecture.
</p>

<ul>
<li>The <b>schema</b> layer implemented by the VDBE.</li>
<li>The <b>b-tree</b> layer implemented by btree.c</li>
<li>The <b>pager</b> layer implemented by pager.c</li>
</ul>

<p>
We wil describe each layer beginning with the bottom (pager)
layer and working upwards.
</p>

<h2>2.0 &nbsp; The Pager Layer</h2>

<p>
An SQLite database consists of
"pages" of data.  Each page is 1024 bytes in size.
Pages are numbered beginning with 1.
A page number of 0 is used to indicate "no such page" in the
B-Tree and Schema layers.
</p>

<p>
The pager layer is responsible for implementing transactions
with atomic commit and rollback.  It does this using a separate
journal file.  Whenever a new transaction is started, a journal
file is created that records the original state of the database.
If the program terminates before completing the transaction, the next
process to open the database can use the journal file to restore
the database to its original state.
</p>

<p>
The journal file is located in the same directory as the database
file and has the same name as the database file but with the
characters "<tt>-journal</tt>" appended.
</p>

<p>
The pager layer does not impose any content restrictions on the
main database file.  As far as the pager is concerned, each page
contains 1024 bytes of arbitrary data.  But there is structure to
the journal file.
</p>

<p>
A journal file begins with 8 bytes as follows:
0xd9, 0xd5, 0x05, 0xf9, 0x20, 0xa1, 0x63, and 0xd5.
Processes that are attempting to rollback a journal use these 8 bytes
as a sanity check to make sure the file they think is a journal really
is a valid journal.  There is no significance to the choice of
bytes here - the values were obtained from /dev/random. 
</p>

<p>
Following the 8 byte prefix is a single 4-byte integer that is the
original size of the main database file before the transaction was
started.  The main database file is truncated back to this size
as part of the rollback process.
The size is expressed in pages (1024 bytes per page) and is
a big-endian number.  That means that the most significant byte
occurs first.  All multi-byte integers in the journal file are
written as big-endian numbers.  That way, a journal file that is
originally created on one machine can be rolled back by another
machine that uses a different byte order.  So, for example, a
transaction that failed to complete on your big-endian SparcStation
can still be rolled back on your little-endian Linux box.
</p>

<p>
After the 8-byte prefix and the 4-byte initial database size, the
journal file consists of zero or more page records.  Each page
record is a 4-byte (big-endian) page number followed by 1024 bytes
of data.  The data is the original content of the database page
before the transaction was started.  So to roll back the transaction,
the data is simply written into the corresponding page of the
main database file.  Pages can appear in the journal in any order,
but they are guaranteed to appear only once. All page numbers will be
between 1 and the maximum specified by the page size integer that
appeared at the beginning of the journal.
</p>

<p>
Here is a summary of the journal file format:
</p>

<ul>
<li>8 byte prefix: 0xd9, 0xd5, 0x05, 0xf9, 0x20, 0xa1, 0x63, x0d5</li>
<li>4 byte initial database page count, big-endian.</li>
<li>Zero or more instances of the following:
   <ul>
   <li>4 byte page number - big-endian</li>
   <li>1024 bytes of original data for the page</li>
   </ul>
</li>
</ul>

<h2>3.0 &nbsp; The B-Tree Layer</h2>

<p>
The B-Tree layer builds on top of the pager layer to implement
one or more separate b-trees all in the same disk file.  The
algorithms used are taken from Knuth's <i>The Art Of Computer
Programming.</i></p>

<p>
Page 1 of a database contains a header string used for sanity
checking, a few 32-bit words of configuration data, and a pointer
to the beginning of a list of unused pages in the database.
All other pages in the
database are either pages of a b-tree, overflow pages, or unused
pages on the freelist.
</p>

<p>
Each b-tree page contains zero or more database entries.
Each entry has an unique key of one or more bytes and data of
zero or more bytes.
Both the key and data are arbitrary byte sequences.  The combination
of key and data are collectively known as "payload".  The current
implementation limits the amount of payload in a single entry to
1048576 bytes.  This limit can be raised to 16777216 by adjusting
a single #define in the source code and recompiling.  But most entries
contain less than a hundred bytes of payload so a megabyte limit seems
more than enough.
</p>

<p>
Up to 238 bytes of payload for an entry can be held directly on
a b-tree page.  Any additional payload is contained on a linked list
of overflow pages.  This limit on the amount of payload held directly
on b-tree pages guarantees that each b-tree page can hold at least
4 entries.  In practice, most entries are smaller than 238 bytes and
thus most pages can hold more than 4 entries.
</p>

<p>
A single database file can hold any number of separate, independent b-trees.
Each b-tree is identified by its root page, which never changes.
Child pages of the b-tree may change as entries are added and removed
and pages split and combine.  But the root page always stays the same.
The b-tree itself does not record which pages are root pages and which
are not.  That information is handled entirely at the schema layer.
</p>

<h3>3.1 &nbsp; B-Tree Page 1 Details</h3>

<p>
Page 1 begins with the following 48-byte string:
</p>

<blockquote><pre>
** This file contains an SQLite 2.1 database **
</pre></blockquote>

<p>
If you count the number of characters in the string above, you will
see that there are only 47.  A '\000' terminator byte is added to
bring the total to 48.
</p>

<p>
A frequent question is why the string says version 2.1 when (as
of this writing) we are up to version 2.7.0 of SQLite and any
change to the second digit of the version is suppose to represent
a database format change.  The answer to this is that the B-tree
layer has not changed any since version 2.1.  There have been
database format changes since version 2.1 but those changes have
all been in the schema layer.  Because the format of the b-tree
layer is unchanged since version 2.1.0, the header string still
says version 2.1.
</p>

<p>
After the format string is a 4-byte integer used to determine the
byte-order of the database.  The integer has a value of
0xdae37528.  If this number is expressed as 0xda, 0xe3, 0x75, 0x28, then
the database is in a big-endian format and all 16 and 32-bit integers
elsewhere in the b-tree layer are also big-endian.  If the number is
expressed as 0x28, 0x75, 0xe3, and 0xda, then the database is in a
little-endian format and all other multi-byte numbers in the b-tree 
layer are also little-endian.  
Prior to version 2.6.3, the SQLite engine was only able to read databases
that used the same byte order as the processor they were running on.
But beginning with 2.6.3, SQLite can read or write databases in any
byte order.
</p>

<p>
After the byte-order code are six 4-byte integers.  Each integer is in the
byte order determined by the byte-order code.  The first integer is the
page number for the first page of the freelist.  If there are no unused
pages in the database, then this integer is 0.  The second integer is
the number of unused pages in the database.  The last 4 integers are
not used by the b-tree layer.  These are the so-called "meta" values that
are passed up to the schema layer
and used there for configuration and format version information.
All bytes of page 1 past beyond the meta-value integers are unused 
and are initialized to zero.
</p>

<p>
Here is a summary of the information contained on page 1 in the b-tree layer:
</p>

<ul>
<li>48 byte header string</li>
<li>4 byte integer used to determine the byte-order</li>
<li>4 byte integer which is the first page of the freelist</li>
<li>4 byte integer which is the number of pages on the freelist</li>
<li>16 bytes of meta-data arranged as four 4-byte integers</li>
<li>948 bytes of unused space</li>
</ul>

<h3>3.2 &nbsp; Structure Of A Single B-Tree Page</h3>

<p>
Conceptually, a b-tree page contains N database entries and N+1 pointers
to other b-tree pages.
</p>

<blockquote>
<table border=1 cellspacing=0 cellpadding=5>
<tr>
<td align="center">Ptr<br>0</td>
<td align="center">Entry<br>0</td>
<td align="center">Ptr<br>1</td>
<td align="center">Entry<br>1</td>
<td align="center"><b>...</b></td>
<td align="center">Ptr<br>N-1</td>
<td align="center">Entry<br>N-1</td>
<td align="center">Ptr<br>N</td>
</tr>
</table>
</blockquote>

<p>
The entries are arranged in increasing order.  That is, the key to
Entry 0 is less than the key to Entry 1, and the key to Entry 1 is
less than the key of Entry 2, and so forth.  The pointers point to
pages containing additional entries that have keys in between the
entries on either side.  So Ptr 0 points to another b-tree page that
contains entries that all have keys less than Key 0, and Ptr 1
points to a b-tree pages where all entries have keys greater than Key 0
but less than Key 1, and so forth.
</p>

<p>
Each b-tree page in SQLite consists of a header, zero or more "cells"
each holding a single entry and pointer, and zero or more "free blocks"
that represent unused space on the page.
</p>

<p>
The header on a b-tree page is the first 8 bytes of the page.
The header contains the value
of the right-most pointer (Ptr N) and the byte offset into the page
of the first cell and the first free block.  The pointer is a 32-bit
value and the offsets are each 16-bit values.  We have:
</p>

<blockquote>
<table border=1 cellspacing=0 cellpadding=5>
<tr>
<td align="center" width=30>0</td>
<td align="center" width=30>1</td>
<td align="center" width=30>2</td>
<td align="center" width=30>3</td>
<td align="center" width=30>4</td>
<td align="center" width=30>5</td>
<td align="center" width=30>6</td>
<td align="center" width=30>7</td>
</tr>
<tr>
<td align="center" colspan=4>Ptr N</td>
<td align="center" colspan=2>Cell 0</td>
<td align="center" colspan=2>Freeblock 0</td>
</tr>
</table>
</blockquote>

<p>
The 1016 bytes of a b-tree page that come after the header contain
cells and freeblocks.  All 1016 bytes are covered by either a cell
or a freeblock.
</p>

<p>
The cells are connected in a linked list.  Cell 0 contains Ptr 0 and
Entry 0.  Bytes 4 and 5 of the header point to Cell 0.  Cell 0 then
points to Cell 1 which contains Ptr 1 and Entry 1.  And so forth.
Cells vary in size.  Every cell has a 12-byte header and at least 4
bytes of payload space.  Space is allocated to payload in increments
of 4 bytes.  Thus the minimum size of a cell is 16 bytes and up to
63 cells can fit on a single page.  The size of a cell is always a multiple
of 4 bytes.
A cell can have up to 238 byte of payload space.  If
the payload is more than 238 bytes, then an addition 4 point page
number is appended to the cell which is the page number of the first
overflow page containing the additional payload.  The maximum size
of a cell is thus 254 bytes, meaning that a least 4 cells can fit into
the 1016 bytes of space available on a b-tree page.
An average cell is usually around 52 to 100 bytes in size with about
10 or 20 cells to a page.
</p>

<p>
The data layout of a cell looks like this:
</p>

<blockquote>
<table border=1 cellspacing=0 cellpadding=5>
<tr>
<td align="center" width=20>0</td>
<td align="center" width=20>1</td>
<td align="center" width=20>2</td>
<td align="center" width=20>3</td>
<td align="center" width=20>4</td>
<td align="center" width=20>5</td>
<td align="center" width=20>6</td>
<td align="center" width=20>7</td>
<td align="center" width=20>8</td>
<td align="center" width=20>9</td>
<td align="center" width=20>10</td>
<td align="center" width=20>11</td>
<td align="center" width=100>12 ... 249</td>
<td align="center" width=20>250</td>
<td align="center" width=20>251</td>
<td align="center" width=20>252</td>
<td align="center" width=20>253</td>
</tr>
<tr>
<td align="center" colspan=4>Ptr</td>
<td align="center" colspan=2>Keysize<br>(low)</td>
<td align="center" colspan=2>Next</td>
<td align="center" colspan=1>Ksz<br>(hi)</td>
<td align="center" colspan=1>Dsz<br>(hi)</td>
<td align="center" colspan=2>Datasize<br>(low)</td>
<td align="center" colspan=1>Payload</td>
<td align="center" colspan=4>Overflow<br>Pointer</td>
</tr>
</table>
</blockquote>

<p>
The first four bytes are the pointer.  The size of the key is a 24-bit
where the upper 8 bits are taken from by 8 and the lower 16 bits are
token from bytes 4 and 5 (or bytes 5 and 4 on little-endian machines.)
The size of the data is another 24-bit value where the upper 8 bits
are taken from byte 9 and the lower 16 bits are taken from bytes 10 and
11 or 11 and 10, depending on the byte order.  Bytes 6 and 7 are the
offset to the next cell in the linked list of all cells on the current
page.  This offset is 0 for the last cell on the page.
</p>

<p>
The payload itself can be any number of bytes between 1 and 1048576.
But space to hold the payload is allocated in 4-byte chunks up to
238 bytes.  If the entry contains more than 238 bytes of payload, then
additional payload data is stored on a linked list of overflow pages.
A 4 byte page number is appended to the cell that contains the first
page of this linked list.
</p>

<p>
Each overflow page begins with a 4-byte value which is the
page number of the next overflow page in the list.   This value is
0 for the last page in the list.  The remaining
1020 bytes of the overflow page are available for storing payload.
Note that a full page is allocated regardless of the number of overflow
bytes stored.  Thus, if the total payload for an entry is 239 bytes,
the first 238 are stored in the cell and the overflow page stores just
one byte.
</p>

<p>
The structure of an overflow page looks like this:
</p>

<blockquote>
<table border=1 cellspacing=0 cellpadding=5>
<tr>
<td align="center" width=20>0</td>
<td align="center" width=20>1</td>
<td align="center" width=20>2</td>
<td align="center" width=20>3</td>
<td align="center" width=200>4 ... 1023</td>
</tr>
<tr>
<td align="center" colspan=4>Next Page</td>
<td align="center" colspan=1>Overflow Data</td>
</tr>
</table>
</blockquote>

<p>
All space on a b-tree page which is not used by the header or by cells
is filled by freeblocks.  Freeblocks, like cells, are variable in size.
The size of a freeblock is at least 4 bytes and is always a multiple of
4 bytes.
The first 4 bytes contain a header and the remaining bytes
are unused.  The structure of the freeblock is as follows:
</p>

<blockquote>
<table border=1 cellspacing=0 cellpadding=5>
<tr>
<td align="center" width=20>0</td>
<td align="center" width=20>1</td>
<td align="center" width=20>2</td>
<td align="center" width=20>3</td>
<td align="center" width=200>4 ... 1015</td>
</tr>
<tr>
<td align="center" colspan=2>Size</td>
<td align="center" colspan=2>Next</td>
<td align="center" colspan=1>Unused</td>
</tr>
</table>
</blockquote>

<p>
Freeblocks are stored in a linked list in increasing order.  That is
to say, the first freeblock occurs at a lower index into the page than
the second free block, and so forth.  The first 2 bytes of the header
are an integer which is the total number of bytes in the freeblock.
The second 2 bytes are the index into the page of the next freeblock
in the list.  The last freeblock has a Next value of 0.
</p>

<p>
When a new b-tree is created in a database, the root page of the b-tree
consist of a header and a single 1016 byte freeblock.  As entries are
added, space is carved off of that freeblock and used to make cells.
When b-tree entries are deleted, the space used by their cells is converted
into freeblocks.  Adjacent freeblocks are merged, but the page can still
become fragmented.  The b-tree code will occasionally try to defragment
the page by moving all cells to the beginning and constructing a single
freeblock at the end to take up all remaining space.
</p>

<h3>3.3 &nbsp; The B-Tree Free Page List</h3>

<p>
When information is removed from an SQLite database such that one or
more pages are no longer needed, those pages are added to a list of
free pages so that they can be reused later when new information is
added.  This subsection describes the structure of this freelist.
</p>

<p>
The 32-bit integer beginning at byte-offset 52 in page 1 of the database
contains the address of the first page in a linked list of free pages.
If there are no free pages available, this integer has a value of 0.
The 32-bit integer at byte-offset 56 in page 1 contains the number of
free pages on the freelist.
</p>

<p>
The freelist contains a trunk and many branches.  The trunk of
the freelist is composed of overflow pages.  That is to say, each page
contains a single 32-bit integer at byte offset 0 which
is the page number of the next page on the freelist trunk.
The payload area
of each trunk page is used to record pointers to branch pages. 
The first 32-bit integer in the payload area of a trunk page
is the number of branch pages to follow (between 0 and 254)
and each subsequent 32-bit integer is a page number for a branch page.
The following diagram shows the structure of a trunk freelist page:
</p>

<blockquote>
<table border=1 cellspacing=0 cellpadding=5>
<tr>
<td align="center" width=20>0</td>
<td align="center" width=20>1</td>
<td align="center" width=20>2</td>
<td align="center" width=20>3</td>
<td align="center" width=20>4</td>
<td align="center" width=20>5</td>
<td align="center" width=20>6</td>
<td align="center" width=20>7</td>
<td align="center" width=200>8 ... 1023</td>
</tr>
<tr>
<td align="center" colspan=4>Next trunk page</td>
<td align="center" colspan=4># of branch pages</td>
<td align="center" colspan=1>Page numbers for branch pages</td>
</tr>
</table>
</blockquote>

<p>
It is important to note that only the pages on the trunk of the freelist
contain pointers to other pages.  The branch pages contain no
data whatsoever.  The fact that the branch pages are completely
blank allows for an important optimization in the paging layer.  When
a branch page is removed from the freelist to be reused, it is not
necessary to write the original content of that page into the rollback
journal.  The branch page contained no data to begin with, so there is
no need to restore the page in the event of a rollback.  Similarly,
when a page is not longer needed and is added to the freelist as a branch
page, it is not necessary to write the content of that page
into the database file.
Again, the page contains no real data so it is not necessary to record the
content of that page.  By reducing the amount of disk I/O required,
these two optimizations allow some database operations
to go four to six times faster than they would otherwise.
</p>

<h2>4.0 &nbsp; The Schema Layer</h2>

<p>
The schema layer implements an SQL database on top of one or more
b-trees and keeps track of the root page numbers for all b-trees.
Where the b-tree layer provides only unformatted data storage with
a unique key, the schema layer allows each entry to contain multiple
columns.  The schema layer also allows indices and non-unique key values.
</p>

<p>
The schema layer implements two separate data storage abstractions:
tables and indices.  Each table and each index uses its own b-tree
but they use the b-tree capabilities in different ways.  For a table,
the b-tree key is a unique 4-byte integer and the b-tree data is the
content of the table row, encoded so that columns can be separately
extracted.  For indices, the b-tree key varies in size depending on the
size of the fields being indexed and the b-tree data is empty.
</p>

<h3>4.1 &nbsp; SQL Table Implementation Details</h3>

<p>Each row of an SQL table is stored in a single b-tree entry.
The b-tree key is a 4-byte big-endian integer that is the ROWID
or INTEGER PRIMARY KEY for that table row.
The key is stored in a big-endian format so
that keys will sort in numerical order using memcmp() function.</p>

<p>The content of a table row is stored in the data portion of
the corresponding b-tree table.  The content is encoded to allow
individual columns of the row to be extracted as necessary.  Assuming
that the table has N columns, the content is encoded as N+1 offsets
followed by N column values, as follows:
</p>

<blockquote>
<table border=1 cellspacing=0 cellpadding=5>
<tr>
<td>offset 0</td>
<td>offset 1</td>
<td><b>...</b></td>
<td>offset N-1</td>
<td>offset N</td>
<td>value 0</td>
<td>value 1</td>
<td><b>...</b></td>
<td>value N-1</td>
</tr>
</table>
</blockquote>

<p>
The offsets can be either 8-bit, 16-bit, or 24-bit integers depending
on how much data is to be stored.  If the total size of the content
is less than 256 bytes then 8-bit offsets are used.  If the total size
of the b-tree data is less than 65536 then 16-bit offsets are used.
24-bit offsets are used otherwise.  Offsets are always little-endian,
which means that the least significant byte occurs first.
</p>

<p>
Data is stored as a nul-terminated string.  Any empty string consists
of just the nul terminator.  A NULL value is an empty string with no
nul-terminator.  Thus a NULL value occupies zero bytes and an empty string
occupies 1 byte.
</p>

<p>
Column values are stored in the order that they appear in the CREATE TABLE
statement.  The offsets at the beginning of the record contain the
byte index of the corresponding column value.  Thus, Offset 0 contains
the byte index for Value 0, Offset 1 contains the byte offset
of Value 1, and so forth.  The number of bytes in a column value can
always be found by subtracting offsets.  This allows NULLs to be
recovered from the record unabiguously.
</p>

<p>
Most columns are stored in the b-tree data as described above.
The one exception is column that has type INTEGER PRIMARY KEY.
INTEGER PRIMARY KEY columns correspond to the 4-byte b-tree key.
When an SQL statement attempts to read the INTEGER PRIMARY KEY,
the 4-byte b-tree key is read rather than information out of the
b-tree data.  But there is still an Offset associated with the
INTEGER PRIMARY KEY, just like any other column.  But the Value
associated with that offset is always NULL.
</p>

<h3>4.2 &nbsp; SQL Index Implementation Details</h3>

<p>
SQL indices are implement using a b-tree in which the key is used
but the data is always empty.  The purpose of an index is to map
one or more column values into the ROWID for the table entry that
contains those column values.
</p>

<p>
Each b-tree in an index consists of one or more column values followed
by a 4-byte ROWID.  Each column value is nul-terminated (even NULL values)
and begins with a single character that indicates the datatype for that
column value.  Only three datatypes are supported: NULL, Number, and
Text.  NULL values are encoded as the character 'a' followed by the
nul terminator.  Numbers are encoded as the character 'b' followed by
a string that has been crafted so that sorting the string using memcmp()
will sort the corresponding numbers in numerical order.  (See the
sqliteRealToSortable() function in util.c of the SQLite sources for
additional information on this encoding.)  Numbers are also nul-terminated.
Text values consists of the character 'c' followed by a copy of the
text string and a nul-terminator.  These encoding rules result in
NULLs being sorted first, followed by numerical values in numerical
order, followed by text values in lexigraphical order.
</p>

<h3>4.4 &nbsp; SQL Schema Storage And Root B-Tree Page Numbers</h3>

<p>
The database schema is stored in the database in a special tabled named
"sqlite_master" and which always has a root b-tree page number of 2.
This table contains the original CREATE TABLE,
CREATE INDEX, CREATE VIEW, and CREATE TRIGGER statements used to define
the database to begin with.  Whenever an SQLite database is opened,
the sqlite_master table is scanned from beginning to end and 
all the original CREATE statements are played back through the parser
in order to reconstruct an in-memory representation of the database
schema for use in subsequent command parsing.  For each CREATE TABLE
and CREATE INDEX statement, the root page number for the corresponding
b-tree is also recorded in the sqlite_master table so that SQLite will
know where to look for the appropriate b-tree.
</p>

<p>
SQLite users can query the sqlite_master table just like any other table
in the database.  But the sqlite_master table cannot be directly written.
The sqlite_master table is automatically updated in response to CREATE
and DROP statements but it cannot be changed using INSERT, UPDATE, or
DELETE statements as that would risk corrupting the database.
</p>

<p>
SQLite stores temporary tables and indices in a separate
file from the main database file.  The temporary table database file
is the same structure as the main database file.  The schema table
for the temporary tables is stored on page 2 just as in the main
database.  But the schema table for the temporary database named
"sqlite_temp_master" instead of "sqlite_master".  Other than the
name change, it works exactly the same.
</p>

<h3>4.4 &nbsp; Schema Version Numbering And Other Meta-Information</h3>

<p>
The four 32-bit integers that are stored beginning at byte offset
60 of Page 1 in the b-tree layer are passed up into the schema layer
and used for versioning and configuration information.  The meaning
of these four integers is as follows:
</p>

<ol>
<li>The schema version number</li>
<li>The format version number</li>
<li>The recommended pager cache size</li>
<li>Unused</li>
</ol>

<p>
The first meta-value, the schema version number, is used to detect when
the schema of the database is changed by a CREATE or DROP statement.
Recall that when a database is first opened the sqlite_master table is
scanned and an internal representation of the tables, indices, views,
and triggers for the database is built in memory.  This internal
representation is used for all subsequent SQL command parsing and
execution.  But what if another process were to change the schema
by adding or removing a table, index, view, or trigger?  If the original
process were to continue using the old schema, it could potentially
corrupt the database by writing to a table that no longer exists.
To avoid this problem, the schema version number is changed whenever
a CREATE or DROP statement is executed.  Before each command is
executed, the current schema version number for the database file
is compared against the schema version number from when the sqlite_master
table was last read.  If those numbers are different, the internal
schema representation is erased and the sqlite_master table is reread
to reconstruct the internal schema representation.
(Calls to sqlite_exec() generally return SQLITE_SCHEMA when this happens.)
</p>

<p>
The second meta-value is the schema format version number.  This
number tells what version of the schema layer should be used to
interpret the file.  There have been changes to the schema layer
over time and this number is used to detect when an older database
file is being processed by a newer version of the library.
As of this writing (SQLite version 2.7.0) the current format version
is "4".
</p>

<p>
The third meta-value is the recommended pager cache size as set 
by the DEFAULT_CACHE_SIZE pragma.  If the value is positive it
means that synchronous behavior is enable (via the DEFAULT_SYNCHRONOUS
pragma) and if negative it means that synchronous behavior is
disabled.
</p>

<p>
The fourth meta-value is currently unused.
</p>

}

puts {
<p><hr /></p>
<p><a href="index.html"><img src="/goback.jpg" border=0 />
Back to the SQLite Home Page</a>
</p>

</body></html>}
Changes to www/index.tcl.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#
# Run this TCL script to generate HTML for the index.html file.
#
set rcsid {$Id: index.tcl,v 1.66 2002/08/15 13:45:17 drh Exp $}

puts {<html>
<head><title>SQLite: An SQL Database Engine In A C Library</title></head>
<body bgcolor=white>
<h1 align=center>SQLite: An SQL Database Engine In A C Library</h1>
<p align=center>}
puts "This page was last modified on [lrange $rcsid 3 4] UTC<br>"
set vers [lindex $argv 0]
puts "The latest SQLite version is <b>$vers</b>"
puts " created on [exec cat last_change] UTC"
puts {</p>}




|


|

|







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#
# Run this TCL script to generate HTML for the index.html file.
#
set rcsid {$Id: index.tcl,v 1.67 2002/08/18 19:09:24 drh Exp $}

puts {<html>
<head><title>SQLite: An Embeddable SQL Database Engine</title></head>
<body bgcolor=white>
<h1 align=center>SQLite<br>An Embeddable SQL Database Engine</h1>
<p align=center>}
puts "This page was last modified on [lrange $rcsid 3 4] UTC<br>"
set vers [lindex $argv 0]
puts "The latest SQLite version is <b>$vers</b>"
puts " created on [exec cat last_change] UTC"
puts {</p>}

100
101
102
103
104
105
106

107
108
109
110
111
112
113
114
115
116
117
118
119
So if you have trouble accessing the CVS server, it could be because the
IP address has recently changed.  Try again in a few days.
</p>

<p>
Whenever either of the first two digits in the version number
for SQLite change, it means that the underlying file format

has changed.  See <a href="formatchng.html">formatchng.html</a>
for additional information.
</p>
}

puts {<h2>Database File Format Change - Version 2.6.0 - 2002 July 17</h2>

<p>Beginning with version 2.6.0, the SQLite database file format changed
in an incompatible way.  If you open a database file from version 2.5.6
or earlier with version 2.6.0 or later of the library, then the file format
will be converted automatically.  This is an irreversible operation.  Once
the conversion occurs, you will no longer be able to access the database
file from older versions of the library.  If the database is large, the







>
|




|







100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
So if you have trouble accessing the CVS server, it could be because the
IP address has recently changed.  Try again in a few days.
</p>

<p>
Whenever either of the first two digits in the version number
for SQLite change, it means that the underlying file format
has changed.  Usually these changes are backwards compatible.
See <a href="formatchng.html">formatchng.html</a>
for additional information.
</p>
}

puts {<h2>Major Database File Format Change - Version 2.6.0 - 2002 July 17</h2>

<p>Beginning with version 2.6.0, the SQLite database file format changed
in an incompatible way.  If you open a database file from version 2.5.6
or earlier with version 2.6.0 or later of the library, then the file format
will be converted automatically.  This is an irreversible operation.  Once
the conversion occurs, you will no longer be able to access the database
file from older versions of the library.  If the database is large, the
139
140
141
142
143
144
145


146
147
148
149
150
151
152
153
154
155
156
157
158
<li>The <a href="lang.html">SQL Language</a> subset understood by SQLite.</li>
<li>The <a href="c_interface.html">C/C++ Interface</a>.</li>
<li>The <a href="tclsqlite.html">Tcl Binding</a> to SQLite.</li>
<li>The <a href="arch.html">Architecture of the SQLite Library</a> describes
    how the library is put together.</li>
<li>A description of the <a href="opcode.html">virtual machine</a> that
    SQLite uses to access the database.</li>


<li>A <a href="speed.html">speed comparison</a> between SQLite, PostgreSQL,
    and MySQL.</li>
</ul>
</p>

<p>The SQLite source code is 35% comment.  These comments are
another important source of information. </p>

}

puts {
<table align="right">
<tr><td align="center">







>
>





|







140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
<li>The <a href="lang.html">SQL Language</a> subset understood by SQLite.</li>
<li>The <a href="c_interface.html">C/C++ Interface</a>.</li>
<li>The <a href="tclsqlite.html">Tcl Binding</a> to SQLite.</li>
<li>The <a href="arch.html">Architecture of the SQLite Library</a> describes
    how the library is put together.</li>
<li>A description of the <a href="opcode.html">virtual machine</a> that
    SQLite uses to access the database.</li>
<li>A description of the 
    <a href="fileformat.html">database file format</a> used by SQLite.
<li>A <a href="speed.html">speed comparison</a> between SQLite, PostgreSQL,
    and MySQL.</li>
</ul>
</p>

<p>The SQLite source code is 30% comment.  These comments are
another important source of information. </p>

}

puts {
<table align="right">
<tr><td align="center">
195
196
197
198
199
200
201







202
203
204
205
206
207
208
$ tar xzf sqlite.tar.gz      <i> Unpacks into directory named "sqlite" </i>
$ mkdir bld                  <i> Create a separate build directory </i>
$ cd bld
$ ../sqlite/configure
$ make                       <i> Builds "sqlite" and "libsqlite.a" </i>
$ make test                  <i> Optional: run regression tests </i>
</pre></blockquote>







}

puts {<h2>Related Sites</h2>

<p>
For information bindings of SQLite to other programming languages
(Perl, Python, Ruby, PHP, etc.) and for a list of programs currently







>
>
>
>
>
>
>







198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
$ tar xzf sqlite.tar.gz      <i> Unpacks into directory named "sqlite" </i>
$ mkdir bld                  <i> Create a separate build directory </i>
$ cd bld
$ ../sqlite/configure
$ make                       <i> Builds "sqlite" and "libsqlite.a" </i>
$ make test                  <i> Optional: run regression tests </i>
</pre></blockquote>

<p>If you prefer, you can also build by making whatever modifications
you desire to the file "Makefile.linux-gcc" and then executing that
makefile.  Tha latter method is used for all official development
and testing of SQLite and for building the precompiled
binaries found on this website.  Windows binaries are generated by
cross-compiling from Linux using <a href="www.mingw.org">MinGW</a></p>
}

puts {<h2>Related Sites</h2>

<p>
For information bindings of SQLite to other programming languages
(Perl, Python, Ruby, PHP, etc.) and for a list of programs currently
Changes to www/lang.tcl.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

20
21
22
23
24
25
26
27
#
# Run this Tcl script to generate the sqlite.html file.
#
set rcsid {$Id: lang.tcl,v 1.43 2002/08/15 11:48:14 drh Exp $}

puts {<html>
<head>
  <title>Query Language Understood By SQLite</title>
</head>
<body bgcolor=white>
<h1 align=center>
SQL As Understood By SQLite
</h1>}
puts "<p align=center>
(This page was last modified on [lrange $rcsid 3 4] UTC)
</p>"

puts {
<p>The SQLite library understands most of the standard SQL

language.  But it does omit some features while at the same time
adding a few features of its own.  This document attempts to
describe percisely what parts of the SQL language SQLite does
and does not support.</p>

<p>In all of the syntax diagrams that follow, literal text is shown in
bold blue.  Non-terminal symbols are shown in italic red.  Operators
that are part of the syntactic markup itself are shown in black roman.</p>



|















>
|







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#
# Run this Tcl script to generate the sqlite.html file.
#
set rcsid {$Id: lang.tcl,v 1.44 2002/08/18 19:09:24 drh Exp $}

puts {<html>
<head>
  <title>Query Language Understood By SQLite</title>
</head>
<body bgcolor=white>
<h1 align=center>
SQL As Understood By SQLite
</h1>}
puts "<p align=center>
(This page was last modified on [lrange $rcsid 3 4] UTC)
</p>"

puts {
<p>The SQLite library understands most of the standard SQL
language.  But it does <a href="omitted.html">omit some features</a>
while at the same time
adding a few features of its own.  This document attempts to
describe percisely what parts of the SQL language SQLite does
and does not support.</p>

<p>In all of the syntax diagrams that follow, literal text is shown in
bold blue.  Non-terminal symbols are shown in italic red.  Operators
that are part of the syntactic markup itself are shown in black roman.</p>
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
<p>Beginning in version 2.0, SQLite supports transactions with
rollback and atomic commit.</p>

<p>
No changes can be made to the database except within a transaction.
Any command that changes the database (basically, any SQL command
other than SELECT) will automatically starts a transaction if
one is not already in effect.  Automatically stared transactions
are committed at the conclusion of the command.
</p>

<p>
Transactions can be started manually using the BEGIN
command. Such transactions usually persist until the next
COMMIT or ROLLBACK command. But a transaction will also 







|







127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
<p>Beginning in version 2.0, SQLite supports transactions with
rollback and atomic commit.</p>

<p>
No changes can be made to the database except within a transaction.
Any command that changes the database (basically, any SQL command
other than SELECT) will automatically starts a transaction if
one is not already in effect.  Automatically started transactions
are committed at the conclusion of the command.
</p>

<p>
Transactions can be started manually using the BEGIN
command. Such transactions usually persist until the next
COMMIT or ROLLBACK command. But a transaction will also 
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
implementation.</p>

<p>There are no arbitrary limits on the number of indices that can be
attached to a single table, nor on the number of columns in an index.</p>

<p>If the UNIQUE keyword appears between CREATE and INDEX then duplicate
index entries are not allowed.  Any attempt to insert a duplicate entry
will result in a rollback and an error message.</p>

<p>The optional conflict-clause allows the specification of al alternative
default constraint conflict resolution algorithm for this index.
This only makes sense if the UNIQUE keyword is used since otherwise
there are not constraints on the index.  The default algorithm is
ABORT.  If a COPY, INSERT, or UPDATE statement specifies a particular
conflict resolution algorithm, that algorithm is used in place of
the default algorithm specified here.
See the section titled







|

|







219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
implementation.</p>

<p>There are no arbitrary limits on the number of indices that can be
attached to a single table, nor on the number of columns in an index.</p>

<p>If the UNIQUE keyword appears between CREATE and INDEX then duplicate
index entries are not allowed.  Any attempt to insert a duplicate entry
will result in an error.</p>

<p>The optional conflict-clause allows the specification of an alternative
default constraint conflict resolution algorithm for this index.
This only makes sense if the UNIQUE keyword is used since otherwise
there are not constraints on the index.  The default algorithm is
ABORT.  If a COPY, INSERT, or UPDATE statement specifies a particular
conflict resolution algorithm, that algorithm is used in place of
the default algorithm specified here.
See the section titled
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
<font color="#2c2cf0"><big>-    +    !    ~</big></font>
</pre></blockquote>

<p>Any SQLite value can be used as part of an expression.  
For arithmetic operations, integers are treated as integers.
Strings are first converted to real numbers using <b>atof()</b>.
For comparison operators, numbers compare as numbers and strings
compare as strings.  For string comparisons, case is significant
but is only used to break a tie.
Note that there are two variations of the equals and not equals
operators.  Equals can be either}
puts "[Operator =] or [Operator ==].
The non-equals operator can be either
[Operator !=] or [Operator {&lt;&gt;}].
The [Operator ||] operator is \"concatenate\" - it joins together
the two strings of its operands.</p>"







|
<







650
651
652
653
654
655
656
657

658
659
660
661
662
663
664
<font color="#2c2cf0"><big>-    +    !    ~</big></font>
</pre></blockquote>

<p>Any SQLite value can be used as part of an expression.  
For arithmetic operations, integers are treated as integers.
Strings are first converted to real numbers using <b>atof()</b>.
For comparison operators, numbers compare as numbers and strings
compare using the <b>strcmp()</b> function.

Note that there are two variations of the equals and not equals
operators.  Equals can be either}
puts "[Operator =] or [Operator ==].
The non-equals operator can be either
[Operator !=] or [Operator {&lt;&gt;}].
The [Operator ||] operator is \"concatenate\" - it joins together
the two strings of its operands.</p>"
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
of a row key in an UPDATE or INSERT statement.
"SELECT * ..." does not return the row key.</p>

<p>SELECT statements can appear in expressions as either the
right-hand operand of the IN operator or as a scalar quantity.
In both cases, the SELECT should have only a single column in its
result.  Compound SELECTs (connected with keywords like UNION or
EXCEPT) are allowed.  Any ORDER BY clause on the select is ignored.
A SELECT in an expression is evaluated once before any other processing
is performed, so none of the expressions within the select itself can
refer to quantities in the containing expression.</p>

<p>When a SELECT is the right operand of the IN operator, the IN
operator returns TRUE if the result of the left operand is any of
the values generated by the select.  The IN operator may be preceded







|







698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
of a row key in an UPDATE or INSERT statement.
"SELECT * ..." does not return the row key.</p>

<p>SELECT statements can appear in expressions as either the
right-hand operand of the IN operator or as a scalar quantity.
In both cases, the SELECT should have only a single column in its
result.  Compound SELECTs (connected with keywords like UNION or
EXCEPT) are allowed.
A SELECT in an expression is evaluated once before any other processing
is performed, so none of the expressions within the select itself can
refer to quantities in the containing expression.</p>

<p>When a SELECT is the right operand of the IN operator, the IN
operator returns TRUE if the result of the left operand is any of
the values generated by the select.  The IN operator may be preceded
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
Syntax {sql-statement} {
PRAGMA <name> = <value> |
PRAGMA <function>(<arg>)
}

puts {
<p>The PRAGMA command is used to modify the operation of the SQLite library.
The pragma command is experimental and specific pragma statements may
removed or added in future releases of SQLite.  Use this command
with caution.</p>

<p>The current implementation supports the following pragmas:</p>

<ul>
<li><p><b>PRAGMA cache_size;







|







985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
Syntax {sql-statement} {
PRAGMA <name> = <value> |
PRAGMA <function>(<arg>)
}

puts {
<p>The PRAGMA command is used to modify the operation of the SQLite library.
The pragma command is experimental and specific pragma statements may be
removed or added in future releases of SQLite.  Use this command
with caution.</p>

<p>The current implementation supports the following pragmas:</p>

<ul>
<li><p><b>PRAGMA cache_size;
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
has a fixed number of columns.  The number of columns in the
result is specified by the expression list in between the
SELECT and FROM keywords.  Any arbitrary expression can be used
as a result.  If a result expression is }
puts "[Operator *] then all columns of all tables are substituted"
puts {for that one expression.</p>

<p>The query is executed again one or more tables specified after
the FROM keyword.  If multiple tables names are separated by commas,
then the query is against the cross join of the various tables.
The full SQL-92 join syntax can also be used to specify joins.
A sub-query
in parentheses may be substituted for any table name in the FROM clause.
The entire FROM clause may be omitted, in which case the result is a
single row consisting of the values of the expression list.
</p>

<p>The WHERE clause can be used to limit the number of rows over
which the query operates.  In the current implementation,
indices will only be used to
optimize the query if WHERE expression contains equality comparisons
connected by the AND operator.</p>

<p>The GROUP BY clauses causes one or more rows of the result to
be combined into a single row of output.  This is especially useful
when the result contains aggregate functions.  The expressions in
the GROUP BY clause do <em>not</em> have to be expressions that
appear in the result.  The HAVING clause is similar to WHERE except
that HAVING applies after grouping has occurred.  The HAVING expression







|










|
<
<
<







1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188



1189
1190
1191
1192
1193
1194
1195
has a fixed number of columns.  The number of columns in the
result is specified by the expression list in between the
SELECT and FROM keywords.  Any arbitrary expression can be used
as a result.  If a result expression is }
puts "[Operator *] then all columns of all tables are substituted"
puts {for that one expression.</p>

<p>The query is executed against one or more tables specified after
the FROM keyword.  If multiple tables names are separated by commas,
then the query is against the cross join of the various tables.
The full SQL-92 join syntax can also be used to specify joins.
A sub-query
in parentheses may be substituted for any table name in the FROM clause.
The entire FROM clause may be omitted, in which case the result is a
single row consisting of the values of the expression list.
</p>

<p>The WHERE clause can be used to limit the number of rows over
which the query operates.</p>




<p>The GROUP BY clauses causes one or more rows of the result to
be combined into a single row of output.  This is especially useful
when the result contains aggregate functions.  The expressions in
the GROUP BY clause do <em>not</em> have to be expressions that
appear in the result.  The HAVING clause is similar to WHERE except
that HAVING applies after grouping has occurred.  The HAVING expression
Changes to www/quickstart.tcl.
1
2
3
4
5
6
7
8
9
10
11
#
# Run this TCL script to generate HTML for the quickstart.html file.
#
set rcsid {$Id: quickstart.tcl,v 1.1 2002/08/15 13:45:17 drh Exp $}

puts {<html>
<head><title>SQLite In 5 Minutes Or Less</title></head>
<body bgcolor=white>
<h1 align=center>SQLite In 5 Minutes Or Less</h1>}

puts {



|







1
2
3
4
5
6
7
8
9
10
11
#
# Run this TCL script to generate HTML for the quickstart.html file.
#
set rcsid {$Id: quickstart.tcl,v 1.2 2002/08/18 19:09:24 drh Exp $}

puts {<html>
<head><title>SQLite In 5 Minutes Or Less</title></head>
<body bgcolor=white>
<h1 align=center>SQLite In 5 Minutes Or Less</h1>}

puts {
32
33
34
35
36
37
38
39





40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62




63
64
65
66
67
68
69

<h2>Write Programs That Use SQLite</h2>

<ul>
<li><p>Below is a simple TCL program that demonstrates how to use
the TCL interface to SQLite.  The program executes the SQL statements
given as the second argument on the database defined by the first
argument.</p>






<blockquote><pre>
#!/usr/bin/tclsh
if {$argc!=2} {
  puts stderr "Usage: %s DATABASE SQL-STATEMENT"
  exit 1
}
load /usr/lib/tclsqlite.so Sqlite
sqlite db [lindex $argv 0]
db eval [lindex $argv 1] x {
  foreach v $x(*) {
    puts "$v = $x($v)"
  }
  puts ""
}
db close
</pre></blockquote>
</li>

<li><p>Below is a simple C program that demonstrates how to use
the C/C++ interface to SQLite.  The name of a database is given by
the first argument and the second argument is one or more SQL statements
to execute against the database.</p>





<blockquote><pre>
#include &lt;stdio.h&gt;
#include &lt;sqlite.h&gt;

static int callback(void *NotUsed, int argc, char **argv, char **azColName){
  int i;







|
>
>
>
>
>








|
|





|






|
>
>
>
>







32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78

<h2>Write Programs That Use SQLite</h2>

<ul>
<li><p>Below is a simple TCL program that demonstrates how to use
the TCL interface to SQLite.  The program executes the SQL statements
given as the second argument on the database defined by the first
argument.  The commands to watch for are the <b>sqlite</b> command
on line 7 which opens an SQLite database and creates
a new TCL command named "<b>db</b>" to access that database, the
invocation of the <b>db</b> command on line 8 to execute
SQL commands against the database, and the closing of the database connection
on the last line of the script.</p>

<blockquote><pre>
#!/usr/bin/tclsh
if {$argc!=2} {
  puts stderr "Usage: %s DATABASE SQL-STATEMENT"
  exit 1
}
load /usr/lib/tclsqlite.so Sqlite
<b>sqlite</b> db [lindex $argv 0]
<b>db</b> eval [lindex $argv 1] x {
  foreach v $x(*) {
    puts "$v = $x($v)"
  }
  puts ""
}
<b>db</b> close
</pre></blockquote>
</li>

<li><p>Below is a simple C program that demonstrates how to use
the C/C++ interface to SQLite.  The name of a database is given by
the first argument and the second argument is one or more SQL statements
to execute against the database.  The function calls to pay attention
to here are the call to <b>sqlite_open()</b> on line 22 which opens
the database, <b>sqlite_exec()</b> on line 27 that executes SQL
command against the database, and <b>sqlite_close()</b> on line 31
that closes the database connection.</p>

<blockquote><pre>
#include &lt;stdio.h&gt;
#include &lt;sqlite.h&gt;

static int callback(void *NotUsed, int argc, char **argv, char **azColName){
  int i;
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
  char *zErrMsg = 0;
  int rc;

  if( argc!=3 ){
    fprintf(stderr, "Usage: %s DATABASE SQL-STATEMENT\n", argv[0]);
    exit(1);
  }
  db = sqlite_open(argv[1], 0, &zErrMsg);
  if( db==0 ){
    fprintf(stderr, "Can't open database: %s\n", &zErrMsg);
    exit(1);
  }
  rc = sqlite_exec(db, argv[2], callback, 0, &zErrMsg);
  if( rc!=SQLITE_OK ){
    fprintf(stderr, "SQL error: %s\n", zErrMsg);
  }
  sqlite_close(db);
  return 0;
}
</pre></blockquote>
</li>
</ul>
}








|




|



|







88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
  char *zErrMsg = 0;
  int rc;

  if( argc!=3 ){
    fprintf(stderr, "Usage: %s DATABASE SQL-STATEMENT\n", argv[0]);
    exit(1);
  }
  db = <b>sqlite_open</b>(argv[1], 0, &zErrMsg);
  if( db==0 ){
    fprintf(stderr, "Can't open database: %s\n", &zErrMsg);
    exit(1);
  }
  rc = <b>sqlite_exec</b>(db, argv[2], callback, 0, &zErrMsg);
  if( rc!=SQLITE_OK ){
    fprintf(stderr, "SQL error: %s\n", zErrMsg);
  }
  <b>sqlite_close</b>(db);
  return 0;
}
</pre></blockquote>
</li>
</ul>
}