Documentation Source Text

Check-in [066e5931ce]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Add <fancy_format> or <table_of_contents> markup to a few more documents. To ensure that there are enough anchors in the longer documents for the search script to use.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | experimental
Files: files | file ages | folders
SHA1: 066e5931ce18a98e3b8c778adb807d8ea25fe2f0
User & Date: dan 2016-08-25 12:18:04.821
Context
2016-08-25
17:29
Change the way sub-sections are selected by the search script. (check-in: c7628dcb37 user: dan tags: experimental)
12:18
Add <fancy_format> or <table_of_contents> markup to a few more documents. To ensure that there are enough anchors in the longer documents for the search script to use. (check-in: 066e5931ce user: dan tags: experimental)
2016-08-24
19:26
Further updates to search database and script. (check-in: cc51dec17e user: dan tags: experimental)
Changes
Unified Diff Ignore Whitespace Patch
Changes to pages/atomiccommit.in.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
<title>Atomic Commit In SQLite</title>
<tcl>hd_keywords {atomic commit} {*Atomic Commit}</tcl>

<h1 align="center">
Atomic Commit In SQLite
</h1>

<h2>1.0 Introduction</h2>

<p>An important feature of transactional databases like SQLite
is "atomic commit".  
Atomic commit means that either all database changes within a single 
transaction occur or none of them occur.  With atomic commit, it
is as if many different writes to different sections of the database
file occur instantaneously and simultaneously.


|
<
<
<

|







1
2
3



4
5
6
7
8
9
10
11
12
<title>Atomic Commit In SQLite</title>
<tcl>hd_keywords {atomic commit} {*Atomic Commit}</tcl>
<table_of_contents>




<h1> Introduction</h1>

<p>An important feature of transactional databases like SQLite
is "atomic commit".  
Atomic commit means that either all database changes within a single 
transaction occur or none of them occur.  With atomic commit, it
is as if many different writes to different sections of the database
file occur instantaneously and simultaneously.
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
using a [write-ahead log].  SQLite still supports atomic commit when
write-ahead logging is enabled, but it accomplishes atomic commit by
a different mechanism from the one described in this article.  See
the [WAL | write-ahead log documentation] for additional information on how
SQLite supports atomic commit in that context.</p>

<tcl>hd_fragment hardware</tcl>
<h2>2.0 Hardware Assumptions</h2>

<p>Throughout this article, we will call the mass storage device "disk"
even though the mass storage device might really be flash memory.</p>

<p>We assume that disk is written in chunks which we call a "sector".
It is not possible to modify any part of the disk smaller than a sector.
To change a part of the disk smaller than a sector, you have to read in







|







30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
using a [write-ahead log].  SQLite still supports atomic commit when
write-ahead logging is enabled, but it accomplishes atomic commit by
a different mechanism from the one described in this article.  See
the [WAL | write-ahead log documentation] for additional information on how
SQLite supports atomic commit in that context.</p>

<tcl>hd_fragment hardware</tcl>
<h1> Hardware Assumptions</h1>

<p>Throughout this article, we will call the mass storage device "disk"
even though the mass storage device might really be flash memory.</p>

<p>We assume that disk is written in chunks which we call a "sector".
It is not possible to modify any part of the disk smaller than a sector.
To change a part of the disk smaller than a sector, you have to read in
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
default in recent versions of SQLite.  The assumption of powersafe 
overwrite property can be disabled at compile-time or a run-time if
desired.  See the [PSOW | powersafe overwrite documentation] for further
details.


<a name="section_3_0"></a>
<h2>3.0 Single File Commit</h2>

<p>We begin with an overview of the steps SQLite takes in order to
perform an atomic commit of a transaction against a single database
file.  The details of file formats used to guard against damage from
power failures and techniques for performing an atomic commit across
multiple databases are discussed in later sections.</p>

<tcl>hd_fragment initstate</tcl>
<h3>3.1 Initial State</h3>

<img src="images/ac/commit-0.gif" align="right" hspace="15">

<p>The state of the computer when a database connection is
first opened is shown conceptually by the diagram at the
right.
The area of the diagram on the extreme right (labeled "Disk") represents







|








|







168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
default in recent versions of SQLite.  The assumption of powersafe 
overwrite property can be disabled at compile-time or a run-time if
desired.  See the [PSOW | powersafe overwrite documentation] for further
details.


<a name="section_3_0"></a>
<h1> Single File Commit</h1>

<p>We begin with an overview of the steps SQLite takes in order to
perform an atomic commit of a transaction against a single database
file.  The details of file formats used to guard against damage from
power failures and techniques for performing an atomic commit across
multiple databases are discussed in later sections.</p>

<tcl>hd_fragment initstate</tcl>
<h2> Initial State</h2>

<img src="images/ac/commit-0.gif" align="right" hspace="15">

<p>The state of the computer when a database connection is
first opened is shown conceptually by the diagram at the
right.
The area of the diagram on the extreme right (labeled "Disk") represents
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
the process that is using SQLite.  The database connection has
just been opened and no information has been read yet, so the
user space is empty.
</p>
<br clear="both">

<tcl>hd_fragment rdlck</tcl>
<h3>3.2 Acquiring A Read Lock</h3>

<img src="images/ac/commit-1.gif" align="right" hspace="15">

<p>Before SQLite can write to a database, it must first read
the database to see what is there already.  Even if it is just
appending new data, SQLite still has to read in the database
schema from the <b>sqlite_master</b> table so that it can know







|







199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
the process that is using SQLite.  The database connection has
just been opened and no information has been read yet, so the
user space is empty.
</p>
<br clear="both">

<tcl>hd_fragment rdlck</tcl>
<h2> Acquiring A Read Lock</h2>

<img src="images/ac/commit-1.gif" align="right" hspace="15">

<p>Before SQLite can write to a database, it must first read
the database to see what is there already.  Even if it is just
appending new data, SQLite still has to read in the database
schema from the <b>sqlite_master</b> table so that it can know
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
operating system crashes or if there is a power loss.  It
is usually also the case that the lock will vanish if the
process that created the lock exits.</p>

<br clear="both">

<a name="section_3_3"></a>
<h3>3.3 Reading Information Out Of The Database</h3>

<img src="images/ac/commit-2.gif" align="right" hspace="15">

<p>After the shared lock is acquired, we can begin reading
information from the database file.  In this scenario, we
are assuming a cold cache, so information must first be
read from mass storage into the operating system cache then







|







234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
operating system crashes or if there is a power loss.  It
is usually also the case that the lock will vanish if the
process that created the lock exits.</p>

<br clear="both">

<a name="section_3_3"></a>
<h2> Reading Information Out Of The Database</h2>

<img src="images/ac/commit-2.gif" align="right" hspace="15">

<p>After the shared lock is acquired, we can begin reading
information from the database file.  In this scenario, we
are assuming a cold cache, so information must first be
read from mass storage into the operating system cache then
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
pages out of eight being read.  In a typical application, a
database will have thousands of pages and a query will normally
only touch a small percentage of those pages.</p>

<br clear="both">

<tcl>hd_fragment rsvdlock</tcl>
<h3>3.4 Obtaining A Reserved Lock</h3>

<img src="images/ac/commit-3.gif" align="right" hspace="15">

<p>Before making changes to the database, SQLite first
obtains a "reserved" lock on the database file.  A reserved
lock is similar to a shared lock in that both a reserved lock
and shared lock allow other processes to read from the database







|







256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
pages out of eight being read.  In a typical application, a
database will have thousands of pages and a query will normally
only touch a small percentage of those pages.</p>

<br clear="both">

<tcl>hd_fragment rsvdlock</tcl>
<h2> Obtaining A Reserved Lock</h2>

<img src="images/ac/commit-3.gif" align="right" hspace="15">

<p>Before making changes to the database, SQLite first
obtains a "reserved" lock on the database file.  A reserved
lock is similar to a shared lock in that both a reserved lock
and shared lock allow other processes to read from the database
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
And because the modifications have not yet started, other
processes can continue to read from the database.  However,
no other process should also begin trying to write to the
database.</p>

<br clear="both">
<a name="section_3_5"></a>
<h3>3.5 Creating A Rollback Journal File</h3>
<img src="images/ac/commit-4.gif" align="right" hspace="15">

<p>Prior to making any changes to the database file, SQLite first
creates a separate rollback journal file and writes into the 
rollback journal the original
content of the database pages that are to be altered.
The idea behind the rollback journal is that it contains







|







280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
And because the modifications have not yet started, other
processes can continue to read from the database.  However,
no other process should also begin trying to write to the
database.</p>

<br clear="both">
<a name="section_3_5"></a>
<h2> Creating A Rollback Journal File</h2>
<img src="images/ac/commit-4.gif" align="right" hspace="15">

<p>Prior to making any changes to the database file, SQLite first
creates a separate rollback journal file and writes into the 
rollback journal the original
content of the database pages that are to be altered.
The idea behind the rollback journal is that it contains
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
  is possible when doing real disk I/O.  We illustrate this idea in
  the diagram to the right by showing that the new rollback journal
  appears in the operating system disk cache only and not on the
  disk itself.</p>

<br clear="both">
<a name="section_3_6"></a>
<h3>3.6 Changing Database Pages In User Space</h3>
<img src="images/ac/commit-5.gif" align="right" hspace="15">

<p>After the original page content has been saved in the rollback
journal, the pages can be modified in user memory.  Each database
connection has its own private copy of user space, so the changes
that are made in user space are only visible to the database connection
that is making the changes.  Other database connections still see
the information in operating system disk cache buffers which have
not yet been changed.  And so even though one process is busy
modifying the database, other processes can continue to read their
own copies of the original database content.</p>

<br clear="both">
<a name="section_3_7"></a>
<h3>3.7 Flushing The Rollback Journal File To Mass Storage</h3>
<img src="images/ac/commit-6.gif" align="right" hspace="15">

<p>The next step is to flush the content of the rollback journal
file to nonvolatile storage.
As we will see later, 
this is a critical step in insuring that the database can survive
an unexpected power loss.







|














|







312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
  is possible when doing real disk I/O.  We illustrate this idea in
  the diagram to the right by showing that the new rollback journal
  appears in the operating system disk cache only and not on the
  disk itself.</p>

<br clear="both">
<a name="section_3_6"></a>
<h2> Changing Database Pages In User Space</h2>
<img src="images/ac/commit-5.gif" align="right" hspace="15">

<p>After the original page content has been saved in the rollback
journal, the pages can be modified in user memory.  Each database
connection has its own private copy of user space, so the changes
that are made in user space are only visible to the database connection
that is making the changes.  Other database connections still see
the information in operating system disk cache buffers which have
not yet been changed.  And so even though one process is busy
modifying the database, other processes can continue to read their
own copies of the original database content.</p>

<br clear="both">
<a name="section_3_7"></a>
<h2> Flushing The Rollback Journal File To Mass Storage</h2>
<img src="images/ac/commit-6.gif" align="right" hspace="15">

<p>The next step is to flush the content of the rollback journal
file to nonvolatile storage.
As we will see later, 
this is a critical step in insuring that the database can survive
an unexpected power loss.
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
rollback journal is modified to show the number of pages in the 
rollback journal.  Then the header is flushed to disk.  The details
on why we do this header modification and extra flush are provided
in a later section of this paper.</p>

<br clear="both">
<a name="section_3_8"></a>
<h3>3.8 Obtaining An Exclusive Lock</h3>
<img src="images/ac/commit-7.gif" align="right" hspace="15">

<p>Prior to making changes to the database file itself, we must
obtain an exclusive lock on the database file.  Obtaining an
exclusive lock is really a two-step process.  First SQLite obtains
a "pending" lock.  Then it escalates the pending lock to an
exclusive lock.</p>







|







349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
rollback journal is modified to show the number of pages in the 
rollback journal.  Then the header is flushed to disk.  The details
on why we do this header modification and extra flush are provided
in a later section of this paper.</p>

<br clear="both">
<a name="section_3_8"></a>
<h2> Obtaining An Exclusive Lock</h2>
<img src="images/ac/commit-7.gif" align="right" hspace="15">

<p>Prior to making changes to the database file itself, we must
obtain an exclusive lock on the database file.  Obtaining an
exclusive lock is really a two-step process.  First SQLite obtains
a "pending" lock.  Then it escalates the pending lock to an
exclusive lock.</p>
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
that cycle by allowing existing shared locks to proceed but
blocking new shared locks from being established.  Eventually
all shared locks will clear and the pending lock will then be
able to escalate into an exclusive lock.</p>

<br clear="both">
<a name="section_3_9"></a>
<h3>3.9 Writing Changes To The Database File</h3>
<img src="images/ac/commit-8.gif" align="right" hspace="15">

<p>Once an exclusive lock is held, we know that no other
processes are reading from the database file and it is
safe to write changes into the database file.  Usually
those changes only go as far as the operating systems disk
cache and do not make it all the way to mass storage.</p>

<br clear="both">
<a name="section_3_10"></a>
<h3>3.10 Flushing Changes To Mass Storage</h3>
<img src="images/ac/commit-9.gif" align="right" hspace="15">

<p>Another flush must occur to make sure that all the
database changes are written into nonvolatile storage.
This is a critical step to ensure that the database will
survive a power loss without damage.  However, because
of the inherent slowness of writing to disk or flash memory, 
this step together with the rollback journal file flush in section
3.7 above takes up most of the time required to complete a
transaction commit in SQLite.</p>

<br clear="both">
<a name="section_3_11"></a>
<h3>3.11 Deleting The Rollback Journal</h3>
<img src="images/ac/commit-A.gif" align="right" hspace="15">

<p>After the database changes are all safely on the mass
storage device, the rollback journal file is deleted.
This is the instant where the transaction commits.
If a power failure or system crash occurs prior to this
point, then recovery processes to be described later make







|










|













|







379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
that cycle by allowing existing shared locks to proceed but
blocking new shared locks from being established.  Eventually
all shared locks will clear and the pending lock will then be
able to escalate into an exclusive lock.</p>

<br clear="both">
<a name="section_3_9"></a>
<h2> Writing Changes To The Database File</h2>
<img src="images/ac/commit-8.gif" align="right" hspace="15">

<p>Once an exclusive lock is held, we know that no other
processes are reading from the database file and it is
safe to write changes into the database file.  Usually
those changes only go as far as the operating systems disk
cache and do not make it all the way to mass storage.</p>

<br clear="both">
<a name="section_3_10"></a>
<h2>0 Flushing Changes To Mass Storage</h2>
<img src="images/ac/commit-9.gif" align="right" hspace="15">

<p>Another flush must occur to make sure that all the
database changes are written into nonvolatile storage.
This is a critical step to ensure that the database will
survive a power loss without damage.  However, because
of the inherent slowness of writing to disk or flash memory, 
this step together with the rollback journal file flush in section
3.7 above takes up most of the time required to complete a
transaction commit in SQLite.</p>

<br clear="both">
<a name="section_3_11"></a>
<h2>1 Deleting The Rollback Journal</h2>
<img src="images/ac/commit-A.gif" align="right" hspace="15">

<p>After the database changes are all safely on the mass
storage device, the rollback journal file is deleted.
This is the instant where the transaction commits.
If a power failure or system crash occurs prior to this
point, then recovery processes to be described later make
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
part of the header is malformed the journal will not roll back.
Hence, one can say that the commit occurs as soon as the header
is sufficiently changed to make it invalid.  Typically this happens
as soon as the first byte of the header is zeroed.</p>

<br clear="both">
<a name="section_3_12"></a>
<h3>3.12 Releasing The Lock</h3>
<img src="images/ac/commit-B.gif" align="right" hspace="15">

<p>The last step in the commit process is to release the
exclusive lock so that other processes can once again
start accessing the database file.</p>

<p>In the diagram at the right, we show that the information







|







454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
part of the header is malformed the journal will not roll back.
Hence, one can say that the commit occurs as soon as the header
is sufficiently changed to make it invalid.  Typically this happens
as soon as the first byte of the header is zeroed.</p>

<br clear="both">
<a name="section_3_12"></a>
<h2>2 Releasing The Lock</h2>
<img src="images/ac/commit-B.gif" align="right" hspace="15">

<p>The last step in the commit process is to release the
exclusive lock so that other processes can once again
start accessing the database file.</p>

<p>In the diagram at the right, we show that the information
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
database by checking that counter.  If the database was modified,
then the user space cache must be cleared and reread.  But it is
commonly the case that no changes have been made and the user
space cache can be reused for a significant performance savings.</p>

<br clear="both">
<tcl>hd_fragment rollback</tcl>
<h2>4.0 Rollback</h2>

<p>An atomic commit is supposed to happen instantaneously.  But the processing
described above clearly takes a finite amount of time.
Suppose the power to the computer were cut
part way through the commit operation described above.  In order
to maintain the illusion that the changes were instantaneous, we
have to "rollback" any partial changes and restore the database to
the state it was in prior to the beginning of the transaction.</p>

<tcl>hd_fragment crisis</tcl>
<h3>4.1 When Something Goes Wrong...</h3>
<img src="images/ac/rollback-0.gif" align="right" hspace="15">

<p>Suppose the power loss occurred
during <a href="#section_3_10">step 3.10</a> above,
while the database changes were being written to disk.
After power is restored, the situation might be something
like what is shown to the right.  We were trying to change
three pages of the database file but only one page was
successfully written.  Another page was partially written
and a third page was not written at all.</p>

<p>The rollback journal is complete and intact on disk when
the power is restored.  This is a key point.  The reason for
the flush operation in <a href="#section_3_7">step 3.7</a>
is to make absolutely sure that
all of the rollback journal is safely on nonvolatile storage
prior to making any changes to the database file itself.</p>

<br clear="both">
<a name="section_4_2"></a>
<h3>4.2 Hot Rollback Journals</h3>
<img src="images/ac/rollback-1.gif" align="right" hspace="15">

<p>The first time that any SQLite process attempts to access
the database file, it obtains a shared lock as described in
<a href="section_3_2">section 3.2</a> above.
But then it notices that there is a 
rollback journal file present.  SQLite then checks to see if







|










|




















|







482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
database by checking that counter.  If the database was modified,
then the user space cache must be cleared and reread.  But it is
commonly the case that no changes have been made and the user
space cache can be reused for a significant performance savings.</p>

<br clear="both">
<tcl>hd_fragment rollback</tcl>
<h1> Rollback</h1>

<p>An atomic commit is supposed to happen instantaneously.  But the processing
described above clearly takes a finite amount of time.
Suppose the power to the computer were cut
part way through the commit operation described above.  In order
to maintain the illusion that the changes were instantaneous, we
have to "rollback" any partial changes and restore the database to
the state it was in prior to the beginning of the transaction.</p>

<tcl>hd_fragment crisis</tcl>
<h2> When Something Goes Wrong...</h2>
<img src="images/ac/rollback-0.gif" align="right" hspace="15">

<p>Suppose the power loss occurred
during <a href="#section_3_10">step 3.10</a> above,
while the database changes were being written to disk.
After power is restored, the situation might be something
like what is shown to the right.  We were trying to change
three pages of the database file but only one page was
successfully written.  Another page was partially written
and a third page was not written at all.</p>

<p>The rollback journal is complete and intact on disk when
the power is restored.  This is a key point.  The reason for
the flush operation in <a href="#section_3_7">step 3.7</a>
is to make absolutely sure that
all of the rollback journal is safely on nonvolatile storage
prior to making any changes to the database file itself.</p>

<br clear="both">
<a name="section_4_2"></a>
<h2> Hot Rollback Journals</h2>
<img src="images/ac/rollback-1.gif" align="right" hspace="15">

<p>The first time that any SQLite process attempts to access
the database file, it obtains a shared lock as described in
<a href="section_3_2">section 3.2</a> above.
But then it notices that there is a 
rollback journal file present.  SQLite then checks to see if
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
it aborted for some reason prior to the completion of the
commit.  A hot journal means that
the database file is in an inconsistent state and needs to
be repaired (by rollback) prior to being used.</p>

<br clear="both">
<tcl>hd_fragment exlock</tcl>
<h3>4.3 Obtaining An Exclusive Lock On The Database</h3>
<img src="images/ac/rollback-2.gif" align="right" hspace="15">

<p>The first step toward dealing with a hot journal is to
obtain an exclusive lock on the database file.  This prevents two
or more processes from trying to rollback the same hot journal
at the same time.</p>

<br clear="both">
<a name="section_4_4"></a>
<h3>4.4 Rolling Back Incomplete Changes</h3>
<img src="images/ac/rollback-3.gif" align="right" hspace="15">

<p>Once a process obtains an exclusive lock, it is permitted
to write to the database file.  It then proceeds to read the
original content of pages out of the rollback journal and write
that content back to where it came from in the database file.
Recall that the header of the rollback journal records the original
size of the database file prior to the start of the aborted
transaction.  SQLite uses this information to truncate the
database file back to its original size in cases where the
incomplete transaction caused the database to grow.  At the
end of this step, the database should be the same size and
contain the same information as it did before the start of
the aborted transaction.</p>

<br clear="both">
<tcl>hd_fragment delhotjrnl</tcl>
<h3>4.5 Deleting The Hot Journal</h3>
<img src="images/ac/rollback-4.gif" align="right" hspace="15">

<p>After all information in the rollback journal has been
played back into the database file (and flushed to disk in case
we encounter yet another power failure), the hot rollback journal
can be deleted.</p>

<p>As in <a href="#section_3_11">section 3.11</a>, the journal
file might be truncated to zero length or its header might
be overwritten with zeros as an optimization on systems where
deleting a file is expensive.  Either way, the journal is no 
longer hot after this step.</p>

<br clear="both">
<tcl>hd_fragment cont</tcl>
<h3>4.6 Continue As If The Uncompleted Writes Had Never Happened</h3>
<img src="images/ac/rollback-5.gif" align="right" hspace="15">

<p>The final recovery step is to reduce the exclusive lock back
to a shared lock.  Once this happens, the database is back in the
state that it would have been if the aborted transaction had never
started.  Since all of this recovery activity happens completely
automatically and transparently, it appears to the program using
SQLite as if the aborted transaction had never begun.</p>

<br clear="both">
<tcl>hd_fragment multicommit</tcl>
<h2>5.0 Multi-file Commit</h2>

<p>SQLite allows a single 
<a href="c3ref/sqlite3.html">database connection</a> to talk to
two or more database files simultaneously through the use of
the <a href="lang_attach.html">ATTACH DATABASE</a> command.
When multiple database files are modified within a single
transaction, all files are updated atomically.  
In other words, either all of the database files are updated or
else none of them are.
Achieving an atomic commit across multiple database files is
more complex that doing so for a single file.  This section
describes how SQLite works that bit of magic.</p>

<tcl>hd_fragment multijrnl</tcl>
<h3>5.1 Separate Rollback Journals For Each Database</h3>
<img src="images/ac/multi-0.gif" align="right" hspace="15">

<p>When multiple database files are involved in a transaction,
each database has its own rollback journal and each database
is locked separately.  The diagram at the right shows a scenario
where three different database files have been modified within
one transaction.  The situation at this step is analogous to 







|









|

















|















|











|














|







553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
it aborted for some reason prior to the completion of the
commit.  A hot journal means that
the database file is in an inconsistent state and needs to
be repaired (by rollback) prior to being used.</p>

<br clear="both">
<tcl>hd_fragment exlock</tcl>
<h2> Obtaining An Exclusive Lock On The Database</h2>
<img src="images/ac/rollback-2.gif" align="right" hspace="15">

<p>The first step toward dealing with a hot journal is to
obtain an exclusive lock on the database file.  This prevents two
or more processes from trying to rollback the same hot journal
at the same time.</p>

<br clear="both">
<a name="section_4_4"></a>
<h2> Rolling Back Incomplete Changes</h2>
<img src="images/ac/rollback-3.gif" align="right" hspace="15">

<p>Once a process obtains an exclusive lock, it is permitted
to write to the database file.  It then proceeds to read the
original content of pages out of the rollback journal and write
that content back to where it came from in the database file.
Recall that the header of the rollback journal records the original
size of the database file prior to the start of the aborted
transaction.  SQLite uses this information to truncate the
database file back to its original size in cases where the
incomplete transaction caused the database to grow.  At the
end of this step, the database should be the same size and
contain the same information as it did before the start of
the aborted transaction.</p>

<br clear="both">
<tcl>hd_fragment delhotjrnl</tcl>
<h2> Deleting The Hot Journal</h2>
<img src="images/ac/rollback-4.gif" align="right" hspace="15">

<p>After all information in the rollback journal has been
played back into the database file (and flushed to disk in case
we encounter yet another power failure), the hot rollback journal
can be deleted.</p>

<p>As in <a href="#section_3_11">section 3.11</a>, the journal
file might be truncated to zero length or its header might
be overwritten with zeros as an optimization on systems where
deleting a file is expensive.  Either way, the journal is no 
longer hot after this step.</p>

<br clear="both">
<tcl>hd_fragment cont</tcl>
<h2> Continue As If The Uncompleted Writes Had Never Happened</h2>
<img src="images/ac/rollback-5.gif" align="right" hspace="15">

<p>The final recovery step is to reduce the exclusive lock back
to a shared lock.  Once this happens, the database is back in the
state that it would have been if the aborted transaction had never
started.  Since all of this recovery activity happens completely
automatically and transparently, it appears to the program using
SQLite as if the aborted transaction had never begun.</p>

<br clear="both">
<tcl>hd_fragment multicommit</tcl>
<h1> Multi-file Commit</h1>

<p>SQLite allows a single 
<a href="c3ref/sqlite3.html">database connection</a> to talk to
two or more database files simultaneously through the use of
the <a href="lang_attach.html">ATTACH DATABASE</a> command.
When multiple database files are modified within a single
transaction, all files are updated atomically.  
In other words, either all of the database files are updated or
else none of them are.
Achieving an atomic commit across multiple database files is
more complex that doing so for a single file.  This section
describes how SQLite works that bit of magic.</p>

<tcl>hd_fragment multijrnl</tcl>
<h2> Separate Rollback Journals For Each Database</h2>
<img src="images/ac/multi-0.gif" align="right" hspace="15">

<p>When multiple database files are involved in a transaction,
each database has its own rollback journal and each database
is locked separately.  The diagram at the right shows a scenario
where three different database files have been modified within
one transaction.  The situation at this step is analogous to 
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
operating system cache and information that is on disk.  All of
these factors still apply in a multi-file commit scenario.  They
just take up a lot of space in the diagrams and they do not add
any new information, so they are omitted here.</p>

<br clear="both">
<tcl>hd_fragment masterjrnl</tcl>
<h3>5.2 The Master Journal File</h3>
<img src="images/ac/multi-1.gif" align="right" hspace="15">

<p>The next step in a multi-file commit is the creation of a
"master journal" file.  The name of the master journal file is
the same name as the original database filename (the database
that was opened using the 
<a href="c3ref/open.html">sqlite3_open()</a> interface,







|







653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
operating system cache and information that is on disk.  All of
these factors still apply in a multi-file commit scenario.  They
just take up a lot of space in the diagrams and they do not add
any new information, so they are omitted here.</p>

<br clear="both">
<tcl>hd_fragment masterjrnl</tcl>
<h2> The Master Journal File</h2>
<img src="images/ac/multi-1.gif" align="right" hspace="15">

<p>The next step in a multi-file commit is the creation of a
"master journal" file.  The name of the master journal file is
the same name as the original database filename (the database
that was opened using the 
<a href="c3ref/open.html">sqlite3_open()</a> interface,
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
transactions are atomic across a power-loss.  But if the database files
have other settings that compromise integrity across a power-loss event
(such as [PRAGMA synchronous=OFF] or [PRAGMA journal_mode=MEMORY]) then
the creation of the master journal is omitted, as an optimization.

<br clear="both">
<tcl>hd_fragment multijrnlupdate</tcl>
<h3>5.3 Updating Rollback Journal Headers</h3>
<img src="images/ac/multi-2.gif" align="right" hspace="15">

<p>The next step is to record the full pathname of the master journal file
in the header of every rollback journal.  Space to hold the master
journal filename was reserved at the beginning of each rollback journal
as the rollback journals were created.</p>

<p>The content of each rollback journal is flushed to disk both before
and after the master journal filename is written into the rollback
journal header.  It is important to do both of these flushes.  Fortunately,
the second flush is usually inexpensive since typically only a single
page of the journal file (the first page) has changed.</p>

<p>This step is analogous to 
<a href="#section_3_7">step 3.7</a> in the single-file commit
scenario described above.</p>

<br clear="both">
<tcl>hd_fragment multidbupdate</tcl>
<h3>5.4 Updating The Database Files</h3>
<img src="images/ac/multi-3.gif" align="right" hspace="15">

<p>Once all rollback journal files have been flushed to disk, it
is safe to begin updating database files.  We have to obtain an
exclusive lock on all database files before writing the changes.
After all the changes are written, it is important to flush the
changes to disk so that they will be preserved in the event of
a power failure or operating system crash.</p>

<p>This step corresponds to steps
<a href="#section_3_8">3.8</a>,
<a href="#section_3_9">3.9</a>, and
<a href="#section_3_10">3.10</a> in the single-file commit
scenario described previously.</p>


<br clear="both">
<a name="section_5_5"></a>
<h3>5.5 Delete The Master Journal File</h3>
<img src="images/ac/multi-4.gif" align="right" hspace="15">

<p>The next step is to delete the master journal file.
This is the point where the multi-file transaction commits.
This step corresponds to 
<a href="#section_3_11">step 3.11</a> in the single-file
commit scenario where the rollback journal is deleted.</p>

<p>If a power failure or operating system crash occurs at this
point, the transaction will not rollback when the system reboots
even though there are rollback journals present.  The
difference is the master journal pathname in the header of the
rollback journal.  Upon restart, SQLite only considers a journal
to be hot and will only playback the journal if there is no
master journal filename in the header (which is the case for
a single-file commit) or if the master journal file still
exists on disk.</p>

<br clear="both">
<tcl>hd_fragment cleanup</tcl>
<h3>5.6 Clean Up The Rollback Journals</h3>
<img src="images/ac/multi-5.gif" align="right" hspace="15">

<p>The final step in a multi-file commit is to delete the
individual rollback journals and drop the exclusive locks on
the database files so that other processes can see the changes.
This corresponds to 
<a href="#section_3_12">step 3.12</a> in the single-file







|



















|


















|




















|







690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
transactions are atomic across a power-loss.  But if the database files
have other settings that compromise integrity across a power-loss event
(such as [PRAGMA synchronous=OFF] or [PRAGMA journal_mode=MEMORY]) then
the creation of the master journal is omitted, as an optimization.

<br clear="both">
<tcl>hd_fragment multijrnlupdate</tcl>
<h2> Updating Rollback Journal Headers</h2>
<img src="images/ac/multi-2.gif" align="right" hspace="15">

<p>The next step is to record the full pathname of the master journal file
in the header of every rollback journal.  Space to hold the master
journal filename was reserved at the beginning of each rollback journal
as the rollback journals were created.</p>

<p>The content of each rollback journal is flushed to disk both before
and after the master journal filename is written into the rollback
journal header.  It is important to do both of these flushes.  Fortunately,
the second flush is usually inexpensive since typically only a single
page of the journal file (the first page) has changed.</p>

<p>This step is analogous to 
<a href="#section_3_7">step 3.7</a> in the single-file commit
scenario described above.</p>

<br clear="both">
<tcl>hd_fragment multidbupdate</tcl>
<h2> Updating The Database Files</h2>
<img src="images/ac/multi-3.gif" align="right" hspace="15">

<p>Once all rollback journal files have been flushed to disk, it
is safe to begin updating database files.  We have to obtain an
exclusive lock on all database files before writing the changes.
After all the changes are written, it is important to flush the
changes to disk so that they will be preserved in the event of
a power failure or operating system crash.</p>

<p>This step corresponds to steps
<a href="#section_3_8">3.8</a>,
<a href="#section_3_9">3.9</a>, and
<a href="#section_3_10">3.10</a> in the single-file commit
scenario described previously.</p>


<br clear="both">
<a name="section_5_5"></a>
<h2> Delete The Master Journal File</h2>
<img src="images/ac/multi-4.gif" align="right" hspace="15">

<p>The next step is to delete the master journal file.
This is the point where the multi-file transaction commits.
This step corresponds to 
<a href="#section_3_11">step 3.11</a> in the single-file
commit scenario where the rollback journal is deleted.</p>

<p>If a power failure or operating system crash occurs at this
point, the transaction will not rollback when the system reboots
even though there are rollback journals present.  The
difference is the master journal pathname in the header of the
rollback journal.  Upon restart, SQLite only considers a journal
to be hot and will only playback the journal if there is no
master journal filename in the header (which is the case for
a single-file commit) or if the master journal file still
exists on disk.</p>

<br clear="both">
<tcl>hd_fragment cleanup</tcl>
<h2> Clean Up The Rollback Journals</h2>
<img src="images/ac/multi-5.gif" align="right" hspace="15">

<p>The final step in a multi-file commit is to delete the
individual rollback journals and drop the exclusive locks on
the database files so that other processes can see the changes.
This corresponds to 
<a href="#section_3_12">step 3.12</a> in the single-file
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
this so that all rollback journals are deleted before any database
files are unlocked.  As long as the rollback journal is deleted before
its corresponding database file is unlocked it does not matter in what
order the rollback journals are deleted or the database files are
unlocked.</p>

<tcl>hd_fragment moredetail</tcl>
<h2>6.0 Additional Details Of The Commit Process</h2>

<p><a href="#section_3_0">Section 3.0</a> above provides an overview of
how atomic commit works in SQLite.  But it glosses over a number of
important details.  The following subsections will attempt to fill
in the gaps.</p>

<tcl>hd_fragment completesectors</tcl>
<h3>6.1 Always Journal Complete Sectors</h3>

<p>When the original content of a database page is written into
the rollback journal (as shown in <a href="#section_3_5">section 3.5</a>),
SQLite always writes a complete sector of data, even if the
page size of the database is smaller than the sector size.  
Historically, the sector size in SQLite has been hard coded to 512
bytes and since the minimum page size is also 512 bytes, this has never







|







|







772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
this so that all rollback journals are deleted before any database
files are unlocked.  As long as the rollback journal is deleted before
its corresponding database file is unlocked it does not matter in what
order the rollback journals are deleted or the database files are
unlocked.</p>

<tcl>hd_fragment moredetail</tcl>
<h1> Additional Details Of The Commit Process</h1>

<p><a href="#section_3_0">Section 3.0</a> above provides an overview of
how atomic commit works in SQLite.  But it glosses over a number of
important details.  The following subsections will attempt to fill
in the gaps.</p>

<tcl>hd_fragment completesectors</tcl>
<h2> Always Journal Complete Sectors</h2>

<p>When the original content of a database page is written into
the rollback journal (as shown in <a href="#section_3_5">section 3.5</a>),
SQLite always writes a complete sector of data, even if the
page size of the database is smaller than the sector size.  
Historically, the sector size in SQLite has been hard coded to 512
bytes and since the minimum page size is also 512 bytes, this has never
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
content of pages 1, 3, and 4 since the hardware must write the complete
sector.  If this write operation is interrupted by a power outage,
one or more of the pages 1, 3, or 4 might be left with incorrect data.
Hence, to avoid lasting corruption to the database, the original content
of all of those pages must be contained in the rollback journal.</p>

<tcl>hd_fragment journalgarbage</tcl>
<h3>6.2 Dealing With Garbage Written Into Journal Files</h3>

<p>When data is appended to the end of the rollback journal,
SQLite normally makes the pessimistic assumption that the file
is first extended with invalid "garbage" data and that afterwards
the correct data replaces the garbage.  In other words, SQLite assumes
that the file size is increased first and then afterwards the content
is written into the file.  If a power failure occurs after the file







|







806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
content of pages 1, 3, and 4 since the hardware must write the complete
sector.  If this write operation is interrupted by a power outage,
one or more of the pages 1, 3, or 4 might be left with incorrect data.
Hence, to avoid lasting corruption to the database, the original content
of all of those pages must be contained in the rollback journal.</p>

<tcl>hd_fragment journalgarbage</tcl>
<h2> Dealing With Garbage Written Into Journal Files</h2>

<p>When data is appended to the end of the rollback journal,
SQLite normally makes the pessimistic assumption that the file
is first extended with invalid "garbage" data and that afterwards
the correct data replaces the garbage.  In other words, SQLite assumes
that the file size is increased first and then afterwards the content
is written into the file.  If a power failure occurs after the file
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
<p>Note that the checksums in the rollback journal are not necessary
if the synchronous setting is FULL.  We only depend on the checksums
when synchronous is lowered to NORMAL.  Nevertheless, the checksums
never hurt and so they are included in the rollback journal regardless
of the synchronous setting.</p>

<tcl>hd_fragment cachespill</tcl>
<h3>6.3 Cache Spill Prior To Commit</h3>

<p>The commit process shown in <a href="#section_3_0">section 3.0</a>
assumes that all database changes fit in memory until it is time to
commit.  This is the common case.  But sometimes a larger change will
overflow the user-space cache prior to transaction commit.  In those
cases, the cache must spill to the database before the transaction
is complete.</p>







|







871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
<p>Note that the checksums in the rollback journal are not necessary
if the synchronous setting is FULL.  We only depend on the checksums
when synchronous is lowered to NORMAL.  Nevertheless, the checksums
never hurt and so they are included in the rollback journal regardless
of the synchronous setting.</p>

<tcl>hd_fragment cachespill</tcl>
<h2> Cache Spill Prior To Commit</h2>

<p>The commit process shown in <a href="#section_3_0">section 3.0</a>
assumes that all database changes fit in memory until it is time to
commit.  This is the common case.  But sometimes a larger change will
overflow the user-space cache prior to transaction commit.  In those
cases, the cache must spill to the database before the transaction
is complete.</p>
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
escalate from reserved to exclusive.  This reduces concurrency.
A cache spill also causes extra disk flush or fsync operations to
occur and these operations are slow, hence a cache spill can
seriously reduce performance.
For these reasons a cache spill is avoided whenever possible.</p>

<tcl>hd_fragment opts</tcl>
<h2>7.0 Optimizations</h2>

<p>Profiling indicates that for most systems and in most circumstances
SQLite spends most of its time doing disk I/O.  It follows then that
anything we can do to reduce the amount of disk I/O will likely have a
large positive impact on the performance of SQLite.  This section
describes some of the techniques used by SQLite to try to reduce the
amount of disk I/O to a minimum while still preserving atomic commit.</p>

<tcl>hd_fragment keepcache</tcl>
<h3>7.1 Cache Retained Between Transactions</h3>

<p><a href="#section_3_12">Step 3.12</a> of the commit process shows
that once the shared lock has been released, all user-space cache
images of database content must be discarded.  This is done because
without a shared lock, other processes are free to modify the database
file content and so any user-space image of that content might become
obsolete.  Consequently, each new transaction would begin by rereading







|









|







906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
escalate from reserved to exclusive.  This reduces concurrency.
A cache spill also causes extra disk flush or fsync operations to
occur and these operations are slow, hence a cache spill can
seriously reduce performance.
For these reasons a cache spill is avoided whenever possible.</p>

<tcl>hd_fragment opts</tcl>
<h1> Optimizations</h1>

<p>Profiling indicates that for most systems and in most circumstances
SQLite spends most of its time doing disk I/O.  It follows then that
anything we can do to reduce the amount of disk I/O will likely have a
large positive impact on the performance of SQLite.  This section
describes some of the techniques used by SQLite to try to reduce the
amount of disk I/O to a minimum while still preserving atomic commit.</p>

<tcl>hd_fragment keepcache</tcl>
<h2> Cache Retained Between Transactions</h2>

<p><a href="#section_3_12">Step 3.12</a> of the commit process shows
that once the shared lock has been released, all user-space cache
images of database content must be discarded.  This is done because
without a shared lock, other processes are free to modify the database
file content and so any user-space image of that content might become
obsolete.  Consequently, each new transaction would begin by rereading
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
which is incremented during every change operation.  SQLite saves a copy
of this counter prior to releasing its database lock.  Then after
acquiring the next database lock it compares the saved counter value
against the current counter value and erases the cache if the values
are different, or reuses the cache if they are the same.</p>

<a name="section_7_2"></a>
<h3>7.2 Exclusive Access Mode</h3>

<p>SQLite version 3.3.14 adds the concept of "Exclusive Access Mode".
In exclusive access mode, SQLite retains the exclusive
database lock at the conclusion of each transaction.  This prevents
other processes from accessing the database, but in many deployments
only a single process is using a database so this is not a
serious problem.  The advantage of exclusive access mode is that







|







949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
which is incremented during every change operation.  SQLite saves a copy
of this counter prior to releasing its database lock.  Then after
acquiring the next database lock it compares the saved counter value
against the current counter value and erases the cache if the values
are different, or reuses the cache if they are the same.</p>

<a name="section_7_2"></a>
<h2> Exclusive Access Mode</h2>

<p>SQLite version 3.3.14 adds the concept of "Exclusive Access Mode".
In exclusive access mode, SQLite retains the exclusive
database lock at the conclusion of each transaction.  This prevents
other processes from accessing the database, but in many deployments
only a single process is using a database so this is not a
serious problem.  The advantage of exclusive access mode is that
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
deleting the rollback journal file,
does not depend on holding an exclusive lock at all times.
This optimization can be set independently of exclusive lock mode
using the [journal_mode pragma]
as described in <a href="#section_7_6">section 7.6</a> below.</p>

<tcl>hd_fragment freelistjrnl</tcl>
<h3>7.3 Do Not Journal Freelist Pages</h3>

<p>When information is deleted from an SQLite database, the pages used
to hold the deleted information are added to a "[freelist]".  Subsequent
inserts will draw pages off of this freelist rather than expanding the
database file.</p>

<p>Some freelist pages contain critical data; specifically the locations







|







986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
deleting the rollback journal file,
does not depend on holding an exclusive lock at all times.
This optimization can be set independently of exclusive lock mode
using the [journal_mode pragma]
as described in <a href="#section_7_6">section 7.6</a> below.</p>

<tcl>hd_fragment freelistjrnl</tcl>
<h2> Do Not Journal Freelist Pages</h2>

<p>When information is deleted from an SQLite database, the pages used
to hold the deleted information are added to a "[freelist]".  Subsequent
inserts will draw pages off of this freelist rather than expanding the
database file.</p>

<p>Some freelist pages contain critical data; specifically the locations
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
Similarly, the content of a new freelist page is never written back
into the database at <a href="#section_3_9">step 3.9</a> nor
read from the database at <a href="#section_3_3">step 3.3</a>.
These optimizations can greatly reduce the amount of I/O that occurs
when making changes to a database file that contains free space.</p>

<tcl>hd_fragment atomicsector</tcl>
<h3>7.4 Single Page Updates And Atomic Sector Writes</h3>

<p>Beginning in SQLite version 3.5.0, the new Virtual File System (VFS)
interface contains a method named xDeviceCharacteristics which reports
on special properties that the underlying mass storage device
might have.  Among the special properties that
xDeviceCharacteristics might report is the ability of to do an
atomic sector write.</p>







|







1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
Similarly, the content of a new freelist page is never written back
into the database at <a href="#section_3_9">step 3.9</a> nor
read from the database at <a href="#section_3_3">step 3.3</a>.
These optimizations can greatly reduce the amount of I/O that occurs
when making changes to a database file that contains free space.</p>

<tcl>hd_fragment atomicsector</tcl>
<h2> Single Page Updates And Atomic Sector Writes</h2>

<p>Beginning in SQLite version 3.5.0, the new Virtual File System (VFS)
interface contains a method named xDeviceCharacteristics which reports
on special properties that the underlying mass storage device
might have.  Among the special properties that
xDeviceCharacteristics might report is the ability of to do an
atomic sector write.</p>
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
only touches a single database page, then SQLite skips the whole
journaling and syncing process and simply writes the modified page
directly into the database file.  The change counter in the first
page of the database file is modified separately since no harm is
done if power is lost before the change counter can be updated.</p>

<tcl>hd_fragment safeappend</tcl>
<h3>7.5 Filesystems With Safe Append Semantics</h3>

<p>Another optimization introduced in SQLite version 3.5.0 makes
use of "safe append" behavior of the underlying disk.
Recall that SQLite assumes that when data is appended to a file
(specifically to the rollback journal) that the size of the file
is increased first and that the content is written second.  So
if power is lost after the file size is increased but before the







|







1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
only touches a single database page, then SQLite skips the whole
journaling and syncing process and simply writes the modified page
directly into the database file.  The change counter in the first
page of the database file is modified separately since no harm is
done if power is lost before the change counter can be updated.</p>

<tcl>hd_fragment safeappend</tcl>
<h2> Filesystems With Safe Append Semantics</h2>

<p>Another optimization introduced in SQLite version 3.5.0 makes
use of "safe append" behavior of the underlying disk.
Recall that SQLite assumes that when data is appended to a file
(specifically to the rollback journal) that the size of the file
is increased first and that the content is written second.  So
if power is lost after the file size is increased but before the
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
occurs, we save a single flush operation and a sector write of
the first page of the journal file.  Furthermore, when a cache
spill occurs we no longer need to append a new journal header
to the end of the journal; we can simply continue appending
new pages to the end of the existing journal.</p>

<a name="section_7_6"></a>
<h3>7.6 Persistent Rollback Journals</h3>

<p>Deleting a file is an expensive operation on many systems.
So as an optimization, SQLite can be configured to avoid the
delete operation of <a href="#section_3_11">section 3.11</a>.
Instead of deleting the journal file in order to commit a transaction,
the file is either truncated to zero bytes in length or its
header is overwritten with zeros.  Truncating the file to zero







|







1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
occurs, we save a single flush operation and a sector write of
the first page of the journal file.  Furthermore, when a cache
spill occurs we no longer need to append a new journal header
to the end of the journal; we can simply continue appending
new pages to the end of the existing journal.</p>

<a name="section_7_6"></a>
<h2> Persistent Rollback Journals</h2>

<p>Deleting a file is an expensive operation on many systems.
So as an optimization, SQLite can be configured to avoid the
delete operation of <a href="#section_3_11">section 3.11</a>.
Instead of deleting the journal file in order to commit a transaction,
the file is either truncated to zero bytes in length or its
header is overwritten with zeros.  Truncating the file to zero
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
in slower behavior than PERSIST.  The commit operation is the same speed.
But subsequent transactions are slower following a TRUNCATE because it is
faster to overwrite existing content than to append to the end of a file.
New journal file entries will always be appended following a TRUNCATE but
will usually overwrite with PERSIST.</p>

<tcl>hd_fragment  testing</tcl>
<h2>8.0 Testing Atomic Commit Behavior</h2>

<p>The developers of SQLite are confident that it is robust
in the face of power failures and system crashes because the
automatic test procedures do extensive checks on
the ability of SQLite to recover from simulated power loss.
We call these the "crash tests".</p>








|







1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
in slower behavior than PERSIST.  The commit operation is the same speed.
But subsequent transactions are slower following a TRUNCATE because it is
faster to overwrite existing content than to append to the end of a file.
New journal file entries will always be appended following a TRUNCATE but
will usually overwrite with PERSIST.</p>

<tcl>hd_fragment  testing</tcl>
<h1> Testing Atomic Commit Behavior</h1>

<p>The developers of SQLite are confident that it is robust
in the face of power failures and system crashes because the
automatic test procedures do extensive checks on
the ability of SQLite to recover from simulated power loss.
We call these the "crash tests".</p>

1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
using only code inspection and analysis techniques.  From this
experience, the developers of SQLite feel confident that any other
database system that does not use a similar crash test system
likely contains undetected bugs that will lead to database
corruption following a system crash or power failure.</p>

<tcl>hd_fragment {sect_9_0} {Things That Can Go Wrong}</tcl>
<h2>9.0 Things That Can Go Wrong</h2>

<p>The atomic commit mechanism in SQLite has proven to be robust,
but it can be circumvented by a sufficiently creative
adversary or a sufficiently broken operating system implementation.
This section describes a few of the ways in which an SQLite database
might be corrupted by a power failure or system crash.
(See also: [how to corrupt | How To Corrupt Your Database Files].)</p>

<tcl>hd_fragment brokenlocks</tcl>
<h3>9.1 Broken Locking Implementations</h3>

<p>SQLite uses filesystem locks to make sure that only one
process and database connection is trying to modify the database
at a time.  The filesystem locking mechanism is implemented
in the VFS layer and is different for every operating system.
SQLite depends on this implementation being correct.  If something
goes wrong and two or more processes are able to write the same







|









|







1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
using only code inspection and analysis techniques.  From this
experience, the developers of SQLite feel confident that any other
database system that does not use a similar crash test system
likely contains undetected bugs that will lead to database
corruption following a system crash or power failure.</p>

<tcl>hd_fragment {sect_9_0} {Things That Can Go Wrong}</tcl>
<h1> Things That Can Go Wrong</h1>

<p>The atomic commit mechanism in SQLite has proven to be robust,
but it can be circumvented by a sufficiently creative
adversary or a sufficiently broken operating system implementation.
This section describes a few of the ways in which an SQLite database
might be corrupted by a power failure or system crash.
(See also: [how to corrupt | How To Corrupt Your Database Files].)</p>

<tcl>hd_fragment brokenlocks</tcl>
<h2> Broken Locking Implementations</h2>

<p>SQLite uses filesystem locks to make sure that only one
process and database connection is trying to modify the database
at a time.  The filesystem locking mechanism is implemented
in the VFS layer and is different for every operating system.
SQLite depends on this implementation being correct.  If something
goes wrong and two or more processes are able to write the same
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
mechanisms do not exclude one another, so if one process is
accessing a file using (for example) AFP locking and another
process (perhaps on a different machine) is using dot-file locks,
the two processes might collide because AFP locks do not exclude
dot-file locks or vice versa.</p>

<tcl>hd_fragment fsync</tcl>
<h3>9.2 Incomplete Disk Flushes</h3>

<p>SQLite uses the fsync() system call on Unix and the FlushFileBuffers()
system call on w32 in order to sync the file system buffers onto disk
oxide as shown in <a href="#section_3_7">step 3.7</a> and
<a href="#section_3_10">step 3.10</a>.  Unfortunately, we have received
reports that neither of these interfaces works as advertised on many
systems.  We hear that FlushFileBuffers() can be completely disabled







|







1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
mechanisms do not exclude one another, so if one process is
accessing a file using (for example) AFP locking and another
process (perhaps on a different machine) is using dot-file locks,
the two processes might collide because AFP locks do not exclude
dot-file locks or vice versa.</p>

<tcl>hd_fragment fsync</tcl>
<h2> Incomplete Disk Flushes</h2>

<p>SQLite uses the fsync() system call on Unix and the FlushFileBuffers()
system call on w32 in order to sync the file system buffers onto disk
oxide as shown in <a href="#section_3_7">step 3.7</a> and
<a href="#section_3_10">step 3.10</a>.  Unfortunately, we have received
reports that neither of these interfaces works as advertised on many
systems.  We hear that FlushFileBuffers() can be completely disabled
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
<p>Setting fullfsync on a Mac will guarantee that data really does
get pushed out to the disk platter on a flush.  But the implementation
of fullfsync involves resetting the disk controller.  And so not only
is it profoundly slow, it also slows down other unrelated disk I/O.
So its use is not recommended.</p>

<tcl>hd_fragment filedel</tcl>
<h3>9.3 Partial File Deletions</h3>

<p>SQLite assumes that file deletion is an atomic operation from the
point of view of a user process.  If power fails in the middle of
a file deletion, then after power is restored SQLite expects to see
either the entire file with all of its original data intact, or it
expects not to find the file at all.  Transactions may not be atomic
on systems that do not work this way.</p>

<tcl>hd_fragment filegarbage</tcl>
<h3>9.4 Garbage Written Into Files</h3>

<p>SQLite database files are ordinary disk files that can be
opened and written by ordinary user processes.  A rogue process
can open an SQLite database and fill it with corrupt data.  
Corrupt data might also be introduced into an SQLite database
by bugs in the operating system or disk controller; especially
bugs triggered by a power failure.  There is nothing SQLite can
do to defend against these kinds of problems.</p>

<tcl>hd_fragment mvhotjrnl</tcl>
<h3>9.5 Deleting Or Renaming A Hot Journal</h3>

<p>If a crash or power loss does occur and a hot journal is left on
the disk, it is essential that the original database file and the hot
journal remain on disk with their original names until the database
file is opened by another SQLite process and rolled back.  
During recovery at <a href="#section_4_2">step 4.2</a> SQLite locates
the hot journal by looking for a file in the same directory as the







|









|










|







1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
<p>Setting fullfsync on a Mac will guarantee that data really does
get pushed out to the disk platter on a flush.  But the implementation
of fullfsync involves resetting the disk controller.  And so not only
is it profoundly slow, it also slows down other unrelated disk I/O.
So its use is not recommended.</p>

<tcl>hd_fragment filedel</tcl>
<h2> Partial File Deletions</h2>

<p>SQLite assumes that file deletion is an atomic operation from the
point of view of a user process.  If power fails in the middle of
a file deletion, then after power is restored SQLite expects to see
either the entire file with all of its original data intact, or it
expects not to find the file at all.  Transactions may not be atomic
on systems that do not work this way.</p>

<tcl>hd_fragment filegarbage</tcl>
<h2> Garbage Written Into Files</h2>

<p>SQLite database files are ordinary disk files that can be
opened and written by ordinary user processes.  A rogue process
can open an SQLite database and fill it with corrupt data.  
Corrupt data might also be introduced into an SQLite database
by bugs in the operating system or disk controller; especially
bugs triggered by a power failure.  There is nothing SQLite can
do to defend against these kinds of problems.</p>

<tcl>hd_fragment mvhotjrnl</tcl>
<h2> Deleting Or Renaming A Hot Journal</h2>

<p>If a crash or power loss does occur and a hot journal is left on
the disk, it is essential that the original database file and the hot
journal remain on disk with their original names until the database
file is opened by another SQLite process and rolled back.  
During recovery at <a href="#section_4_2">step 4.2</a> SQLite locates
the hot journal by looking for a file in the same directory as the
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
that SQLite can do to prevent it.  If you are running on a system that
is vulnerable to this kind of filesystem namespace corruption (most
modern journalling filesystems are immune, we believe) then you might
want to consider putting each SQLite database file in its own private
subdirectory.</p>

<tcl>hd_fragment future</tcl>
<h2>10.0 Future Directions And Conclusion</h2>

<p>Every now and then someone discovers a new failure mode for
the atomic commit mechanism in SQLite and the developers have to
put in a patch.  This is happening less and less and the
failure modes are becoming more and more obscure.  But it would
still be foolish to suppose that the atomic commit logic of
SQLite is entirely bug-free.  The developers are committed to fixing







|







1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
that SQLite can do to prevent it.  If you are running on a system that
is vulnerable to this kind of filesystem namespace corruption (most
modern journalling filesystems are immune, we believe) then you might
want to consider putting each SQLite database file in its own private
subdirectory.</p>

<tcl>hd_fragment future</tcl>
<h1>10.0 Future Directions And Conclusion</h1>

<p>Every now and then someone discovers a new failure mode for
the atomic commit mechanism in SQLite and the developers have to
put in a patch.  This is happening less and less and the
failure modes are becoming more and more obscure.  But it would
still be foolish to suppose that the atomic commit logic of
SQLite is entirely bug-free.  The developers are committed to fixing
Changes to pages/cintro.in.
1

2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<title>An Introduction To The SQLite C/C++ Interface</title>

<tcl>
proc CODE {text} {
  hd_puts "<blockquote><pre>"
  hd_puts $text
  hd_puts "</pre></blockquote>"
}
proc PARAGRAPH {text} {
  hd_resolve <p>$text</p>\n
}
set level(0) 0
set level(1) 0
proc HEADING {n name {tag {}}} {
  if {$tag!=""} {
    hd_fragment $tag
  }
  global level

>

<
<
<
<
<
<
<
<







1
2
3








4
5
6
7
8
9
10
<title>An Introduction To The SQLite C/C++ Interface</title>
<table_of_contents>
<tcl>








set level(0) 0
set level(1) 0
proc HEADING {n name {tag {}}} {
  if {$tag!=""} {
    hd_fragment $tag
  }
  global level
31
32
33
34
35
36
37
38
39
40


41
42
43
44
45
46
47
    }
  }
  incr n 1
  hd_puts "<h$n>$num $name</h$n>"
}

hd_keywords {*cintro}
HEADING 0 {An Introduction To The SQLite C/C++ Interface}
HEADING 1 {Executive Summary}
</tcl>



<p>The following two objects and eight methods comprise the essential
elements of the SQLite interface:

<table border="0" cellpadding="0">
<tr>
<td valign="top"><b>[sqlite3]</b></td><td>&nbsp;&nbsp;&nbsp;</td>







<
<

>
>







24
25
26
27
28
29
30


31
32
33
34
35
36
37
38
39
40
    }
  }
  incr n 1
  hd_puts "<h$n>$num $name</h$n>"
}

hd_keywords {*cintro}


</tcl>

<h1>Summary</h1>

<p>The following two objects and eight methods comprise the essential
elements of the SQLite interface:

<table border="0" cellpadding="0">
<tr>
<td valign="top"><b>[sqlite3]</b></td><td>&nbsp;&nbsp;&nbsp;</td>
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
<td valign="top"><b>[sqlite3_exec()]</td><td>
<td>A wrapper function that does [sqlite3_prepare()], [sqlite3_step()],
[sqlite3_column_int|sqlite3_column()], and [sqlite3_finalize()] for
a string of one or more SQL statements.</td>
</tr>
</table>

<tcl>HEADING 1 {Introduction}</tcl>

<p>
  SQLite currently has over 200 distinct APIs.
  This can be overwhelming to a new programmer.
  Fortunately, most of the interfaces are very specialized
  and need not be considered by beginners.
  The core API is small, simple, and easy to learn.







|







91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
<td valign="top"><b>[sqlite3_exec()]</td><td>
<td>A wrapper function that does [sqlite3_prepare()], [sqlite3_step()],
[sqlite3_column_int|sqlite3_column()], and [sqlite3_finalize()] for
a string of one or more SQL statements.</td>
</tr>
</table>

<h1>Introduction</h1>

<p>
  SQLite currently has over 200 distinct APIs.
  This can be overwhelming to a new programmer.
  Fortunately, most of the interfaces are very specialized
  and need not be considered by beginners.
  The core API is small, simple, and easy to learn.
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
  the reader
  understands the basic principles of operation for SQLite, 
  [capi3ref | that document] should be used as a reference
  guide.  This article is intended as introduction only and is neither a
  complete nor authoritative reference for the SQLite API.
</p>

<tcl>HEADING 1 {Core Objects And Interfaces}</tcl>

<p>
  The principal task of an SQL database engine is to evaluate SQL statements
  of SQL.  To accomplish this, the developer needs two objects:
</p>

<p><ul>







|







113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
  the reader
  understands the basic principles of operation for SQLite, 
  [capi3ref | that document] should be used as a reference
  guide.  This article is intended as introduction only and is neither a
  complete nor authoritative reference for the SQLite API.
</p>

<h1>Core Objects And Interfaces</h1>

<p>
  The principal task of an SQL database engine is to evaluate SQL statements
  of SQL.  To accomplish this, the developer needs two objects:
</p>

<p><ul>
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
  to [sqlite3_open()].  All [prepared statements] associated with the
  connection should be [sqlite3_finalize | finalized] prior to closing the
  connection.
</td>

</table>

<tcl>HEADING 2 {Typical Usage Of Core Routines And Objects}</tcl>

<p>
  An application will typically use
  [sqlite3_open()] to create a single [database connection]
  during initialization.
  Note that [sqlite3_open()] can be used to either open existing database
  files or to create and open new database files.







|







258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
  to [sqlite3_open()].  All [prepared statements] associated with the
  connection should be [sqlite3_finalize | finalized] prior to closing the
  connection.
</td>

</table>

<h1>Typical Usage Of Core Routines And Objects</h1>

<p>
  An application will typically use
  [sqlite3_open()] to create a single [database connection]
  during initialization.
  Note that [sqlite3_open()] can be used to either open existing database
  files or to create and open new database files.
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
</ol></p>

<p>
  The foregoing is all one really needs to know in order to use SQLite
  effectively.  All the rest is optimization and detail.
</p>

<tcl>HEADING 1 {Convenience Wrappers Around Core Routines}</tcl>

<p>
  The [sqlite3_exec()] interface is a convenience wrapper that carries out
  all four of the above steps with a single function call.  A callback
  function passed into [sqlite3_exec()] is used to process each row of
  the result set.  The [sqlite3_get_table()] is another convenience wrapper
  that does all four of the above steps.  The [sqlite3_get_table()] interface
  differs from [sqlite3_exec()] in that it stores the results of queries
  in heap memory rather than invoking a callback.
</p>

<p>
  It is important to realize that neither [sqlite3_exec()] nor
  [sqlite3_get_table()] do anything that cannot be accomplished using
  the core routines.  In fact, these wrappers are implemented purely in
  terms of the core routines.
</p>


<tcl>HEADING 1 {Binding Parameters and Reusing Prepared Statements}</tcl>

<p>
  In prior discussion, it was assumed that each SQL statement is prepared
  once, evaluated, then destroyed.  However, SQLite allows the same
  [prepared statement] to be evaluated multiple times.  This is accomplished
  using the following routines:
</p>







|



















|







304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
</ol></p>

<p>
  The foregoing is all one really needs to know in order to use SQLite
  effectively.  All the rest is optimization and detail.
</p>

<h1>Convenience Wrappers Around Core Routines</h1>

<p>
  The [sqlite3_exec()] interface is a convenience wrapper that carries out
  all four of the above steps with a single function call.  A callback
  function passed into [sqlite3_exec()] is used to process each row of
  the result set.  The [sqlite3_get_table()] is another convenience wrapper
  that does all four of the above steps.  The [sqlite3_get_table()] interface
  differs from [sqlite3_exec()] in that it stores the results of queries
  in heap memory rather than invoking a callback.
</p>

<p>
  It is important to realize that neither [sqlite3_exec()] nor
  [sqlite3_get_table()] do anything that cannot be accomplished using
  the core routines.  In fact, these wrappers are implemented purely in
  terms of the core routines.
</p>


<h1>Binding Parameters and Reusing Prepared Statements</h1>

<p>
  In prior discussion, it was assumed that each SQL statement is prepared
  once, evaluated, then destroyed.  However, SQLite allows the same
  [prepared statement] to be evaluated multiple times.  This is accomplished
  using the following routines:
</p>
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
  create all of the [prepared statements] they will ever need.  Other
  applications keep a cache of the most recently used [prepared statements]
  and then reuse [prepared statements] out of the cache when available.
  Another approach is to only reuse [prepared statements] when they are
  inside of a loop.
</p>

<tcl>HEADING 1 {Configuring SQLite}</tcl>

<p>
  The default configuration for SQLite works great for most applications.
  But sometimes developers want to tweak the setup to try to squeeze out
  a little more performance, or take advantage of some obscure feature.
<p>
  The [sqlite3_config()] interface is used to make global, process-wide







|







405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
  create all of the [prepared statements] they will ever need.  Other
  applications keep a cache of the most recently used [prepared statements]
  and then reuse [prepared statements] out of the cache when available.
  Another approach is to only reuse [prepared statements] when they are
  inside of a loop.
</p>

<h1>Configuring SQLite</h1>

<p>
  The default configuration for SQLite works great for most applications.
  But sometimes developers want to tweak the setup to try to squeeze out
  a little more performance, or take advantage of some obscure feature.
<p>
  The [sqlite3_config()] interface is used to make global, process-wide
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
    application-defined mutex system.
</ul> 
<p>
  After process-wide configuration is complete and [database connections]
  have been created, individual database connections can be configured using
  calls to [sqlite3_limit()] and [sqlite3_db_config()].

<tcl>HEADING 1 {Extending SQLite}</tcl>

<p>
  SQLite includes interfaces that can be used to extend its functionality.
  Such routines include:
</p>

<p><ul>







|







431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
    application-defined mutex system.
</ul> 
<p>
  After process-wide configuration is complete and [database connections]
  have been created, individual database connections can be configured using
  calls to [sqlite3_limit()] and [sqlite3_db_config()].

<h1>Extending SQLite</h1>

<p>
  SQLite includes interfaces that can be used to extend its functionality.
  Such routines include:
</p>

<p><ul>
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
  [http://www.sqlite.org/src/doc/trunk/src/func.c | func.c] source files
  for examples.
</p>

<p>
  Shared libraries or DLLs can be used as [loadable extensions] to SQLite.

<tcl>HEADING 1 {Other Interfaces}</tcl>

<p>
  This article only mentions the most important and most commonly
  used SQLite interfaces.
  The SQLite library includes many other APIs implementing useful
  features that are not described here.  
  A [capi3ref_funclist | complete list of functions] that form the SQLite
  application programming interface is found at the
  [capi3ref | C/C++ Interface Specification].
  Refer to that document for complete and authoritative information about
  all SQLite interfaces.
</p>







|












478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
  [http://www.sqlite.org/src/doc/trunk/src/func.c | func.c] source files
  for examples.
</p>

<p>
  Shared libraries or DLLs can be used as [loadable extensions] to SQLite.

<h1>Other Interfaces</h1>

<p>
  This article only mentions the most important and most commonly
  used SQLite interfaces.
  The SQLite library includes many other APIs implementing useful
  features that are not described here.  
  A [capi3ref_funclist | complete list of functions] that form the SQLite
  application programming interface is found at the
  [capi3ref | C/C++ Interface Specification].
  Refer to that document for complete and authoritative information about
  all SQLite interfaces.
</p>
Changes to pages/compile.in.
1
2
3

4

5
6
7
8
9
10
11
<title>Compilation Options For SQLite</title>
<tcl>hd_keywords {compile-time options}</tcl>


<h1>1.0 Compilation Options For SQLite</h1>


<p>
For most purposes, SQLite can be built just fine using the default
compilation options. However, if required, the compile-time options
documented below can be used to 
<a href="#omitfeatures">omit SQLite features</a> (resulting in
a [relfootprint | smaller compiled library size]) or to change the



>
|
>







1
2
3
4
5
6
7
8
9
10
11
12
13
<title>Compilation Options For SQLite</title>
<tcl>hd_keywords {compile-time options}</tcl>

<table_of_contents>

<h1>Overview</h1>

<p>
For most purposes, SQLite can be built just fine using the default
compilation options. However, if required, the compile-time options
documented below can be used to 
<a href="#omitfeatures">omit SQLite features</a> (resulting in
a [relfootprint | smaller compiled library size]) or to change the
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
  hd_puts <p><b>$name</b></p>
  regsub -all "\n\\s*\n" $text "</p>\n\n<p>" text
  hd_resolve <blockquote><p>$text</p></blockquote>
}
</tcl>

<a name="osconfig"></a>
<h2>1.1 Platform Configuration</h2>

<tcl>
COMPILE_OPTION {_HAVE_SQLITE_CONFIG_H} {
  If the _HAVE_SQLITE_CONFIG_H macro is defined
  then the SQLite source code will attempt to #include a file named "config.h".
  The "config.h" file usually contains other configuration options, especially
  "HAVE_<i>INTERFACE</i>" type options generated by autoconf scripts.







|







35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
  hd_puts <p><b>$name</b></p>
  regsub -all "\n\\s*\n" $text "</p>\n\n<p>" text
  hd_resolve <blockquote><p>$text</p></blockquote>
}
</tcl>

<a name="osconfig"></a>
<h1> Platform Configuration</h1>

<tcl>
COMPILE_OPTION {_HAVE_SQLITE_CONFIG_H} {
  If the _HAVE_SQLITE_CONFIG_H macro is defined
  then the SQLite source code will attempt to #include a file named "config.h".
  The "config.h" file usually contains other configuration options, especially
  "HAVE_<i>INTERFACE</i>" type options generated by autoconf scripts.
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
  If the HAVE_UTIME option is true, then the built-in but non-standard
  "unix-dotfile" VFS will use the utime() system call, instead of utimes(),
  to set the last access time on the lock file.
}
</tcl>

<a name="defaults"></a>
<h2>1.2 Options To Set Default Parameter Values</h2>

<tcl>

COMPILE_OPTION {SQLITE_DEFAULT_AUTOMATIC_INDEX=<i>&lt;0 or 1&gt;</i>} {
  This macro determines the initial setting for [PRAGMA automatic_index]
  for newly opened [database connections].
  For all versions of SQLite through 3.7.17,







|







122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
  If the HAVE_UTIME option is true, then the built-in but non-standard
  "unix-dotfile" VFS will use the utime() system call, instead of utimes(),
  to set the last access time on the lock file.
}
</tcl>

<a name="defaults"></a>
<h1> Options To Set Default Parameter Values</h1>

<tcl>

COMPILE_OPTION {SQLITE_DEFAULT_AUTOMATIC_INDEX=<i>&lt;0 or 1&gt;</i>} {
  This macro determines the initial setting for [PRAGMA automatic_index]
  for newly opened [database connections].
  For all versions of SQLite through 3.7.17,
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
  Developers whose applications contain SQL statements that 
  need more than 100 LALR(1) stack entries should seriously
  consider refactoring their SQL as it is likely to be well beyond
  the ability of any human to comprehend.
}
</tcl>

<h2>1.3 Options To Set Size Limits</h2>

<p>There are compile-time options that will set upper bounds
on the sizes of various structures in SQLite.  The compile-time
options normally set a hard upper bound that can be changed
at run-time on individual [database connections] using the
[sqlite3_limit()] interface.</p>








|







415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
  Developers whose applications contain SQL statements that 
  need more than 100 LALR(1) stack entries should seriously
  consider refactoring their SQL as it is likely to be well beyond
  the ability of any human to comprehend.
}
</tcl>

<h1> Options To Set Size Limits</h1>

<p>There are compile-time options that will set upper bounds
on the sizes of various structures in SQLite.  The compile-time
options normally set a hard upper bound that can be changed
at run-time on individual [database connections] using the
[sqlite3_limit()] interface.</p>

439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
<li> [SQLITE_MAX_LIKE_PATTERN_LENGTH]  </li>
<li> [SQLITE_MAX_PAGE_COUNT]  </li>
<li> [SQLITE_MAX_SQL_LENGTH]  </li>
<li> [SQLITE_MAX_VARIABLE_NUMBER]  </li>
</ul>

<a name="controlfeatures"></a>
<h2>1.4 Options To Control Operating Characteristics</h2>

<tcl>
COMPILE_OPTION {SQLITE_4_BYTE_ALIGNED_MALLOC} {
  On most systems, the malloc() system call returns a buffer that is
  aligned to an 8-byte boundary.  But on some systems (ex: windows) malloc()
  returns 4-byte aligned pointer.  This compile-time option must be used
  on systems that return 4-byte aligned pointers from malloc().







|







441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
<li> [SQLITE_MAX_LIKE_PATTERN_LENGTH]  </li>
<li> [SQLITE_MAX_PAGE_COUNT]  </li>
<li> [SQLITE_MAX_SQL_LENGTH]  </li>
<li> [SQLITE_MAX_VARIABLE_NUMBER]  </li>
</ul>

<a name="controlfeatures"></a>
<h1> Options To Control Operating Characteristics</h1>

<tcl>
COMPILE_OPTION {SQLITE_4_BYTE_ALIGNED_MALLOC} {
  On most systems, the malloc() system call returns a buffer that is
  aligned to an 8-byte boundary.  But on some systems (ex: windows) malloc()
  returns 4-byte aligned pointer.  This compile-time option must be used
  on systems that return 4-byte aligned pointers from malloc().
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
  This option causes the [URI filename] process logic to be enabled by 
  default.  
}

</tcl>

<a name="enablefeatures"></a>
<h2>1.5 Options To Enable Features Normally Turned Off</h2>

<tcl>
COMPILE_OPTION {SQLITE_ALLOW_URI_AUTHORITY} {
  [URI filenames] normally throws an error if the authority section is
  not either empty or "localhost".  However, if SQLite is compiled with
  the SQLITE_ALLOW_URI_AUTHORITY compile-time option, then the URI is
  converted into a Uniform Naming Convention (UNC) filename and passed







|







574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
  This option causes the [URI filename] process logic to be enabled by 
  default.  
}

</tcl>

<a name="enablefeatures"></a>
<h1> Options To Enable Features Normally Turned Off</h1>

<tcl>
COMPILE_OPTION {SQLITE_ALLOW_URI_AUTHORITY} {
  [URI filenames] normally throws an error if the authority section is
  not either empty or "localhost".  However, if SQLite is compiled with
  the SQLITE_ALLOW_URI_AUTHORITY compile-time option, then the URI is
  converted into a Uniform Naming Convention (UNC) filename and passed
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
  (determined at compile-time using the [YYSTACKDEPTH] options).
  This option can be used to help determine if an application is
  getting close to exceeding the maximum LALR(1) stack depth.
}
</tcl>

<a name="disablefeatures"></a>
<h2>1.6 Options To Disable Features Normally Turned On</h2>

<tcl>
COMPILE_OPTION {SQLITE_DISABLE_LFS} {
  If this C-preprocessor macro is defined, large file support
  is disabled.
}








|







950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
  (determined at compile-time using the [YYSTACKDEPTH] options).
  This option can be used to help determine if an application is
  getting close to exceeding the maximum LALR(1) stack depth.
}
</tcl>

<a name="disablefeatures"></a>
<h1> Options To Disable Features Normally Turned On</h1>

<tcl>
COMPILE_OPTION {SQLITE_DISABLE_LFS} {
  If this C-preprocessor macro is defined, large file support
  is disabled.
}

982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
}
</tcl>

<tcl>
  hd_fragment "omitfeatures"
  hd_keywords "omitfeatures"
</tcl>
<h2>1.7 Options To Omit Features</h2>

<p>The following options can be used to 
[relfootprint | reduce the size of the compiled library]
by omitting unused features. This is probably only useful
in embedded systems where space is especially tight, as even with all
features included the SQLite library is relatively small. Don't forget
to tell your compiler to optimize for binary size! (the -Os option if







|







984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
}
</tcl>

<tcl>
  hd_fragment "omitfeatures"
  hd_keywords "omitfeatures"
</tcl>
<h1> Options To Omit Features</h1>

<p>The following options can be used to 
[relfootprint | reduce the size of the compiled library]
by omitting unused features. This is probably only useful
in embedded systems where space is especially tight, as even with all
features included the SQLite library is relatively small. Don't forget
to tell your compiler to optimize for binary size! (the -Os option if
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
  So the net effect of this compile-time option is that it allows SQLite
  to be compiled and linked against a system library that does not support
  malloc(), free(), and/or realloc().
}

</tcl>
<a name="debugoptions"></a>
<h2>1.8 Analysis and Debugging Options</h2>
<tcl>

COMPILE_OPTION {SQLITE_DEBUG} {
  The SQLite source code contains literally thousands of assert() statements
  used to verify internal assumptions and subroutine preconditions and
  postconditions.  These assert() statements are normally turned off
  (they generate no code) since turning them on makes SQLite run approximately







|







1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
  So the net effect of this compile-time option is that it allows SQLite
  to be compiled and linked against a system library that does not support
  malloc(), free(), and/or realloc().
}

</tcl>
<a name="debugoptions"></a>
<h1> Analysis and Debugging Options</h1>
<tcl>

COMPILE_OPTION {SQLITE_DEBUG} {
  The SQLite source code contains literally thousands of assert() statements
  used to verify internal assumptions and subroutine preconditions and
  postconditions.  These assert() statements are normally turned off
  (they generate no code) since turning them on makes SQLite run approximately
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
  writing off the ends of a memory allocation, freeing memory not previously
  obtained from the memory allocator, or failing to initialize newly
  allocated memory.
}

</tcl>
<a name="win32options"></a>
<h2>1.9 Windows-Specific Options</h2>
<tcl>

COMPILE_OPTION {SQLITE_WIN32_HEAP_CREATE} {
  This option forces the Win32 native memory allocator, when enabled, to
  create a private heap to hold all memory allocations.
}

COMPILE_OPTION {SQLITE_WIN32_MALLOC_VALIDATE} {
  This option forces the Win32 native memory allocator, when enabled, to
  make strategic calls into the HeapValidate() function if assert() is also
  enabled.
}

</tcl>
<a name="linkage"></a>
<h2>1.10 Compiler Linkage Control</h2>

<p>The following macros specify
interface linkage for certain kinds of SQLite builds.  The Makefiles will normally
handle setting these macros automatically.  Application developers should
not need to worry with these macros.  The following documentation about these 
macros is included completeness.</p>








|















|







1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
  writing off the ends of a memory allocation, freeing memory not previously
  obtained from the memory allocator, or failing to initialize newly
  allocated memory.
}

</tcl>
<a name="win32options"></a>
<h1> Windows-Specific Options</h1>
<tcl>

COMPILE_OPTION {SQLITE_WIN32_HEAP_CREATE} {
  This option forces the Win32 native memory allocator, when enabled, to
  create a private heap to hold all memory allocations.
}

COMPILE_OPTION {SQLITE_WIN32_MALLOC_VALIDATE} {
  This option forces the Win32 native memory allocator, when enabled, to
  make strategic calls into the HeapValidate() function if assert() is also
  enabled.
}

</tcl>
<a name="linkage"></a>
<h1>Compiler Linkage Control</h1>

<p>The following macros specify
interface linkage for certain kinds of SQLite builds.  The Makefiles will normally
handle setting these macros automatically.  Application developers should
not need to worry with these macros.  The following documentation about these 
macros is included completeness.</p>

Changes to pages/dbstat.in.
1
2
3
4



5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
<title>The DBSTAT Virtual Table</title>
<tcl>hd_keywords dbstat {dbstat virtual table}</tcl>
<fancy_format>




<p>
The DBSTAT virtual tables is a read-only [eponymous virtual table] that returns
information about which pages of the database files are used by which
tables and indexes in the schema.
The the DBSTAT virtual table is used to implement [sqlite3_analyzer.exe]
utility program, and to help compute the 
[https://www.sqlite.org/src/repo-tabsize|table size pie-chart] in
the [https://www.fossil-scm.org/|Fossil-implemented] version control system
for SQLite.
</p>


<h1>Overview</h1>

<p>
^The <b>dbstat</b> virtual table is available on all 
[database connections] when SQLite is built using the
[SQLITE_ENABLE_DBSTAT_VTAB] compile-time option.
^The dbstat virtual table provides low-level information 
about btree and overflow pages in a database file.





>
>
>











<
<
<







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18



19
20
21
22
23
24
25
<title>The DBSTAT Virtual Table</title>
<tcl>hd_keywords dbstat {dbstat virtual table}</tcl>
<fancy_format>


<h1>Overview</h1>

<p>
The DBSTAT virtual tables is a read-only [eponymous virtual table] that returns
information about which pages of the database files are used by which
tables and indexes in the schema.
The the DBSTAT virtual table is used to implement [sqlite3_analyzer.exe]
utility program, and to help compute the 
[https://www.sqlite.org/src/repo-tabsize|table size pie-chart] in
the [https://www.fossil-scm.org/|Fossil-implemented] version control system
for SQLite.
</p>




<p>
^The <b>dbstat</b> virtual table is available on all 
[database connections] when SQLite is built using the
[SQLITE_ENABLE_DBSTAT_VTAB] compile-time option.
^The dbstat virtual table provides low-level information 
about btree and overflow pages in a database file.

Changes to pages/errlog.in.
1
2
3
4

5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
<title>The Error And Warning Log</title>
<tcl>hd_keywords {errlog} {error log}</tcl>

<h1 align="center">The Error And Warning Log</h1>


<p>SQLite can be configured to invoke a callback function containing
an error code and a terse error message whenever anomalies occur.
This mechanism is very helpful in tracking obscure problems that
occur rarely and in the field.  Application developers are encouraged
to take advantage of the error logging facility of SQLite in their
products, as it is very low CPU and memory cost but can be a
huge aid for debugging.</p>

<h2>Setting Up The Error Logging Callback</h2>

<p>There can only be a single error logging callback per process.
The error logging callback is registered at start-time using C-code
similar to the following:

<blockquote><pre>
[sqlite3_config]([SQLITE_CONFIG_LOG], errorLogCallback, pData);



|
>









|







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
<title>The Error And Warning Log</title>
<tcl>hd_keywords {errlog} {error log}</tcl>

<table_of_contents>
<h2 style="margin-left:1.0em" notoc id=overview> Overview</h2> 

<p>SQLite can be configured to invoke a callback function containing
an error code and a terse error message whenever anomalies occur.
This mechanism is very helpful in tracking obscure problems that
occur rarely and in the field.  Application developers are encouraged
to take advantage of the error logging facility of SQLite in their
products, as it is very low CPU and memory cost but can be a
huge aid for debugging.</p>

<h1>Setting Up The Error Logging Callback</h1>

<p>There can only be a single error logging callback per process.
The error logging callback is registered at start-time using C-code
similar to the following:

<blockquote><pre>
[sqlite3_config]([SQLITE_CONFIG_LOG], errorLogCallback, pData);
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
<p>Do not misunderstand: There is nothing technically wrong with displaying 
the error logger messages to end users.  The messages do not contain
sensitive or private information that must be protected from unauthorized
viewing.  Rather the messages are technical in nature and are not useful
or meaningful to the typical end user.  The messages coming from the
error logger are intended for database geeks.  Display them accordingly.</p>

<h2>Interface Details</h2>

<p>The third argument to the [sqlite3_config]([SQLITE_CONFIG_LOG],...) 
interface (the "pData" argument in the example above) is a pointer to arbitrary
data.  SQLite passes this pointer through to the first argument of the
error logger callback.  The pointer can be used to pass application-specific 
setup or state information, if desired.  Or it can simply be a NULL 
pointer which is ignored by the callback.</p>







|







42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
<p>Do not misunderstand: There is nothing technically wrong with displaying 
the error logger messages to end users.  The messages do not contain
sensitive or private information that must be protected from unauthorized
viewing.  Rather the messages are technical in nature and are not useful
or meaningful to the typical end user.  The messages coming from the
error logger are intended for database geeks.  Display them accordingly.</p>

<h1>Interface Details</h1>

<p>The third argument to the [sqlite3_config]([SQLITE_CONFIG_LOG],...) 
interface (the "pData" argument in the example above) is a pointer to arbitrary
data.  SQLite passes this pointer through to the first argument of the
error logger callback.  The pointer can be used to pass application-specific 
setup or state information, if desired.  Or it can simply be a NULL 
pointer which is ignored by the callback.</p>
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
to try to allocate memory inside the error logger.  Do not even think
about trying to store the error message in another SQLite database.</p>

<p>Applications can use the [sqlite3_log(E,F,..)] API to send new messages
to the log, if desired, but this is discouraged.  The [sqlite3_log()]
interface is intended for use by extensions only, not by applications.</p>

<h2>Variety of Error Messages</h2>

<p>The error messages that might be sent to the error logger and their
exact format is subject to changes from one release to the next.  So
applications should not depend on any particular error message text formats or
error codes.  Things do not change capriciously, but they do sometimes
changes.</p>








|







72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
to try to allocate memory inside the error logger.  Do not even think
about trying to store the error message in another SQLite database.</p>

<p>Applications can use the [sqlite3_log(E,F,..)] API to send new messages
to the log, if desired, but this is discouraged.  The [sqlite3_log()]
interface is intended for use by extensions only, not by applications.</p>

<h1>Variety of Error Messages</h1>

<p>The error messages that might be sent to the error logger and their
exact format is subject to changes from one release to the next.  So
applications should not depend on any particular error message text formats or
error codes.  Things do not change capriciously, but they do sometimes
changes.</p>

142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
to the error logger when there really is something wrong.  Applications
might further cull the error message traffic 
by deliberately ignore certain classes of error
messages that they do not care about.  For example, an application that
makes frequent database schema changes might want to ignore all
SQLITE_SCHEMA errors.</p>

<h2>Summary</h2>

<p>The use of the error logger callback is highly recommended.
The debugging information that the error logger provides has proven
very useful in tracking down obscure problems that occurs with applications
after they get into the field.  The error logger callback has also 
proven useful in catching errors occasional errors that the application
misses because of inconsistent checking of API return codes.
Developers are encouraged to implement an error logger callback early
in the development cycle in order to spot unexpected behavior quickly,
and to leave the error logger callback turned on through deployment.
If the error logger never finds a problem, then no harm is done.  
But failure to set up an appropriate error logger might compromise
diagnostic capabilities later on.</p>







|













143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
to the error logger when there really is something wrong.  Applications
might further cull the error message traffic 
by deliberately ignore certain classes of error
messages that they do not care about.  For example, an application that
makes frequent database schema changes might want to ignore all
SQLITE_SCHEMA errors.</p>

<h1>Summary</h1>

<p>The use of the error logger callback is highly recommended.
The debugging information that the error logger provides has proven
very useful in tracking down obscure problems that occurs with applications
after they get into the field.  The error logger callback has also 
proven useful in catching errors occasional errors that the application
misses because of inconsistent checking of API return codes.
Developers are encouraged to implement an error logger callback early
in the development cycle in order to spot unexpected behavior quickly,
and to leave the error logger callback turned on through deployment.
If the error logger never finds a problem, then no harm is done.  
But failure to set up an appropriate error logger might compromise
diagnostic capabilities later on.</p>
Changes to pages/fts5.in.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

<tcl>hd_keywords *fts5 FTS5</tcl>
<title>SQLite FTS5 Extension</title>

<table_of_contents>

<h2 style="margin-left:1.0em" notoc> Overview</h2> 

<h1>Overview of FTS5</h1>

<p>FTS5 is an SQLite [virtual table module] that provides 
<a href=http://en.wikipedia.org/wiki/Full_text_search>full-text search</a>
functionality to database applications. In their most elementary form, 
full-text search engines allow the user to efficiently search a large 
collection of documents for the subset that contain one or more instances of a






<
<







1
2
3
4
5
6


7
8
9
10
11
12
13

<tcl>hd_keywords *fts5 FTS5</tcl>
<title>SQLite FTS5 Extension</title>

<table_of_contents>



<h1>Overview of FTS5</h1>

<p>FTS5 is an SQLite [virtual table module] that provides 
<a href=http://en.wikipedia.org/wiki/Full_text_search>full-text search</a>
functionality to database applications. In their most elementary form, 
full-text search engines allow the user to efficiently search a large 
collection of documents for the subset that contain one or more instances of a
Changes to pages/howtocompile.in.
1
2
3
4
5

6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
          <title>How To Compile SQLite</title>
<title>How To Compile SQLite</title>
<tcl>hd_keywords {how to compile} {How To Compile SQLite}</tcl>

<h1 align=center>How To Compile SQLite</h1>


<p>
SQLite is ANSI-C source code.
It must be compiled into machine code before it is useful.
This article is a guide to the various ways of compiling SQLite.
</p>

<p>This article does not contain a step-by-step recipe for compiling
SQLite.  That would be difficult since each development situation
is different.
Rather, this article describes and illustrates the principals behind the
compilation of SQLite.  Typical compilation commands are provided as examples
with the expectation that application developers can use these examples
as guidance for developing their own custom compilation procedures.
In other words, this article provides ideas and insights, not turnkey
solutions.</p>

<h2>Amalgamation Versus Individual Source Files</h2> 

<p>SQLite is built from over one hundred files of C code and script
spread across multiple directories.  The implementation of SQLite is pure
ANSI-C, but many of the C-language source code files are either
generated or transformed by auxiliary C programs and AWK, SED, and TCL 
scripts prior to being incorporated into the finished SQLite library.
Building the necessary C programs and transforming and/or creating the




|
>

















|







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
          <title>How To Compile SQLite</title>
<title>How To Compile SQLite</title>
<tcl>hd_keywords {how to compile} {How To Compile SQLite}</tcl>

<table_of_contents>
<h2 style="margin-left:1.0em" notoc id=overview> Overview</h2> 

<p>
SQLite is ANSI-C source code.
It must be compiled into machine code before it is useful.
This article is a guide to the various ways of compiling SQLite.
</p>

<p>This article does not contain a step-by-step recipe for compiling
SQLite.  That would be difficult since each development situation
is different.
Rather, this article describes and illustrates the principals behind the
compilation of SQLite.  Typical compilation commands are provided as examples
with the expectation that application developers can use these examples
as guidance for developing their own custom compilation procedures.
In other words, this article provides ideas and insights, not turnkey
solutions.</p>

<h1>Amalgamation Versus Individual Source Files</h1> 

<p>SQLite is built from over one hundred files of C code and script
spread across multiple directories.  The implementation of SQLite is pure
ANSI-C, but many of the C-language source code files are either
generated or transformed by auxiliary C programs and AWK, SED, and TCL 
scripts prior to being incorporated into the finished SQLite library.
Building the necessary C programs and transforming and/or creating the
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
For those situations, it is recommended that a customized amalgamation be
built (as described [building the amalgamation | below])
and used.  In other words, even if a project requires building SQLite 
beginning with individual source files, it is still recommended that an
amalgamation source file be used as an intermediate step.</p>

<tcl>hd_fragment {cli} {compiling the CLI}</tcl>
<h2>Compiling The Command-Line Interface</h2>

<p>A build of the [CLI | command-line interface] requires three source
files:</p>

<ul>
<li><b>sqlite3.c</b>: The SQLite amalgamation source file
<li><b>sqlite3.h</b>: The header files that accompanies sqlite3.c and 







|







55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
For those situations, it is recommended that a customized amalgamation be
built (as described [building the amalgamation | below])
and used.  In other words, even if a project requires building SQLite 
beginning with individual source files, it is still recommended that an
amalgamation source file be used as an intermediate step.</p>

<tcl>hd_fragment {cli} {compiling the CLI}</tcl>
<h1>Compiling The Command-Line Interface</h1>

<p>A build of the [CLI | command-line interface] requires three source
files:</p>

<ul>
<li><b>sqlite3.c</b>: The SQLite amalgamation source file
<li><b>sqlite3.h</b>: The header files that accompanies sqlite3.c and 
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
<p>The key point is this:  Building the CLI consists of compiling 
together two C-language files.   The <b>shell.c</b> file contains the
definition of the entry point and the user input loop and the
SQLite amalgamation <b>sqlite3.c</b> contains the complete implementation
of the SQLite library.</p>

<tcl>hd_fragment {tcl} {compiling the TCL interface}</tcl>
<h2>Compiling The TCL Interface</h2>

<p>The TCL interface for SQLite is a small module that is added into
the regular amalgamation.  The result is a new amalgamated source
file called "<b>tclsqlite3.c</b>".  This single source file is all that
is needed to generate a shared library that can be loaded into a
standard 
[http://wiki.tcl-lang.org/2541 | tclsh] or 







|







136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
<p>The key point is this:  Building the CLI consists of compiling 
together two C-language files.   The <b>shell.c</b> file contains the
definition of the entry point and the user input loop and the
SQLite amalgamation <b>sqlite3.c</b> contains the complete implementation
of the SQLite library.</p>

<tcl>hd_fragment {tcl} {compiling the TCL interface}</tcl>
<h1>Compiling The TCL Interface</h1>

<p>The TCL interface for SQLite is a small module that is added into
the regular amalgamation.  The result is a new amalgamated source
file called "<b>tclsqlite3.c</b>".  This single source file is all that
is needed to generate a shared library that can be loaded into a
standard 
[http://wiki.tcl-lang.org/2541 | tclsh] or 
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
SQLite includes a <b>main()</b> procedure that initializes a TCL interpreter
and enters a command-line loop when it is compiled with -DTCLSH=1.  The
command above works on both Linux and Mac OS X, though one may need to adjust
the library options depending on the platform and which version of TCL one
is linking against.</p>

<tcl>hd_fragment {amal} {building the amalgamation}</tcl>
<h2>Building The Amalgamation</h2>

<p>The versions of the SQLite amalgamation that are supplied on the
[download page] are normally adequate for most users.  However, some
projects may want or need to build their own amalgamations.  A common
reason for building a custom amalgamation is in order to use certain
[compile-time options] to customize the SQLite library.  Recall that
the SQLite amalgamation contains a lot of C-code that is generated by







|







176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
SQLite includes a <b>main()</b> procedure that initializes a TCL interpreter
and enters a command-line loop when it is compiled with -DTCLSH=1.  The
command above works on both Linux and Mac OS X, though one may need to adjust
the library options depending on the platform and which version of TCL one
is linking against.</p>

<tcl>hd_fragment {amal} {building the amalgamation}</tcl>
<h1>Building The Amalgamation</h1>

<p>The versions of the SQLite amalgamation that are supplied on the
[download page] are normally adequate for most users.  However, some
projects may want or need to build their own amalgamations.  A common
reason for building a custom amalgamation is in order to use certain
[compile-time options] to customize the SQLite library.  Recall that
the SQLite amalgamation contains a lot of C-code that is generated by
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
"<b>sqlite3.c</b>" amalgamation source file, its header file
"<b>sqlite3.h</b>", and the "<b>tclsqlite3.c</b>" amalgamation source
file that includes the TCL interface.
Afterwards, the needed files can be copied into project directories and
compiled according to the procedures outlined above.</p>

<tcl>hd_fragment {dll} {building a DLL}</tcl>
<h2>Building A Windows DLL</h2>

<p>To build a DLL of SQLite for use in Windows, first acquire the
appropriate amalgamated source code files, sqlite3.c and sqlite3.h.  
These can either
be downloaded from the [http://www.sqlite.org/download.html | SQLite website]
or custom generated from sources as shown above.</p>








|







245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
"<b>sqlite3.c</b>" amalgamation source file, its header file
"<b>sqlite3.h</b>", and the "<b>tclsqlite3.c</b>" amalgamation source
file that includes the TCL interface.
Afterwards, the needed files can be copied into project directories and
compiled according to the procedures outlined above.</p>

<tcl>hd_fragment {dll} {building a DLL}</tcl>
<h1>Building A Windows DLL</h1>

<p>To build a DLL of SQLite for use in Windows, first acquire the
appropriate amalgamated source code files, sqlite3.c and sqlite3.h.  
These can either
be downloaded from the [http://www.sqlite.org/download.html | SQLite website]
or custom generated from sources as shown above.</p>

Changes to pages/howtocorrupt.in.
1
2
3
4

5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
<title>How To Corrupt An SQLite Database File</title>
<tcl>hd_keywords {how to corrupt}</tcl>

<h1 align=center>How To Corrupt An SQLite Database File</h1>


<p>An SQLite database is highly resistant to corruption.
If an application crash, or an operating-system crash, or even
a power failure occurs in the middle of a transaction, the partially
written transaction should be automatically rolled back the next time
the database file is accessed.  The recovery process is fully
automatic and does not require any action on the part of the user
or the application.
</p>

<p>Though SQLite is resistant to database corruption, it is not immune.
This document describes the various ways that an SQLite database might
go corrupt.</p>

<h2>1.0 File overwrite by a rogue thread or process</h2>

<p>SQLite database files are ordinary disk files.
That means that any process can open the file and 
overwrite it with garbage.  There is nothing that the SQLite
library can do to defend against this.</p>

<h3>1.1 Continuing to use a file descriptor after it has been closed</h3>

<p>We have seen multiple cases where a file descriptor was open on a file,
then that file descriptor was closed and reopened on an SQLite database.
Later, some other thread continued to write into the
old file descriptor, not realizing that the original file had been closed
already.  But because the file descriptor had been reopened by SQLite,
the information that was intended to go into the original file ended up



|
>














|






|







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
<title>How To Corrupt An SQLite Database File</title>
<tcl>hd_keywords {how to corrupt}</tcl>

<table_of_contents>
<h2 style="margin-left:1.0em" notoc id=overview> Overview</h2> 

<p>An SQLite database is highly resistant to corruption.
If an application crash, or an operating-system crash, or even
a power failure occurs in the middle of a transaction, the partially
written transaction should be automatically rolled back the next time
the database file is accessed.  The recovery process is fully
automatic and does not require any action on the part of the user
or the application.
</p>

<p>Though SQLite is resistant to database corruption, it is not immune.
This document describes the various ways that an SQLite database might
go corrupt.</p>

<h1> File overwrite by a rogue thread or process</h1>

<p>SQLite database files are ordinary disk files.
That means that any process can open the file and 
overwrite it with garbage.  There is nothing that the SQLite
library can do to defend against this.</p>

<h2> Continuing to use a file descriptor after it has been closed</h2>

<p>We have seen multiple cases where a file descriptor was open on a file,
then that file descriptor was closed and reopened on an SQLite database.
Later, some other thread continued to write into the
old file descriptor, not realizing that the original file had been closed
already.  But because the file descriptor had been reopened by SQLite,
the information that was intended to go into the original file ended up
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
for database files. 
(See [SQLITE_MINIMUM_FILE_DESCRIPTOR] for additional information.)</p>

<p>Another example of corruption caused by using a closed file
descriptor was 
[https://code.facebook.com/posts/313033472212144/debugging-file-corruption-on-ios/|reported by facebook engineers] in a blog post on 2014-08-12.</p>

<h3>1.2 Backup or restore while a transaction is active</h3>

<p>Systems that run automatic backups in the background might try to
make a backup copy of an SQLite database file while it is in the middle
of a transaction.  The backup copy then might contain some old and some
new content, and thus be corrupt.</p>

<p>The best approach to make reliable backup copies of an SQLite database
is to make use of the [backup API] that is part of the SQLite library.
Failing that, it is safe to make a copy of an SQLite database file as long
as there are no transactions in progress by any process.  If the previous
transaction failed, then it is important that any rollback journal
(the <tt>*-journal</tt> file) or write-ahead log (the <tt>*-wal</tt> file)
be copied together with the database file itself.</p>

<tcl>hd_fragment delhotjrnl {deleting a hot journal}</tcl>
<h3>1.3 Deleting a hot journal</h3>

<p>SQLite normally stores all content in a single disk file.  However,
while performing a transaction, information necessary to roll back that
transaction following a crash or power failure is stored in auxiliary
journal files.  These journal files have the same name as the
original database file with the addition
of <tt>-journal</tt> or <tt>-wal</tt> suffix.</p>

<p>SQLite must see the journal files in order to recover from a crash
or power failure.  If the journal files are moved, deleted, or renamed
after a crash or power failure, then automatic recovery will not work
and the database may go corrupt.</p>

<p>Another manifestation of this problem is
[database corruption caused by inconsistent use of 8+3 filenames].</p>


<h2>2.0 File locking problems</h2>

<p>SQLite uses file locks on the database file, and on the 
[write-ahead log] or [WAL] file, to coordinate access between concurrent
processes.  Without coordination, two threads or processes might try
to make incompatible changes to a database file at the same time,
resulting in database corruption.</p>

<h3>2.1 Filesystems with broken or missing lock implementations</h3>

<p>SQLite depends on the underlying filesystem to do locking as the
documentation says it will.  But some filesystems contain bugs in their
locking logic such that the locks do not always behave as advertised.
This is especially true of network filesystems and NFS in particular.
If SQLite is used on a filesystem where the locking primitives contain
bugs, and if two or more threads or processes try to access the same
database at the same time, then database corruption might result.</p>

<tcl>hd_fragment posix_close_bug</tcl>
<h3>2.2 Posix advisory locks canceled by a separate thread doing close()</h3>

<p>The default locking mechanism used by SQLite on unix platforms is
POSIX advisory locking.  Unfortunately, POSIX advisory locking has design
quirks that make it prone to misuse and failure. In particular, any
thread in the same process with a file descriptor that is holding a POSIX
advisory lock can override that lock using a different file descriptor.
One particularly pernicious problem is that the <tt>close()</tt> system







|















|

















|







|










|







49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
for database files. 
(See [SQLITE_MINIMUM_FILE_DESCRIPTOR] for additional information.)</p>

<p>Another example of corruption caused by using a closed file
descriptor was 
[https://code.facebook.com/posts/313033472212144/debugging-file-corruption-on-ios/|reported by facebook engineers] in a blog post on 2014-08-12.</p>

<h2> Backup or restore while a transaction is active</h2>

<p>Systems that run automatic backups in the background might try to
make a backup copy of an SQLite database file while it is in the middle
of a transaction.  The backup copy then might contain some old and some
new content, and thus be corrupt.</p>

<p>The best approach to make reliable backup copies of an SQLite database
is to make use of the [backup API] that is part of the SQLite library.
Failing that, it is safe to make a copy of an SQLite database file as long
as there are no transactions in progress by any process.  If the previous
transaction failed, then it is important that any rollback journal
(the <tt>*-journal</tt> file) or write-ahead log (the <tt>*-wal</tt> file)
be copied together with the database file itself.</p>

<tcl>hd_fragment delhotjrnl {deleting a hot journal}</tcl>
<h2> Deleting a hot journal</h2>

<p>SQLite normally stores all content in a single disk file.  However,
while performing a transaction, information necessary to roll back that
transaction following a crash or power failure is stored in auxiliary
journal files.  These journal files have the same name as the
original database file with the addition
of <tt>-journal</tt> or <tt>-wal</tt> suffix.</p>

<p>SQLite must see the journal files in order to recover from a crash
or power failure.  If the journal files are moved, deleted, or renamed
after a crash or power failure, then automatic recovery will not work
and the database may go corrupt.</p>

<p>Another manifestation of this problem is
[database corruption caused by inconsistent use of 8+3 filenames].</p>


<h1> File locking problems</h1>

<p>SQLite uses file locks on the database file, and on the 
[write-ahead log] or [WAL] file, to coordinate access between concurrent
processes.  Without coordination, two threads or processes might try
to make incompatible changes to a database file at the same time,
resulting in database corruption.</p>

<h2> Filesystems with broken or missing lock implementations</h2>

<p>SQLite depends on the underlying filesystem to do locking as the
documentation says it will.  But some filesystems contain bugs in their
locking logic such that the locks do not always behave as advertised.
This is especially true of network filesystems and NFS in particular.
If SQLite is used on a filesystem where the locking primitives contain
bugs, and if two or more threads or processes try to access the same
database at the same time, then database corruption might result.</p>

<tcl>hd_fragment posix_close_bug</tcl>
<h2> Posix advisory locks canceled by a separate thread doing close()</h2>

<p>The default locking mechanism used by SQLite on unix platforms is
POSIX advisory locking.  Unfortunately, POSIX advisory locking has design
quirks that make it prone to misuse and failure. In particular, any
thread in the same process with a file descriptor that is holding a POSIX
advisory lock can override that lock using a different file descriptor.
One particularly pernicious problem is that the <tt>close()</tt> system
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148

<p>Note that it is perfectly safe for two or more threads to access the
same SQLite database file using the SQLite library.  The unix drivers for
SQLite know about the POSIX advisory locking quirks and work around them.
This problem only arises when a thread tries to bypass the SQLite library
and read the database file directly.</p>

<h4>2.2.1 Multiple copies of SQLite linked into the same application</h4>

<p>As pointed out in the previous paragraph, SQLite takes steps to work
around the quirks of POSIX advisory locking.  Part of that work-around involves
keeping a global list (mutex protected) of open SQLite database files.
But, if multiple copies of SQLite are linked into the same application,
then there will be multiple instances of this global list.
Database connections opened using one copy of the SQLite library







|







135
136
137
138
139
140
141
142
143
144
145
146
147
148
149

<p>Note that it is perfectly safe for two or more threads to access the
same SQLite database file using the SQLite library.  The unix drivers for
SQLite know about the POSIX advisory locking quirks and work around them.
This problem only arises when a thread tries to bypass the SQLite library
and read the database file directly.</p>

<h3>Multiple copies of SQLite linked into the same application</h3>

<p>As pointed out in the previous paragraph, SQLite takes steps to work
around the quirks of POSIX advisory locking.  Part of that work-around involves
keeping a global list (mutex protected) of open SQLite database files.
But, if multiple copies of SQLite are linked into the same application,
then there will be multiple instances of this global list.
Database connections opened using one copy of the SQLite library
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
with exactly this bug.  The vendor came to the SQLite developers seeking
help in tracking down some infrequent database corruption issues they were
seeing on Linux and Mac.  The problem was eventually traced to the
fact that the application was linking against two separate copies of SQLite.
The solution was to change the application build procedures to link against
just one copy of SQLite instead of two.</p>

<h3>2.3 Two processes using different locking protocols</h3>

<p>The default locking mechanism used by SQLite on unix platforms is
POSIX advisory locking, but there are other options.  By selecting an
alternative [sqlite3_vfs] using the [sqlite3_open_v2()] interface, an
application can make use of other locking protocols that might be more
appropriate to certain filesystems.  For example, dot-file locking might
be select for use in an application that has to run on an NFS filesystem
that does not support POSIX advisory locking.</p>

<p>It is important that all connections to the same database file use 
the same locking protocol.
If one application is using POSIX advisory locks and another application
is using dot-file locking, then the two applications will not see each
other's locks and will not be able to coordinate database access, possibly
leading to database corruption.</p>

<tcl>hd_fragment unlink {unlink corruption} {unlinked database files}</tcl>
<h3>2.4 Unlinking or renaming a database file while in use</h3>

<p>If two processes have open connections to the same database file and
one process closes its connection, unlinks the file, then creates a new
database file in its place with the same name and reopens the new file,
then the two processes will be talking to different database files with 
the same name.  (Note that this is only possible on Posix and Posix-like
systems that permit a file to be unlinked while it is still open for







|

















|







159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
with exactly this bug.  The vendor came to the SQLite developers seeking
help in tracking down some infrequent database corruption issues they were
seeing on Linux and Mac.  The problem was eventually traced to the
fact that the application was linking against two separate copies of SQLite.
The solution was to change the application build procedures to link against
just one copy of SQLite instead of two.</p>

<h2> Two processes using different locking protocols</h2>

<p>The default locking mechanism used by SQLite on unix platforms is
POSIX advisory locking, but there are other options.  By selecting an
alternative [sqlite3_vfs] using the [sqlite3_open_v2()] interface, an
application can make use of other locking protocols that might be more
appropriate to certain filesystems.  For example, dot-file locking might
be select for use in an application that has to run on an NFS filesystem
that does not support POSIX advisory locking.</p>

<p>It is important that all connections to the same database file use 
the same locking protocol.
If one application is using POSIX advisory locks and another application
is using dot-file locking, then the two applications will not see each
other's locks and will not be able to coordinate database access, possibly
leading to database corruption.</p>

<tcl>hd_fragment unlink {unlink corruption} {unlinked database files}</tcl>
<h2> Unlinking or renaming a database file while in use</h2>

<p>If two processes have open connections to the same database file and
one process closes its connection, unlinks the file, then creates a new
database file in its place with the same name and reopens the new file,
then the two processes will be talking to different database files with 
the same name.  (Note that this is only possible on Posix and Posix-like
systems that permit a file to be unlinked while it is still open for
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
results in behavior that is undefined and probably undesirable.</p>

<p>Beginning with SQLite [version 3.7.17], the unix OS interface will
send SQLITE_WARNING messages to the [error log] if a database file is unlinked
while it is still in use.</p>

<tcl>hd_fragment alias {database filename aliasing}</tcl>
<h3>2.5 Multiple links to the same file</h3>

<p>If a single database file has multiple links (either hard or soft links)
then that is just another way of saying that the file has multiple names.
If two or more processes open the database using different names, then
they will use different rollback journals and WAL files.  That means that
if one process crashes, the other process will be unable to recover the
transaction in progress because it will be looking in the wrong place







|







201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
results in behavior that is undefined and probably undesirable.</p>

<p>Beginning with SQLite [version 3.7.17], the unix OS interface will
send SQLITE_WARNING messages to the [error log] if a database file is unlinked
while it is still in use.</p>

<tcl>hd_fragment alias {database filename aliasing}</tcl>
<h2> Multiple links to the same file</h2>

<p>If a single database file has multiple links (either hard or soft links)
then that is just another way of saying that the file has multiple names.
If two or more processes open the database using different names, then
they will use different rollback journals and WAL files.  That means that
if one process crashes, the other process will be unable to recover the
transaction in progress because it will be looking in the wrong place
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257

<p>Beginning with SQLite [version 3.10.0], the unix OS interface will
attempt to resolve symbolic links and open the database file by its
canonical name.  Prior to version 3.10.0, opening a database file 
through a symbolic link was similar to opening a database file
that had multiple hard links and resulted in undefined behavior.</p>

<h2>3.0 Failure to sync</h2>

<p>In order to guarantee that database files are always consistent, SQLite
will occasionally ask the operating system to flush all pending writes to
persistent storage then wait for that flush to complete.  This is 
accomplished using the <tt>fsync()</tt> system call under unix and
<tt>FlushFileBuffers()</tt> under Windows.  We call this flush of
pending writes a "sync".</p>

<p>Actually, if one is only concerned with atomic and consistent writes and
is willing to forego durable writes, the sync operation does not need
to wait until the content is completely stored on persistent media.  Instead,
the sync operation can be thought of as an I/O barrier.  As long as all
writes that occur before the sync are completed before any write that happens
after the sync, no database corruption will occur.  If sync is operating as
an I/O barrier and not as a true sync, then a power failure or system crash
might cause one or more previously committed transactions to roll back
(in violation of the "durable" property of "ACID") but the database will at
least continue to be consistent, and that is what most people care about.</p>

<h3>3.1 Disk drives that do not honor sync requests</h3>

<p>Unfortunately, most consumer-grade mass storage devices lie about
syncing.  Disk drives will report that content is safely on persistent
media as soon as it reaches the track buffer and before actually being
written to oxide.  This makes the disk drives seem to operate faster
(which is vitally important to the manufacturer so that they can show
good benchmark numbers in trade magazines).  And in fairness, the lie







|



















|







224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258

<p>Beginning with SQLite [version 3.10.0], the unix OS interface will
attempt to resolve symbolic links and open the database file by its
canonical name.  Prior to version 3.10.0, opening a database file 
through a symbolic link was similar to opening a database file
that had multiple hard links and resulted in undefined behavior.</p>

<h1> Failure to sync</h1>

<p>In order to guarantee that database files are always consistent, SQLite
will occasionally ask the operating system to flush all pending writes to
persistent storage then wait for that flush to complete.  This is 
accomplished using the <tt>fsync()</tt> system call under unix and
<tt>FlushFileBuffers()</tt> under Windows.  We call this flush of
pending writes a "sync".</p>

<p>Actually, if one is only concerned with atomic and consistent writes and
is willing to forego durable writes, the sync operation does not need
to wait until the content is completely stored on persistent media.  Instead,
the sync operation can be thought of as an I/O barrier.  As long as all
writes that occur before the sync are completed before any write that happens
after the sync, no database corruption will occur.  If sync is operating as
an I/O barrier and not as a true sync, then a power failure or system crash
might cause one or more previously committed transactions to roll back
(in violation of the "durable" property of "ACID") but the database will at
least continue to be consistent, and that is what most people care about.</p>

<h2> Disk drives that do not honor sync requests</h2>

<p>Unfortunately, most consumer-grade mass storage devices lie about
syncing.  Disk drives will report that content is safely on persistent
media as soon as it reaches the track buffer and before actually being
written to oxide.  This makes the disk drives seem to operate faster
(which is vitally important to the manufacturer so that they can show
good benchmark numbers in trade magazines).  And in fairness, the lie
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
out-of-order writes than in the default rollback journal modes.  In WAL
mode, the only time that a failed sync operation can cause database corruption
is during a [checkpoint] operation.  A sync failure during a COMMIT might
result in loss of durability but not in a corrupt database file.  Hence,
one line of defense against database corruption due to failed sync operations
is to use SQLite in WAL mode and to checkpoint as infrequently as possible.</p>

<h3>3.2 Disabling sync using PRAGMAs</h3>

<p>The sync operations that SQLite performs to help ensure integrity
can be disabled at run-time using the [synchronous pragma].  By setting
PRAGMA synchronous=OFF, all sync operations are omitted.  This makes
SQLite seem to run faster, but it also allows the operating system to freely
reorder writes, which could result in database corruption if a power failure
or hard reset occurs prior to all content reaching persistent storage.</p>

<p>For maximum reliability and for robustness against database corruption,
SQLite should always be run with its default synchronous setting of FULL.</p>

<tcl>hd_fragment hardwarefault</tcl>
<h2>4.0 Disk Drive and Flash Memory Failures</h2>

<p>An SQLite database can become corrupt if the file content changes 
due to a disk drive or flash memory failure.  It is very rare, but disks 
will occasionally flip a bit in the middle of a sector.</p>

<h3>4.1 Non-powersafe flash memory controllers</h3>

<p>We are told that in some flash memory controllers the wear-leveling logic
can cause random filesystem damage if power is interrupted during a write.
This can manifest, for example, as random changes in the middle of a file
that was not even open at the time of the power loss.  So, for example,
a device would be writing content into an MP3 file in flash memory when a
power loss occurs, and that could result in an SQLite database being
corrupted even though the database as not even in use at the time of the
power loss.</p>

<tcl>hd_fragment fakeusb</tcl>
<h3>4.2 Fake capacity USB sticks</h3>

<p>There are many fraudulent USB sticks in circulation that report to have
a high capacity (ex: 8GB) but are really only capable of storing a much
smaller amount (ex: 1GB).   Attempts to write on these devices will
often result in unrelated files being overwritten.  Any use of a fraudulent
flash memory device can easily lead to database corruption, therefore.
Internet searches such as "fake capacity usb" will turn up lots of
disturbing information about this problem.

<h2>5.0 Memory corruption</h2>

<p>SQLite is a C-library that runs in the same address space as the 
application that it serves.  That means that stray pointers, buffer
overruns, heap corruption, or other malfunctions in the application can
corrupt internal SQLite data structure and ultimately result in a
corrupt database file.  Normally these kinds of problems manifest themselves
as segfaults prior to any database corruption occurring, but there have
been instances where application code errors have caused SQLite to
malfunction subtly so as to corrupt the database file rather than
panicking.</p>

<p>The memory corruption problem becomes more acute when
using [memory-mapped I/O].
When all or part of the database file is mapped into the application's
address space, then a stray pointer the overwrites any part of that
mapped space will immediately corrupt the database file, without
requiring the application to do a subsequent write() system call.</p>

<h2>6.0 Other operating system problems</h2>

<p>Sometimes operating systems will exhibit non-standard behavior which
can lead to problems.  Sometimes this non-standard behavior is deliberate,
and sometimes it is a mistake in the implementation.  But in any event,
if the operating performs differently from they way SQLite expects it to
perform, the possibility of database corruption exists.</p>

<h3>6.1 Linux Threads</h3>

<p>Some older versions of Linux used the LinuxThreads library for thread
support.  LinuxThreads is similar to Pthreads, but is subtly different
with respect to handling of POSIX advisory locks.  SQLite versions
2.2.3 through 3.6.23 recognized that LinuxThreads where being used at
runtime and took appropriate action to work around the non-standard
behavior of LinuxThreads.  But most modern Linux implementations make
use of the newer, and correct, NPTL implementation of Pthreads.  Beginning
with SQLite version 3.7.0, the use of NPTL is assumed.  No checks are 
made.  Hence, recent versions of SQLite will subtly malfunction and may 
corrupt database files if used in multi-threaded application that run
on older linux systems that make use of LinuxThreads.</p>

<h3>6.2 Failures of mmap() on QNX</h3>

<p>There exists some subtle problem with mmap() on QNX such that making
a second mmap() call against a single file descriptor can cause
the memory obtained from the first mmap() call to be zeroed.  SQLite on
unix uses mmap() to create a shared memory region for transaction 
coordination in [WAL | WAL mode], and it will call mmap() multiple times
for large transactions.  The QNX mmap() has been demonstrated to corrupt
database file under that scenario.  QNX engineers are aware of this problem
and are working on a solution; the problem may have already been fixed by
the time you read this.</p>

<p>When running on QNX, it is recommended that [memory-mapped I/O] never
be used.  Furthermore, to use [WAL mode], it is recommended that applications
employ the [locking_mode | exclusive locking mode] in order to 
use [WAL without shared memory].


<tcl>hd_fragment fscorruption {filesystem corruption}</tcl>
<h3>6.3 Filesystem Corruption</h3>

<p>Since SQLite databases are ordinary disk files, any malfunction in the
filesystem can corrupt the database.  Filesystems in modern operating systems
are very reliable, but errors do still occur.  For example, on 2013-10-01
the SQLite database that holds the
<a href="http://wiki.tcl-lang.org/">Wiki for Tcl/Tk</a> went corrupt a few days
after the host computer was moved to a dodgy build of the (linux) kernel
that had issues in the filesystem layer.  In that event, the filesystem
eventually became so badly corrupted that the machine was unusable, but
the earliest symptom of trouble was the corrupted SQLite database.</p>

<h2>7.0 Bugs in SQLite</h2>

<p>SQLite is [testing | very carefully tested] to help ensure that it is
as bug-free as possible.  Among the many tests that are carried out for
every SQLite version are tests that simulate power failures, I/O errors,
and out-of-memory (OOM) errors and verify that no database corrupt occurs
during any of these events.  SQLite is also field-proven with approximately
two billion active deployments with no serious problems.</p>







|












|





|











|









|


















|







|













|


















|











|







279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
out-of-order writes than in the default rollback journal modes.  In WAL
mode, the only time that a failed sync operation can cause database corruption
is during a [checkpoint] operation.  A sync failure during a COMMIT might
result in loss of durability but not in a corrupt database file.  Hence,
one line of defense against database corruption due to failed sync operations
is to use SQLite in WAL mode and to checkpoint as infrequently as possible.</p>

<h2> Disabling sync using PRAGMAs</h2>

<p>The sync operations that SQLite performs to help ensure integrity
can be disabled at run-time using the [synchronous pragma].  By setting
PRAGMA synchronous=OFF, all sync operations are omitted.  This makes
SQLite seem to run faster, but it also allows the operating system to freely
reorder writes, which could result in database corruption if a power failure
or hard reset occurs prior to all content reaching persistent storage.</p>

<p>For maximum reliability and for robustness against database corruption,
SQLite should always be run with its default synchronous setting of FULL.</p>

<tcl>hd_fragment hardwarefault</tcl>
<h1> Disk Drive and Flash Memory Failures</h1>

<p>An SQLite database can become corrupt if the file content changes 
due to a disk drive or flash memory failure.  It is very rare, but disks 
will occasionally flip a bit in the middle of a sector.</p>

<h2> Non-powersafe flash memory controllers</h2>

<p>We are told that in some flash memory controllers the wear-leveling logic
can cause random filesystem damage if power is interrupted during a write.
This can manifest, for example, as random changes in the middle of a file
that was not even open at the time of the power loss.  So, for example,
a device would be writing content into an MP3 file in flash memory when a
power loss occurs, and that could result in an SQLite database being
corrupted even though the database as not even in use at the time of the
power loss.</p>

<tcl>hd_fragment fakeusb</tcl>
<h2> Fake capacity USB sticks</h2>

<p>There are many fraudulent USB sticks in circulation that report to have
a high capacity (ex: 8GB) but are really only capable of storing a much
smaller amount (ex: 1GB).   Attempts to write on these devices will
often result in unrelated files being overwritten.  Any use of a fraudulent
flash memory device can easily lead to database corruption, therefore.
Internet searches such as "fake capacity usb" will turn up lots of
disturbing information about this problem.

<h1> Memory corruption</h1>

<p>SQLite is a C-library that runs in the same address space as the 
application that it serves.  That means that stray pointers, buffer
overruns, heap corruption, or other malfunctions in the application can
corrupt internal SQLite data structure and ultimately result in a
corrupt database file.  Normally these kinds of problems manifest themselves
as segfaults prior to any database corruption occurring, but there have
been instances where application code errors have caused SQLite to
malfunction subtly so as to corrupt the database file rather than
panicking.</p>

<p>The memory corruption problem becomes more acute when
using [memory-mapped I/O].
When all or part of the database file is mapped into the application's
address space, then a stray pointer the overwrites any part of that
mapped space will immediately corrupt the database file, without
requiring the application to do a subsequent write() system call.</p>

<h1> Other operating system problems</h1>

<p>Sometimes operating systems will exhibit non-standard behavior which
can lead to problems.  Sometimes this non-standard behavior is deliberate,
and sometimes it is a mistake in the implementation.  But in any event,
if the operating performs differently from they way SQLite expects it to
perform, the possibility of database corruption exists.</p>

<h2> Linux Threads</h2>

<p>Some older versions of Linux used the LinuxThreads library for thread
support.  LinuxThreads is similar to Pthreads, but is subtly different
with respect to handling of POSIX advisory locks.  SQLite versions
2.2.3 through 3.6.23 recognized that LinuxThreads where being used at
runtime and took appropriate action to work around the non-standard
behavior of LinuxThreads.  But most modern Linux implementations make
use of the newer, and correct, NPTL implementation of Pthreads.  Beginning
with SQLite version 3.7.0, the use of NPTL is assumed.  No checks are 
made.  Hence, recent versions of SQLite will subtly malfunction and may 
corrupt database files if used in multi-threaded application that run
on older linux systems that make use of LinuxThreads.</p>

<h2> Failures of mmap() on QNX</h2>

<p>There exists some subtle problem with mmap() on QNX such that making
a second mmap() call against a single file descriptor can cause
the memory obtained from the first mmap() call to be zeroed.  SQLite on
unix uses mmap() to create a shared memory region for transaction 
coordination in [WAL | WAL mode], and it will call mmap() multiple times
for large transactions.  The QNX mmap() has been demonstrated to corrupt
database file under that scenario.  QNX engineers are aware of this problem
and are working on a solution; the problem may have already been fixed by
the time you read this.</p>

<p>When running on QNX, it is recommended that [memory-mapped I/O] never
be used.  Furthermore, to use [WAL mode], it is recommended that applications
employ the [locking_mode | exclusive locking mode] in order to 
use [WAL without shared memory].


<tcl>hd_fragment fscorruption {filesystem corruption}</tcl>
<h2> Filesystem Corruption</h2>

<p>Since SQLite databases are ordinary disk files, any malfunction in the
filesystem can corrupt the database.  Filesystems in modern operating systems
are very reliable, but errors do still occur.  For example, on 2013-10-01
the SQLite database that holds the
<a href="http://wiki.tcl-lang.org/">Wiki for Tcl/Tk</a> went corrupt a few days
after the host computer was moved to a dodgy build of the (linux) kernel
that had issues in the filesystem layer.  In that event, the filesystem
eventually became so badly corrupted that the machine was unusable, but
the earliest symptom of trouble was the corrupted SQLite database.</p>

<h1> Bugs in SQLite</h1>

<p>SQLite is [testing | very carefully tested] to help ensure that it is
as bug-free as possible.  Among the many tests that are carried out for
every SQLite version are tests that simulate power failures, I/O errors,
and out-of-memory (OOM) errors and verify that no database corrupt occurs
during any of these events.  SQLite is also field-proven with approximately
two billion active deployments with no serious problems.</p>
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
of all database-corruption bugs found in SQLite during the
four-year period from 2009-04-01 to 2013-04-15.
This account should give the reader an intuitive sense of the
kinds of bugs in SQLite that manage to slip through testing procedures
and make it into a release.</p>


<h3>7.1 False corruption reports due to database shrinkage</h3>

<p>If a database is written by SQLite version 3.7.0 or later and then
written again by SQLite version 3.6.23 or earlier in such a way as to
make the size of the database file decrease, then the next time that
SQLite version 3.7.0 access the database file, it might report that the
database file is corrupt.  The database file is not really corrupt, however.
Version 3.7.0 was simply being overly zealous in its corruption detection.</p>

<p>The problem was fixed on 2011-02-20.  The fix first appears in
SQLite version 3.7.6.</p>

<h3>7.2 Corruption following switches between rollback and WAL modes</h3>

<p>Repeatedly switching an SQLite database in and out of [WAL | WAL mode]
and running the [VACUUM] command in between switches, in one process or
thread, can cause another process or thread that has the database file
open to miss the fact that the database has changed.  That second process
or thread might then try to modify the database using a stale cache and
cause database corruption.</p>

<p>This problem was discovered during internal testing and has never been
observed in the wild.  The problem was fixed on 2011-01-27 and in version
3.7.5.</p>

<h3>7.3 I/O while obtaining a lock leads to corruption</h3>

<p>If the operating system returns an I/O error while attempting to obtain
a certain lock on shared memory in [WAL | WAL mode] then SQLite might fail 
to reset its cache,
which could lead to database corruption if subsequent writes are attempted.</p>

<p>Note that this problem only occurs if the attempt to acquire the lock
resulted in an I/O error.  If the lock is simply not granted (because some
other thread or process is already holding a conflicting lock) then no
corruption will ever occur.  We are not aware of any operating systems that
will fail with an I/O error while attempting to get a file lock on shared
memory.  So this is a theoretical problem rather than a real problem.
Needless to say, this problem has never been observed in the wild.  The
problem was discovered while doing stress testing of SQLite in a test
harness that simulates I/O errors.</p>

<p>This problem was fixed on 2010-09-20 for SQLite version 3.7.3.</p>

<h3>7.4 Database pages leak from the free page list</h3>

<p>When content is deleted from an SQLite database, pages that are no
longer used are added to a free list and are reused to hold content
added by subsequent inserts.  A bug in SQLite that was present in
version 3.6.16 through 3.7.2 might cause pages to go missing out of
the free list when [incremental_vacuum] was used.  This would not cause
data loss.  But it would result in the database file being larger than
necessary.  And it would cause the [integrity_check pragma] to report
pages missing from the free list.</p>

<p>This problem was fixed on 2010-08-23 for SQLite version 3.7.2.</p>

<h3>7.5 Corruption following alternating writes from 3.6 and 3.7.</h3>

<p>SQLite version 3.7.0 introduced a number of new enhancements to
the SQLite database file format (such as but not limited to [WAL]).
The 3.7.0 release was a shake-out release for these new features. 
We expected to find problems and were not disappointed.</p>

<p>If a database were originally created using SQLite version 3.7.0,
then written by SQLite version 3.6.23.1 such that the size of the database
file increased, then written again by SQLite version 3.7.0, the database
file could go corrupt.</p>

<p>This problem was fixed on 2010-08-04 for SQLite version 3.7.1.</p>

<h3>7.6 Race condition in recovery on windows system.</h3>

<p>SQLite version 3.7.16.2 fixes a subtle race condition in the locking
logic on Windows systems.  When a database file is in need
of recovery because the previous process writing to it crashed in the
middle of a transaction and two or more processes try to open the 
that database at the same time, then the race condition might cause
one of those processes to get a false indication that the recovery 







|











|












|


















|












|













|







415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
of all database-corruption bugs found in SQLite during the
four-year period from 2009-04-01 to 2013-04-15.
This account should give the reader an intuitive sense of the
kinds of bugs in SQLite that manage to slip through testing procedures
and make it into a release.</p>


<h2> False corruption reports due to database shrinkage</h2>

<p>If a database is written by SQLite version 3.7.0 or later and then
written again by SQLite version 3.6.23 or earlier in such a way as to
make the size of the database file decrease, then the next time that
SQLite version 3.7.0 access the database file, it might report that the
database file is corrupt.  The database file is not really corrupt, however.
Version 3.7.0 was simply being overly zealous in its corruption detection.</p>

<p>The problem was fixed on 2011-02-20.  The fix first appears in
SQLite version 3.7.6.</p>

<h2> Corruption following switches between rollback and WAL modes</h2>

<p>Repeatedly switching an SQLite database in and out of [WAL | WAL mode]
and running the [VACUUM] command in between switches, in one process or
thread, can cause another process or thread that has the database file
open to miss the fact that the database has changed.  That second process
or thread might then try to modify the database using a stale cache and
cause database corruption.</p>

<p>This problem was discovered during internal testing and has never been
observed in the wild.  The problem was fixed on 2011-01-27 and in version
3.7.5.</p>

<h2> I/O while obtaining a lock leads to corruption</h2>

<p>If the operating system returns an I/O error while attempting to obtain
a certain lock on shared memory in [WAL | WAL mode] then SQLite might fail 
to reset its cache,
which could lead to database corruption if subsequent writes are attempted.</p>

<p>Note that this problem only occurs if the attempt to acquire the lock
resulted in an I/O error.  If the lock is simply not granted (because some
other thread or process is already holding a conflicting lock) then no
corruption will ever occur.  We are not aware of any operating systems that
will fail with an I/O error while attempting to get a file lock on shared
memory.  So this is a theoretical problem rather than a real problem.
Needless to say, this problem has never been observed in the wild.  The
problem was discovered while doing stress testing of SQLite in a test
harness that simulates I/O errors.</p>

<p>This problem was fixed on 2010-09-20 for SQLite version 3.7.3.</p>

<h2> Database pages leak from the free page list</h2>

<p>When content is deleted from an SQLite database, pages that are no
longer used are added to a free list and are reused to hold content
added by subsequent inserts.  A bug in SQLite that was present in
version 3.6.16 through 3.7.2 might cause pages to go missing out of
the free list when [incremental_vacuum] was used.  This would not cause
data loss.  But it would result in the database file being larger than
necessary.  And it would cause the [integrity_check pragma] to report
pages missing from the free list.</p>

<p>This problem was fixed on 2010-08-23 for SQLite version 3.7.2.</p>

<h2> Corruption following alternating writes from 3.6 and 3.7.</h2>

<p>SQLite version 3.7.0 introduced a number of new enhancements to
the SQLite database file format (such as but not limited to [WAL]).
The 3.7.0 release was a shake-out release for these new features. 
We expected to find problems and were not disappointed.</p>

<p>If a database were originally created using SQLite version 3.7.0,
then written by SQLite version 3.6.23.1 such that the size of the database
file increased, then written again by SQLite version 3.7.0, the database
file could go corrupt.</p>

<p>This problem was fixed on 2010-08-04 for SQLite version 3.7.1.</p>

<h2> Race condition in recovery on windows system.</h2>

<p>SQLite version 3.7.16.2 fixes a subtle race condition in the locking
logic on Windows systems.  When a database file is in need
of recovery because the previous process writing to it crashed in the
middle of a transaction and two or more processes try to open the 
that database at the same time, then the race condition might cause
one of those processes to get a false indication that the recovery 
Changes to pages/malloc.in.
1
2
3
4


5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
<title>Dynamic Memory Allocation In SQLite</title>

<h1>Dynamic Memory Allocation In SQLite</h1>
<tcl>hd_keywords {memory allocation}</tcl>



<p>SQLite uses dynamic memory allocation to obtain
memory for storing various objects
(ex: [database connections] and [prepared statements]) and to build
a memory cache of the database file and to hold the results of queries.
Much effort has gone into making the dynamic memory allocation subsystem
of SQLite reliable, predictable, robust, secure, and efficient.</p>

<p>This document provides an overview of dynamic memory allocation within 
SQLite.  The target audience is software engineers who are tuning their
use of SQLite for peak performance in demanding environments.
Nothing in this document is required knowledge for using SQLite.  The
default settings and configuration for SQLite will work well in most
applications.  However, the information contained in this document may
be useful to engineers who are tuning SQLite to comply with special
requirements or to run under unusual circumstances.</p>

<a name="features"></a>
<h2>1.0 Features</h2>

<p>The SQLite core and its memory allocation subsystem provides the 
following capabilities:</p>

<ul>
<li><p>
<b>Robust against allocation failures.</b>


|

>
>


















|







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
<title>Dynamic Memory Allocation In SQLite</title>

<table_of_contents>
<tcl>hd_keywords {memory allocation}</tcl>

<h1 style="margin-left:1.0em" notoc id=overview> Overview</h1> 

<p>SQLite uses dynamic memory allocation to obtain
memory for storing various objects
(ex: [database connections] and [prepared statements]) and to build
a memory cache of the database file and to hold the results of queries.
Much effort has gone into making the dynamic memory allocation subsystem
of SQLite reliable, predictable, robust, secure, and efficient.</p>

<p>This document provides an overview of dynamic memory allocation within 
SQLite.  The target audience is software engineers who are tuning their
use of SQLite for peak performance in demanding environments.
Nothing in this document is required knowledge for using SQLite.  The
default settings and configuration for SQLite will work well in most
applications.  However, the information contained in this document may
be useful to engineers who are tuning SQLite to comply with special
requirements or to run under unusual circumstances.</p>

<a name="features"></a>
<h1> Features</h1>

<p>The SQLite core and its memory allocation subsystem provides the 
following capabilities:</p>

<ul>
<li><p>
<b>Robust against allocation failures.</b>
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
routines used by SQLite through the
[sqlite3_malloc()], [sqlite3_realloc()], and [sqlite3_free()] interfaces.
</p></li>

</ul>

<a name="testing"></a>
<h2>2.0 Testing</h2>

<p>Most of the code in the SQLite source tree is devoted purely to 
[testing | testing and verification].  Reliability is important to SQLite.
Among the tasks of the test infrastructure is to ensure that
SQLite does not misuse dynamically allocated memory, that SQLite
does not leak memory, and that SQLite responds
correctly to a dynamic memory allocation failure.</p>







|







112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
routines used by SQLite through the
[sqlite3_malloc()], [sqlite3_realloc()], and [sqlite3_free()] interfaces.
</p></li>

</ul>

<a name="testing"></a>
<h1> Testing</h1>

<p>Most of the code in the SQLite source tree is devoted purely to 
[testing | testing and verification].  Reliability is important to SQLite.
Among the tasks of the test infrastructure is to ensure that
SQLite does not misuse dynamically allocated memory, that SQLite
does not leak memory, and that SQLite responds
correctly to a dynamic memory allocation failure.</p>
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
the [TCL test suite] provides over 99% statement test coverage and that
the [TH3] test harness provides [test coverage | 100% branch test coverage]
with no leak leaks. This is
strong evidence that dynamic memory allocation is used correctly
everywhere within SQLite.</p>

<a name="allocarray"></a>
<h3>2.1 Use of reallocarray()</h3>

<p>The reallocarray() interface is a recent innovation (circa 2014)
from the OpenBSD community that grow out of efforts to prevent the
next [http://heartbleed.com | "heartbleed" bug] by avoiding 32-bit integer
arithmetic overflow on memory allocation size computations.  The
reallocarray() function has both unit-size and count parameters.
To allocate memory sufficient to hold an array of N elements each X-bytes







|







198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
the [TCL test suite] provides over 99% statement test coverage and that
the [TH3] test harness provides [test coverage | 100% branch test coverage]
with no leak leaks. This is
strong evidence that dynamic memory allocation is used correctly
everywhere within SQLite.</p>

<a name="allocarray"></a>
<h2> Use of reallocarray()</h2>

<p>The reallocarray() interface is a recent innovation (circa 2014)
from the OpenBSD community that grow out of efforts to prevent the
next [http://heartbleed.com | "heartbleed" bug] by avoiding 32-bit integer
arithmetic overflow on memory allocation size computations.  The
reallocarray() function has both unit-size and count parameters.
To allocate memory sufficient to hold an array of N elements each X-bytes
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
also verifies that it is impossible to overflow a 64-bit integer
during the computation.</p>

<p>The code audits used to ensure that memory allocation size computations
do not overflow in SQLite are repeated prior to every SQLite release.</p>

<a name="config"></a>
<h2>3.0 Configuration</h2>

<p>The default memory allocation settings in SQLite are appropriate
for most applications.  However, applications with unusual or particularly
strict requirements may want to adjust the configuration to more closely
align SQLite to their needs.
Both compile-time and start-time configuration options are available.</p>

<tcl>hd_fragment altalloc {built-in memory allocators}</tcl>
<h3>3.1 Alternative low-level memory allocators</h3>

<p>The SQLite source code includes several different memory allocation
modules that can be selected at compile-time, or to a limited extent
at start-time.</p>

<tcl>hd_fragment defaultalloc {default memory allocator}</tcl>
<h4>3.1.1 The default memory allocator</h4>

<p>By default, SQLite uses the malloc(), realloc(), and free() routines
from the standard C library for its memory allocation needs.  These routines
are surrounded by a thin wrapper that also provides a "memsize()" function
that will return the size of an existing allocation.  The memsize() function
is needed to keep an accurate count of the number of bytes of outstanding
memory; memsize() determines how many bytes to remove from the outstanding
count when an allocation is freed.  The default allocator implements
memsize() by always allocating 8 extra bytes on each malloc() request and
storing the size of the allocation in that 8-byte header.</p>

<p>The default memory allocator is recommended for most applications.
If you do not have a compelling need to use an alternative memory
allocator, then use the default.</p>

<tcl>hd_fragment memdebug {debugging memory allocator} *memsys2</tcl>
<h4>3.1.2 The debugging memory allocator</h4>

<p>If SQLite is compiled with the [SQLITE_MEMDEBUG] compile-time option,
then a different, heavy wrapper is used around system malloc(), realloc(), 
and free().
The heavy wrapper allocates around 100 bytes of extra space
with each allocation.  The extra space is used to place sentinel values 
at both ends of the allocation returned to the SQLite core.  When an







|








|






|
















|







237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
also verifies that it is impossible to overflow a 64-bit integer
during the computation.</p>

<p>The code audits used to ensure that memory allocation size computations
do not overflow in SQLite are repeated prior to every SQLite release.</p>

<a name="config"></a>
<h1> Configuration</h1>

<p>The default memory allocation settings in SQLite are appropriate
for most applications.  However, applications with unusual or particularly
strict requirements may want to adjust the configuration to more closely
align SQLite to their needs.
Both compile-time and start-time configuration options are available.</p>

<tcl>hd_fragment altalloc {built-in memory allocators}</tcl>
<h2> Alternative low-level memory allocators</h2>

<p>The SQLite source code includes several different memory allocation
modules that can be selected at compile-time, or to a limited extent
at start-time.</p>

<tcl>hd_fragment defaultalloc {default memory allocator}</tcl>
<h3>The default memory allocator</h3>

<p>By default, SQLite uses the malloc(), realloc(), and free() routines
from the standard C library for its memory allocation needs.  These routines
are surrounded by a thin wrapper that also provides a "memsize()" function
that will return the size of an existing allocation.  The memsize() function
is needed to keep an accurate count of the number of bytes of outstanding
memory; memsize() determines how many bytes to remove from the outstanding
count when an allocation is freed.  The default allocator implements
memsize() by always allocating 8 extra bytes on each malloc() request and
storing the size of the allocation in that 8-byte header.</p>

<p>The default memory allocator is recommended for most applications.
If you do not have a compelling need to use an alternative memory
allocator, then use the default.</p>

<tcl>hd_fragment memdebug {debugging memory allocator} *memsys2</tcl>
<h3>The debugging memory allocator</h3>

<p>If SQLite is compiled with the [SQLITE_MEMDEBUG] compile-time option,
then a different, heavy wrapper is used around system malloc(), realloc(), 
and free().
The heavy wrapper allocates around 100 bytes of extra space
with each allocation.  The extra space is used to place sentinel values 
at both ends of the allocation returned to the SQLite core.  When an
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325

<p>The heavy wrapper employed by [SQLITE_MEMDEBUG] is intended for use
only during testing, analysis, and debugging of SQLite.  The heavy wrapper
has a significant performance and memory overhead and probably should not
be used in production.</p>

<tcl>hd_fragment win32heap {Win32 native memory allocator}</tcl>
<h4>3.1.3 The Win32 native memory allocator</h4>

<p>If SQLite is compiled for Windows with the [SQLITE_WIN32_MALLOC]
compile-time option, then a different, thin wrapper is used around
HeapAlloc(), HeapReAlloc(), and HeapFree().  The thin wrapper uses the
configured SQLite heap, which will be different from the default process
heap if the [SQLITE_WIN32_HEAP_CREATE] compile-time option is used.  In
addition, when an allocation is made or freed, HeapValidate() will be
called if SQLite is compiled with assert() enabled and the
[SQLITE_WIN32_MALLOC_VALIDATE] compile-time option.</p>

<tcl>hd_fragment memsys5 *memsys5 {zero-malloc memory allocator}</tcl>
<h4>3.1.4 Zero-malloc memory allocator</h4>

<p>When SQLite is compiled with the [SQLITE_ENABLE_MEMSYS5] option, an
alternative memory allocator that does not use malloc() is included in the
build.  The SQLite developers refer to this alternative memory allocator
as "memsys5".  Even when it is included in the build, memsys5 is 
disabled by default.
To enable memsys5, the application must invoke the following SQLite 







|











|







301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327

<p>The heavy wrapper employed by [SQLITE_MEMDEBUG] is intended for use
only during testing, analysis, and debugging of SQLite.  The heavy wrapper
has a significant performance and memory overhead and probably should not
be used in production.</p>

<tcl>hd_fragment win32heap {Win32 native memory allocator}</tcl>
<h3>The Win32 native memory allocator</h3>

<p>If SQLite is compiled for Windows with the [SQLITE_WIN32_MALLOC]
compile-time option, then a different, thin wrapper is used around
HeapAlloc(), HeapReAlloc(), and HeapFree().  The thin wrapper uses the
configured SQLite heap, which will be different from the default process
heap if the [SQLITE_WIN32_HEAP_CREATE] compile-time option is used.  In
addition, when an allocation is made or freed, HeapValidate() will be
called if SQLite is compiled with assert() enabled and the
[SQLITE_WIN32_MALLOC_VALIDATE] compile-time option.</p>

<tcl>hd_fragment memsys5 *memsys5 {zero-malloc memory allocator}</tcl>
<h3>Zero-malloc memory allocator</h3>

<p>When SQLite is compiled with the [SQLITE_ENABLE_MEMSYS5] option, an
alternative memory allocator that does not use malloc() is included in the
build.  The SQLite developers refer to this alternative memory allocator
as "memsys5".  Even when it is included in the build, memsys5 is 
disabled by default.
To enable memsys5, the application must invoke the following SQLite 
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
requests are rounded up to a power of two and the request is satisfied
by the first free slot in pBuf that is large enough.  Adjacent freed
allocations are coalesced using a buddy system. When used appropriately,
this algorithm provides mathematical guarantees against fragmentation and
breakdown, as described further <a href="#nofrag">below</a>.</p>

<tcl>hd_fragment memsysx {experimental memory allocators}</tcl>
<h4>3.1.5 Experimental memory allocators</h4>

<p>The name "memsys5" used for the zero-malloc memory allocator implies
that there are several additional memory allocators available, and indeed
there are.  The default memory allocator is "memsys1".  The debugging
memory allocator is "memsys2".  Those have already been covered.</p>

<p>If SQLite is compiled with [SQLITE_ENABLE_MEMSYS3] then another







|







353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
requests are rounded up to a power of two and the request is satisfied
by the first free slot in pBuf that is large enough.  Adjacent freed
allocations are coalesced using a buddy system. When used appropriately,
this algorithm provides mathematical guarantees against fragmentation and
breakdown, as described further <a href="#nofrag">below</a>.</p>

<tcl>hd_fragment memsysx {experimental memory allocators}</tcl>
<h3>Experimental memory allocators</h3>

<p>The name "memsys5" used for the zero-malloc memory allocator implies
that there are several additional memory allocators available, and indeed
there are.  The default memory allocator is "memsys1".  The debugging
memory allocator is "memsys2".  Those have already been covered.</p>

<p>If SQLite is compiled with [SQLITE_ENABLE_MEMSYS3] then another
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
[version 3.6.5].</p>

<p>Other experimental memory allocators might be added in future releases
of SQLite.  One may anticipate that these will be called memsys7, memsys8,
and so forth.</p>

<a name="appalloc"></a>
<h4>3.1.6 Application-defined memory allocators</h4>

<p>New memory allocators do not have to be part of the SQLite source tree
nor included in the sqlite3.c [amalgamation].  Individual applications can
supply their own memory allocators to SQLite at start-time.</p>

<p>To cause SQLite to use a new memory allocator, the application
simply calls:</p>







|







410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
[version 3.6.5].</p>

<p>Other experimental memory allocators might be added in future releases
of SQLite.  One may anticipate that these will be called memsys7, memsys8,
and so forth.</p>

<a name="appalloc"></a>
<h3>Application-defined memory allocators</h3>

<p>New memory allocators do not have to be part of the SQLite source tree
nor included in the sqlite3.c [amalgamation].  Individual applications can
supply their own memory allocators to SQLite at start-time.</p>

<p>To cause SQLite to use a new memory allocator, the application
simply calls:</p>
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447

<p>In a multi-threaded application, access to the [sqlite3_mem_methods]
is serialized if and only if [SQLITE_CONFIG_MEMSTATUS] is enabled.
If [SQLITE_CONFIG_MEMSTATUS] is disabled then the methods in
[sqlite3_mem_methods] must take care of their own serialization needs.</p>

<a name="overlayalloc"></a>
<h4>3.1.7 Memory allocator overlays</h4>

<p>An application can insert layers or "overlays" in between the
SQLite core and the underlying memory allocator.
For example, the <a href="#oomtesting">out-of-memory test logic</a>
for SQLite uses an overlay that can simulate memory allocation
failures.</p>








|







435
436
437
438
439
440
441
442
443
444
445
446
447
448
449

<p>In a multi-threaded application, access to the [sqlite3_mem_methods]
is serialized if and only if [SQLITE_CONFIG_MEMSTATUS] is enabled.
If [SQLITE_CONFIG_MEMSTATUS] is disabled then the methods in
[sqlite3_mem_methods] must take care of their own serialization needs.</p>

<a name="overlayalloc"></a>
<h3>Memory allocator overlays</h3>

<p>An application can insert layers or "overlays" in between the
SQLite core and the underlying memory allocator.
For example, the <a href="#oomtesting">out-of-memory test logic</a>
for SQLite uses an overlay that can simulate memory allocation
failures.</p>

455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
The existing allocator is saved by the overlay and is used as
a fallback to do real memory allocation.  Then the overlay is
inserted in place of the existing memory allocator using
the [sqlite3_config]([SQLITE_CONFIG_MALLOC],...) as described
<a href="#appalloc">above</a>.

<a name="stuballoc"></a>
<h4>3.1.8 No-op memory allocator stub</h4>

<p>If SQLite is compiled with the [SQLITE_ZERO_MALLOC] option, then
the [default memory allocator] is omitted and replaced by a stub
memory allocator that never allocates any memory.  Any calls to the
stub memory allocator will report back that no memory is available.</p>

<p>The no-op memory allocator is not useful by itself.  It exists only
as a placeholder so that SQLite has a memory allocator to link against
on systems that may not have malloc(), free(), or realloc() in their
standard library.
An application that is compiled with [SQLITE_ZERO_MALLOC] will need to
use [sqlite3_config()] together with [SQLITE_CONFIG_MALLOC] or
[SQLITE_CONFIG_HEAP] to specify a new alternative memory allocator
before beginning to use SQLite.</p>

<tcl>hd_fragment scratch {scratch memory allocator}</tcl>
<h3>3.2 Scratch memory</h3>

<p>SQLite occasionally needs a large chunk of "scratch" memory to
perform some transient calculation.  Scratch memory is used, for example,
as temporary storage when rebalancing a B-Tree.  These scratch memory
allocations are typically about 10 kilobytes in size and are
transient - lasting
only for the duration of a single, short-lived function call.</p>







|
















|







457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
The existing allocator is saved by the overlay and is used as
a fallback to do real memory allocation.  Then the overlay is
inserted in place of the existing memory allocator using
the [sqlite3_config]([SQLITE_CONFIG_MALLOC],...) as described
<a href="#appalloc">above</a>.

<a name="stuballoc"></a>
<h3>No-op memory allocator stub</h3>

<p>If SQLite is compiled with the [SQLITE_ZERO_MALLOC] option, then
the [default memory allocator] is omitted and replaced by a stub
memory allocator that never allocates any memory.  Any calls to the
stub memory allocator will report back that no memory is available.</p>

<p>The no-op memory allocator is not useful by itself.  It exists only
as a placeholder so that SQLite has a memory allocator to link against
on systems that may not have malloc(), free(), or realloc() in their
standard library.
An application that is compiled with [SQLITE_ZERO_MALLOC] will need to
use [sqlite3_config()] together with [SQLITE_CONFIG_MALLOC] or
[SQLITE_CONFIG_HEAP] to specify a new alternative memory allocator
before beginning to use SQLite.</p>

<tcl>hd_fragment scratch {scratch memory allocator}</tcl>
<h2> Scratch memory</h2>

<p>SQLite occasionally needs a large chunk of "scratch" memory to
perform some transient calculation.  Scratch memory is used, for example,
as temporary storage when rebalancing a B-Tree.  These scratch memory
allocations are typically about 10 kilobytes in size and are
transient - lasting
only for the duration of a single, short-lived function call.</p>
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530

<p>If the scratch memory setup does not define enough memory, then
SQLite falls back to using the regular memory allocator for its scratch
memory allocations.  The default setup is sz=0 and N=0 so the use
of the regular memory allocator is the default behavior.</p>

<tcl>hd_fragment pagecache {pagecache memory allocator}</tcl>
<h3>3.3 Page cache memory</h3>

<p>In most applications, the database page cache subsystem within 
SQLite uses more dynamically allocated memory than all other parts
of SQLite combined.  It is not unusual to see the database page cache
consumes over 10 times more memory than the rest of SQLite combined.</p>

<p>SQLite can be configured to make page cache memory allocations from







|







518
519
520
521
522
523
524
525
526
527
528
529
530
531
532

<p>If the scratch memory setup does not define enough memory, then
SQLite falls back to using the regular memory allocator for its scratch
memory allocations.  The default setup is sz=0 and N=0 so the use
of the regular memory allocator is the default behavior.</p>

<tcl>hd_fragment pagecache {pagecache memory allocator}</tcl>
<h2> Page cache memory</h2>

<p>In most applications, the database page cache subsystem within 
SQLite uses more dynamically allocated memory than all other parts
of SQLite combined.  It is not unusual to see the database page cache
consumes over 10 times more memory than the rest of SQLite combined.</p>

<p>SQLite can be configured to make page cache memory allocations from
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
number of available allocations.</p>

<p>If SQLite needs a page-cache entry that is larger than "sz" bytes or
if it needs more than N entries, it falls back to using the
general-purpose memory allocator.</p>

<tcl>hd_fragment lookaside {lookaside memory allocator}</tcl>
<h3>3.4 Lookaside memory allocator</h3>

<p>SQLite [database connections] make many
small and short-lived memory allocations.
This occurs most commonly when compiling SQL statements using
[sqlite3_prepare_v2()] but also to a lesser extent when running
[prepared statements] using [sqlite3_step()].  These small memory
allocations are used to hold things such as the names of tables







|







568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
number of available allocations.</p>

<p>If SQLite needs a page-cache entry that is larger than "sz" bytes or
if it needs more than N entries, it falls back to using the
general-purpose memory allocator.</p>

<tcl>hd_fragment lookaside {lookaside memory allocator}</tcl>
<h2> Lookaside memory allocator</h2>

<p>SQLite [database connections] make many
small and short-lived memory allocations.
This occurs most commonly when compiling SQL statements using
[sqlite3_prepare_v2()] but also to a lesser extent when running
[prepared statements] using [sqlite3_step()].  These small memory
allocations are used to hold things such as the names of tables
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
<p>The lookaside configuration can only be changed while there are
no outstanding lookaside allocations for the database connection.
Hence, the configuration should be set immediately after creating the 
database connection using [sqlite3_open()] (or equivalent) and before
evaluating any SQL statements on the connection.</p>

<tcl>hd_fragment memstatus {memory statistics}</tcl>
<h3>3.5 Memory status</h3>

<p>By default, SQLite keeps statistics on its memory usage.  These
statistics are useful in helping to determine how much memory an
application really needs.  The statistics can also be used in
high-reliability system to determine
if the memory usage is coming close to or exceeding the limits 
of the [Robson proof] and hence that the memory allocation subsystem is 







|







656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
<p>The lookaside configuration can only be changed while there are
no outstanding lookaside allocations for the database connection.
Hence, the configuration should be set immediately after creating the 
database connection using [sqlite3_open()] (or equivalent) and before
evaluating any SQL statements on the connection.</p>

<tcl>hd_fragment memstatus {memory statistics}</tcl>
<h2> Memory status</h2>

<p>By default, SQLite keeps statistics on its memory usage.  These
statistics are useful in helping to determine how much memory an
application really needs.  The statistics can also be used in
high-reliability system to determine
if the memory usage is coming close to or exceeding the limits 
of the [Robson proof] and hence that the memory allocation subsystem is 
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730

<p>The per-connection statistics do not use global variables and hence
do not require mutexes to update or access.  Consequently the
per-connection statistics continue to function even if
[SQLITE_CONFIG_MEMSTATUS] is turned off.</p>

<a name="heaplimit"></a>
<h3>3.6 Setting memory usage limits</h3>

<p>The [sqlite3_soft_heap_limit64()] interface can be used to set an
upper bound on the total amount of outstanding memory that the
general-purpose memory allocator for SQLite will allow to be outstanding
at one time.  If attempts are made to allocate more memory that specified
by the soft heap limit, then SQLite will first attempt to free cache
memory before continuing with the allocation request.  The soft heap







|







718
719
720
721
722
723
724
725
726
727
728
729
730
731
732

<p>The per-connection statistics do not use global variables and hence
do not require mutexes to update or access.  Consequently the
per-connection statistics continue to function even if
[SQLITE_CONFIG_MEMSTATUS] is turned off.</p>

<a name="heaplimit"></a>
<h2> Setting memory usage limits</h2>

<p>The [sqlite3_soft_heap_limit64()] interface can be used to set an
upper bound on the total amount of outstanding memory that the
general-purpose memory allocator for SQLite will allow to be outstanding
at one time.  If attempts are made to allocate more memory that specified
by the soft heap limit, then SQLite will first attempt to free cache
memory before continuing with the allocation request.  The soft heap
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
<p>As of SQLite version 3.6.1, the soft heap limit only applies to the
general-purpose memory allocator.  The soft heap limit does not know
about or interact with the [scratch memory allocator], 
the [pagecache memory allocator], or the [lookaside memory allocator].
This deficiency will likely be addressed in a future release.</p>

<tcl>hd_fragment nofrag {Robson proof}</tcl>
<h2>4.0 Mathematical Guarantees Against Memory Allocation Failures</h2>

<p>The problem of dynamic memory allocation, and specifically the
problem of a memory allocator breakdown, has been studied by
J. M. Robson and the results published as:</p>

<blockquote>
J. M. Robson.  "Bounds for Some Functions Concerning Dynamic







|







744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
<p>As of SQLite version 3.6.1, the soft heap limit only applies to the
general-purpose memory allocator.  The soft heap limit does not know
about or interact with the [scratch memory allocator], 
the [pagecache memory allocator], or the [lookaside memory allocator].
This deficiency will likely be addressed in a future release.</p>

<tcl>hd_fragment nofrag {Robson proof}</tcl>
<h1> Mathematical Guarantees Against Memory Allocation Failures</h1>

<p>The problem of dynamic memory allocation, and specifically the
problem of a memory allocator breakdown, has been studied by
J. M. Robson and the results published as:</p>

<blockquote>
J. M. Robson.  "Bounds for Some Functions Concerning Dynamic
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
for <b>N</b> that will guarantee that no call to any SQLite interface
will ever return [SQLITE_NOMEM].  The memory pool will never become
so fragmented that a new memory allocation request cannot be satisfied.
This is an important property for
applications where a software fault could cause injury, physical harm, or
loss of irreplaceable data.</p>

<h3>4.1 Computing and controlling parameters <b>M</b> and <b>n</b></h3>

<p>The Robson proof applies separately to each of the memory allocators
used by SQLite:</p>

<ul>
<li>The general-purpose memory allocator ([memsys5]).</li>
<li>The [scratch memory allocator].</li>







|







824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
for <b>N</b> that will guarantee that no call to any SQLite interface
will ever return [SQLITE_NOMEM].  The memory pool will never become
so fragmented that a new memory allocation request cannot be satisfied.
This is an important property for
applications where a software fault could cause injury, physical harm, or
loss of irreplaceable data.</p>

<h2> Computing and controlling parameters <b>M</b> and <b>n</b></h2>

<p>The Robson proof applies separately to each of the memory allocators
used by SQLite:</p>

<ul>
<li>The general-purpose memory allocator ([memsys5]).</li>
<li>The [scratch memory allocator].</li>
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
[database connection] and [prepared statement] objects the application
uses, and on the complexity of the [prepared statements].  For any
given application, these factors are normally fixed and can be
determined experimentally using [SQLITE_STATUS_MEMORY_USED].
A typical application might only use about 40KB of general-purpose
memory.  This gives a value of <b>N</b> of around 100KB.</p>

<h3>4.2 Ductile failure</h3>

<p>If the memory allocation subsystems within SQLite are configured
for breakdown-free operation but the actual memory usage exceeds
design limits set by the [Robson proof], SQLite will usually continue 
to operate normally.
The [scratch memory allocator], the [pagecache memory allocator],
and the [lookaside memory allocator] all automatically failover







|







965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
[database connection] and [prepared statement] objects the application
uses, and on the complexity of the [prepared statements].  For any
given application, these factors are normally fixed and can be
determined experimentally using [SQLITE_STATUS_MEMORY_USED].
A typical application might only use about 40KB of general-purpose
memory.  This gives a value of <b>N</b> of around 100KB.</p>

<h2> Ductile failure</h2>

<p>If the memory allocation subsystems within SQLite are configured
for breakdown-free operation but the actual memory usage exceeds
design limits set by the [Robson proof], SQLite will usually continue 
to operate normally.
The [scratch memory allocator], the [pagecache memory allocator],
and the [lookaside memory allocator] all automatically failover
992
993
994
995
996
997
998
999
1000
1001
1002
1003
the application will provide operators with abundant warning well
in advance of failure.
The [memory statistics] interfaces of SQLite provide the application with
all the mechanism necessary to complete the monitoring portion of
this task.</p>

<a name="stability"></a>
<h2>5.0 Stability Of Memory Interfaces</h2>

<p><b>Update:</b> As of SQLite version 3.7.0 (2010-07-22), 
all of SQLite memory allocation interfaces
are considered stable and will be supported in future releases.</p>







|




994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
the application will provide operators with abundant warning well
in advance of failure.
The [memory statistics] interfaces of SQLite provide the application with
all the mechanism necessary to complete the monitoring portion of
this task.</p>

<a name="stability"></a>
<h1> Stability Of Memory Interfaces</h1>

<p><b>Update:</b> As of SQLite version 3.7.0 (2010-07-22), 
all of SQLite memory allocation interfaces
are considered stable and will be supported in future releases.</p>
Changes to pages/privatebranch.in.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
<title>Private Branches Of SQLite</title>
<tcl>hd_keywords {private branches}</tcl>

<h1 align="center">
Maintaining Private Branches Of SQLite
</h1>

<h2>1.0 Introduction</h2>

<p>SQLite is designed to meet most developer's needs without any 
changes or customization.  When changes are needed, they can normally
be accomplished using start-time [sqlite3_config | (1)]
or runtime
[sqlite3_db_config | (2)]
[sqlite3_limit | (3)]
|


|
<
<

|







1
2
3
4


5
6
7
8
9
10
11
12
13
<title>Maintaining Private Branches Of SQLite</title>
<tcl>hd_keywords {private branches}</tcl>

<table_of_contents>



<h1> Introduction</h1>

<p>SQLite is designed to meet most developer's needs without any 
changes or customization.  When changes are needed, they can normally
be accomplished using start-time [sqlite3_config | (1)]
or runtime
[sqlite3_db_config | (2)]
[sqlite3_limit | (3)]
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
This article is not trying to impose a particular procedure on 
maintainers of private branches.  The point of this article is to offer
an example of one process for maintaining a private branch which can
be used as a template for designing processes best suited for the
circumstances of each individual project.</p>

<img src="images/private_branch.gif" align="right">
<h2>2.0 The Basic Idea</h2>


<p>We propose to use the
[http://www.fossil-scm.org | fossil software configuration management]
system to set up two branches.  One branch (the "public branch" or "trunk")
contains the published SQLite sources and the other branch is the 
private branch which contains the code that is customized for the project.







|







33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
This article is not trying to impose a particular procedure on 
maintainers of private branches.  The point of this article is to offer
an example of one process for maintaining a private branch which can
be used as a template for designing processes best suited for the
circumstances of each individual project.</p>

<img src="images/private_branch.gif" align="right">
<h1> The Basic Idea</h1>


<p>We propose to use the
[http://www.fossil-scm.org | fossil software configuration management]
system to set up two branches.  One branch (the "public branch" or "trunk")
contains the published SQLite sources and the other branch is the 
private branch which contains the code that is customized for the project.
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
<p>The cycle above can be repeated many times.  The
diagram shows a third SQLite release, 3.6.17 in
circle (6).  The private branch maintainer can do
another merge in order to incorporate the changes
moving from (4) to (6) into the private branch, resulting
in version (7).</p>

<h2>3.0 The Procedure</h2>

<p>The remainder of this document will guide the reader through
the steps needed to maintain a private branch.  The general idea
is the same as outlined above.  This section merely provides more
detail.</p>

<p>We emphasize again that these steps are not intended to be the only
acceptable method for maintaining private branch.  This approach
is one of many.  Use this document as a baseline for preparing
project-specific procedures.  Do not be afraid to experiment.</p>

<h3>3.1 Obtain The Software</h3>

<p>[http://www.fossil-scm.org/ | Fossil] is a computer program
that must be installed on your machine before you use it.
Fortunately, installing fossil is very easy.  Fossil is a single
"*.exe" file that you simply download and run.  To uninstall fossil,
simply delete the exe file.  
[http://www.fossil-scm.org/index.html/doc/tip/www/quickstart.wiki | Detailed instructions] for installing and getting started with
fossil are available on the 
[http://www.fossil-scm.org | fossil website].</p>

<h3>3.2 Create A Project Repository</h3>

<p>Create a fossil repository to host the private branch using the
following command:</p>

<blockquote><pre>
fossil new private-project.fossil
</pre></blockquote>







|











|










|







86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
<p>The cycle above can be repeated many times.  The
diagram shows a third SQLite release, 3.6.17 in
circle (6).  The private branch maintainer can do
another merge in order to incorporate the changes
moving from (4) to (6) into the private branch, resulting
in version (7).</p>

<h1> The Procedure</h1>

<p>The remainder of this document will guide the reader through
the steps needed to maintain a private branch.  The general idea
is the same as outlined above.  This section merely provides more
detail.</p>

<p>We emphasize again that these steps are not intended to be the only
acceptable method for maintaining private branch.  This approach
is one of many.  Use this document as a baseline for preparing
project-specific procedures.  Do not be afraid to experiment.</p>

<h2> Obtain The Software</h2>

<p>[http://www.fossil-scm.org/ | Fossil] is a computer program
that must be installed on your machine before you use it.
Fortunately, installing fossil is very easy.  Fossil is a single
"*.exe" file that you simply download and run.  To uninstall fossil,
simply delete the exe file.  
[http://www.fossil-scm.org/index.html/doc/tip/www/quickstart.wiki | Detailed instructions] for installing and getting started with
fossil are available on the 
[http://www.fossil-scm.org | fossil website].</p>

<h2> Create A Project Repository</h2>

<p>Create a fossil repository to host the private branch using the
following command:</p>

<blockquote><pre>
fossil new private-project.fossil
</pre></blockquote>
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
fossil open private-project.fossil
</pre></blockquote>

<p>You can have multiple checkouts of the same project if you want. 
And you can "clone" the repository to different machines so that multiple
developers can use it.  See the fossil website for further information.</p>

<h3>3.3 Installing The SQLite Baseline In Fossil</h3>

<p>The repository created in the previous step is initially empty.  The
next step is to load the baseline SQLite release - circle (1) in the diagram
above.</p>

<p>Begin by obtaining a copy of SQLite in whatever form you use it.
The public SQLite you obtain should be as close to your private edited







|







150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
fossil open private-project.fossil
</pre></blockquote>

<p>You can have multiple checkouts of the same project if you want. 
And you can "clone" the repository to different machines so that multiple
developers can use it.  See the fossil website for further information.</p>

<h2> Installing The SQLite Baseline In Fossil</h2>

<p>The repository created in the previous step is initially empty.  The
next step is to load the baseline SQLite release - circle (1) in the diagram
above.</p>

<p>Begin by obtaining a copy of SQLite in whatever form you use it.
The public SQLite you obtain should be as close to your private edited
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231

<blockquote><pre>
fossil timeline
fossil info
fossil status
</pre></blockquote>

<h3>3.4 Creating The Private Branch</h3>

<p>The previous step created circle (1) in the diagram above.
This step will create circle (2).  Run the following command:</p>

<blockquote><pre> 
fossil branch new private trunk -bgcolor "#add8e8"
</pre></blockquote>







|







215
216
217
218
219
220
221
222
223
224
225
226
227
228
229

<blockquote><pre>
fossil timeline
fossil info
fossil status
</pre></blockquote>

<h2> Creating The Private Branch</h2>

<p>The previous step created circle (1) in the diagram above.
This step will create circle (2).  Run the following command:</p>

<blockquote><pre> 
fossil branch new private trunk -bgcolor "#add8e8"
</pre></blockquote>
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
fossil update trunk
</pre></blockquote>

<p>Normally, fossil will modify all the files in your checkout when switching
between the private and the public branches.  But at this point, the files
are identical in both branches so not modifications need to be made.</p>

<h3>3.5 Adding Customizations To The Code In The Private Branch</h3>

<p>Now it is time to make the private, custom modifications to SQLite
which are the whole point of this exercise.  Switch to the private branch
(if you are not already there) using the "<tt>fossil update private</tt>"
command, then bring up the source files in your text editor and make
whatever changes you want to make.  Once you have finished making
changes, commit those changes using this command:</p>







|







256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
fossil update trunk
</pre></blockquote>

<p>Normally, fossil will modify all the files in your checkout when switching
between the private and the public branches.  But at this point, the files
are identical in both branches so not modifications need to be made.</p>

<h2> Adding Customizations To The Code In The Private Branch</h2>

<p>Now it is time to make the private, custom modifications to SQLite
which are the whole point of this exercise.  Switch to the private branch
(if you are not already there) using the "<tt>fossil update private</tt>"
command, then bring up the source files in your text editor and make
whatever changes you want to make.  Once you have finished making
changes, commit those changes using this command:</p>
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
<p>Note that in the diagram above, we showed the private edits as a single
commit.  This was for clarity of presentation only.  There is nothing to stop
you from doing dozens or hundreds of separate tiny changes and committing
each separately.  In fact, making many small changes is the preferred way
to work.  The only reason for doing all the changes in a single commit
is that it makes the diagram easier to draw.</p>

<h3>3.6 Incorporating New Public SQLite Releases</h3>

<p>Suppose that after a while (about a month, usually) a new version of
SQLite is released: 3.6.16.  You will want to incorporate this new
public version of SQLite into your repository in the public branch (the
trunk).  To do this, first change your repository over to the trunk:</p>

<blockquote><pre>
fossil update trunk
</pre></blockquote>

<p>Then download the new version of the SQLite sources and overwrite the
files that are in the checkout.<p>

<p>If you made NL to CR-NL line ending changes or space to tab
indentation changes in the original baseline, make the same changes
to the new source file.</p>

<p>Once everything is ready, run the "<tt>fossil commit</tt>" command to
check in the changes.  This creates circle (4) in the diagram above.</p>

<h3>3.7 Merging Public SQLite Updates Into The Private Branch</h3>

<p>The next step is to move the changes in the public branch over into
the private branch.  In other words, we want to create circle (5) in the
diagram above.  Begin by changing to the private branch using
"<tt>fossil update private</tt>".  Then type this command:</p>

<blockquote><pre>







|




















|







285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
<p>Note that in the diagram above, we showed the private edits as a single
commit.  This was for clarity of presentation only.  There is nothing to stop
you from doing dozens or hundreds of separate tiny changes and committing
each separately.  In fact, making many small changes is the preferred way
to work.  The only reason for doing all the changes in a single commit
is that it makes the diagram easier to draw.</p>

<h2> Incorporating New Public SQLite Releases</h2>

<p>Suppose that after a while (about a month, usually) a new version of
SQLite is released: 3.6.16.  You will want to incorporate this new
public version of SQLite into your repository in the public branch (the
trunk).  To do this, first change your repository over to the trunk:</p>

<blockquote><pre>
fossil update trunk
</pre></blockquote>

<p>Then download the new version of the SQLite sources and overwrite the
files that are in the checkout.<p>

<p>If you made NL to CR-NL line ending changes or space to tab
indentation changes in the original baseline, make the same changes
to the new source file.</p>

<p>Once everything is ready, run the "<tt>fossil commit</tt>" command to
check in the changes.  This creates circle (4) in the diagram above.</p>

<h2> Merging Public SQLite Updates Into The Private Branch</h2>

<p>The next step is to move the changes in the public branch over into
the private branch.  In other words, we want to create circle (5) in the
diagram above.  Begin by changing to the private branch using
"<tt>fossil update private</tt>".  Then type this command:</p>

<blockquote><pre>
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
conflicts and manually resolve the conflicts.</p>

<p>After resolving conflicts, many users like to compile and test the
new version before committing it to the repository.  Or you can commit
first and test later.  Either way, run the "<tt>fossil commit</tt>"
command to check-in the circle (5) version.

<h3>3.8 Further Updates</h3>

<p>As new versions of SQLite are released, repeat steps 3.6 and 3.7 to
add changes in the new release to the private branch.
Additional private changes can be
made on the private branch in between releases if desired.</p>

<h2>4.0 Variations</h2>

<p>Since this document was first written, the canonical SQLite source code
has been moved from the venerable CVS system into a Fossil repository at
[http://www.sqlite.org/src].  This means that if you are working with
canonical SQLite source code (as opposed to the [amalgamation] source code
files, sqlite3.c and sqlite3.h) then you can create a private repository
simply by cloning the official repository:</p>







|






|







334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
conflicts and manually resolve the conflicts.</p>

<p>After resolving conflicts, many users like to compile and test the
new version before committing it to the repository.  Or you can commit
first and test later.  Either way, run the "<tt>fossil commit</tt>"
command to check-in the circle (5) version.

<h2> Further Updates</h2>

<p>As new versions of SQLite are released, repeat steps 3.6 and 3.7 to
add changes in the new release to the private branch.
Additional private changes can be
made on the private branch in between releases if desired.</p>

<h1> Variations</h1>

<p>Since this document was first written, the canonical SQLite source code
has been moved from the venerable CVS system into a Fossil repository at
[http://www.sqlite.org/src].  This means that if you are working with
canonical SQLite source code (as opposed to the [amalgamation] source code
files, sqlite3.c and sqlite3.h) then you can create a private repository
simply by cloning the official repository:</p>
Changes to pages/queryplanner-ng.in.
10
11
12
13
14
15
16
17

18
19
20
21
22
23
24
25
26
  hd_puts "<center><table><tr><td><pre>\n"
  regsub {^[ \n]*\n} $txt {} txt
  hd_puts [string trimright $txt]\n
  hd_puts "</pre></table></center>\n"
}
hd_keywords {next generation query planner} {Next Generation Query Planner} NGQP
</tcl>
<h1 align="center">The Next Generation Query Planner</h1>


<h2>1.0 Introduction</h2>

<p>
The task of the "query planner" is to figure
out the best algorithm or "query plan" to accomplish an SQL statement.
Beginning with SQLite version 3.8.0, the query planner component has been
rewritten so that it runs faster and generates better plans.  The
rewrite is called the "next generation query planner" or "NGQP".







|
>

|







10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
  hd_puts "<center><table><tr><td><pre>\n"
  regsub {^[ \n]*\n} $txt {} txt
  hd_puts [string trimright $txt]\n
  hd_puts "</pre></table></center>\n"
}
hd_keywords {next generation query planner} {Next Generation Query Planner} NGQP
</tcl>

<table_of_contents>

<h1> Introduction</h1>

<p>
The task of the "query planner" is to figure
out the best algorithm or "query plan" to accomplish an SQL statement.
Beginning with SQLite version 3.8.0, the query planner component has been
rewritten so that it runs faster and generates better plans.  The
rewrite is called the "next generation query planner" or "NGQP".
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
regressions.  This risk is considered and a checklist is provided
for reducing the risk and for fixing any issues that do arise.</p>

<p>This document focuses on the NGQP.  For a more general overview of the
SQLite query planner that encompasses the entire history of SQLite, see
"[query planner | The SQLite Query Optimizer Overview]".</p>

<h2>2.0 Background</h2>

<p>For simple queries against a single table with few indices, there is usually
an obvious choice for the best algorithm.
But for larger and more complex queries, such as
multi-way joins with many indices
and subqueries, there can be hundreds, thousands, or millions
of reasonable algorithms for computing the result.







|







38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
regressions.  This risk is considered and a checklist is provided
for reducing the risk and for fixing any issues that do arise.</p>

<p>This document focuses on the NGQP.  For a more general overview of the
SQLite query planner that encompasses the entire history of SQLite, see
"[query planner | The SQLite Query Optimizer Overview]".</p>

<h1> Background</h1>

<p>For simple queries against a single table with few indices, there is usually
an obvious choice for the best algorithm.
But for larger and more complex queries, such as
multi-way joins with many indices
and subqueries, there can be hundreds, thousands, or millions
of reasonable algorithms for computing the result.
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
without actually running that plan.  So when comparing two
or more plans to figure out which is "best", the query planner has to make
some guesses and assumptions and those guesses and assumptions will 
sometimes be wrong. A good query planner is one that will
find the correct solution often enough that application
programmers rarely need to get involved.</p>

<h3>2.1 Query Planning In SQLite</h3>

<p>SQLite computes joins using nested loops, 
one loop for each table
in the join.  (Additional loops might be inserted for IN
and OR operators in the WHERE clause.  SQLite considers those too,
but for simplicity we will ignore them in this essay.)
One or more indices might be used on each loop to speed the search,
or a loop might be a "full table scan" that reads every row in the
table.  Thus query planning decomposes into two subtasks:</p>
<ol>
<li> Picking the nested order of the various loops
<li> Choosing good indices for each loop
</ol>
<p>Picking the nesting order is generally the more challenging problem.
Once the nesting order of the join is established, the choice of indices
for each loop is normally obvious.</p>

<tcl>hd_fragment qpstab {query planner stability guarantee}</tcl>
<h3>2.2 The SQLite Query Planner Stability Guarantee</h3>

<p>SQLite will always pick the same query plan for any
given SQL statement as long as:
<ol type="a">
<li>the database schema does not change in significant ways such as 
    adding or dropping indices,</li>
<li>the ANALYZE command is not rerun, </li>







|


















|







73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
without actually running that plan.  So when comparing two
or more plans to figure out which is "best", the query planner has to make
some guesses and assumptions and those guesses and assumptions will 
sometimes be wrong. A good query planner is one that will
find the correct solution often enough that application
programmers rarely need to get involved.</p>

<h2> Query Planning In SQLite</h2>

<p>SQLite computes joins using nested loops, 
one loop for each table
in the join.  (Additional loops might be inserted for IN
and OR operators in the WHERE clause.  SQLite considers those too,
but for simplicity we will ignore them in this essay.)
One or more indices might be used on each loop to speed the search,
or a loop might be a "full table scan" that reads every row in the
table.  Thus query planning decomposes into two subtasks:</p>
<ol>
<li> Picking the nested order of the various loops
<li> Choosing good indices for each loop
</ol>
<p>Picking the nesting order is generally the more challenging problem.
Once the nesting order of the join is established, the choice of indices
for each loop is normally obvious.</p>

<tcl>hd_fragment qpstab {query planner stability guarantee}</tcl>
<h2> The SQLite Query Planner Stability Guarantee</h2>

<p>SQLite will always pick the same query plan for any
given SQL statement as long as:
<ol type="a">
<li>the database schema does not change in significant ways such as 
    adding or dropping indices,</li>
<li>the ANALYZE command is not rerun, </li>
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
a different version of SQLite, then query plans might change.  In rare
cases, an SQLite version change might lead to a performance regression.  
This is one reason
you should consider statically linking your applications against SQLite 
rather than use a system-wide SQLite shared library which might 
change without your knowledge or control.</p>

<h2>3.0 A Difficult Case</h2>

<p>
"TPC-H Q8" is a test query from the
<a href="http://www.tpc.org/tpch/">Transaction Processing Performance
Council</a>.  The query planners in SQLite versions 3.7.17 and earlier 
do not choose good plans for TPC-H Q8.  And it has been determined that 
no amount
of tweaking of the legacy query planner will fix that.  In order to find
a good solution to the TPC-H Q8 query, and to continue improving the 
quality of SQLite's query planner, it became necessary to redesign the
query planner.  This section tries to explain why this redesign was
necessary and how the NGQP is different and addresses the TPC-H Q8 problem.
</p>

<h3>3.1 Query Details</h3>

<p>
TPC-H Q8 is an eight-way join.
As observed above, the main task of the query planner is
to figure out the best nesting order of the eight loops in order to minimize
the work needed to complete the join.
A simplified model of this problem for the case of TPC-H Q8 is shown







|














|







141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
a different version of SQLite, then query plans might change.  In rare
cases, an SQLite version change might lead to a performance regression.  
This is one reason
you should consider statically linking your applications against SQLite 
rather than use a system-wide SQLite shared library which might 
change without your knowledge or control.</p>

<h1> A Difficult Case</h1>

<p>
"TPC-H Q8" is a test query from the
<a href="http://www.tpc.org/tpch/">Transaction Processing Performance
Council</a>.  The query planners in SQLite versions 3.7.17 and earlier 
do not choose good plans for TPC-H Q8.  And it has been determined that 
no amount
of tweaking of the legacy query planner will fix that.  In order to find
a good solution to the TPC-H Q8 query, and to continue improving the 
quality of SQLite's query planner, it became necessary to redesign the
query planner.  This section tries to explain why this redesign was
necessary and how the NGQP is different and addresses the TPC-H Q8 problem.
</p>

<h2> Query Details</h2>

<p>
TPC-H Q8 is an eight-way join.
As observed above, the main task of the query planner is
to figure out the best nesting order of the eight loops in order to minimize
the work needed to complete the join.
A simplified model of this problem for the case of TPC-H Q8 is shown
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
a minimum-cost path through the graph that visits each node
exactly once.</p>

<p>(Side note:  The costs estimates in the TPC-H Q8 graph were computed
by the query planner in SQLite 3.7.16 and converted using a natural logarithm.)
</p>

<h3>3.2 Complications</h3>

<p>The presentation of the query planner problem above is a simplification.
The costs are estimates.  We cannot
know what the true cost of running a loop is until we actually run the loop.
SQLite makes guesses for the cost of running a loop based on
the availability of indices and constraints found in the WHERE
clause.  These guesses are usually pretty good, but they can sometimes be







|







206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
a minimum-cost path through the graph that visits each node
exactly once.</p>

<p>(Side note:  The costs estimates in the TPC-H Q8 graph were computed
by the query planner in SQLite 3.7.16 and converted using a natural logarithm.)
</p>

<h2> Complications</h2>

<p>The presentation of the query planner problem above is a simplification.
The costs are estimates.  We cannot
know what the true cost of running a loop is until we actually run the loop.
SQLite makes guesses for the cost of running a loop based on
the availability of indices and constraints found in the WHERE
clause.  These guesses are usually pretty good, but they can sometimes be
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
<p>In the TPC-H Q8 query, the setup costs are all negligible,
all dependencies are between individual nodes, and there is no ORDER BY,
GROUP BY, or DISTINCT clause. So for TPC-H Q8,
the graph above is a reasonable representation of what needs to be computed.
The general case involves a lot of extra complication, which for clarity
is neglected in the remainder of this article.</p>

<h3>3.3 Finding The Best Query Plan</h3>

<p>Prior to version 3.8.0, SQLite always used
the "Nearest Neighbor" or "NN" heuristic when searching for the best query plan.
The NN heuristic makes a single traversal of the graph, always choosing
the lowest-cost arc as the next step.  
The NN heuristic works surprisingly well in most cases.
And NN is fast, so that SQLite is able to quickly find good plans







|







253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
<p>In the TPC-H Q8 query, the setup costs are all negligible,
all dependencies are between individual nodes, and there is no ORDER BY,
GROUP BY, or DISTINCT clause. So for TPC-H Q8,
the graph above is a reasonable representation of what needs to be computed.
The general case involves a lot of extra complication, which for clarity
is neglected in the remainder of this article.</p>

<h2> Finding The Best Query Plan</h2>

<p>Prior to version 3.8.0, SQLite always used
the "Nearest Neighbor" or "NN" heuristic when searching for the best query plan.
The NN heuristic makes a single traversal of the graph, always choosing
the lowest-cost arc as the next step.  
The NN heuristic works surprisingly well in most cases.
And NN is fast, so that SQLite is able to quickly find good plans
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
<p>One solution to this problem is to change SQLite to do an exhaustive
search for the best path.  But an exhaustive search requires time 
proportional to
K! (where K is the number of tables in the join) and so when you get 
beyond a 10-way join, the time
to run [sqlite3_prepare()] becomes very large.</p>

<h3>3.4 The N Nearest Neighbors or "N3" Heuristic</h3>

<p>The NGQP uses a new heuristic for seeking the best path through the
graph: "N Nearest Neighbors" (hereafter "N3").  With N3, instead of
choosing just one nearest neighbor for each step, the algorithm keeps
track of the N bests paths at each step for some small integer N.</p>

<p>Suppose N=4.  Then for the TPC-H Q8 graph, the first step finds







|







284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
<p>One solution to this problem is to change SQLite to do an exhaustive
search for the best path.  But an exhaustive search requires time 
proportional to
K! (where K is the number of tables in the join) and so when you get 
beyond a 10-way join, the time
to run [sqlite3_prepare()] becomes very large.</p>

<h2> The N Nearest Neighbors or "N3" Heuristic</h2>

<p>The NGQP uses a new heuristic for seeking the best path through the
graph: "N Nearest Neighbors" (hereafter "N3").  With N3, instead of
choosing just one nearest neighbor for each step, the algorithm keeps
track of the N bests paths at each step for some small integer N.</p>

<p>Suppose N=4.  Then for the TPC-H Q8 graph, the first step finds
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
when N is 10 or greater.</p>

<p>The initial implementation of NGQP chooses N=1 for simple queries, N=5
for two-way joins and N=10 for all joins with three or more tables.  This
formula for selecting N might change in subsequent releases.</p>

<tcl>hd_fragment hazards {hazards of upgrading to the NGQP}</tcl>
<h2>4.0 Hazards Of Upgrading To NGQP</h2>

<p>For most applications, upgrading from the legacy query planner to the NGQP
requires little thought or effort.
Simply replace the older SQLite version with the newer version of SQLite 
and recompile and the application will run faster.  
There are no API changes nor modifications
to compilation procedures.</p>







|







346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
when N is 10 or greater.</p>

<p>The initial implementation of NGQP chooses N=1 for simple queries, N=5
for two-way joins and N=10 for all joins with three or more tables.  This
formula for selecting N might change in subsequent releases.</p>

<tcl>hd_fragment hazards {hazards of upgrading to the NGQP}</tcl>
<h1> Hazards Of Upgrading To NGQP</h1>

<p>For most applications, upgrading from the legacy query planner to the NGQP
requires little thought or effort.
Simply replace the older SQLite version with the newer version of SQLite 
and recompile and the application will run faster.  
There are no API changes nor modifications
to compilation procedures.</p>
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
</ul>

<p>Not all applications meet these conditions.  Fortunately,
the NGQP will still usually find good query plans, even without these conditions.
However, cases do arise (rarely) where performance regressions can occur.</p>

<tcl>hd_fragment fossilcasestudy {The Fossil NGQP Upgrade Case Study}</tcl>
<h3>4.1 Case Study: Upgrading Fossil to the NGQP</h3>

<p>The <a href="http://www.fossil-scm.org/">Fossil DVCS</a> is the version
control system used to track all of the SQLite source code.
A Fossil repository is an SQLite database file.
(Readers are invited to ponder this recursion as an independent exercise.)
Fossil is both the version-control system for SQLite and a test
platform for SQLite.  Whenever enhancements are made to SQLite, 







|







386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
</ul>

<p>Not all applications meet these conditions.  Fortunately,
the NGQP will still usually find good query plans, even without these conditions.
However, cases do arise (rarely) where performance regressions can occur.</p>

<tcl>hd_fragment fossilcasestudy {The Fossil NGQP Upgrade Case Study}</tcl>
<h2> Case Study: Upgrading Fossil to the NGQP</h2>

<p>The <a href="http://www.fossil-scm.org/">Fossil DVCS</a> is the version
control system used to track all of the SQLite source code.
A Fossil repository is an SQLite database file.
(Readers are invited to ponder this recursion as an independent exercise.)
Fossil is both the version-control system for SQLite and a test
platform for SQLite.  Whenever enhancements are made to SQLite, 
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
<p>(Side note:  The costs estimates in the two most recent graphs 
were computed by the NGQP using a base-2 logarithm and slightly different
cost assumptions compared to the legacy query planner.  
Hence, the cost estimates in
these latter two graphs are not directly comparable to the cost estimates
in the TPC-H Q8 graph.)</p>

<h3>4.2 Fixing The Problem</h3>

<p>Running [ANALYZE] on the repository database immediately fixed the
performance problem.  However, we want Fossil to be robust and to always
work quickly regardless of whether or not its repository has been analyzed.
For this reason, the query was modified to use the CROSS JOIN operator 
instead of the plain JOIN operator.
SQLite will not reorder the tables of a CROSS JOIN.







|







583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
<p>(Side note:  The costs estimates in the two most recent graphs 
were computed by the NGQP using a base-2 logarithm and slightly different
cost assumptions compared to the legacy query planner.  
Hence, the cost estimates in
these latter two graphs are not directly comparable to the cost estimates
in the TPC-H Q8 graph.)</p>

<h2> Fixing The Problem</h2>

<p>Running [ANALYZE] on the repository database immediately fixed the
performance problem.  However, we want Fossil to be robust and to always
work quickly regardless of whether or not its repository has been analyzed.
For this reason, the query was modified to use the CROSS JOIN operator 
instead of the plain JOIN operator.
SQLite will not reorder the tables of a CROSS JOIN.
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
than PLINK_I1 and algorithm-2 really would be the faster choice.
However such repositories are very unlikely to appear in
practice and so hard-coding the loop nested order using the
CROSS JOIN syntax is a reasonable solution
to the problem in this case.</p>

<tcl>hd_fragment howtofix {query planner checklist}</tcl>
<h2>5.0 Checklist For Avoiding Or Fixing Query Planner Problems</h2>

<ol>
<li><p><b>Don't panic!</b>
Cases where the query planner picks an inferior plan are actually quite
rare.  You are unlikely to run across any problems in your application.
If you are not having performance issues, you do not need to worry
about any of this.</p>







|







611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
than PLINK_I1 and algorithm-2 really would be the faster choice.
However such repositories are very unlikely to appear in
practice and so hard-coding the loop nested order using the
CROSS JOIN syntax is a reasonable solution
to the problem in this case.</p>

<tcl>hd_fragment howtofix {query planner checklist}</tcl>
<h1> Checklist For Avoiding Or Fixing Query Planner Problems</h1>

<ol>
<li><p><b>Don't panic!</b>
Cases where the query planner picks an inferior plan are actually quite
rare.  You are unlikely to run across any problems in your application.
If you are not having performance issues, you do not need to worry
about any of this.</p>
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
<li><p><b>Use the [INDEXED BY] syntax to enforce the selection of
particular indices on problem queries.</b>
As with the previous two bullets, avoid this step if possible, and 
especially avoid doing this early in development as it is clearly a
premature optimization.</p>
</ol>

<h2>6.0 Summary</h2>

<p>The query planner in SQLite normally does a terrific job of selecting
fast algorithms for running your SQL statements.  This is true of the 
legacy query planner and even more true of the new NGQP.  There may be
an occasional situation where, due to incomplete information, the query
planner selects a suboptimal plan.  This will happen less often with the
NGQP than with the legacy query planner, but it might still happen.  Only
in those rare cases do application developers need to get involved and
help the query planner to do the right thing.  In the common case, the
NGQP is just a new enhancement to SQLite that makes the application run
a little faster and which requires no new developer thought or action.</p>







|











700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
<li><p><b>Use the [INDEXED BY] syntax to enforce the selection of
particular indices on problem queries.</b>
As with the previous two bullets, avoid this step if possible, and 
especially avoid doing this early in development as it is clearly a
premature optimization.</p>
</ol>

<h1> Summary</h1>

<p>The query planner in SQLite normally does a terrific job of selecting
fast algorithms for running your SQL statements.  This is true of the 
legacy query planner and even more true of the new NGQP.  There may be
an occasional situation where, due to incomplete information, the query
planner selects a suboptimal plan.  This will happen less often with the
NGQP than with the legacy query planner, but it might still happen.  Only
in those rare cases do application developers need to get involved and
help the query planner to do the right thing.  In the common case, the
NGQP is just a new enhancement to SQLite that makes the application run
a little faster and which requires no new developer thought or action.</p>
Changes to pages/queryplanner.in.
14
15
16
17
18
19
20
21



22
23
24
25
26
27
28
proc code {txt} {
  hd_puts "<center><table><tr><td><pre>\n"
  regsub {^[ \n]*\n} $txt {} txt
  hd_puts [string trimright $txt]\n
  hd_puts "</pre></table></center>\n"
}
</tcl>
<h1 align="center">Query Planning</h1>




<p>
The best feature of SQL (in <u>all</u> its implementations, not just SQLite)
is that it is a <i>declarative</i> language, not a <i>procedural</i>
language.  When programming in SQL you tell the system <i>what</i> you
want to compute, not <i>how</i> to compute it.  The task of figuring out
the <i>how</i> is delegated to the <i>query planner</i> subsystem within 







|
>
>
>







14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
proc code {txt} {
  hd_puts "<center><table><tr><td><pre>\n"
  regsub {^[ \n]*\n} $txt {} txt
  hd_puts [string trimright $txt]\n
  hd_puts "</pre></table></center>\n"
}
</tcl>

<table_of_contents>

<h2 style="margin-left:1.0em" notoc id=overview> Overview</h2> 

<p>
The best feature of SQL (in <u>all</u> its implementations, not just SQLite)
is that it is a <i>declarative</i> language, not a <i>procedural</i>
language.  When programming in SQL you tell the system <i>what</i> you
want to compute, not <i>how</i> to compute it.  The task of figuring out
the <i>how</i> is delegated to the <i>query planner</i> subsystem within 
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
with background information to help them understand
what is going on behind the scenes with SQLite, which in turn should make
it easier for programmers to create the indices that will help the SQLite
query planner to pick the best plans.
</p>

<tcl>hd_fragment searching strategies</tcl>
<h2>1.0 Searching</h2>

<h3>1.1 Tables Without Indices</h3>

<p>
Most tables in SQLite consist of zero or more rows with a unique integer
key (the [rowid] or [INTEGER PRIMARY KEY]) followed by content.  
(The exception is [WITHOUT ROWID] tables.)
The rows
are logically stored in order of increasing rowid.  As an example, this







|

|







49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
with background information to help them understand
what is going on behind the scenes with SQLite, which in turn should make
it easier for programmers to create the indices that will help the SQLite
query planner to pick the best plans.
</p>

<tcl>hd_fragment searching strategies</tcl>
<h1> Searching</h1>

<h2> Tables Without Indices</h2>

<p>
Most tables in SQLite consist of zero or more rows with a unique integer
key (the [rowid] or [INTEGER PRIMARY KEY]) followed by content.  
(The exception is [WITHOUT ROWID] tables.)
The rows
are logically stored in order of increasing rowid.  As an example, this
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
With a table of only 7 rows, this is not a big deal, but if your table
contained 7 million rows, a full table scan might read megabytes of content in order to find a single 8-byte number.  For that reason, one normally
tries to avoid full table scans.
</p>

<tcl>figure 2 #fig2 fullscan.gif {Full Table Scan}</tcl>

<h3>1.2 Lookup By Rowid</h3>

<p>
One technique for avoiding a full table scan is to do lookups by
rowid (or by the equivalent INTEGER PRIMARY KEY).   To lookup the
price of peaches, one would query for the entry with a rowid of 4:
</p>








|







112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
With a table of only 7 rows, this is not a big deal, but if your table
contained 7 million rows, a full table scan might read megabytes of content in order to find a single 8-byte number.  For that reason, one normally
tries to avoid full table scans.
</p>

<tcl>figure 2 #fig2 fullscan.gif {Full Table Scan}</tcl>

<h2> Lookup By Rowid</h2>

<p>
One technique for avoiding a full table scan is to do lookups by
rowid (or by the equivalent INTEGER PRIMARY KEY).   To lookup the
price of peaches, one would query for the entry with a rowid of 4:
</p>

133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
to N as in a full table scan.  If the table contains 10 million elements,
that means the query will be on the order of N/logN or about 1 million
times faster!
</p>

<tcl>figure 3 #fig3 rowidlu.gif {Lookup By Rowid}</tcl>

<h3>1.3 Lookup By Index</h3>
<p>
The problem with looking up information by rowid is that you probably
do not care what the price of "item 4" is - you want to know the price
of peaches.  And so a rowid lookup is not helpful.
</p>

<p>







|







136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
to N as in a full table scan.  If the table contains 10 million elements,
that means the query will be on the order of N/logN or about 1 million
times faster!
</p>

<tcl>figure 3 #fig3 rowidlu.gif {Lookup By Rowid}</tcl>

<h2> Lookup By Index</h2>
<p>
The problem with looking up information by rowid is that you probably
do not care what the price of "item 4" is - you want to know the price
of peaches.  And so a rowid lookup is not helpful.
</p>

<p>
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208

<p>
SQLite has to do two binary searches to find the price of peaches using
the method show above.  But for a table with a large number of rows, this
is still much faster than doing a full table scan.
</p>

<h3>1.4 Multiple Result Rows</h3>

<p>
In the previous query the fruit='Peach' constraint narrowed the result
down to a single row.  But the same technique works even if multiple
rows are obtained.  Suppose we looked up the price of Oranges instead of
Peaches:
</p>







|







197
198
199
200
201
202
203
204
205
206
207
208
209
210
211

<p>
SQLite has to do two binary searches to find the price of peaches using
the method show above.  But for a table with a large number of rows, this
is still much faster than doing a full table scan.
</p>

<h2> Multiple Result Rows</h2>

<p>
In the previous query the fruit='Peach' constraint narrowed the result
down to a single row.  But the same technique works even if multiple
rows are obtained.  Suppose we looked up the price of Oranges instead of
Peaches:
</p>
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
cheap in comparison to a binary search that we usually ignore it.  So
our estimate for the total cost of this query is 3 binary searches.
If the number of rows of output is K and the number of rows in the table
is N, then in general the cost of doing the query is proportional
to (K+1)*logN.
</p>

<h3>1.5 Multiple AND-Connected WHERE-Clause Terms</h3>

<p>
Next, suppose that you want to look up the price of not just any orange,
but specifically California-grown oranges.  The appropriate query would
be as follows:
</p>








|







228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
cheap in comparison to a binary search that we usually ignore it.  So
our estimate for the total cost of this query is 3 binary searches.
If the number of rows of output is K and the number of rows in the table
is N, then in general the cost of doing the query is proportional
to (K+1)*logN.
</p>

<h2> Multiple AND-Connected WHERE-Clause Terms</h2>

<p>
Next, suppose that you want to look up the price of not just any orange,
but specifically California-grown oranges.  The appropriate query would
be as follows:
</p>

301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
search down to two rows.  So, if all else is equal, SQLite will
choose Idx1 with the hope of narrowing the search to as small
a number of rows as possible.  This choice is only possible because
of the statistics provided by [ANALYZE].  If [ANALYZE] has not been
run then the choice of which index to use is arbitrary.
</p>

<h3>1.6 Multi-Column Indices</h3>

<p>
To get the maximum performance out of a query with multiple AND-connected
terms in the WHERE clause, you really want a multi-column index with
columns for each of the AND terms.  In this case we create a new index
on the "fruit" and "state" columns of FruitsForSale:
</p>







|







304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
search down to two rows.  So, if all else is equal, SQLite will
choose Idx1 with the hope of narrowing the search to as small
a number of rows as possible.  This choice is only possible because
of the statistics provided by [ANALYZE].  If [ANALYZE] has not been
run then the choice of which index to use is arbitrary.
</p>

<h2> Multi-Column Indices</h2>

<p>
To get the maximum performance out of a query with multiple AND-connected
terms in the WHERE clause, you really want a multi-column index with
columns for each of the AND terms.  In this case we create a new index
on the "fruit" and "state" columns of FruitsForSale:
</p>
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
Hence, a good rule of thumb is that your database schema should never
contain two indices where one index is a prefix of the other.  Drop the
index with fewer columns.  SQLite will still be able to do efficient
lookups with the longer index.
</p>

<tcl>hd_fragment covidx {covering index} {covering indices}</tcl>
<h3>1.7 Covering Indices</h3>

<p>
The "price of California oranges" query was made more efficient through
the use of a two-column index.  But SQLite can do even better with a
three-column index that also includes the "price" column:
</p>








|







370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
Hence, a good rule of thumb is that your database schema should never
contain two indices where one index is a prefix of the other.  Drop the
index with fewer columns.  SQLite will still be able to do efficient
lookups with the longer index.
</p>

<tcl>hd_fragment covidx {covering index} {covering indices}</tcl>
<h2> Covering Indices</h2>

<p>
The "price of California oranges" query was made more efficient through
the use of a two-column index.  But SQLite can do even better with a
three-column index that also includes the "price" column:
</p>

406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
A two-fold performance increase is not nearly as dramatic as the
one-million-fold increase seen when the table was first indexed.
And for most queries, the difference between 1 microsecond and
2 microseconds is unlikely to be noticed.
</p>

<tcl>hd_fragment or_in_where or-connected-terms</tcl>
<h3>1.8 OR-Connected Terms In The WHERE Clause</h3>

<p>
Multi-column indices only work if the constraint terms in the WHERE
clause of the query are connected by AND.
So Idx3 and Idx4 are helpful when the search is for items that
are both Oranges and grown in California, but neither index would
be that useful if we wanted all items that were either oranges







|







409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
A two-fold performance increase is not nearly as dramatic as the
one-million-fold increase seen when the table was first indexed.
And for most queries, the difference between 1 microsecond and
2 microseconds is unlikely to be noticed.
</p>

<tcl>hd_fragment or_in_where or-connected-terms</tcl>
<h2> OR-Connected Terms In The WHERE Clause</h2>

<p>
Multi-column indices only work if the constraint terms in the WHERE
clause of the query are connected by AND.
So Idx3 and Idx4 are helpful when the search is for items that
are both Oranges and grown in California, but neither index would
be that useful if we wanted all items that were either oranges
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
database engines will do just that.  But the performance gain over using
just a single index is slight and so SQLite does not implement that technique
at this time.  However, a future version SQLite might be enhanced to support
AND-by-INTERSECT.
</p>

<tcl>hd_fragment sorting sorting</tcl>
<h2>2.0 Sorting</h2>

<p>
SQLite (like all other SQL database engines) can also use indices to
satisfy the ORDER BY clauses in a query, in addition to expediting
lookup.  In other words, indices can be used to speed up sorting as
well as searching.
</p>







|







467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
database engines will do just that.  But the performance gain over using
just a single index is slight and so SQLite does not implement that technique
at this time.  However, a future version SQLite might be enhanced to support
AND-by-INTERSECT.
</p>

<tcl>hd_fragment sorting sorting</tcl>
<h1> Sorting</h1>

<p>
SQLite (like all other SQL database engines) can also use indices to
satisfy the ORDER BY clauses in a query, in addition to expediting
lookup.  In other words, indices can be used to speed up sorting as
well as searching.
</p>
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
full table scan.  Furthermore, the entire output is accumulated in
temporary storage (which might be either in main memory or on disk,
depending on various compile-time and run-time settings)
which can mean that a lot of temporary storage is required to complete
the query.
</p>

<h3>2.1 Sorting By Rowid</h3>

<p>
Because sorting can be expensive, SQLite works hard to convert ORDER BY
clauses into no-ops.  If SQLite determines that output will
naturally appear in the order specified, then no sorting is done.
So, for example, if you request the output in rowid order, no sorting
will be done:







|







502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
full table scan.  Furthermore, the entire output is accumulated in
temporary storage (which might be either in main memory or on disk,
depending on various compile-time and run-time settings)
which can mean that a lot of temporary storage is required to complete
the query.
</p>

<h2> Sorting By Rowid</h2>

<p>
Because sorting can be expensive, SQLite works hard to convert ORDER BY
clauses into no-ops.  If SQLite determines that output will
naturally appear in the order specified, then no sorting is done.
So, for example, if you request the output in rowid order, no sorting
will be done:
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
SQLite will still omit the sorting step.  But in order for output to
appear in the correct order, SQLite will do the table scan starting at
the end and working toward the beginning, rather than starting at the
beginning and working toward the end as shown in 
<a href="#fig17">figure 17</a>.
</p>

<h3>2.2 Sorting By Index</h3>

<p>
Of course, ordering the output of a query by rowid is seldom useful.
Usually one wants to order the output by some other column.
</p>

<p>







|







531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
SQLite will still omit the sorting step.  But in order for output to
appear in the correct order, SQLite will do the table scan starting at
the end and working toward the beginning, rather than starting at the
beginning and working toward the end as shown in 
<a href="#fig17">figure 17</a>.
</p>

<h2> Sorting By Index</h2>

<p>
Of course, ordering the output of a query by rowid is seldom useful.
Usually one wants to order the output by some other column.
</p>

<p>
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
time, and so this case could go either way depending on the table size and
what WHERE clause constraints were available, and so forth.  But generally
speaking, the indexed sort would probably be chosen, if for no other
reason, because it does not need to accumulate the entire result set in
temporary storage before sorting and thus uses much less temporary storage.
</p>

<h3>2.3 Sorting By Covering Index</h3>

<p>
If a covering index can be used for a query, then the multiple rowid lookups
can be avoided and the cost of the query drops dramatically.
</p>

<tcl>
figure 19 #fig19 obfruitidx4.gif {Sorting With A Covering Index}
</tcl>

<p>
With a covering index, SQLite can simply walk the index from one end to the
other and deliver the output in time proportional to N and without having
allocate a large buffer to hold the result set.
</p>

<h2>3.0 Searching And Sorting At The Same Time</h2>

<p>
The previous discussion has treated searching and sorting as separate
topics.  But in practice, it is often the case that one wants to search
and sort at the same time.  Fortunately, it is possible to do this
using a single index.
</p>

<h3>3.1 Searching And Sorting With A Multi-Column Index</h3>

<p>
Suppose we want to find the prices of all kinds of oranges sorted in
order of the state where they are grown.  The query is this:
</p>

<tcl>







|
















|








|







575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
time, and so this case could go either way depending on the table size and
what WHERE clause constraints were available, and so forth.  But generally
speaking, the indexed sort would probably be chosen, if for no other
reason, because it does not need to accumulate the entire result set in
temporary storage before sorting and thus uses much less temporary storage.
</p>

<h2> Sorting By Covering Index</h2>

<p>
If a covering index can be used for a query, then the multiple rowid lookups
can be avoided and the cost of the query drops dramatically.
</p>

<tcl>
figure 19 #fig19 obfruitidx4.gif {Sorting With A Covering Index}
</tcl>

<p>
With a covering index, SQLite can simply walk the index from one end to the
other and deliver the output in time proportional to N and without having
allocate a large buffer to hold the result set.
</p>

<h1> Searching And Sorting At The Same Time</h1>

<p>
The previous discussion has treated searching and sorting as separate
topics.  But in practice, it is often the case that one wants to search
and sort at the same time.  Fortunately, it is possible to do this
using a single index.
</p>

<h2> Searching And Sorting With A Multi-Column Index</h2>

<p>
Suppose we want to find the prices of all kinds of oranges sorted in
order of the state where they are grown.  The query is this:
</p>

<tcl>
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
column also happens to be the first column after the fruit column in the
index.  So, if we scan entries of the index that have the same value for
the fruit column from top to bottom, those index entries are guaranteed to
be ordered by the state column.
</p>

<tcl>hd_fragment {srchsortcovidx}</tcl>
<h3>3.2 Searching And Sorting With A Covering Index</h3>

<p>
A [covering index] can also be used to search and sort at the same time.
Consider the following:
</p>

<tcl>







|







642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
column also happens to be the first column after the fruit column in the
index.  So, if we scan entries of the index that have the same value for
the fruit column from top to bottom, those index entries are guaranteed to
be ordered by the state column.
</p>

<tcl>hd_fragment {srchsortcovidx}</tcl>
<h2> Searching And Sorting With A Covering Index</h2>

<p>
A [covering index] can also be used to search and sort at the same time.
Consider the following:
</p>

<tcl>
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
<p>
The same basic algorithm is followed, except this time the matching rows
of the index are scanned from bottom to top instead of from top to bottom,
so that the states will appear in descending order.
</p>

<tcl>hd_fragment {partialsort} {partial sorting by index} {block sorting}</tcl>
<h3>3.2 Partial Sorting Using An Index (a.k.a. Block Sorting)</h3>

<p>
Sometimes only part of an ORDER BY clause can be satisfied using indexes.
Consider, for example, the following query:
</p>

<tcl>







|







682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
<p>
The same basic algorithm is followed, except this time the matching rows
of the index are scanned from bottom to top instead of from top to bottom,
so that the states will appear in descending order.
</p>

<tcl>hd_fragment {partialsort} {partial sorting by index} {block sorting}</tcl>
<h2> Partial Sorting Using An Index (a.k.a. Block Sorting)</h2>

<p>
Sometimes only part of an ORDER BY clause can be satisfied using indexes.
Consider, for example, the following query:
</p>

<tcl>
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
    completes, and well before the table scan is complete.
<li>If a LIMIT clause is present, it might be possible to avoid scanning
    the entire table.
</ol>
Because of these advantages, SQLite always tries to do a partial sort using an
index even if a complete sort by index is not possible.</p>

<h2>4.0 WITHOUT ROWID tables</h2>

<p>
The basic principals described above apply to both ordinary rowid tables
and [WITHOUT ROWID] tables.
The only difference is that the rowid column that serves as the key for
tables and that appears as the right-most term in indexes is replaced by
the PRIMARY KEY.
</p>







|








729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
    completes, and well before the table scan is complete.
<li>If a LIMIT clause is present, it might be possible to avoid scanning
    the entire table.
</ol>
Because of these advantages, SQLite always tries to do a partial sort using an
index even if a complete sort by index is not possible.</p>

<h1> WITHOUT ROWID tables</h1>

<p>
The basic principals described above apply to both ordinary rowid tables
and [WITHOUT ROWID] tables.
The only difference is that the rowid column that serves as the key for
tables and that appears as the right-most term in indexes is replaced by
the PRIMARY KEY.
</p>
Changes to pages/rescode.in.
1
2
3



4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
<title>SQLite Result Codes</title>
<tcl>hd_keywords {result code} {result codes} {error code} {error codes}</tcl>
<h1 align="center">SQLite Result Codes</h1>




<p>
Many of the routines in the SQLite [C-language Interface] return
numeric result codes indicating either success or failure, and 
in the event of a failure, providing some idea of the cause of
the failure.  This document strives to explain what each
of those numeric result codes means.

<h2>Result Codes versus Error Codes</h2>

<p>
"Error codes" are a subset of "result codes" that indicate that
something has gone wrong.  There are only a few non-error result
codes:  [SQLITE_OK], [SQLITE_ROW], and [SQLITE_DONE].  The term
"error code" means any result code other than these three.

<tcl>hd_fragment pve {primary versus extended result codes} \
                     {*ext-v-prim} </tcl>
<h2>Primary Result Codes versus Extended Result Codes</h2>

<p>
Result codes are signed 32-bit integers.
The least significant 8 bits of the result code define a broad category
and are called the "primary result code".  More significant bits provide
more detailed information about the error and are called the
"extended result code"


|
>
>
>








|









|







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
<title>SQLite Result Codes</title>
<tcl>hd_keywords {result code} {result codes} {error code} {error codes}</tcl>

<table_of_contents>

<h2 style="margin-left:1.0em" notoc id=overview> Overview</h2> 

<p>
Many of the routines in the SQLite [C-language Interface] return
numeric result codes indicating either success or failure, and 
in the event of a failure, providing some idea of the cause of
the failure.  This document strives to explain what each
of those numeric result codes means.

<h1>Result Codes versus Error Codes</h1>

<p>
"Error codes" are a subset of "result codes" that indicate that
something has gone wrong.  There are only a few non-error result
codes:  [SQLITE_OK], [SQLITE_ROW], and [SQLITE_DONE].  The term
"error code" means any result code other than these three.

<tcl>hd_fragment pve {primary versus extended result codes} \
                     {*ext-v-prim} </tcl>
<h1>Primary Result Codes versus Extended Result Codes</h1>

<p>
Result codes are signed 32-bit integers.
The least significant 8 bits of the result code define a broad category
and are called the "primary result code".  More significant bits provide
more detailed information about the error and are called the
"extended result code"
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
primary result codes by default.  
The extended result code for the most recent error can be
retrieved using the [sqlite3_extended_errcode()] interface.
The [sqlite3_extended_result_codes()] interface can be used to put
a [database connection] into a mode where it returns the
extended result codes instead of the primary result codes.

<h2>Definitions</h2>

<p>
All result codes are integers.
Symbolic names for all result codes are created using
"#define" macros in the sqlite3.h header file.
There are separate sections in the sqlite3.h header file for
the [result code definitions] and the [extended result code definitions].







|







46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
primary result codes by default.  
The extended result code for the most recent error can be
retrieved using the [sqlite3_extended_errcode()] interface.
The [sqlite3_extended_result_codes()] interface can be used to put
a [database connection] into a mode where it returns the
extended result codes instead of the primary result codes.

<h1>Definitions</h1>

<p>
All result codes are integers.
Symbolic names for all result codes are created using
"#define" macros in the sqlite3.h header file.
There are separate sections in the sqlite3.h header file for
the [result code definitions] and the [extended result code definitions].
726
727
728
729
730
731
732

733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754

755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
    set prim_rc($name) $val
  } else {
    incr nExtCode
    set ext_rc($name) $val
  }
}


hd_puts "<h2>Primary Result Code List</h2>\n"
hd_puts "<p>The $nPrimCode result codes"
</tcl>
  are [result code definitions|defined in sqlite3.h] and are
  listed in alphabetical order below:
  <table border=0 width="100%" cellpadding=10>
  <tr><td valign="top" align="left"><ul>
<tcl>
set nrow [expr {($nPrimCode+2)/3}]
set i 0
foreach name [lsort [array names prim_rc]] {
  if {$i==$nrow} {
    hd_puts "</ul></td><td valign=\"top\" align=\"left\"><ul>\n"
    set i 0
  }
  incr i
  hd_resolve "<li> \[$name\] ($prim_rc($name))\n"
}
hd_puts "</td></td></table>\n\n"

hd_fragment extrc {extended result code} {extended result codes} \
                  {extended error code} {extended error codes}

hd_puts "<h2>Extended Result Code List</h2>\n"
hd_puts "<p>The $nExtCode extended result codes"
</tcl>
  are [extended result code definitions|defined in sqlite3.h] and are
  listed in alphabetical order below:
  <table border=0 width="100%" cellpadding=10>
  <tr><td valign="top" align="left"><ul>
<tcl>
set nrow [expr {($nExtCode+1)/2}]
set i 0
foreach name [lsort [array names ext_rc]] {
  if {$i==$nrow} {
    hd_puts "</ul></td><td valign=\"top\" align=\"left\"><ul>\n"
    set i 0
  }
  incr i
  hd_resolve "<li> \[$name\] ($ext_rc($name))\n"
}
hd_puts "</td></td></table>\n\n"

hd_puts "
<h2>Result Code Meanings</h2>
<p>
The meanings for all $nResCode result code values are shown below,
in numeric order.
"

# Generate the table of result codes
#
foreach val [lsort -int [array names valtoname]] {
  set name $valtoname($val)
  regsub {sqlite_} [string tolower $name] {} tag
  hd_puts "<!--------------------------------------------------------------->\n"
  hd_fragment $tag --override $name
  hd_puts "<h3>($val) $valtoname($val)</h3>\n"
  hd_resolve $resdesc($name)\n\n
}

</tcl>







>
|
|
<
|
|
|
















>
|
|
<

















|
|



|
|












729
730
731
732
733
734
735
736
737
738

739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760

761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
    set prim_rc($name) $val
  } else {
    incr nExtCode
    set ext_rc($name) $val
  }
}

</tcl>
<h1>Primary Result Code List</h1>
<p>The $nPrimCode result codes are 

   [result code definitions|defined in sqlite3.h] and are listed in
   alphabetical order below: 
<table border=0 width="100%" cellpadding=10>
  <tr><td valign="top" align="left"><ul>
<tcl>
set nrow [expr {($nPrimCode+2)/3}]
set i 0
foreach name [lsort [array names prim_rc]] {
  if {$i==$nrow} {
    hd_puts "</ul></td><td valign=\"top\" align=\"left\"><ul>\n"
    set i 0
  }
  incr i
  hd_resolve "<li> \[$name\] ($prim_rc($name))\n"
}
hd_puts "</td></td></table>\n\n"

hd_fragment extrc {extended result code} {extended result codes} \
                  {extended error code} {extended error codes}
</tcl>
<h1>Extended Result Code List</h1>
<p>The $nExtCode extended result codes

  are [extended result code definitions|defined in sqlite3.h] and are
  listed in alphabetical order below:
  <table border=0 width="100%" cellpadding=10>
  <tr><td valign="top" align="left"><ul>
<tcl>
set nrow [expr {($nExtCode+1)/2}]
set i 0
foreach name [lsort [array names ext_rc]] {
  if {$i==$nrow} {
    hd_puts "</ul></td><td valign=\"top\" align=\"left\"><ul>\n"
    set i 0
  }
  incr i
  hd_resolve "<li> \[$name\] ($ext_rc($name))\n"
}
hd_puts "</td></td></table>\n\n"

</tcl>
<h1>Result Code Meanings</h1>
<p>
The meanings for all $nResCode result code values are shown below,
in numeric order.

<tcl>
# Generate the table of result codes
#
foreach val [lsort -int [array names valtoname]] {
  set name $valtoname($val)
  regsub {sqlite_} [string tolower $name] {} tag
  hd_puts "<!--------------------------------------------------------------->\n"
  hd_fragment $tag --override $name
  hd_puts "<h3>($val) $valtoname($val)</h3>\n"
  hd_resolve $resdesc($name)\n\n
}

</tcl>
Changes to pages/sharedcache.in.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
<title>SQLite Shared-Cache Mode</title>
<tcl>hd_keywords {SQLite Shared-Cache Mode} \
        {shared cache} {shared cache mode}</tcl>

<tcl>
proc HEADING {level title} {
  global pnum
  incr pnum($level)
  foreach i [array names pnum] {
    if {$i>$level} {set pnum($i) 0}
  }
  set h [expr {$level+1}]
  if {$h>6} {set h 6}
  set n $pnum(1).$pnum(2)
  for {set i 3} {$i<=$level} {incr i} {
    append n .$pnum($i)
  }
  hd_puts "<h$h>$n $title</h$h>"
}
set pnum(1) 0
set pnum(2) 0
set pnum(3) 0
set pnum(4) 0
set pnum(5) 0
set pnum(6) 0
set pnum(7) 0
set pnum(8) 0
</tcl>

<tcl>HEADING 1 {SQLite Shared-Cache Mode}</tcl>

<p>Starting with version 3.3.0, SQLite includes a special "shared-cache"
mode (disabled by default) intended for use in embedded servers. If
shared-cache mode is enabled and a thread establishes multiple connections
to the same database, the connections share a single data and schema cache.
This can significantly reduce the quantity of memory and IO required by
the system.</p>




|
<
<
<
<
<
|
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
|







1
2
3
4
5





6


















7
8
9
10
11
12
13
14
<title>SQLite Shared-Cache Mode</title>
<tcl>hd_keywords {SQLite Shared-Cache Mode} \
        {shared cache} {shared cache mode}</tcl>

<table_of_contents>
























<h1>SQLite Shared-Cache Mode</h1>

<p>Starting with version 3.3.0, SQLite includes a special "shared-cache"
mode (disabled by default) intended for use in embedded servers. If
shared-cache mode is enabled and a thread establishes multiple connections
to the same database, the connections share a single data and schema cache.
This can significantly reduce the quantity of memory and IO required by
the system.</p>
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59

<p>Shared-cache mode changes the semantics
of the locking model in some cases. The details are described by
this document. A basic understanding of the normal SQLite locking model (see
<a href="lockingv3.html">File Locking And Concurrency In SQLite Version 3</a>
for details) is assumed.</p>

<tcl>HEADING 1 {Shared-Cache Locking Model}</tcl>

<p>Externally, from the point of view of another process or thread, two
or more [sqlite3|database connections] using a shared-cache appear as a single 
connection. The locking protocol used to arbitrate between multiple 
shared-caches or regular database users is described elsewhere.
</p>








|







22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

<p>Shared-cache mode changes the semantics
of the locking model in some cases. The details are described by
this document. A basic understanding of the normal SQLite locking model (see
<a href="lockingv3.html">File Locking And Concurrency In SQLite Version 3</a>
for details) is assumed.</p>

<h1>Shared-Cache Locking Model</h1>

<p>Externally, from the point of view of another process or thread, two
or more [sqlite3|database connections] using a shared-cache appear as a single 
connection. The locking protocol used to arbitrate between multiple 
shared-caches or regular database users is described elsewhere.
</p>

88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
connections 2 and 3 is described in the remainder of this section.
</p>

<p>There are three levels to the shared-cache locking model, 
transaction level locking, table level locking and schema level locking. 
They are described in the following three sub-sections.</p>

<tcl>HEADING 2 {Transaction Level Locking}</tcl>

<p>SQLite connections can open two kinds of transactions, read and write
transactions. This is not done explicitly, a transaction is implicitly a
read-transaction until it first writes to a database table, at which point
it becomes a write-transaction.
</p>
<p>At most one connection to a single shared cache may open a 
write transaction at any one time. This may co-exist with any number of read 
transactions. 
</p>

<tcl>HEADING 2 {Table Level Locking}</tcl>

<p>When two or more connections use a shared-cache, locks are used to 
serialize concurrent access attempts on a per-table basis. Tables support 
two types of locks, "read-locks" and "write-locks". Locks are granted to
connections - at any one time, each database connection has either a
read-lock, write-lock or no lock on each database table.
</p>

<p>At any one time, a single table may have any number of active read-locks
or a single active write lock. To read data a table, a connection must 
first obtain a read-lock. To write to a table, a connection must obtain a 
write-lock on that table. If a required table lock cannot be obtained,
the query fails and SQLITE_LOCKED is returned to the caller.
</p> 

<p>Once a connection obtains a table lock, it is not released until the
current transaction (read or write) is concluded.
</p>

<tcl>HEADING 3 {Read-Uncommitted Isolation Mode}</tcl>

<p>The behaviour described above may be modified slightly by using the 
[read_uncommitted] pragma to change the isolation level from serialized 
(the default), to read-uncommitted.</p>

<p> A database connection in read-uncommitted mode does not attempt 
to obtain read-locks before reading from database tables as described 







|











|



















|







65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
connections 2 and 3 is described in the remainder of this section.
</p>

<p>There are three levels to the shared-cache locking model, 
transaction level locking, table level locking and schema level locking. 
They are described in the following three sub-sections.</p>

<h2>Transaction Level Locking</h2>

<p>SQLite connections can open two kinds of transactions, read and write
transactions. This is not done explicitly, a transaction is implicitly a
read-transaction until it first writes to a database table, at which point
it becomes a write-transaction.
</p>
<p>At most one connection to a single shared cache may open a 
write transaction at any one time. This may co-exist with any number of read 
transactions. 
</p>

<h2>Table Level Locking</h2>

<p>When two or more connections use a shared-cache, locks are used to 
serialize concurrent access attempts on a per-table basis. Tables support 
two types of locks, "read-locks" and "write-locks". Locks are granted to
connections - at any one time, each database connection has either a
read-lock, write-lock or no lock on each database table.
</p>

<p>At any one time, a single table may have any number of active read-locks
or a single active write lock. To read data a table, a connection must 
first obtain a read-lock. To write to a table, a connection must obtain a 
write-lock on that table. If a required table lock cannot be obtained,
the query fails and SQLITE_LOCKED is returned to the caller.
</p> 

<p>Once a connection obtains a table lock, it is not released until the
current transaction (read or write) is concluded.
</p>

<h3>Read-Uncommitted Isolation Mode</h3>

<p>The behaviour described above may be modified slightly by using the 
[read_uncommitted] pragma to change the isolation level from serialized 
(the default), to read-uncommitted.</p>

<p> A database connection in read-uncommitted mode does not attempt 
to obtain read-locks before reading from database tables as described 
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
  */
  PRAGMA read_uncommitted = &lt;boolean&gt;;

  /* Retrieve the current value of the read-uncommitted flag */
  PRAGMA read_uncommitted;
</pre></blockquote>

<tcl>HEADING 2 {Schema (sqlite_master) Level Locking}</tcl>

<p>The <i>sqlite_master</i> table supports shared-cache read and write 
locks in the same way as all other database tables (see description 
above). The following special rules also apply:
</p>

<ul>
<li>A connection must obtain a read-lock on <i>sqlite_master</i> before 
accessing any database tables or obtaining any other read or write locks.</li>
<li>Before executing a statement that modifies the database schema (i.e. 
a CREATE or DROP TABLE statement), a connection must obtain a write-lock on 
<i>sqlite_master</i>.
</li>
<li>A connection may not compile an SQL statement if any other connection
is holding a write-lock on the <i>sqlite_master</i> table of any attached
database (including the default database, "main"). 
</li>
</ul>

<tcl>HEADING 1 {Thread Related Issues}</tcl>

<p>In SQLite versions 3.3.0 through 3.4.2 when shared-cache mode is enabled, 
a database connection may only be
used by the thread that called [sqlite3_open()] to create it.
And a connection could only share cache with another connection in the
same thread.
These restrictions were dropped beginning with SQLite version 3.5.0.
</p>

<tcl>HEADING 1 {Shared Cache And Virtual Tables}</tcl>

<p>
In older versions of SQLite,
shared cache mode could not be used together with virtual tables.
This restriction was removed in SQLite [version 3.6.17].

<tcl>HEADING 1 {Enabling Shared-Cache Mode}</tcl>

<p>Shared-cache mode is enabled on a per-process basis. Using the C 
interface, the following API can be used to globally enable or disable
shared-cache mode:
</p>

<blockquote><pre>







|



















|









|






|







130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
  */
  PRAGMA read_uncommitted = &lt;boolean&gt;;

  /* Retrieve the current value of the read-uncommitted flag */
  PRAGMA read_uncommitted;
</pre></blockquote>

<h2>Schema (sqlite_master) Level Locking</h2>

<p>The <i>sqlite_master</i> table supports shared-cache read and write 
locks in the same way as all other database tables (see description 
above). The following special rules also apply:
</p>

<ul>
<li>A connection must obtain a read-lock on <i>sqlite_master</i> before 
accessing any database tables or obtaining any other read or write locks.</li>
<li>Before executing a statement that modifies the database schema (i.e. 
a CREATE or DROP TABLE statement), a connection must obtain a write-lock on 
<i>sqlite_master</i>.
</li>
<li>A connection may not compile an SQL statement if any other connection
is holding a write-lock on the <i>sqlite_master</i> table of any attached
database (including the default database, "main"). 
</li>
</ul>

<h1>Thread Related Issues</h1>

<p>In SQLite versions 3.3.0 through 3.4.2 when shared-cache mode is enabled, 
a database connection may only be
used by the thread that called [sqlite3_open()] to create it.
And a connection could only share cache with another connection in the
same thread.
These restrictions were dropped beginning with SQLite version 3.5.0.
</p>

<h1>Shared Cache And Virtual Tables</h1>

<p>
In older versions of SQLite,
shared cache mode could not be used together with virtual tables.
This restriction was removed in SQLite [version 3.6.17].

<h1>Enabling Shared-Cache Mode</h1>

<p>Shared-cache mode is enabled on a per-process basis. Using the C 
interface, the following API can be used to globally enable or disable
shared-cache mode:
</p>

<blockquote><pre>
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
cache sharing behavior of a database connection allows cache sharing to
be controlled in [ATTACH] statements.  For example:</p>

<blockquote><pre>
ATTACH 'file:aux.db?cache=shared' AS aux;
</pre></blockquote>

<tcl>
hd_fragment inmemsharedcache {in-memory shared-cache}
HEADING 1 {Shared Cache And In-Memory Databases}</tcl>

<p>
Beginning with SQLite [version 3.7.13], shared cache can be used on
[in-memory databases], provided that the database is created using
a [URI filename].  For backwards compatibility, shared cache is always
disable for in-memory
databases if the unadorned name ":memory:" is used to open the database.







<
|
|







205
206
207
208
209
210
211

212
213
214
215
216
217
218
219
220
cache sharing behavior of a database connection allows cache sharing to
be controlled in [ATTACH] statements.  For example:</p>

<blockquote><pre>
ATTACH 'file:aux.db?cache=shared' AS aux;
</pre></blockquote>


<tcl> hd_fragment inmemsharedcache {in-memory shared-cache} </tcl>
<h1>Shared Cache And In-Memory Databases</h1>

<p>
Beginning with SQLite [version 3.7.13], shared cache can be used on
[in-memory databases], provided that the database is created using
a [URI filename].  For backwards compatibility, shared cache is always
disable for in-memory
databases if the unadorned name ":memory:" is used to open the database.
Changes to pages/spellfix1.in.
1
2
3

4

5
6
7
8
9
10
11
<title>The spellfix1 virtual table</title>
<tcl>
hd_keywords {spellfix1}</tcl>

<h1 align='center'>The Spellfix1 Virtual Table</h1>


<p>This spellfix1 [virtual table] can be used to search
a large vocabulary for close matches.  For example, spellfix1
can be used to suggest corrections to misspelled words.  Or,
it could be used with [FTS4] to do full-text search using potentially
misspelled words.

|


>
|
>







1
2
3
4
5
6
7
8
9
10
11
12
13
<title>The Spellfix1 Virtual Table</title>
<tcl>
hd_keywords {spellfix1}</tcl>
<table_of_contents>

<h1>Overview</h1>

<p>This spellfix1 [virtual table] can be used to search
a large vocabulary for close matches.  For example, spellfix1
can be used to suggest corrections to misspelled words.  Or,
it could be used with [FTS4] to do full-text search using potentially
misspelled words.

90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
kennestone
kenneson
kenneys
keanes
keenes
</pre></blockquote>

<h2>Search Refinements</h2>

<p>By default, the spellfix1 table returns no more than 20 results.
(It might return less than 20 if there were fewer good matches.)
You can change the upper bound on the number of returned rows by
adding a "top=N" term to the WHERE clause of your query, where N
is the new maximum.  For example, to see the 5 best matches:








|







92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
kennestone
kenneson
kenneys
keanes
keenes
</pre></blockquote>

<h1>Search Refinements</h1>

<p>By default, the spellfix1 table returns no more than 20 results.
(It might return less than 20 if there were fewer good matches.)
You can change the upper bound on the number of returned rows by
adding a "top=N" term to the WHERE clause of your query, where N
is the new maximum.  For example, to see the 5 best matches:

132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152

<p>Note that if you do not include the "langid=N" term in the WHERE clause,
the search will be against language 0 (English in the example above.)
All spellfix1 searches are against a single language id.  There is no
way to search all languages at once.
 

<h2>Virtual Table Details</h2>

<p>Each row in the spellfix1 virtual table has a unique rowid 
with seven columns plus five extra hidden columns.
The columns are as follows:

<blockquote><dl>
<dt><p><b>rowid</b><dd>
A unique integer number associated with each
vocabulary item in the table.  This can be used
as a foreign key on other tables in the database.

<dt><p><b>word</b><dd>
The text of the word that matches the pattern.







|





|







134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154

<p>Note that if you do not include the "langid=N" term in the WHERE clause,
the search will be against language 0 (English in the example above.)
All spellfix1 searches are against a single language id.  There is no
way to search all languages at once.
 

<h1>Virtual Table Details</h1>

<p>Each row in the spellfix1 virtual table has a unique rowid 
with seven columns plus five extra hidden columns.
The columns are as follows:

<dl>
<dt><p><b>rowid</b><dd>
A unique integer number associated with each
vocabulary item in the table.  This can be used
as a foreign key on other tables in the database.

<dt><p><b>word</b><dd>
The text of the word that matches the pattern.
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
<dt><p><b>command</b><dd>
(HIDDEN)  The value of the "command" column is always NULL.  However,
applications can insert special strings into the "command" column in order
to provoke certain behaviors in the spellfix1 virtual table.
For example, inserting the string 'reset' into the "command" column
will cause the virtual table to reread its edit distance weights
(if there are any).
</dl></blockquote>

<h2>Algorithm</h2>

<p>The spellfix1 virtual table creates a single
shadow table named "%_vocab" (where the % is replaced by the name of
the virtual table; Ex: "demo_vocab" for the "demo" virtual table).  
the shadow table contains the following columns:

<blockquote><dl>
<dt><p><b>id</b><dd>
The unique id (INTEGER PRIMARY KEY)

<dt><p><b>rank</b><dd>
The rank of word.

<dt><p><b>langid</b><dd>







|

|






|







223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
<dt><p><b>command</b><dd>
(HIDDEN)  The value of the "command" column is always NULL.  However,
applications can insert special strings into the "command" column in order
to provoke certain behaviors in the spellfix1 virtual table.
For example, inserting the string 'reset' into the "command" column
will cause the virtual table to reread its edit distance weights
(if there are any).
</dl>

<h1>Algorithm</h1>

<p>The spellfix1 virtual table creates a single
shadow table named "%_vocab" (where the % is replaced by the name of
the virtual table; Ex: "demo_vocab" for the "demo" virtual table).  
the shadow table contains the following columns:

<dl>
<dt><p><b>id</b><dd>
The unique id (INTEGER PRIMARY KEY)

<dt><p><b>rank</b><dd>
The rank of word.

<dt><p><b>langid</b><dd>
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
single symbol "A".  And the letters "p", "b", "f", and
"v" all become "B".  All nasal sounds are represented
as "N".  And so forth.  The mapping is base on
ideas found in Soundex, Metaphone, and other
long-standing phonetic matching systems.  This key can
be generated by the function spellfix1_phonehash(X).  
Hence: k2 = spellfix1_phonehash(k1)
</dl></blockquote>

<p>There is also a function for computing the Wagner edit distance or the
Levenshtein distance between a pattern and a word.  This function
is exposed as spellfix1_editdist(X,Y).  The edit distance function
returns the "cost" of converting X into Y.  Some transformations
cost more than others.  Changing one vowel into a different vowel,
for example is relatively cheap, as is doubling a constant, or







|







266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
single symbol "A".  And the letters "p", "b", "f", and
"v" all become "B".  All nasal sounds are represented
as "N".  And so forth.  The mapping is base on
ideas found in Soundex, Metaphone, and other
long-standing phonetic matching systems.  This key can
be generated by the function spellfix1_phonehash(X).  
Hence: k2 = spellfix1_phonehash(k1)
</dl>

<p>There is also a function for computing the Wagner edit distance or the
Levenshtein distance between a pattern and a word.  This function
is exposed as spellfix1_editdist(X,Y).  The edit distance function
returns the "cost" of converting X into Y.  Some transformations
cost more than others.  Changing one vowel into a different vowel,
for example is relatively cheap, as is doubling a constant, or
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335

<p>Only terms of the vocabulary with a matching langid are searched.
Hence, the same table can contain entries from multiple languages
and only the requested language will be used.  The default langid
is 0.

<tcl>hd_fragment configeditdist {configurable edit distances}</tcl>
<h2>Configurable Edit Distance</h2>

<p>The built-in Wagner edit-distance function with fixed weights can be
replaced by the [editdist3()] edit-distance function
with application-defined weights and support for unicode, by specifying
the "edit_cost_table=<i>TABLENAME</i>" parameter to the spellfix1 module
when the virtual table is created.
For example:







|







323
324
325
326
327
328
329
330
331
332
333
334
335
336
337

<p>Only terms of the vocabulary with a matching langid are searched.
Hence, the same table can contain entries from multiple languages
and only the requested language will be used.  The default langid
is 0.

<tcl>hd_fragment configeditdist {configurable edit distances}</tcl>
<h1>Configurable Edit Distance</h1>

<p>The built-in Wagner edit-distance function with fixed weights can be
replaced by the [editdist3()] edit-distance function
with application-defined weights and support for unicode, by specifying
the "edit_cost_table=<i>TABLENAME</i>" parameter to the spellfix1 module
when the virtual table is created.
For example:
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
APPCOST table.  Hence, applications should run a SQL statement similar
to the following when changes to the APPCOST table occur:

<blockquote>
INSERT INTO demo2(command) VALUES("reset");
</blockquote>

<h2>Dealing With Unusual And Difficult Spellings</h2>

<p>The algorithm above works quite well for most cases, but there are
exceptions.  These exceptions can be dealt with by making additional
entries in the virtual table using the "soundslike" column.

<p>For example, many words of Greek origin begin with letters "ps" where
the "p" is silent.  Ex:  psalm, pseudonym, psoriasis, psyche.  In







|







363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
APPCOST table.  Hence, applications should run a SQL statement similar
to the following when changes to the APPCOST table occur:

<blockquote>
INSERT INTO demo2(command) VALUES("reset");
</blockquote>

<h1>Dealing With Unusual And Difficult Spellings</h1>

<p>The algorithm above works quite well for most cases, but there are
exceptions.  These exceptions can be dealt with by making additional
entries in the virtual table using the "soundslike" column.

<p>For example, many words of Greek origin begin with letters "ps" where
the "p" is silent.  Ex:  psalm, pseudonym, psoriasis, psyche.  In
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
<li>"Tch" sounds in Slavic words:  Tchaikovsky vs. Chaykovsky
<li>The letter "j" pronounced like "h" in Spanish:  LaJolla
<li>Words beginning with "wr" versus "r":  write vs. rite
<li>Miscellaneous problem words such as "debt", "tsetse",
      "Nguyen", "Van Nuyes".
</ul>

<h2>Auxiliary Functions</h2>

<p>The source code module that implements the spellfix1 virtual table also
implements several SQL functions that might be useful to applications
that employ spellfix1 or for testing or diagnostic work while developing
applications that use spellfix1.  The following auxiliary functions are
available:

<blockquote><dl>
<dt><p><b>editdist3(P,W)<br>editdist3(P,W,L)<br>editdist3(T)</b><dd>
These routines provide direct access to the version of the Wagner
edit-distance function that allows for application-defined weights
on edit operations.  The first two forms of this function compare
pattern P against word W and return the edit distance.  In the first
function, the langid is assumed to be 0 and in the second, the
langid is given by the L parameter.  The third form of this function







|







|







415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
<li>"Tch" sounds in Slavic words:  Tchaikovsky vs. Chaykovsky
<li>The letter "j" pronounced like "h" in Spanish:  LaJolla
<li>Words beginning with "wr" versus "r":  write vs. rite
<li>Miscellaneous problem words such as "debt", "tsetse",
      "Nguyen", "Van Nuyes".
</ul>

<h1>Auxiliary Functions</h1>

<p>The source code module that implements the spellfix1 virtual table also
implements several SQL functions that might be useful to applications
that employ spellfix1 or for testing or diagnostic work while developing
applications that use spellfix1.  The following auxiliary functions are
available:

<dl>
<dt><p><b>editdist3(P,W)<br>editdist3(P,W,L)<br>editdist3(T)</b><dd>
These routines provide direct access to the version of the Wagner
edit-distance function that allows for application-defined weights
on edit operations.  The first two forms of this function compare
pattern P against word W and return the edit distance.  In the first
function, the langid is assumed to be 0 and in the second, the
langid is given by the L parameter.  The third form of this function
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494

<dt><p><b>spellfix1_translit(X)</b><dd>
This routine transliterates unicode text into pure ascii, returning
the pure ascii representation of the input text X.  This is the function
that is used internally to transform vocabulary words into the K1
column of the shadow table.

</dl></blockquote>

<tcl>hd_fragment editdist3 editdist3</tcl>
<h2>The editdist3 function</h2>

<p>The editdist3 algorithm is a function that computes the minimum edit 
distance (a.k.a. the Levenshtein distance) between two input strings.
The editdist3 algorithm is a configurable alternative to the default
edit distance function of spellfix1.
Features of editdist3 include:

<ul>
<li><p>It works with unicode (UTF8) text.

<li><p>A table of insertion, deletion, and substitution costs can be 
       provided by the application.

<li><p>Multi-character insertions, deletions, and substitutions can be
       enumerated in the cost table.
</ul>

<h2>The editdist3 COST table</h2>

<p>To program the costs of editdist3, create a table such as the following:

<blockquote><pre>
CREATE TABLE editcost(
  iLang INT,   -- The language ID
  cFrom TEXT,  -- Convert text from this







|


|

















|







461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496

<dt><p><b>spellfix1_translit(X)</b><dd>
This routine transliterates unicode text into pure ascii, returning
the pure ascii representation of the input text X.  This is the function
that is used internally to transform vocabulary words into the K1
column of the shadow table.

</dl>

<tcl>hd_fragment editdist3 editdist3</tcl>
<h1>The editdist3 function</h1>

<p>The editdist3 algorithm is a function that computes the minimum edit 
distance (a.k.a. the Levenshtein distance) between two input strings.
The editdist3 algorithm is a configurable alternative to the default
edit distance function of spellfix1.
Features of editdist3 include:

<ul>
<li><p>It works with unicode (UTF8) text.

<li><p>A table of insertion, deletion, and substitution costs can be 
       provided by the application.

<li><p>Multi-character insertions, deletions, and substitutions can be
       enumerated in the cost table.
</ul>

<h1>The editdist3 COST table</h1>

<p>To program the costs of editdist3, create a table such as the following:

<blockquote><pre>
CREATE TABLE editcost(
  iLang INT,   -- The language ID
  cFrom TEXT,  -- Convert text from this
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
<p>In the spellfix1 algorithm, cFrom is the text as the user entered it and
cTo is the correctly spelled text as it exists in the database.  The goal
of the editdist3 algorithm is to determine how close the user-entered text is
to the dictionary text.

<p>There are three special-case entries in the cost table:

<table border=1>
<tr><th>cFrom</th><th>cTo</th><th>Meaning</th></tr>
<tr><td>''</td><td>'?'</td><td>The default insertion cost</td></tr>
<tr><td>'?'</td><td>''</td><td>The default deletion cost</td></tr>
<tr><td>'?'</td><td>'?'</td><td>The default substitution cost</td></tr>
</table>

<p>If any of the special-case entries shows above are omitted, then the







|







525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
<p>In the spellfix1 algorithm, cFrom is the text as the user entered it and
cTo is the correctly spelled text as it exists in the database.  The goal
of the editdist3 algorithm is to determine how close the user-entered text is
to the dictionary text.

<p>There are three special-case entries in the cost table:

<table border=1 align=center>
<tr><th>cFrom</th><th>cTo</th><th>Meaning</th></tr>
<tr><td>''</td><td>'?'</td><td>The default insertion cost</td></tr>
<tr><td>'?'</td><td>''</td><td>The default deletion cost</td></tr>
<tr><td>'?'</td><td>'?'</td><td>The default substitution cost</td></tr>
</table>

<p>If any of the special-case entries shows above are omitted, then the
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
INSERT INTO editcost(iLang, cFrom, cTo, iCost)
VALUES(0, 'ss', 'ß', 8);
</pre></blockquote>

<p>The number of characters in cFrom and cTo do not need to be the same.  The
rule above says that "ss" on user input will match "ß" with a penalty of 8.

<h2>Experimenting with the editcost3() function</h2>

<p>The spellfix1 virtual table
uses editdist3 if the "edit_cost_table=TABLE" option
is specified as an argument when the spellfix1 virtual table is created.  
But editdist3 can also be tested directly using the built-in "editdist3()"
SQL function.  The editdist3() SQL function has 3 forms:








|







561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
INSERT INTO editcost(iLang, cFrom, cTo, iCost)
VALUES(0, 'ss', 'ß', 8);
</pre></blockquote>

<p>The number of characters in cFrom and cTo do not need to be the same.  The
rule above says that "ss" on user input will match "ß" with a penalty of 8.

<h1>Experimenting with the editcost3() function</h1>

<p>The spellfix1 virtual table
uses editdist3 if the "edit_cost_table=TABLE" option
is specified as an argument when the spellfix1 virtual table is created.  
But editdist3 can also be tested directly using the built-in "editdist3()"
SQL function.  The editdist3() SQL function has 3 forms:

Changes to pages/tempfiles.in.
1
2
3
4
5
6
7
8
9
10
11
12
13
<title>Temporary Files Used By SQLite</title>
<tcl>hd_keywords {temporary disk files}</tcl>

<h1 align="center">SQLite's Use Of Temporary Disk Files</h1>

<h2>1.0 Introduction</h2>

<p>
One of the <a href="different.html">distinctive features</a> of
SQLite is that a database consists of a single disk file.
This simplifies the use of SQLite since moving or backing up a
database is a simple as copying a single file.  It also makes
SQLite appropriate for use as an



|

|







1
2
3
4
5
6
7
8
9
10
11
12
13
<title>Temporary Files Used By SQLite</title>
<tcl>hd_keywords {temporary disk files}</tcl>

<table_of_contents>

<h1>Introduction</h1>

<p>
One of the <a href="different.html">distinctive features</a> of
SQLite is that a database consists of a single disk file.
This simplifies the use of SQLite since moving or backing up a
database is a simple as copying a single file.  It also makes
SQLite appropriate for use as an
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
will use temporary files in the same way.  New kinds of temporary
files might be employed  and some of
the current temporary file uses might be discontinued
in future releases of SQLite.
</p>

<tcl>hd_fragment types</tcl>
<h2>2.0 Nine Kinds Of Temporary Files</h2>

<p>
SQLite currently uses nine distinct types of temporary files:
</p>

<ol>
<li>Rollback journals</li>







|







34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
will use temporary files in the same way.  New kinds of temporary
files might be employed  and some of
the current temporary file uses might be discontinued
in future releases of SQLite.
</p>

<tcl>hd_fragment types</tcl>
<h1>Nine Kinds Of Temporary Files</h1>

<p>
SQLite currently uses nine distinct types of temporary files:
</p>

<ol>
<li>Rollback journals</li>
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72

<p>
Additional information about each of these temporary file types
is in the sequel.
</p>

<tcl>hd_fragment rollbackjrnl</tcl>
<h3>2.1 Rollback Journals</h3>

<p>
A rollback journal is a temporary file used to implement
atomic commit and rollback capabilities in SQLite.
(For a detailed discussion of how this works, see
the separate document titled
<a href="atomiccommit.html">Atomic Commit In SQLite</a>.)







|







58
59
60
61
62
63
64
65
66
67
68
69
70
71
72

<p>
Additional information about each of these temporary file types
is in the sequel.
</p>

<tcl>hd_fragment rollbackjrnl</tcl>
<h2>Rollback Journals</h2>

<p>
A rollback journal is a temporary file used to implement
atomic commit and rollback capabilities in SQLite.
(For a detailed discussion of how this works, see
the separate document titled
<a href="atomiccommit.html">Atomic Commit In SQLite</a>.)
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
memory rather than on disk.  The ROLLBACK command still works when
the journal mode is MEMORY, but because no file exists on disks for
recovery, a crash or power loss in the middle of a transaction that uses
the MEMORY journal mode will likely result in a corrupt database.
</p>

<tcl>hd_fragment walfile</tcl>
<h3>2.2 Write-Ahead Log (WAL) Files</h3>

<p>
A write-ahead log or WAL file is used in place of a rollback journal
when SQLite is operating in [WAL mode].  As with the rollback journal,
the purpose of the WAL file is to implement atomic commit and rollback.
The WAL file is always located in the same directory
as the database file and has the same name as the database
file except with the 4 characters "<b>-wal</b>" appended.
The WAL file is created when the first connection to the
database is opened and is normally removed when the last
connection to the database closes.  However, if the last connection
does not shutdown cleanly, the WAL file will remain in the filesystem
and will be automatically cleaned up the next time the database is
opened.
</p>

<tcl>hd_fragment shmfile</tcl>
<h3>2.3 Shared-Memory Files</h3>

<p>
When operating in [WAL mode], all SQLite database connections associated
with the same database file need to share some memory that is used as an
index for the WAL file.  In most implementations, this shared memory is
implemented by calling mmap() on a file created for this sole purpose:
the shared-memory file.  The shared-memory file, if it exists, is located







|

















|







145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
memory rather than on disk.  The ROLLBACK command still works when
the journal mode is MEMORY, but because no file exists on disks for
recovery, a crash or power loss in the middle of a transaction that uses
the MEMORY journal mode will likely result in a corrupt database.
</p>

<tcl>hd_fragment walfile</tcl>
<h2>Write-Ahead Log (WAL) Files</h2>

<p>
A write-ahead log or WAL file is used in place of a rollback journal
when SQLite is operating in [WAL mode].  As with the rollback journal,
the purpose of the WAL file is to implement atomic commit and rollback.
The WAL file is always located in the same directory
as the database file and has the same name as the database
file except with the 4 characters "<b>-wal</b>" appended.
The WAL file is created when the first connection to the
database is opened and is normally removed when the last
connection to the database closes.  However, if the last connection
does not shutdown cleanly, the WAL file will remain in the filesystem
and will be automatically cleaned up the next time the database is
opened.
</p>

<tcl>hd_fragment shmfile</tcl>
<h2>Shared-Memory Files</h2>

<p>
When operating in [WAL mode], all SQLite database connections associated
with the same database file need to share some memory that is used as an
index for the WAL file.  In most implementations, this shared memory is
implemented by calling mmap() on a file created for this sole purpose:
the shared-memory file.  The shared-memory file, if it exists, is located
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
The shared-memory file is created when the WAL file is created and is
deleted when the WAL file is deleted.  During WAL file recovery, the
shared memory file is recreated from scratch based on the contents of
the WAL file being recovered.
</p>

<tcl>hd_fragment masterjrnl {master journal}</tcl>
<h3>2.4 Master Journal Files</h3>

<p>
The master journal file is used as part of the atomic commit
process when a single transaction makes changes to multiple
databases that have been added to a single [database connection]
using the [ATTACH] statement.  The master journal file is always
located in the same directory as the main database file







|







198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
The shared-memory file is created when the WAL file is created and is
deleted when the WAL file is deleted.  During WAL file recovery, the
shared memory file is recreated from scratch based on the contents of
the WAL file being recovered.
</p>

<tcl>hd_fragment masterjrnl {master journal}</tcl>
<h2>Master Journal Files</h2>

<p>
The master journal file is used as part of the atomic commit
process when a single transaction makes changes to multiple
databases that have been added to a single [database connection]
using the [ATTACH] statement.  The master journal file is always
located in the same directory as the main database file
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
some files might rollback while others roll forward after
power is restored.
</p>


<tcl>hd_fragment stmtjrnl {statement journal} {statement journals} \
   {Statement journals}</tcl>
<h3>2.5 Statement Journal Files</h3>

<p>
A statement journal file is used to rollback partial results of
a single statement within a larger transaction.  For example, suppose
an UPDATE statement will attempt to modify 100 rows in the database.
But after modifying the first 50 rows, the UPDATE hits
a constraint violation which should block the entire statement.







|







252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
some files might rollback while others roll forward after
power is restored.
</p>


<tcl>hd_fragment stmtjrnl {statement journal} {statement journals} \
   {Statement journals}</tcl>
<h2>Statement Journal Files</h2>

<p>
A statement journal file is used to rollback partial results of
a single statement within a larger transaction.  For example, suppose
an UPDATE statement will attempt to modify 100 rows in the database.
But after modifying the first 50 rows, the UPDATE hits
a constraint violation which should block the entire statement.
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
deleted at the conclusion of the transaction.  The size of the
statement journal is proportional to the size of the change implemented
by the UPDATE or INSERT statement that caused the statement journal
to be created.
</p>

<tcl>hd_fragment tempdb</tcl>
<h3>2.6 TEMP Databases</h3>

<p>Tables created using the "CREATE TEMP TABLE" syntax are only
visible to the [database connection] in which the "CREATE TEMP TABLE"
statement is originally evaluated.  These TEMP tables, together
with any associated indices, triggers, and views, are collectively
stored in a separate temporary database file that is created as
soon as the first "CREATE TEMP TABLE" statement is seen.







|







301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
deleted at the conclusion of the transaction.  The size of the
statement journal is proportional to the size of the change implemented
by the UPDATE or INSERT statement that caused the statement journal
to be created.
</p>

<tcl>hd_fragment tempdb</tcl>
<h2>TEMP Databases</h2>

<p>Tables created using the "CREATE TEMP TABLE" syntax are only
visible to the [database connection] in which the "CREATE TEMP TABLE"
statement is originally evaluated.  These TEMP tables, together
with any associated indices, triggers, and views, are collectively
stored in a separate temporary database file that is created as
soon as the first "CREATE TEMP TABLE" statement is seen.
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
<p>
The temporary files associated with the TEMP database and its
rollback journal are only created if the application makes use
of the "CREATE TEMP TABLE" statement.
</p>

<tcl>hd_fragment views</tcl>
<h3>2.7 Materializations Of Views And Subqueries</h3>

<p>Queries that contain subqueries must sometime evaluate
the subqueries separately and store the results in a temporary
table, then use the content of the temporary table to evaluate
the outer query.
We call this "materializing" the subquery.
The query optimizer in SQLite attempts to avoid materializing,







|







336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
<p>
The temporary files associated with the TEMP database and its
rollback journal are only created if the application makes use
of the "CREATE TEMP TABLE" statement.
</p>

<tcl>hd_fragment views</tcl>
<h2>Materializations Of Views And Subqueries</h2>

<p>Queries that contain subqueries must sometime evaluate
the subqueries separately and store the results in a temporary
table, then use the content of the temporary table to evaluate
the outer query.
We call this "materializing" the subquery.
The query optimizer in SQLite attempts to avoid materializing,
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
or not the subquery or outer query contain aggregate functions,
ORDER BY or GROUP BY clauses, LIMIT clauses, and so forth.
The rules for when a query can and cannot be flattened are
very complex and are beyond the scope of this document.
</p>

<tcl>hd_fragment transidx</tcl>
<h3>2.8 Transient Indices</h3>

<p>
SQLite may make use of transient indices to
implement SQL language features such as:
</p>

<ul>







|







442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
or not the subquery or outer query contain aggregate functions,
ORDER BY or GROUP BY clauses, LIMIT clauses, and so forth.
The rules for when a query can and cannot be flattened are
very complex and are beyond the scope of this document.
</p>

<tcl>hd_fragment transidx</tcl>
<h2>Transient Indices</h2>

<p>
SQLite may make use of transient indices to
implement SQL language features such as:
</p>

<ul>
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
<p>
Note that the UNION ALL operator for compound queries does not
use transient indices by itself (though of course the right
and left subqueries of the UNION ALL might use transient indices
depending on how they are composed.)

<tcl>hd_fragment vacuumdb</tcl>
<h3>2.9 Transient Database Used By [VACUUM]</h3>

<p>
The [VACUUM] command works by creating a temporary file
and then rebuilding the entire database into that temporary
file.  Then the content of the temporary file is copied back
into the original database file and the temporary file is
deleted.
</p>

<p>
The temporary file created by the [VACUUM] command exists only
for the duration of the command itself.  The size of the temporary
file will be no larger than the original database.
</p>

<tcl>hd_fragment tempstore *tempstore</tcl>
<h2>3.0 The SQLITE_TEMP_STORE Compile-Time Parameter and Pragma</h2>

<p>
The temporary files associated with transaction control, namely
the rollback journal, master journal, write-ahead log (WAL) files,
and shared-memory files, are always written to disk.
But the other kinds of temporary files might be stored in memory
only and never written to disk.







|
















|







524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
<p>
Note that the UNION ALL operator for compound queries does not
use transient indices by itself (though of course the right
and left subqueries of the UNION ALL might use transient indices
depending on how they are composed.)

<tcl>hd_fragment vacuumdb</tcl>
<h2>Transient Database Used By [VACUUM]</h2>

<p>
The [VACUUM] command works by creating a temporary file
and then rebuilding the entire database into that temporary
file.  Then the content of the temporary file is copied back
into the original database file and the temporary file is
deleted.
</p>

<p>
The temporary file created by the [VACUUM] command exists only
for the duration of the command itself.  The size of the temporary
file will be no larger than the original database.
</p>

<tcl>hd_fragment tempstore *tempstore</tcl>
<h1>The SQLITE_TEMP_STORE Compile-Time Parameter and Pragma</h1>

<p>
The temporary files associated with transaction control, namely
the rollback journal, master journal, write-ahead log (WAL) files,
and shared-memory files, are always written to disk.
But the other kinds of temporary files might be stored in memory
only and never written to disk.
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
and the master journal.  The rollback journal and the master
journal are always written to disk regardless of the settings of
the [SQLITE_TEMP_STORE] compile-time parameter and the
[temp_store pragma].
</p>

<tcl>hd_fragment otheropt</tcl>
<h2>4.0 Other Temporary File Optimizations</h2>

<p>
SQLite uses a page cache of recently read and written database
pages.  This page cache is used not just for the main database
file but also for transient indices and tables stored in temporary
files.  If SQLite needs to use a temporary index or table and
the [SQLITE_TEMP_STORE] compile-time parameter and the







|







630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
and the master journal.  The rollback journal and the master
journal are always written to disk regardless of the settings of
the [SQLITE_TEMP_STORE] compile-time parameter and the
[temp_store pragma].
</p>

<tcl>hd_fragment otheropt</tcl>
<h1>Other Temporary File Optimizations</h1>

<p>
SQLite uses a page cache of recently read and written database
pages.  This page cache is used not just for the main database
file but also for transient indices and tables stored in temporary
files.  If SQLite needs to use a temporary index or table and
the [SQLITE_TEMP_STORE] compile-time parameter and the
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
same for every temporary table and index.  The value cannot
be changed at run-time or on a per-table or per-index basis.
Each temporary file gets its own private page cache with its
own SQLITE_DEFAULT_TEMP_CACHE_SIZE page limit.
</p>

<tcl>hd_fragment tempdir {temporary directory search algorithm}</tcl>
<h2>5.0 Temporary File Storage Locations</h2>

<p>
The directory or folder in which temporary files are created is
determined by the OS-specific [VFS].

<p>
On unix-like systems, directories are searched in the following order:







|







666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
same for every temporary table and index.  The value cannot
be changed at run-time or on a per-table or per-index basis.
Each temporary file gets its own private page cache with its
own SQLITE_DEFAULT_TEMP_CACHE_SIZE page limit.
</p>

<tcl>hd_fragment tempdir {temporary directory search algorithm}</tcl>
<h1>Temporary File Storage Locations</h1>

<p>
The directory or folder in which temporary files are created is
determined by the OS-specific [VFS].

<p>
On unix-like systems, directories are searched in the following order:
Changes to pages/testing.in.
1
2
3


4
5
6
7
8
9
10
<title>How SQLite Is Tested</title>
<tcl>hd_keywords testing *tested {test suite}</tcl>



<tcl>
# This document contains many size statistics about SQLite, statistics
# that change frequently.  We want the document to be up-to-date.  To
# facilitate that, all the size values are defined by variables here
# which are then used as needed through the document.
#
# NOTE:  Also update the version number in the text!!!



>
>







1
2
3
4
5
6
7
8
9
10
11
12
<title>How SQLite Is Tested</title>
<tcl>hd_keywords testing *tested {test suite}</tcl>

<table_of_contents>

<tcl>
# This document contains many size statistics about SQLite, statistics
# that change frequently.  We want the document to be up-to-date.  To
# facilitate that, all the size values are defined by variables here
# which are then used as needed through the document.
#
# NOTE:  Also update the version number in the text!!!
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
}
proc version {} {
  hd_puts $::stat(version)
}

</tcl>

<h1 align="center">How SQLite Is Tested</h1>

<h2>1.0 Introduction</h2>

<p>The reliability and robustness of SQLite is achieved in part
by thorough and careful testing.</p>

<p>As of [version 3.12.0],
the SQLite library consists of approximately
<tcl>KB {$stat(coreSLOC)}</tcl> KSLOC of C code.
(KSLOC means thousands of "Source Lines Of Code" or, in other words,
lines of code excluding blank lines and comments.)
By comparison, the project has
<tcl>
hd_puts "[expr {int($stat(totalSLOC)/$stat(coreSLOC))}] times as much"
</tcl>
test code and test scripts - 
<tcl>KB {$stat(totalSLOC)}</tcl> KSLOC.</p>

<h3>1.1 Executive Summary</h3>

<ul>
<li> Three independently developed test harnesses
<li> 100% branch test coverage in an as-deployed configuration
<li> Millions and millions of test cases
<li> Out-of-memory tests
<li> I/O error tests
<li> Crash and power loss tests
<li> Fuzz tests
<li> Boundary value tests
<li> Disabled optimization tests
<li> Regression tests
<li> Malformed database tests
<li> Extensive use of assert() and run-time checks
<li> Valgrind analysis
<li> Undefined behavior checks
<li> Checklists
</ul>

<tcl>hd_fragment {harnesses} {test harness} {three test harnesses}</tcl>
<h2>2.0 Test Harnesses</h2>

<p>There are three independent test harnesses used for testing the 
core SQLite library.
Each test harness is designed, maintained, and managed separately
from the others.
</p>








<
<
|
















|




















|







97
98
99
100
101
102
103


104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
}
proc version {} {
  hd_puts $::stat(version)
}

</tcl>



<h1>Introduction</h1>

<p>The reliability and robustness of SQLite is achieved in part
by thorough and careful testing.</p>

<p>As of [version 3.12.0],
the SQLite library consists of approximately
<tcl>KB {$stat(coreSLOC)}</tcl> KSLOC of C code.
(KSLOC means thousands of "Source Lines Of Code" or, in other words,
lines of code excluding blank lines and comments.)
By comparison, the project has
<tcl>
hd_puts "[expr {int($stat(totalSLOC)/$stat(coreSLOC))}] times as much"
</tcl>
test code and test scripts - 
<tcl>KB {$stat(totalSLOC)}</tcl> KSLOC.</p>

<h2>Executive Summary</h2>

<ul>
<li> Three independently developed test harnesses
<li> 100% branch test coverage in an as-deployed configuration
<li> Millions and millions of test cases
<li> Out-of-memory tests
<li> I/O error tests
<li> Crash and power loss tests
<li> Fuzz tests
<li> Boundary value tests
<li> Disabled optimization tests
<li> Regression tests
<li> Malformed database tests
<li> Extensive use of assert() and run-time checks
<li> Valgrind analysis
<li> Undefined behavior checks
<li> Checklists
</ul>

<tcl>hd_fragment {harnesses} {test harness} {three test harnesses}</tcl>
<h1>Test Harnesses</h1>

<p>There are three independent test harnesses used for testing the 
core SQLite library.
Each test harness is designed, maintained, and managed separately
from the others.
</p>

220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
<tcl>KB {$stat(vqNEval)}</tcl> thousand test cases.
The veryquick tests include most tests other than the anomaly, fuzz, and 
soak tests.  The idea behind the veryquick tests are that they are
sufficient to catch most errors, but also run in only a few minutes
instead of a few hours.</p>

<tcl>hd_fragment anomoly</tcl>
<h2>3.0 Anomaly Testing</h2>

<p>Anomaly tests are tests designed to verify the correct behavior
of SQLite when something goes wrong.  It is (relatively) easy to build
an SQL database engine that behaves correctly on well-formed inputs
on a fully functional computer.  It is more difficult to build a system
that responds sanely to invalid inputs and continues to function following
system malfunctions.  The anomaly tests are designed to verify the latter
behavior.</p>

<tcl>hd_fragment oomtesting</tcl>
<h3>3.1 Out-Of-Memory Testing</h3>

<p>SQLite, like all SQL database engines, makes extensive use of
malloc()  (See the separate report on
[memory allocation | dynamic memory allocation in SQLite] for
additional detail.)
On servers and workstations, malloc() never fails in practice and so correct
handling of out-of-memory (OOM) errors is not particularly important.







|










|







220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
<tcl>KB {$stat(vqNEval)}</tcl> thousand test cases.
The veryquick tests include most tests other than the anomaly, fuzz, and 
soak tests.  The idea behind the veryquick tests are that they are
sufficient to catch most errors, but also run in only a few minutes
instead of a few hours.</p>

<tcl>hd_fragment anomoly</tcl>
<h1>Anomaly Testing</h1>

<p>Anomaly tests are tests designed to verify the correct behavior
of SQLite when something goes wrong.  It is (relatively) easy to build
an SQL database engine that behaves correctly on well-formed inputs
on a fully functional computer.  It is more difficult to build a system
that responds sanely to invalid inputs and continues to function following
system malfunctions.  The anomaly tests are designed to verify the latter
behavior.</p>

<tcl>hd_fragment oomtesting</tcl>
<h2>Out-Of-Memory Testing</h2>

<p>SQLite, like all SQL database engines, makes extensive use of
malloc()  (See the separate report on
[memory allocation | dynamic memory allocation in SQLite] for
additional detail.)
On servers and workstations, malloc() never fails in practice and so correct
handling of out-of-memory (OOM) errors is not particularly important.
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
repeated.  The loop continues until the entire operation runs to
completion without ever encountering a simulated OOM failure.
Tests like this are run twice, once with the instrumented malloc
set to fail only once, and again with the instrumented malloc set
to fail continuously after the first failure.</p>

<tcl>hd_fragment ioerrtesting</tcl>
<h3>3.2 I/O Error Testing</h3>

<p>I/O error testing seeks to verify that SQLite responds sanely
to failed I/O operations.  I/O errors might result from a full disk drive,
malfunctioning disk hardware, network outages when using a network
file system, system configuration or permission changes that occur in the 
middle of an SQL operation, or other hardware or operating system 
malfunctions.  Whatever the cause, it is important that SQLite be able







|







263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
repeated.  The loop continues until the entire operation runs to
completion without ever encountering a simulated OOM failure.
Tests like this are run twice, once with the instrumented malloc
set to fail only once, and again with the instrumented malloc set
to fail continuously after the first failure.</p>

<tcl>hd_fragment ioerrtesting</tcl>
<h2>I/O Error Testing</h2>

<p>I/O error testing seeks to verify that SQLite responds sanely
to failed I/O operations.  I/O errors might result from a full disk drive,
malfunctioning disk hardware, network outages when using a network
file system, system configuration or permission changes that occur in the 
middle of an SQL operation, or other hardware or operating system 
malfunctions.  Whatever the cause, it is important that SQLite be able
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308

<p>In I/O error tests, after the I/O error simulation failure mechanism
is disabled, the database is examined using
[PRAGMA integrity_check] to make sure that the I/O error has not
introduced database corruption.</p>

<tcl>hd_fragment crashtesting</tcl>
<h3>3.3 Crash Testing</h3>

<p>Crash testing seeks to demonstrate that an SQLite database will not
go corrupt if the application or operating system crashes or if there
is a power failure in the middle of a database update.  A separate
white-paper titled
<a href="atomiccommit.html">Atomic Commit in SQLite</a> describes the
defensive measure SQLite takes to prevent database corruption following







|







294
295
296
297
298
299
300
301
302
303
304
305
306
307
308

<p>In I/O error tests, after the I/O error simulation failure mechanism
is disabled, the database is examined using
[PRAGMA integrity_check] to make sure that the I/O error has not
introduced database corruption.</p>

<tcl>hd_fragment crashtesting</tcl>
<h2>Crash Testing</h2>

<p>Crash testing seeks to demonstrate that an SQLite database will not
go corrupt if the application or operating system crashes or if there
is a power failure in the middle of a database update.  A separate
white-paper titled
<a href="atomiccommit.html">Atomic Commit in SQLite</a> describes the
defensive measure SQLite takes to prevent database corruption following
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
one expects to see following a power loss.  Then the database is opened
and checks are made to ensure that it is well-formed and that the
transaction either ran to completion or was completely rolled back.
The interior of the loop is repeated multiple times for each
snapshot with different random damage each time.</p>

<tcl>hd_fragment multifail</tcl>
<h3>3.4 Compound failure tests</h3>

<p>The test suites for SQLite also explore the result of stacking
multiple failures.  For example, tests are run to ensure correct behavior
when an I/O error or OOM fault occurs while trying to recover from a
prior crash.

<tcl>hd_fragment fuzztesting {fuzz testing} {SQL fuzzing}</tcl>
<h2>4.0 Fuzz Testing</h2>

<p>[http://en.wikipedia.org/wiki/Fuzz_testing | Fuzz testing]
seeks to establish that SQLite responds correctly to invalid, out-of-range,
or malformed inputs.</p>

<h3>4.1 SQL Fuzz</h3>

<p>SQL fuzz testing consists of creating syntactically correct yet
wildly nonsensical SQL statements and feeding them to SQLite to see
what it will do with them.  Usually some kind of error is returned
(such as "no such table").  Sometimes, purely by chance, the SQL
statement also happens to be semantically correct.  In that case, the
resulting prepared statement is run to make sure it gives a reasonable
result.</p>

<p>The SQL fuzz generator tests are part of the TCL test suite.
During a full test run, about <tcl>KB {$stat(nSqlFuzz)}</tcl> 
thousand fuzz SQL statements are
generated and tested.</p>

<tcl>hd_fragment aflfuzz {American Fuzzy Lop fuzzer}</tcl>
<h4>4.1.1 SQL Fuzz Using The American Fuzzy Lop Fuzzer</h4>

<p>The <a href="http://lcamtuf.coredump.cx/afl/">American Fuzzy Lop</a>
or "AFL" fuzzer is a recent (circa 2014) innovation from Michal Zalewski.
Unlike most other fuzzers that blindly generate random inputs, the AFL
fuzzer instruments the program being tested (by modifying the assembly-language
output from the C compiler) and uses that instrumentation to detect when
an input causes the program to do something different - to follow







|







|





|















|







339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
one expects to see following a power loss.  Then the database is opened
and checks are made to ensure that it is well-formed and that the
transaction either ran to completion or was completely rolled back.
The interior of the loop is repeated multiple times for each
snapshot with different random damage each time.</p>

<tcl>hd_fragment multifail</tcl>
<h2>Compound failure tests</h2>

<p>The test suites for SQLite also explore the result of stacking
multiple failures.  For example, tests are run to ensure correct behavior
when an I/O error or OOM fault occurs while trying to recover from a
prior crash.

<tcl>hd_fragment fuzztesting {fuzz testing} {SQL fuzzing}</tcl>
<h1>Fuzz Testing</h1>

<p>[http://en.wikipedia.org/wiki/Fuzz_testing | Fuzz testing]
seeks to establish that SQLite responds correctly to invalid, out-of-range,
or malformed inputs.</p>

<h2>SQL Fuzz</h2>

<p>SQL fuzz testing consists of creating syntactically correct yet
wildly nonsensical SQL statements and feeding them to SQLite to see
what it will do with them.  Usually some kind of error is returned
(such as "no such table").  Sometimes, purely by chance, the SQL
statement also happens to be semantically correct.  In that case, the
resulting prepared statement is run to make sure it gives a reasonable
result.</p>

<p>The SQL fuzz generator tests are part of the TCL test suite.
During a full test run, about <tcl>KB {$stat(nSqlFuzz)}</tcl> 
thousand fuzz SQL statements are
generated and tested.</p>

<tcl>hd_fragment aflfuzz {American Fuzzy Lop fuzzer}</tcl>
<h3>SQL Fuzz Using The American Fuzzy Lop Fuzzer</h3>

<p>The <a href="http://lcamtuf.coredump.cx/afl/">American Fuzzy Lop</a>
or "AFL" fuzzer is a recent (circa 2014) innovation from Michal Zalewski.
Unlike most other fuzzers that blindly generate random inputs, the AFL
fuzzer instruments the program being tested (by modifying the assembly-language
output from the C compiler) and uses that instrumentation to detect when
an input causes the program to do something different - to follow
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
Both SQL statements and database files are fuzzed.
Billions and billions of mutations have been tried, but AFL's 
instrumentation has narrowed them down to less than 50,000 test cases that
cover all distinct behaviors.  Newly discovered test cases are periodically
captured and added to the [TCL test suite] where they can be rerun using
the "make fuzztest" or "make valgrindfuzz" commands.

<h3>4.2 Malformed Database Files</h3>

<p>There are numerous test cases that verify that SQLite is able to
deal with malformed database files.
These tests first build a well-formed database file, then add
corruption by changing one or more bytes in the file by some means
other than SQLite.  Then SQLite is used to read the database.
In some cases, the bytes changes are in the middle of data.
This causes the content of the database to change while keeping the
database well-formed.
In other cases, unused bytes of the file are modified, which has
no effect on the integrity of the database.
The interesting cases are when bytes of the file that
define database structure get changed.  The malformed database tests
verify that SQLite finds the file format errors and reports them
using the [SQLITE_CORRUPT] return code without overflowing
buffers, dereferencing NULL pointers, or performing other
unwholesome actions.</p>

<h3>4.3 Boundary Value Tests</h3>

<p>SQLite defines certain [limits] on its operation, such as the
maximum number of columns in a table, the maximum length of an 
SQL statement, or the maximum value of an integer.  The TCL and TH3 test
suites both contains numerous tests that push SQLite right to the edge
of its defined limits and verify that it performs correctly for
all allowed values.  Additional tests go beyond the defined limits
and verify that SQLite correctly returns errors.  The source code
contains [testcase macros] to verify that both sides of each boundary
have been tested.</p>

<tcl>hd_fragment regressiontesting</tcl>
<h2>5.0 Regression Testing</h2>

<p>Whenever a bug is reported against SQLite, that bug is not considered
fixed until new test cases that would exhibit the bug have been added 
to either the TCL or TH3 test suites.
Over the years,
this has resulted in thousands and thousands of new tests.
These regression tests ensure that bugs that have
been fixed in the past are not reintroduced into future versions of
SQLite.</p>

<tcl>hd_fragment leakcheck</tcl>
<h2>6.0 Automatic Resource Leak Detection</h2>

<p>Resource leak occurs when system resources
are allocated and never freed.  The most troublesome resource leaks
in many applications are memory leaks - when memory is allocated using
malloc() but never released using free().  But other kinds of resources
can also be leaked:  file descriptors, threads, mutexes, etc.</p>

<p>Both the TCL and TH3 test harnesses automatically track system
resources and report resource leaks on <u>every</u> test run.
No special configuration or setup is required.   The test harnesses
are especially vigilant with regard to memory leaks.  If a change
causes a memory leak, the test harnesses will recognize this
quickly.  SQLite is designed to never leak memory, even after
an exception such as an OOM error or disk I/O error.  The test
harnesses are zealous to enforce this.</p>

<tcl>hd_fragment coverage {test coverage}</tcl>
<h2>7.0 Test Coverage</h2>

<p>The SQLite core, including the unix [VFS],
has 100% branch test coverage under [TH3] in
its default configuration as measured by
[http://gcc.gnu.org/onlinedocs/gcc/Gcov.html | gcov].
Extensions such as FTS3 and RTree are excluded from this
analysis.</p>

<tcl>hd_fragment stmtvbr</tcl>
<h3>7.1 Statement versus branch coverage</h3>

<p>There are many ways to measure test coverage.  The most popular
metric is "statement coverage".  When you hear someone say that their
program as "XX% test coverage" without further explanation, they usually
mean statement coverage.  Statement coverage measures what percentage
of lines of code are executed at least once by the test suite.</p>








|


















|












|











|

















|









|







397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
Both SQL statements and database files are fuzzed.
Billions and billions of mutations have been tried, but AFL's 
instrumentation has narrowed them down to less than 50,000 test cases that
cover all distinct behaviors.  Newly discovered test cases are periodically
captured and added to the [TCL test suite] where they can be rerun using
the "make fuzztest" or "make valgrindfuzz" commands.

<h2>Malformed Database Files</h2>

<p>There are numerous test cases that verify that SQLite is able to
deal with malformed database files.
These tests first build a well-formed database file, then add
corruption by changing one or more bytes in the file by some means
other than SQLite.  Then SQLite is used to read the database.
In some cases, the bytes changes are in the middle of data.
This causes the content of the database to change while keeping the
database well-formed.
In other cases, unused bytes of the file are modified, which has
no effect on the integrity of the database.
The interesting cases are when bytes of the file that
define database structure get changed.  The malformed database tests
verify that SQLite finds the file format errors and reports them
using the [SQLITE_CORRUPT] return code without overflowing
buffers, dereferencing NULL pointers, or performing other
unwholesome actions.</p>

<h2>Boundary Value Tests</h2>

<p>SQLite defines certain [limits] on its operation, such as the
maximum number of columns in a table, the maximum length of an 
SQL statement, or the maximum value of an integer.  The TCL and TH3 test
suites both contains numerous tests that push SQLite right to the edge
of its defined limits and verify that it performs correctly for
all allowed values.  Additional tests go beyond the defined limits
and verify that SQLite correctly returns errors.  The source code
contains [testcase macros] to verify that both sides of each boundary
have been tested.</p>

<tcl>hd_fragment regressiontesting</tcl>
<h1>Regression Testing</h1>

<p>Whenever a bug is reported against SQLite, that bug is not considered
fixed until new test cases that would exhibit the bug have been added 
to either the TCL or TH3 test suites.
Over the years,
this has resulted in thousands and thousands of new tests.
These regression tests ensure that bugs that have
been fixed in the past are not reintroduced into future versions of
SQLite.</p>

<tcl>hd_fragment leakcheck</tcl>
<h1>Automatic Resource Leak Detection</h1>

<p>Resource leak occurs when system resources
are allocated and never freed.  The most troublesome resource leaks
in many applications are memory leaks - when memory is allocated using
malloc() but never released using free().  But other kinds of resources
can also be leaked:  file descriptors, threads, mutexes, etc.</p>

<p>Both the TCL and TH3 test harnesses automatically track system
resources and report resource leaks on <u>every</u> test run.
No special configuration or setup is required.   The test harnesses
are especially vigilant with regard to memory leaks.  If a change
causes a memory leak, the test harnesses will recognize this
quickly.  SQLite is designed to never leak memory, even after
an exception such as an OOM error or disk I/O error.  The test
harnesses are zealous to enforce this.</p>

<tcl>hd_fragment coverage {test coverage}</tcl>
<h1>Test Coverage</h1>

<p>The SQLite core, including the unix [VFS],
has 100% branch test coverage under [TH3] in
its default configuration as measured by
[http://gcc.gnu.org/onlinedocs/gcc/Gcov.html | gcov].
Extensions such as FTS3 and RTree are excluded from this
analysis.</p>

<tcl>hd_fragment stmtvbr</tcl>
<h2>Statement versus branch coverage</h2>

<p>There are many ways to measure test coverage.  The most popular
metric is "statement coverage".  When you hear someone say that their
program as "XX% test coverage" without further explanation, they usually
mean statement coverage.  Statement coverage measures what percentage
of lines of code are executed at least once by the test suite.</p>

516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
but all three are required for 100% branch coverage.  Generally speaking,
100% branch coverage implies 100% statement coverage, but the converse is
not true.  To reemphasize, the
[TH3] test harness for SQLite provides the stronger form of
test coverage - 100% branch test coverage.</p>

<tcl>hd_fragment defensivecode</tcl>
<h3>7.2 Coverage testing of defensive code</h3>

<p>A well-written C program will typically contain some defensive
conditionals which in practice are always true or always false.
This leads to a 
programming dilemma:  Does one remove defensive code in order to obtain
100% branch coverage?</p>








|







516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
but all three are required for 100% branch coverage.  Generally speaking,
100% branch coverage implies 100% statement coverage, but the converse is
not true.  To reemphasize, the
[TH3] test harness for SQLite provides the stronger form of
test coverage - 100% branch test coverage.</p>

<tcl>hd_fragment defensivecode</tcl>
<h2>Coverage testing of defensive code</h2>

<p>A well-written C program will typically contain some defensive
conditionals which in practice are always true or always false.
This leads to a 
programming dilemma:  Does one remove defensive code in order to obtain
100% branch coverage?</p>

565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
the ALWAYS() and NEVER() definitions shown above.  All three test runs
should yield exactly the same result.  There is a run-time test using
the [sqlite3_test_control]([SQLITE_TESTCTRL_ALWAYS], ...) interface that
can be used to verify that the macros are correctly set to the first
form (the pass-through form) for deployment.</p>

<tcl>hd_fragment {testcase} {testcase macros}</tcl>
<h3>7.3 Forcing coverage of boundary values and boolean vector tests</h3>

<p>Another macro used in conjunction with test coverage measurement is
the <tt>testcase()</tt> macro.  The argument is a condition for which
we want test cases that evaluate to both true and false.
In non-coverage builds (that is to say, in release builds) the
<tt>testcase()</tt> macro is a no-op:</p>








|







565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
the ALWAYS() and NEVER() definitions shown above.  All three test runs
should yield exactly the same result.  There is a run-time test using
the [sqlite3_test_control]([SQLITE_TESTCTRL_ALWAYS], ...) interface that
can be used to verify that the macros are correctly set to the first
form (the pass-through form) for deployment.</p>

<tcl>hd_fragment {testcase} {testcase macros}</tcl>
<h2>Forcing coverage of boundary values and boolean vector tests</h2>

<p>Another macro used in conjunction with test coverage measurement is
the <tt>testcase()</tt> macro.  The argument is a condition for which
we want test cases that evaluate to both true and false.
In non-coverage builds (that is to say, in release builds) the
<tt>testcase()</tt> macro is a no-op:</p>

624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
if( (mask & (SQLITE_OPEN_MAIN_DB|SQLITE_OPEN_TEMP_DB))!=0 ){ ... }
</pre></blockquote>

<p>The SQLite source code contains <tcl>N {$stat(nTestcase)}</tcl>
uses of the <tt>testcase()</tt> macro.</p>

<tcl>hd_fragment {mcdc} *MC/DC {MC/DC testing}</tcl>
<h3>7.4 Branch coverage versus MC/DC</h3>

<p>Two methods of measuring test coverage were described above:
"statement" and "branch" coverage.  There are many other test coverage
metrics besides these two.  Another popular metric is "Modified
Condition/Decision Coverage" or MC/DC.  
[http://en.wikipedia.org/wiki/Modified_Condition/Decision_Coverage | Wikipedia]
defines MC/DC as follows:</p>







|







624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
if( (mask & (SQLITE_OPEN_MAIN_DB|SQLITE_OPEN_TEMP_DB))!=0 ){ ... }
</pre></blockquote>

<p>The SQLite source code contains <tcl>N {$stat(nTestcase)}</tcl>
uses of the <tt>testcase()</tt> macro.</p>

<tcl>hd_fragment {mcdc} *MC/DC {MC/DC testing}</tcl>
<h2>Branch coverage versus MC/DC</h2>

<p>Two methods of measuring test coverage were described above:
"statement" and "branch" coverage.  There are many other test coverage
metrics besides these two.  Another popular metric is "Modified
Condition/Decision Coverage" or MC/DC.  
[http://en.wikipedia.org/wiki/Modified_Condition/Decision_Coverage | Wikipedia]
defines MC/DC as follows:</p>
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
might not be satisfied.</p>

<p>SQLite uses <tt>testcase()</tt> macros as described in the previous
subsection to make sure that every condition in a bit-vector decision takes
on every possible outcome.  In this way, SQLite also achieves 100% MC/DC
in addition to 100% branch coverage.</p>

<h3>7.5 Measuring branch coverage</h3>

<p>Branch coverage in SQLite is currently measured
using [https://gcc.gnu.org/onlinedocs/gcc/Gcov.html|gcov] with the "-b"
option.  First the test program is compiled using options
"-g -fprofile-arcs -ftest-coverage" and then the test program is run.
Then "gcov -b" is run to generate a coverage report.
The coverage report is verbose and inconvenient to read, 







|







655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
might not be satisfied.</p>

<p>SQLite uses <tt>testcase()</tt> macros as described in the previous
subsection to make sure that every condition in a bit-vector decision takes
on every possible outcome.  In this way, SQLite also achieves 100% MC/DC
in addition to 100% branch coverage.</p>

<h2>Measuring branch coverage</h2>

<p>Branch coverage in SQLite is currently measured
using [https://gcc.gnu.org/onlinedocs/gcc/Gcov.html|gcov] with the "-b"
option.  First the test program is compiled using options
"-g -fprofile-arcs -ftest-coverage" and then the test program is run.
Then "gcov -b" is run to generate a coverage report.
The coverage report is verbose and inconvenient to read, 
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
or a bug in the compiler.
Note that SQLite has, over the previous decade, encountered bugs
in each of GCC, Clang, and MSVC.  Compiler bugs, while rare, do happen,
which is why it is so important to test the code in an as-delivered
configuration.

<tcl>hd_fragment thoughts1</tcl>
<h3>7.6 Experience with full test coverage</h3>

<p>The developers of SQLite have found that full coverage testing is an
extremely effective method for locating and preventing bugs.
Because every single branch
instruction in SQLite core code is covered by test cases, the developers
can be confident that changes made in one part of the code
do not have unintended consequences in other parts of the code.







|







691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
or a bug in the compiler.
Note that SQLite has, over the previous decade, encountered bugs
in each of GCC, Clang, and MSVC.  Compiler bugs, while rare, do happen,
which is why it is so important to test the code in an as-delivered
configuration.

<tcl>hd_fragment thoughts1</tcl>
<h2>Experience with full test coverage</h2>

<p>The developers of SQLite have found that full coverage testing is an
extremely effective method for locating and preventing bugs.
Because every single branch
instruction in SQLite core code is covered by test cases, the developers
can be confident that changes made in one part of the code
do not have unintended consequences in other parts of the code.
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
However, we think that full-coverage testing is justified for a
[most widely deployed|very widely deployed] infrastructure library
like SQLite, and especially for a database library which by its very
nature "remembers" past mistakes.


<tcl>hd_fragment dynamicanalysis</tcl>
<h2>8.0 Dynamic Analysis</h2>

<p>Dynamic analysis refers to internal and external checks on the
SQLite code which are performed while the code is live and running.
Dynamic analysis has proven to be a great help in maintaining the
quality of SQLite.</p>

<tcl>hd_fragment asserts</tcl>
<h3>8.1 Assert</h3>

<p>The SQLite core contains <tcl>N {$stat(nAssert)}</tcl> <tt>assert()</tt>
statements that verify function preconditions and postconditions and
loop invariants.  Assert() is a macro which is a standard part of
ANSI-C.  The argument is a boolean value that is assumed to always be
true.  If the assertion is false, the program prints an error message
and halts.</p>

<p>Assert() macros are disabled by compiling with the NDEBUG macro defined.
In most systems, asserts are enabled by default.  But in SQLite, the
asserts are so numerous and are in such performance critical places, that
the database engine runs about three times slower when asserts are enabled.
Hence, the default (production) build of SQLite disables asserts.  
Assert statements are only enabled when SQLite is compiled with the
SQLITE_DEBUG preprocessor macro defined.</p>

<tcl>hd_fragment valgrind</tcl>
<h3>8.2 Valgrind</h3>

<p>[http://valgrind.org/ | Valgrind] is perhaps the most amazing
and useful developer tool in the world.  Valgrind is a simulator - it simulates
an x86 running a Linux binary.  (Ports of Valgrind for platforms other
than Linux are in development, but as of this writing, Valgrind only
works reliably on Linux, which in the opinion of the SQLite developers 
means that Linux should be the preferred platform for all software development.)







|







|

















|







713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
However, we think that full-coverage testing is justified for a
[most widely deployed|very widely deployed] infrastructure library
like SQLite, and especially for a database library which by its very
nature "remembers" past mistakes.


<tcl>hd_fragment dynamicanalysis</tcl>
<h1>Dynamic Analysis</h1>

<p>Dynamic analysis refers to internal and external checks on the
SQLite code which are performed while the code is live and running.
Dynamic analysis has proven to be a great help in maintaining the
quality of SQLite.</p>

<tcl>hd_fragment asserts</tcl>
<h2>Assert</h2>

<p>The SQLite core contains <tcl>N {$stat(nAssert)}</tcl> <tt>assert()</tt>
statements that verify function preconditions and postconditions and
loop invariants.  Assert() is a macro which is a standard part of
ANSI-C.  The argument is a boolean value that is assumed to always be
true.  If the assertion is false, the program prints an error message
and halts.</p>

<p>Assert() macros are disabled by compiling with the NDEBUG macro defined.
In most systems, asserts are enabled by default.  But in SQLite, the
asserts are so numerous and are in such performance critical places, that
the database engine runs about three times slower when asserts are enabled.
Hence, the default (production) build of SQLite disables asserts.  
Assert statements are only enabled when SQLite is compiled with the
SQLITE_DEBUG preprocessor macro defined.</p>

<tcl>hd_fragment valgrind</tcl>
<h2>Valgrind</h2>

<p>[http://valgrind.org/ | Valgrind] is perhaps the most amazing
and useful developer tool in the world.  Valgrind is a simulator - it simulates
an x86 running a Linux binary.  (Ports of Valgrind for platforms other
than Linux are in development, but as of this writing, Valgrind only
works reliably on Linux, which in the opinion of the SQLite developers 
means that Linux should be the preferred platform for all software development.)
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
running in Valgrind on a workstation will perform about the same as it
would running natively on a smartphone.)  So it is impractical to run the full
SQLite test suite through Valgrind.  However, the veryquick tests and
the coverage of the TH3 tests are run through Valgrind prior to every
release.</p>

<tcl>hd_fragment memtesting</tcl>
<h3>8.3 Memsys2</h3>

<p>SQLite contains a pluggable
[memory allocation | memory allocation subsystem].
The default implementation uses system malloc() and free(). 
However, if SQLite is compiled with [SQLITE_MEMDEBUG], an alternative
memory allocation wrapper ([memsys2])
is inserted that looks for memory allocation
errors at run-time.  The memsys2 wrapper checks for memory leaks, of
course, but also looks for buffer overruns, uses of uninitialized memory,
and attempts to use memory after it has been freed.  These same checks
are also done by valgrind (and, indeed, Valgrind does them better)
but memsys2 has the advantage of being much faster than Valgrind, which
means the checks can be done more often and for longer tests.</p>

<tcl>hd_fragment mutextesting</tcl>
<h3>8.4 Mutex Asserts</h3>

<p>SQLite contains a pluggable mutex subsystem.  Depending on 
compile-time options, the default mutex system contains interfaces
[sqlite3_mutex_held()] and [sqlite3_mutex_notheld()] that detect
whether or not a particular mutex is held by the calling thread.
These two interfaces are used extensively within assert() statements
in SQLite to verify mutexes are held and released at all the right
moments, in order to double-check that SQLite does work correctly
in multi-threaded applications.</p>

<tcl>hd_fragment journaltest</tcl>
<h3>8.5 Journal Tests</h3>

<p>One of the things that SQLite does to ensure that transactions
are atomic across system crashes and power failures is to write
all changes into the rollback journal file prior to changing the
database.  The TCL test harness contains an alternative
[OS backend] implementation that helps to
verify this is occurring correctly.  The "journal-test VFS" monitors
all disk I/O traffic between the database file and rollback journal,
checking to make sure that nothing is written into the database
file which has not first been written and synced to the rollback journal.
If any discrepancies are found, an assertion fault is raised.</p>

<p>The journal tests are an additional double-check over and above
the crash tests to make sure that SQLite transactions will be atomic
across system crashes and power failures.</p>

<tcl>hd_fragment intoverflow</tcl>
<h3>8.6 Undefined Behavior Checks</h3>

<p>In the C programming language, it is very easy to write code that
has "undefined" or "implementation defined" behavior.
That means that the code might work during development, but then give
a different answer on a different system, or when recompiled using different
compiler options.  
Examples of undefined and implementation-defined behavior in







|















|











|

















|







764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
running in Valgrind on a workstation will perform about the same as it
would running natively on a smartphone.)  So it is impractical to run the full
SQLite test suite through Valgrind.  However, the veryquick tests and
the coverage of the TH3 tests are run through Valgrind prior to every
release.</p>

<tcl>hd_fragment memtesting</tcl>
<h2>Memsys2</h2>

<p>SQLite contains a pluggable
[memory allocation | memory allocation subsystem].
The default implementation uses system malloc() and free(). 
However, if SQLite is compiled with [SQLITE_MEMDEBUG], an alternative
memory allocation wrapper ([memsys2])
is inserted that looks for memory allocation
errors at run-time.  The memsys2 wrapper checks for memory leaks, of
course, but also looks for buffer overruns, uses of uninitialized memory,
and attempts to use memory after it has been freed.  These same checks
are also done by valgrind (and, indeed, Valgrind does them better)
but memsys2 has the advantage of being much faster than Valgrind, which
means the checks can be done more often and for longer tests.</p>

<tcl>hd_fragment mutextesting</tcl>
<h2>Mutex Asserts</h2>

<p>SQLite contains a pluggable mutex subsystem.  Depending on 
compile-time options, the default mutex system contains interfaces
[sqlite3_mutex_held()] and [sqlite3_mutex_notheld()] that detect
whether or not a particular mutex is held by the calling thread.
These two interfaces are used extensively within assert() statements
in SQLite to verify mutexes are held and released at all the right
moments, in order to double-check that SQLite does work correctly
in multi-threaded applications.</p>

<tcl>hd_fragment journaltest</tcl>
<h2>Journal Tests</h2>

<p>One of the things that SQLite does to ensure that transactions
are atomic across system crashes and power failures is to write
all changes into the rollback journal file prior to changing the
database.  The TCL test harness contains an alternative
[OS backend] implementation that helps to
verify this is occurring correctly.  The "journal-test VFS" monitors
all disk I/O traffic between the database file and rollback journal,
checking to make sure that nothing is written into the database
file which has not first been written and synced to the rollback journal.
If any discrepancies are found, an assertion fault is raised.</p>

<p>The journal tests are an additional double-check over and above
the crash tests to make sure that SQLite transactions will be atomic
across system crashes and power failures.</p>

<tcl>hd_fragment intoverflow</tcl>
<h2>Undefined Behavior Checks</h2>

<p>In the C programming language, it is very easy to write code that
has "undefined" or "implementation defined" behavior.
That means that the code might work during development, but then give
a different answer on a different system, or when recompiled using different
compiler options.  
Examples of undefined and implementation-defined behavior in
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
on 32-bit and 64-bit systems and on big-endian and little-endian systems,
using a variety of CPU architectures.
Furthermore, the test suites are augmented with many test cases that are
deliberately designed to provoke undefined behavior.  For example:
"<b>SELECT -1*(-9223372036854775808);</b>".

<tcl>hd_fragment disopttest</tcl>
<h2>9.0 Disabled Optimization Tests</h2>

<p>The [sqlite3_test_control]([SQLITE_TESTCTRL_OPTIMIZATIONS], ...) interface
allows selected SQL statement optimizations to be disabled at run-time.
SQLite should always generate exactly the same answer with optimizations
enabled and with optimizations disabled; the answer simply arrives quicker
with the optimizations turned on.  So in a production environment, one always
leaves the optimizations turned on (the default setting).</p>







|







855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
on 32-bit and 64-bit systems and on big-endian and little-endian systems,
using a variety of CPU architectures.
Furthermore, the test suites are augmented with many test cases that are
deliberately designed to provoke undefined behavior.  For example:
"<b>SELECT -1*(-9223372036854775808);</b>".

<tcl>hd_fragment disopttest</tcl>
<h1>Disabled Optimization Tests</h1>

<p>The [sqlite3_test_control]([SQLITE_TESTCTRL_OPTIMIZATIONS], ...) interface
allows selected SQL statement optimizations to be disabled at run-time.
SQLite should always generate exactly the same answer with optimizations
enabled and with optimizations disabled; the answer simply arrives quicker
with the optimizations turned on.  So in a production environment, one always
leaves the optimizations turned on (the default setting).</p>
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
But the majority of test cases simply check that the correct answer
was obtained, and all of those cases can be run successfully with and
without the optimizations, in order to show that the optimizations do not
cause malfunctions.</p>


<tcl>hd_fragment cklist</tcl>
<h2>10.0 Checklists</h2>

<p>The SQLite developers use an on-line checklist to coordinate testing
activity and to verify that all tests pass prior each SQLite release.
<a href="http://www.sqlite.org/checklists/index.html">Past checklists</a>
are retained for historical reference.
(The checklists are read-only for anonymous internet viewers, but
developers can log in and update checklist items in their web







|







881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
But the majority of test cases simply check that the correct answer
was obtained, and all of those cases can be run successfully with and
without the optimizations, in order to show that the optimizations do not
cause malfunctions.</p>


<tcl>hd_fragment cklist</tcl>
<h1>Checklists</h1>

<p>The SQLite developers use an on-line checklist to coordinate testing
activity and to verify that all tests pass prior each SQLite release.
<a href="http://www.sqlite.org/checklists/index.html">Past checklists</a>
are retained for historical reference.
(The checklists are read-only for anonymous internet viewers, but
developers can log in and update checklist items in their web
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
potential problems are discovered, new checklist items are added to
make sure those problems do not appear in subsequent releases.  The
release checklist has proven to be an invaluable tool in helping to
ensure that nothing is overlooked during the release process.</p>


<tcl>hd_fragment staticanalysis</tcl>
<h2>11.0 Static Analysis</h2>

<p>Static analysis means analyzing code at or before compile-time to
check for correctness.  Static analysis includes looking at compiler
warning messages and running the code through more in-depth
analysis engines such as the
[http://clang-analyzer.llvm.org/ | Clang Static Analyzer].
SQLite compiles without warnings on GCC and Clang using 







|







915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
potential problems are discovered, new checklist items are added to
make sure those problems do not appear in subsequent releases.  The
release checklist has proven to be an invaluable tool in helping to
ensure that nothing is overlooked during the release process.</p>


<tcl>hd_fragment staticanalysis</tcl>
<h1>Static Analysis</h1>

<p>Static analysis means analyzing code at or before compile-time to
check for correctness.  Static analysis includes looking at compiler
warning messages and running the code through more in-depth
analysis engines such as the
[http://clang-analyzer.llvm.org/ | Clang Static Analyzer].
SQLite compiles without warnings on GCC and Clang using 
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
<p>Static analysis has not proven to be especially helpful in finding
bugs in SQLite.  Static analysis has found a few bugs in SQLite, but
those are the exceptions.  More bugs have been
introduced into SQLite while trying to get it to compile without 
warnings than have been found by static analysis.</p>

<tcl>hd_fragment summary</tcl>
<h2>12.0 Summary</h2>

<p>SQLite is open source.  This gives many people the idea that
it is not well tested as commercial software and is perhaps unreliable.
But that impression is false.  
SQLite has exhibited very high reliability in the field and
a very low defect rate, especially considering how rapidly it is evolving.
The quality of SQLite is achieved in part by careful code design and
implementation.  But extensive testing also plays a vital role in
maintaining and improving the quality of SQLite.  This document has
summarized the testing procedures that every release of SQLite undergoes
with the hope of inspiring confidence that SQLite is
suitable for use in mission-critical applications.</p>







|












939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
<p>Static analysis has not proven to be especially helpful in finding
bugs in SQLite.  Static analysis has found a few bugs in SQLite, but
those are the exceptions.  More bugs have been
introduced into SQLite while trying to get it to compile without 
warnings than have been found by static analysis.</p>

<tcl>hd_fragment summary</tcl>
<h1>Summary</h1>

<p>SQLite is open source.  This gives many people the idea that
it is not well tested as commercial software and is perhaps unreliable.
But that impression is false.  
SQLite has exhibited very high reliability in the field and
a very low defect rate, especially considering how rapidly it is evolving.
The quality of SQLite is achieved in part by careful code design and
implementation.  But extensive testing also plays a vital role in
maintaining and improving the quality of SQLite.  This document has
summarized the testing procedures that every release of SQLite undergoes
with the hope of inspiring confidence that SQLite is
suitable for use in mission-critical applications.</p>
Changes to pages/th3.in.
1
2
3
4
5
6
7
8
9
10
11
12
13
<title>SQLite TH3</title>
<tcl>hd_keywords {TH3}</tcl>

<h1 align="center">TH3: SQLite Test Harness #3</h1>

<h2>1.0 Overview</h2>

<p>SQLite Test Harness #3 (hereafter "TH3") is one of
[three test harnesses] used for testing SQLite.
TH3 is designed to meet the following objectives:</p>

<ul>
<li><p> TH3 is able to run on embedded platforms that lack the support
|


<
|
|







1
2
3

4
5
6
7
8
9
10
11
12
<title>TH3 Test Harness</title>
<tcl>hd_keywords {TH3}</tcl>


<fancy_format>
<h1>Overview</h1>

<p>SQLite Test Harness #3 (hereafter "TH3") is one of
[three test harnesses] used for testing SQLite.
TH3 is designed to meet the following objectives:</p>

<ul>
<li><p> TH3 is able to run on embedded platforms that lack the support
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
<p>TH3 was originally written for validation testing only, but has
subsequently been used for development testing and debugging
as well, and has proven very helpful in those roles.  A full-coverage
test takes less than five minutes on a workstation and hence
serves as a fast regression test during day-to-day maintenance
of the SQLite code base.</p>

<h2>2.0 Operation</h2>

<p>TH3 is a test program generator.  The output of TH3 is a program
implemented in C-code and intended to be
linked against the SQLite library under test.  The generated test
program is compiled and run on the target platform in order to verify
correct operation of SQLite on that platform.</p>








|







34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
<p>TH3 was originally written for validation testing only, but has
subsequently been used for development testing and debugging
as well, and has proven very helpful in those roles.  A full-coverage
test takes less than five minutes on a workstation and hence
serves as a fast regression test during day-to-day maintenance
of the SQLite code base.</p>

<h1>Operation</h1>

<p>TH3 is a test program generator.  The output of TH3 is a program
implemented in C-code and intended to be
linked against the SQLite library under test.  The generated test
program is compiled and run on the target platform in order to verify
correct operation of SQLite on that platform.</p>

75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
written SQL with the remainder either in pure C or a combination of C and SQL.
</p>

<p>Each test module file contains a header which describes the circumstances
under which the test is valid.  For a particular configuration, only those
modules that are compatible with the configuration are run.  </p>

<h2>3.0 Generating A Test Program</h2>

<p>The TH3 program generator is a TCL script named "<tt>mkth3.tcl</tt>".
To generate a test program, one has merely to run this script and supply
the names of files containing test modules and configurations on the
command line.  Test modules are files that use the "<tt>.test</tt>" suffix
and configurations are files that use the "<tt>.cfg</tt>" suffix.  A
typical invocation of mkth3.tcl might look something like the following:</p>







|







74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
written SQL with the remainder either in pure C or a combination of C and SQL.
</p>

<p>Each test module file contains a header which describes the circumstances
under which the test is valid.  For a particular configuration, only those
modules that are compatible with the configuration are run.  </p>

<h1>Generating A Test Program</h1>

<p>The TH3 program generator is a TCL script named "<tt>mkth3.tcl</tt>".
To generate a test program, one has merely to run this script and supply
the names of files containing test modules and configurations on the
command line.  Test modules are files that use the "<tt>.test</tt>" suffix
and configurations are files that use the "<tt>.cfg</tt>" suffix.  A
typical invocation of mkth3.tcl might look something like the following:</p>
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
configurations and "pager08", "build33", "orderby01", etc. are test modules.
Compile-time and run-time options are available to increase or decrease
the amount of output.
The output can be increased by showing each test case within each
test module.  The output can be decreased
by degrees: omitting test modules starts and stops,
omitting configuration starts and stops, and finally by omitting all output.
<h3>3.1 Test Automation Scripts</h3>

<p>TH3 comes with additional TCL scripts that help automate the testing
process on workstations.  The "th3make" script automatically runs "mkth3.tcl"
and "gcc" and then runs the resulting test program and checks the results.
Arguments to th3make include all of the "*.test" test modules and 
"*.cfg" configurations that are to be included in the test.  Additional
options to th3make can cause the test program to be compiled using different







|







170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
configurations and "pager08", "build33", "orderby01", etc. are test modules.
Compile-time and run-time options are available to increase or decrease
the amount of output.
The output can be increased by showing each test case within each
test module.  The output can be decreased
by degrees: omitting test modules starts and stops,
omitting configuration starts and stops, and finally by omitting all output.
<h2>Test Automation Scripts</h2>

<p>TH3 comes with additional TCL scripts that help automate the testing
process on workstations.  The "th3make" script automatically runs "mkth3.tcl"
and "gcc" and then runs the resulting test program and checks the results.
Arguments to th3make include all of the "*.test" test modules and 
"*.cfg" configurations that are to be included in the test.  Additional
options to th3make can cause the test program to be compiled using different
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
The two-line summary at the bottom shows the total number of errors and tests
over all th3make runs and the total elapse time, together with the 
[SQLITE_SOURCE_ID] information for the version of SQLite that was
tested.  This summary information is recorded in the
<a href="https://www.sqlite.org/checklists/3081000/index#c6">release
checklist</a> during testing.

<h2>4.0 Test Coverage</h2>

<p>Using one particular subset of the available TH3 test modules (the "cov1"
tests) SQLite obtained 
[test coverage | 100% branch test coverage] and 100% [MC/DC] as measured
by [http://gcc.gnu.org/onlinedocs/gcc/Gcov.html | gcov]
on Linux x86 and x86_64 hardware.  All releases of SQLite since
[version 3.6.17] (circa 2009-08-10) have been tested to this standard. 
The SQLite developers 
are committed to maintaining 100% branch coverage and MC/DC for all 
future releases of SQLite.</p>

<p>The cov1 test set used to obtain 100% branch test coverage are only a
subset of the tests currently implemented using TH3.  New test modules are
added on a regular basis.</p>

<h2>5.0 TH3 License</h2>

<p>SQLite itself is in the <a href="copyright.html">public domain</a> and
can be used for any purpose.  But TH3 is proprietary and requires a license.
</p>

<p>Even though open-source users do not have direct access to TH3, all
users of SQLite benefit from TH3 indirectly since each version of SQLite is
validated running TH3 on multiple platforms (Linux, Windows, WinRT, Mac,
OpenBSD) prior to release.  So anyone using an official release
of SQLite can deploy their application with the confidence of knowing that
it has been tested using TH3.  They simply cannot rerun those tests
themselves without purchasing a TH3 license.</p>







|















|












283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
The two-line summary at the bottom shows the total number of errors and tests
over all th3make runs and the total elapse time, together with the 
[SQLITE_SOURCE_ID] information for the version of SQLite that was
tested.  This summary information is recorded in the
<a href="https://www.sqlite.org/checklists/3081000/index#c6">release
checklist</a> during testing.

<h1>Test Coverage</h1>

<p>Using one particular subset of the available TH3 test modules (the "cov1"
tests) SQLite obtained 
[test coverage | 100% branch test coverage] and 100% [MC/DC] as measured
by [http://gcc.gnu.org/onlinedocs/gcc/Gcov.html | gcov]
on Linux x86 and x86_64 hardware.  All releases of SQLite since
[version 3.6.17] (circa 2009-08-10) have been tested to this standard. 
The SQLite developers 
are committed to maintaining 100% branch coverage and MC/DC for all 
future releases of SQLite.</p>

<p>The cov1 test set used to obtain 100% branch test coverage are only a
subset of the tests currently implemented using TH3.  New test modules are
added on a regular basis.</p>

<h1>TH3 License</h1>

<p>SQLite itself is in the <a href="copyright.html">public domain</a> and
can be used for any purpose.  But TH3 is proprietary and requires a license.
</p>

<p>Even though open-source users do not have direct access to TH3, all
users of SQLite benefit from TH3 indirectly since each version of SQLite is
validated running TH3 on multiple platforms (Linux, Windows, WinRT, Mac,
OpenBSD) prior to release.  So anyone using an official release
of SQLite can deploy their application with the confidence of knowing that
it has been tested using TH3.  They simply cannot rerun those tests
themselves without purchasing a TH3 license.</p>
Changes to pages/threadsafe.in.
1
2
3

4

5
6
7
8
9
10
11
<title>Using SQLite In Multi-Threaded Applications</title>
<tcl>hd_keywords {threading mode}</tcl>


<h2>SQLite And Multiple Threads</h2>


<p>SQLite supports three different threading modes:</p>

<ol>
<li><p><b>Single-thread</b>.
In this mode, all mutexes are disabled and SQLite is unsafe to use in
more than a single thread at once.</p></li>



>
|
>







1
2
3
4
5
6
7
8
9
10
11
12
13
<title>Using SQLite In Multi-Threaded Applications</title>
<tcl>hd_keywords {threading mode}</tcl>

<fancy_format>

<h1>Overview</h1>

<p>SQLite supports three different threading modes:</p>

<ol>
<li><p><b>Single-thread</b>.
In this mode, all mutexes are disabled and SQLite is unsafe to use in
more than a single thread at once.</p></li>
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
overridden once selected.
</p>

<p>
The default mode is serialized.
</p>

<h3>Compile-time selection of threading mode</h3>

<p>
Use the [SQLITE_THREADSAFE] compile-time parameter to selected the
threading mode.  If no [SQLITE_THREADSAFE] compile-time parameter is
present, then serialized mode is used.
This can be made explicit with 
[SQLITE_THREADSAFE | -DSQLITE_THREADSAFE=1].







|







32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
overridden once selected.
</p>

<p>
The default mode is serialized.
</p>

<h1>Compile-time selection of threading mode</h1>

<p>
Use the [SQLITE_THREADSAFE] compile-time parameter to selected the
threading mode.  If no [SQLITE_THREADSAFE] compile-time parameter is
present, then serialized mode is used.
This can be made explicit with 
[SQLITE_THREADSAFE | -DSQLITE_THREADSAFE=1].
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
<p>
If single-thread mode is selected at compile-time, then critical
mutexing logic is omitted from the build and it is impossible to
enable either multi-thread or serialized modes at start-time or
run-time.
</p>

<h3>Start-time selection of threading mode</h3>

<p>
Assuming that the compile-time threading mode is not single-thread, then
the threading mode can be changed during initialization using the
[sqlite3_config()] interface.  The [SQLITE_CONFIG_SINGLETHREAD] verb
puts SQLite into single-thread mode, the [SQLITE_CONFIG_MULTITHREAD]
verb sets multi-thread mode, and the [SQLITE_CONFIG_SERIALIZED] verb
sets serialized mode.
</p>

<h3>Run-time selection of threading mode</h3>

<p>If single-thread mode has not been selected at compile-time or start-time,
then individual database connections can be created as either multi-thread
or serialized.  It is not possible to downgrade an individual database
connection to single-thread mode.  Nor is it possible to escalate an
individual database connection if the compile-time or start-time mode
is single-thread.</p>







|










|







66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
<p>
If single-thread mode is selected at compile-time, then critical
mutexing logic is omitted from the build and it is impossible to
enable either multi-thread or serialized modes at start-time or
run-time.
</p>

<h1>Start-time selection of threading mode</h1>

<p>
Assuming that the compile-time threading mode is not single-thread, then
the threading mode can be changed during initialization using the
[sqlite3_config()] interface.  The [SQLITE_CONFIG_SINGLETHREAD] verb
puts SQLite into single-thread mode, the [SQLITE_CONFIG_MULTITHREAD]
verb sets multi-thread mode, and the [SQLITE_CONFIG_SERIALIZED] verb
sets serialized mode.
</p>

<h1>Run-time selection of threading mode</h1>

<p>If single-thread mode has not been selected at compile-time or start-time,
then individual database connections can be created as either multi-thread
or serialized.  It is not possible to downgrade an individual database
connection to single-thread mode.  Nor is it possible to escalate an
individual database connection if the compile-time or start-time mode
is single-thread.</p>
Changes to pages/uri.in.
1
2
3
4
5


6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
<title>Uniform Resource Identifiers</title>
<tcl>
hd_keywords {URI} {Uniform Resource Identifier} {URI filename} {URI filenames}
</tcl>



<h1>1.0 URI Filenames In SQLite</h1>

<p>
Beginning with [version 3.7.7], the SQLite database file argument to the
[sqlite3_open()], [sqlite3_open16()], and [sqlite3_open_v2()] interfaces
and to the [ATTACH] command can be specified
either as an ordinary filename or as a Uniform Resource Identifier or URI.
The advantage of using a URI filename is that query parameters on the URI can
be used to control details of the newly created database connection.
For example, an alternative [VFS] can be specified using a 
"vfs=" query parameter.
Or the database can be opened read-only by using "mode=ro" as a query
parameter.
</p>

<h1>2.0 Backwards Compatibility</h1>

<p>
^In order to maintain full backwards compatibility for legacy applications,
the URI filename capability is disabled by default.
^URI filenames can be enabled or disabled using the [SQLITE_USE_URI=1]
or [SQLITE_USE_URI=0] compile-time options.
^The compile-time setting for URI filenames can be changed





>
>
|














|







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
<title>Uniform Resource Identifiers</title>
<tcl>
hd_keywords {URI} {Uniform Resource Identifier} {URI filename} {URI filenames}
</tcl>

<fancy_format>

<h1>URI Filenames In SQLite</h1>

<p>
Beginning with [version 3.7.7], the SQLite database file argument to the
[sqlite3_open()], [sqlite3_open16()], and [sqlite3_open_v2()] interfaces
and to the [ATTACH] command can be specified
either as an ordinary filename or as a Uniform Resource Identifier or URI.
The advantage of using a URI filename is that query parameters on the URI can
be used to control details of the newly created database connection.
For example, an alternative [VFS] can be specified using a 
"vfs=" query parameter.
Or the database can be opened read-only by using "mode=ro" as a query
parameter.
</p>

<h1>Backwards Compatibility</h1>

<p>
^In order to maintain full backwards compatibility for legacy applications,
the URI filename capability is disabled by default.
^URI filenames can be enabled or disabled using the [SQLITE_USE_URI=1]
or [SQLITE_USE_URI=0] compile-time options.
^The compile-time setting for URI filenames can be changed
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
with "<tt>file:</tt>"
as an ordinary filename regardless of the URI setting, and because it is
very unusual to have an actual file begin with "<tt>file:</tt>", 
it is safe for most applications to enable URI processing even if URI 
filenames are not currently being used.
</p>

<h1>3.0 URI Format</h1>

<p>
According to [http://tools.ietf.org/html/rfc3986 | RFC 3986], a URI consists
of a scheme, an authority, a path, a query string, and a fragment.  The
scheme is always required.  One of either the authority or the path is also
always required.  The query string and fragment are optional.
</p>







|







48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
with "<tt>file:</tt>"
as an ordinary filename regardless of the URI setting, and because it is
very unusual to have an actual file begin with "<tt>file:</tt>", 
it is safe for most applications to enable URI processing even if URI 
filenames are not currently being used.
</p>

<h1>URI Format</h1>

<p>
According to [http://tools.ietf.org/html/rfc3986 | RFC 3986], a URI consists
of a scheme, an authority, a path, a query string, and a fragment.  The
scheme is always required.  One of either the authority or the path is also
always required.  The query string and fragment are optional.
</p>
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
<p>^A filename that is not a well-formed URI is interpreted as an
ordinary filename.</p>

<p>^URIs are processed as UTF8 text.
^The filename argument sqlite3_open16() is converted from UTF16 
native byte order into UTF8 prior to processing.

<h2>3.1 The URI Path</h2>

<p>^The path component of the URI specifies the disk file that is the
SQLite database to be opened.  ^(If the path component is omitted, then
the database is stored in a temporary file that will be automatically
deleted when the database connection closes.)^  ^If the authority section
is present, then the path is always an absolute pathname.  ^If the 
authority section is omitted, then the path is an absolute pathname if it







|







99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
<p>^A filename that is not a well-formed URI is interpreted as an
ordinary filename.</p>

<p>^URIs are processed as UTF8 text.
^The filename argument sqlite3_open16() is converted from UTF16 
native byte order into UTF8 prior to processing.

<h2>The URI Path</h2>

<p>^The path component of the URI specifies the disk file that is the
SQLite database to be opened.  ^(If the path component is omitted, then
the database is stored in a temporary file that will be automatically
deleted when the database connection closes.)^  ^If the authority section
is present, then the path is always an absolute pathname.  ^If the 
authority section is omitted, then the path is an absolute pathname if it
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
<li>Convert all sequences of two or more "<tt>/</tt>" characters into a
    single "<tt>/</tt>" character.
<li>On windows only, if the filename begins with a drive letter, prepend
    a single "<tt>/</tt>" character.
<li>Prepend the "<tt>file:</tt>" scheme.
</ol>

<h2>3.2 Query String</h2>

<p>^A URI filename can optionally be followed by a query string.
^The query string consists of text following the first "<tt>?</tt>"
character but excluding the optional fragment that begins with
"<tt>#</tt>".  ^The query string is divided into key/value pairs.
We usually refer to these key/value pairs as "query parameters".
^Key/value pairs are separated by a single "<tt>&amp;</tt>" character.







|







130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
<li>Convert all sequences of two or more "<tt>/</tt>" characters into a
    single "<tt>/</tt>" character.
<li>On windows only, if the filename begins with a drive letter, prepend
    a single "<tt>/</tt>" character.
<li>Prepend the "<tt>file:</tt>" scheme.
</ol>

<h2>Query String</h2>

<p>^A URI filename can optionally be followed by a query string.
^The query string consists of text following the first "<tt>?</tt>"
character but excluding the optional fragment that begins with
"<tt>#</tt>".  ^The query string is divided into key/value pairs.
We usually refer to these key/value pairs as "query parameters".
^Key/value pairs are separated by a single "<tt>&amp;</tt>" character.
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
^The list of query parameters appended to the xOpen filename
is terminated by a single zero-length key.
Note that the value of a query parameter can be an empty string.
</p>

<tcl>hd_fragment coreqp *coreqp {standard query parameters} {URI query parameters} \
    {query parameters with special meaning to SQLite}</tcl>
<h2>3.3 Recognized Query Parameters</h2>

<p>
Some query parameters are interpreted by the SQLite core and used to 
modify the characteristics of the new connection.  ^All query parameters
are always passed through into the xOpen method of the [VFS] even if
they are previously read and interpreted by the SQLite core.
</p>







|







157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
^The list of query parameters appended to the xOpen filename
is terminated by a single zero-length key.
Note that the value of a query parameter can be an empty string.
</p>

<tcl>hd_fragment coreqp *coreqp {standard query parameters} {URI query parameters} \
    {query parameters with special meaning to SQLite}</tcl>
<h2>Recognized Query Parameters</h2>

<p>
Some query parameters are interpreted by the SQLite core and used to 
modify the characteristics of the new connection.  ^All query parameters
are always passed through into the xOpen method of the [VFS] even if
they are previously read and interpreted by the SQLite core.
</p>
228
229
230
231
232
233
234
235
236
237
238
239
240
the [SQLITE_IOCAP_IMMUTABLE] bit in xDeviceCharacteristics)
asserts that a database file is immutable and that file 
changes anyhow, then SQLite might return incorrect query 
results and/or [SQLITE_CORRUPT] errors.
</dd>
</dl>

<h1>4.0 See Also</h1>

<ul>
<li> [URI filenames in sqlite3_open()]
<li> [URI filename examples]
</ul>







|





230
231
232
233
234
235
236
237
238
239
240
241
242
the [SQLITE_IOCAP_IMMUTABLE] bit in xDeviceCharacteristics)
asserts that a database file is immutable and that file 
changes anyhow, then SQLite might return incorrect query 
results and/or [SQLITE_CORRUPT] errors.
</dd>
</dl>

<h1>See Also</h1>

<ul>
<li> [URI filenames in sqlite3_open()]
<li> [URI filename examples]
</ul>
Changes to pages/wal.in.
1
2
3

4

5
6
7
8
9
10
11
<title>Write-Ahead Logging</title>
<tcl>hd_keywords {WAL} {write-ahead log} {WAL mode}</tcl>


<h1 align="center">Write-Ahead Logging</h1>


<p>The default method by which SQLite implements
[atomic commit | atomic commit and rollback] is a [rollback journal].
Beginning with [version 3.7.0], a new "Write-Ahead Log" option
(hereafter referred to as "WAL") is available.</p>

<p>There are advantages and disadvantages to using WAL instead of



>
|
>







1
2
3
4
5
6
7
8
9
10
11
12
13
<title>Write-Ahead Logging</title>
<tcl>hd_keywords {WAL} {write-ahead log} {WAL mode}</tcl>

<fancy_format>

<h1>Overview</h1>

<p>The default method by which SQLite implements
[atomic commit | atomic commit and rollback] is a [rollback journal].
Beginning with [version 3.7.0], a new "Write-Ahead Log" option
(hereafter referred to as "WAL") is available.</p>

<p>There are advantages and disadvantages to using WAL instead of
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
    It is recommended that one of the rollback journal modes be used for
    transactions larger than a few dozen megabytes.</s>
    Beginning with [version 3.11.0], WAL mode works as efficiently with
    large transactions as does rollback mode.
    
</ol>

<h2>How WAL Works</h2>

<p>The traditional rollback journal works by writing a copy of the
original unchanged database content into a separate rollback journal file
and then writing changes directly into the database file.  In the
event of a crash or [ROLLBACK], the original content contained in the
rollback journal is played back into the database file to
revert the database file to its original state.  The [COMMIT] occurs
when the rollback journal is deleted.</p>

<p>The WAL approach inverts this.  The original content is preserved
in the database file and the changes are appended into a separate
WAL file.  A [COMMIT] occurs when a special record indicating a commit
is appended to the WAL.  Thus a COMMIT can happen without ever writing
to the original database, which allows readers to continue operating
from the original unaltered database while changes are simultaneously being
committed into the WAL.  Multiple transactions can be appended to the
end of a single WAL file.</p>

<tcl>hd_fragment ckpt checkpoint checkpointed checkpointing</tcl>
<h3>Checkpointing</h3>

<p>Of course, one wants to eventually transfer all the transactions that
are appended in the WAL file back into the original database.  Moving
the WAL file transactions back into the database is called a
"<i>checkpoint</i>".<p>

<p>Another way to think about the difference between rollback and 







|



















|







64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
    It is recommended that one of the rollback journal modes be used for
    transactions larger than a few dozen megabytes.</s>
    Beginning with [version 3.11.0], WAL mode works as efficiently with
    large transactions as does rollback mode.
    
</ol>

<h1>How WAL Works</h1>

<p>The traditional rollback journal works by writing a copy of the
original unchanged database content into a separate rollback journal file
and then writing changes directly into the database file.  In the
event of a crash or [ROLLBACK], the original content contained in the
rollback journal is played back into the database file to
revert the database file to its original state.  The [COMMIT] occurs
when the rollback journal is deleted.</p>

<p>The WAL approach inverts this.  The original content is preserved
in the database file and the changes are appended into a separate
WAL file.  A [COMMIT] occurs when a special record indicating a commit
is appended to the WAL.  Thus a COMMIT can happen without ever writing
to the original database, which allows readers to continue operating
from the original unaltered database while changes are simultaneously being
committed into the WAL.  Multiple transactions can be appended to the
end of a single WAL file.</p>

<tcl>hd_fragment ckpt checkpoint checkpointed checkpointing</tcl>
<h2>Checkpointing</h2>

<p>Of course, one wants to eventually transfer all the transactions that
are appended in the WAL file back into the original database.  Moving
the WAL file transactions back into the database is called a
"<i>checkpoint</i>".<p>

<p>Another way to think about the difference between rollback and 
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
specify a different default.) Applications using WAL do
not have to do anything in order to for these checkpoints to occur.  
But if they want to, applications can adjust the automatic checkpoint
threshold.  Or they can turn off the automatic checkpoints and run 
checkpoints during idle moments or in a separate thread or process.</p>

<tcl>hd_fragment concurrency {WAL concurrency}</tcl>
<h3>Concurrency</h3>

<p>When a read operation begins on a WAL-mode database, it first
remembers the location of the last valid commit record in the WAL.
Call this point the "end mark".  Because the WAL can be growing and
adding new commit records while various readers connect to the database,
each reader can potentially have its own end mark.  But for any
particular reader, the end mark is unchanged for the duration of the







|







108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
specify a different default.) Applications using WAL do
not have to do anything in order to for these checkpoints to occur.  
But if they want to, applications can adjust the automatic checkpoint
threshold.  Or they can turn off the automatic checkpoints and run 
checkpoints during idle moments or in a separate thread or process.</p>

<tcl>hd_fragment concurrency {WAL concurrency}</tcl>
<h2>Concurrency</h2>

<p>When a read operation begins on a WAL-mode database, it first
remembers the location of the last valid commit record in the WAL.
Call this point the "end mark".  Because the WAL can be growing and
adding new commit records while various readers connect to the database,
each reader can potentially have its own end mark.  But for any
particular reader, the end mark is unchanged for the duration of the
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
the checkpointer has made, and if the entire WAL has been transferred into
the database and synced and if no readers are making use of the WAL, then
the writer will rewind the WAL back to the beginning and start putting new
transactions at the beginning of the WAL.  This mechanism prevents a WAL
file from growing without bound.</p>

<tcl>hd_fragment fast</tcl>
<h3>Performance Considerations</h3>

<p>Write transactions are very fast since they only involve writing
the content once (versus twice for rollback-journal transactions)
and because the writes are all sequential.  Further, syncing the
content to the disk is not required, as long as the application is
willing to sacrifice durability following a power loss or hard reboot.
(Writers sync the WAL on every transaction commit if







|







162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
the checkpointer has made, and if the entire WAL has been transferred into
the database and synced and if no readers are making use of the WAL, then
the writer will rewind the WAL back to the beginning and start putting new
transactions at the beginning of the WAL.  This mechanism prevents a WAL
file from growing without bound.</p>

<tcl>hd_fragment fast</tcl>
<h2>Performance Considerations</h2>

<p>Write transactions are very fast since they only involve writing
the content once (versus twice for rollback-journal transactions)
and because the writes are all sequential.  Further, syncing the
content to the disk is not required, as long as the application is
willing to sacrifice durability following a power loss or hard reboot.
(Writers sync the WAL on every transaction commit if
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
vary from one application to another depending on the relative read
and write performance requirements of the application.
The default strategy is to run a checkpoint once the WAL
reaches 1000 pages and this strategy seems to work well in test applications on 
workstations, but other strategies might work better on different 
platforms or for different workloads.</p>

<h2>Activating And Configuring WAL Mode</h2>

<p>An SQLite database connection defaults to 
[journal_mode | journal_mode=DELETE].  To convert to WAL mode, use the
following pragma:</p>

<blockquote><pre>
PRAGMA journal_mode=WAL;
</pre></blockquote>

<p>The journal_mode pragma returns a string which is the new journal mode.
On success, the pragma will return the string "<tt>wal</tt>".  If 
the conversion to WAL could not be completed (for example, if the [VFS]
does not support the necessary shared-memory primitives) then the
journaling mode will be unchanged and the string returned from the
primitive will be the prior journaling mode (for example "<tt>delete</tt>").

<a name="how_to_checkpoint"></a>
<h3>Automatic Checkpoint</h3>

<p>By default, SQLite will automatically checkpoint whenever a [COMMIT]
occurs that causes the WAL file to be 1000 pages or more in size, or when the 
last database connection on a database file closes.  The default 
configuration is intended to work well for most applications.
But programs that want more control can force a checkpoint
using the [wal_checkpoint pragma] or by calling the
[sqlite3_wal_checkpoint()] C interface.  The automatic checkpoint
threshold can be changed or automatic checkpointing can be completely
disabled using the [wal_autocheckpoint pragma] or by calling the
[sqlite3_wal_autocheckpoint()] C interface.  A program can also 
use [sqlite3_wal_hook()] to register a callback to be invoked whenever
any transaction commits to the WAL.  This callback can then invoke
[sqlite3_wal_checkpoint()] or [sqlite3_wal_checkpoint_v2()] based on whatever
criteria it thinks is appropriate.  (The automatic checkpoint mechanism
is implemented as a simple wrapper around [sqlite3_wal_hook()].)</p>

<h3>Application-Initiated Checkpoints</h3>

<p>An application can initiate a checkpoint using any writable database
connection on the database simply by invoking
[sqlite3_wal_checkpoint()] or [sqlite3_wal_checkpoint_v2()].
There are three subtypes of checkpoints that vary in their aggressiveness:
PASSIVE, FULL, and RESTART.  The default checkpoint style is PASSIVE, which
does as much work as it can without interfering with other database
connections, and which might not run to completion if there are
concurrent readers or writers.
All checkpoints initiated by [sqlite3_wal_checkpoint()] and
by the automatic checkpoint mechanism are PASSIVE.  FULL and RESTART
checkpoints try harder to run the checkpoint to completion and can only
be initiated by a call to [sqlite3_wal_checkpoint_v2()].  See the
[sqlite3_wal_checkpoint_v2()] documentation for additional information
on FULL and RESET checkpoints.

<h3>Persistence of WAL mode</h3>

<p>Unlike the other journaling modes, 
[journal_mode | PRAGMA journal_mode=WAL] is
persistent.  If a process sets WAL mode, then closes and reopens the
database, the database will come back in WAL mode.  In contrast, if
a process sets (for example) PRAGMA journal_mode=TRUNCATE and then closes and
reopens the database will come back up in the default rollback mode of
DELETE rather than the previous TRUNCATE setting.</p>

<p>The persistence of WAL mode means that applications can be converted
to using SQLite in WAL mode without making any changes to the application
itself.  One has merely to run "<tt>PRAGMA journal_mode=WAL;</tt>" on the
database file(s) using the [command-line shell] or other utility, then
restart the application.</p>

<p>The WAL journal mode will be set on all
connections to the same database file if it is set on any one connection.
</p>

<tcl>hd_fragment {readonly} {read-only WAL databases}</tcl>
<h2>Read-Only Databases</h2>

<p>No SQLite database (regardless of whether or not it is WAL mode) is
readable if it is located on read-only media and it requires recovery.
So, for example, if an application crashes and leaves an SQLite database
with a [hot journal], that database cannot be opened unless the opening
process has write privilege on the database file, the directory
containing the database file, and the hot journal.  This is because the 







|

















|

















|
















|




















|







235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
vary from one application to another depending on the relative read
and write performance requirements of the application.
The default strategy is to run a checkpoint once the WAL
reaches 1000 pages and this strategy seems to work well in test applications on 
workstations, but other strategies might work better on different 
platforms or for different workloads.</p>

<h1>Activating And Configuring WAL Mode</h1>

<p>An SQLite database connection defaults to 
[journal_mode | journal_mode=DELETE].  To convert to WAL mode, use the
following pragma:</p>

<blockquote><pre>
PRAGMA journal_mode=WAL;
</pre></blockquote>

<p>The journal_mode pragma returns a string which is the new journal mode.
On success, the pragma will return the string "<tt>wal</tt>".  If 
the conversion to WAL could not be completed (for example, if the [VFS]
does not support the necessary shared-memory primitives) then the
journaling mode will be unchanged and the string returned from the
primitive will be the prior journaling mode (for example "<tt>delete</tt>").

<a name="how_to_checkpoint"></a>
<h2>Automatic Checkpoint</h2>

<p>By default, SQLite will automatically checkpoint whenever a [COMMIT]
occurs that causes the WAL file to be 1000 pages or more in size, or when the 
last database connection on a database file closes.  The default 
configuration is intended to work well for most applications.
But programs that want more control can force a checkpoint
using the [wal_checkpoint pragma] or by calling the
[sqlite3_wal_checkpoint()] C interface.  The automatic checkpoint
threshold can be changed or automatic checkpointing can be completely
disabled using the [wal_autocheckpoint pragma] or by calling the
[sqlite3_wal_autocheckpoint()] C interface.  A program can also 
use [sqlite3_wal_hook()] to register a callback to be invoked whenever
any transaction commits to the WAL.  This callback can then invoke
[sqlite3_wal_checkpoint()] or [sqlite3_wal_checkpoint_v2()] based on whatever
criteria it thinks is appropriate.  (The automatic checkpoint mechanism
is implemented as a simple wrapper around [sqlite3_wal_hook()].)</p>

<h2>Application-Initiated Checkpoints</h2>

<p>An application can initiate a checkpoint using any writable database
connection on the database simply by invoking
[sqlite3_wal_checkpoint()] or [sqlite3_wal_checkpoint_v2()].
There are three subtypes of checkpoints that vary in their aggressiveness:
PASSIVE, FULL, and RESTART.  The default checkpoint style is PASSIVE, which
does as much work as it can without interfering with other database
connections, and which might not run to completion if there are
concurrent readers or writers.
All checkpoints initiated by [sqlite3_wal_checkpoint()] and
by the automatic checkpoint mechanism are PASSIVE.  FULL and RESTART
checkpoints try harder to run the checkpoint to completion and can only
be initiated by a call to [sqlite3_wal_checkpoint_v2()].  See the
[sqlite3_wal_checkpoint_v2()] documentation for additional information
on FULL and RESET checkpoints.

<h2>Persistence of WAL mode</h2>

<p>Unlike the other journaling modes, 
[journal_mode | PRAGMA journal_mode=WAL] is
persistent.  If a process sets WAL mode, then closes and reopens the
database, the database will come back in WAL mode.  In contrast, if
a process sets (for example) PRAGMA journal_mode=TRUNCATE and then closes and
reopens the database will come back up in the default rollback mode of
DELETE rather than the previous TRUNCATE setting.</p>

<p>The persistence of WAL mode means that applications can be converted
to using SQLite in WAL mode without making any changes to the application
itself.  One has merely to run "<tt>PRAGMA journal_mode=WAL;</tt>" on the
database file(s) using the [command-line shell] or other utility, then
restart the application.</p>

<p>The WAL journal mode will be set on all
connections to the same database file if it is set on any one connection.
</p>

<tcl>hd_fragment {readonly} {read-only WAL databases}</tcl>
<h1>Read-Only Databases</h1>

<p>No SQLite database (regardless of whether or not it is WAL mode) is
readable if it is located on read-only media and it requires recovery.
So, for example, if an application crashes and leaves an SQLite database
with a [hot journal], that database cannot be opened unless the opening
process has write privilege on the database file, the directory
containing the database file, and the hot journal.  This is because the 
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368

<p>Also, if multiple processes are to access a WAL mode database, then
all processes should run under user or group IDs that give them write
access to the database files, the WAL file, the shared memory 
<tt>-shm</tt> file, and the containing directory.</p>

<tcl>hd_fragment bigwal {large WAL files} {avoiding large WAL files}</tcl>
<h2>Avoiding Excessively Large WAL Files</h2>

<p>In normal cases, new content is appended to the WAL file until the
WAL file accumulates about 1000 pages (and is thus about 4MB 
in size) at which point a checkpoint is automatically run and the WAL file
is recycled.  The checkpoint does not normally truncate the WAL file
(unless the [journal_size_limit pragma] is set).  Instead, it merely
causes SQLite to start overwriting the WAL file from the beginning.







|







356
357
358
359
360
361
362
363
364
365
366
367
368
369
370

<p>Also, if multiple processes are to access a WAL mode database, then
all processes should run under user or group IDs that give them write
access to the database files, the WAL file, the shared memory 
<tt>-shm</tt> file, and the containing directory.</p>

<tcl>hd_fragment bigwal {large WAL files} {avoiding large WAL files}</tcl>
<h1>Avoiding Excessively Large WAL Files</h1>

<p>In normal cases, new content is appended to the WAL file until the
WAL file accumulates about 1000 pages (and is thus about 4MB 
in size) at which point a checkpoint is automatically run and the WAL file
is recycled.  The checkpoint does not normally truncate the WAL file
(unless the [journal_size_limit pragma] is set).  Instead, it merely
causes SQLite to start overwriting the WAL file from the beginning.
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
should be proportional in size to the transaction itself.  Pages that
are changed by the transaction should only be written into the WAL file
once.  However, with older versions of SQLite, the same page might be
written into the WAL file multiple times if the transaction grows larger
than the page cache.
</ul>

<h2>Implementation Of Shared-Memory For The WAL-Index</h2>

<p>The [wal-index] is implemented using an ordinary file that is
mmapped for robustness.  Early (pre-release) implementations of WAL mode
stored the wal-index in volatile shared-memory, such as files created in
/dev/shm on Linux or /tmp on other unix systems.  The problem
with that approach is that processes with a different root directory
(changed via <a href="http://en.wikipedia.org/wiki/Chroot">chroot</a>)







|







430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
should be proportional in size to the transaction itself.  Pages that
are changed by the transaction should only be written into the WAL file
once.  However, with older versions of SQLite, the same page might be
written into the WAL file multiple times if the transaction grows larger
than the page cache.
</ul>

<h1>Implementation Of Shared-Memory For The WAL-Index</h1>

<p>The [wal-index] is implemented using an ordinary file that is
mmapped for robustness.  Early (pre-release) implementations of WAL mode
stored the wal-index in volatile shared-memory, such as files created in
/dev/shm on Linux or /tmp on other unix systems.  The problem
with that approach is that processes with a different root directory
(changed via <a href="http://en.wikipedia.org/wiki/Chroot">chroot</a>)
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
shared memory is unacceptable can devise alternative methods via a
custom [VFS].  
For example, if it is known that a particular database
will only be accessed by threads within a single process, the wal-index
can be implemented using heap memory instead of true shared memory.</p>

<tcl>hd_fragment noshm {WAL without shared memory}</tcl>
<h2>Use of WAL Without Shared-Memory</h2>

<p>Beginning in SQLite version 3.7.4, ^WAL databases can be created, read, and
written even if shared memory is unavailable as long as the
[locking_mode] is set to EXCLUSIVE before the first attempted access.
In other words, a process can interact with
a WAL database without using shared memory if that
process is guaranteed to be the only process accessing the database.







|







463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
shared memory is unacceptable can devise alternative methods via a
custom [VFS].  
For example, if it is known that a particular database
will only be accessed by threads within a single process, the wal-index
can be implemented using heap memory instead of true shared memory.</p>

<tcl>hd_fragment noshm {WAL without shared memory}</tcl>
<h1>Use of WAL Without Shared-Memory</h1>

<p>Beginning in SQLite version 3.7.4, ^WAL databases can be created, read, and
written even if shared memory is unavailable as long as the
[locking_mode] is set to EXCLUSIVE before the first attempted access.
In other words, a process can interact with
a WAL database without using shared memory if that
process is guaranteed to be the only process accessing the database.
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
^As long as exactly one connection is using a shared-memory wal-index, 
the locking mode can be changed freely between NORMAL and EXCLUSIVE.  
^It is only when the shared-memory wal-index is omitted, when the locking 
mode is EXCLUSIVE prior to the first WAL-mode database access, that the 
locking mode is stuck in EXCLUSIVE.</p>

<tcl>hd_fragment bkwrds {WAL backwards compatibility}</tcl>
<h2>Backwards Compatibility</h2>

<p>The database file format is unchanged for WAL mode.  However, the
WAL file and the [wal-index] are new concepts and so older versions of 
SQLite will not know
how to recover a crashed SQLite database that was operating in WAL mode
when the crash occurred.
^(To prevent older versions of SQLite (prior to version 3.7.0, 2010-07-22)







|







500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
^As long as exactly one connection is using a shared-memory wal-index, 
the locking mode can be changed freely between NORMAL and EXCLUSIVE.  
^It is only when the shared-memory wal-index is omitted, when the locking 
mode is EXCLUSIVE prior to the first WAL-mode database access, that the 
locking mode is stuck in EXCLUSIVE.</p>

<tcl>hd_fragment bkwrds {WAL backwards compatibility}</tcl>
<h1>Backwards Compatibility</h1>

<p>The database file format is unchanged for WAL mode.  However, the
WAL file and the [wal-index] are new concepts and so older versions of 
SQLite will not know
how to recover a crashed SQLite database that was operating in WAL mode
when the crash occurred.
^(To prevent older versions of SQLite (prior to version 3.7.0, 2010-07-22)
Changes to pages/withoutrowid.in.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
<title>The WITHOUT ROWID Optimization</title>
<tcl>
hd_keywords {WITHOUT rowid} {WITHOUT ROWID}
</tcl>
<h1 align="center">The WITHOUT ROWID Optimization</h1>

<h2>1.0 Introduction</h2>

<p>^By default, every row in SQLite has a special column, usually called the
"[rowid]", that uniquely identifies that row within the table.  ^However
if the phrase "WITHOUT ROWID" is added to the end of a [CREATE TABLE] statement,
then the special "rowid" column is omitted.  There are sometimes
space and performance advantages to omitting the rowid.</p>

<h3>1.1 Syntax</h3>

<p>^(To create a WITHOUT ROWID table, simply add the keywords "WITHOUT ROWID"
to the end of the [CREATE TABLE] statement.  For example:</p>

<blockquote><pre>
CREATE TABLE IF NOT EXISTS wordcount(
  word TEXT PRIMARY KEY,




|

|







|







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
<title>The WITHOUT ROWID Optimization</title>
<tcl>
hd_keywords {WITHOUT rowid} {WITHOUT ROWID}
</tcl>
<fancy_format>

<h1>Introduction</h1>

<p>^By default, every row in SQLite has a special column, usually called the
"[rowid]", that uniquely identifies that row within the table.  ^However
if the phrase "WITHOUT ROWID" is added to the end of a [CREATE TABLE] statement,
then the special "rowid" column is omitted.  There are sometimes
space and performance advantages to omitting the rowid.</p>

<h2>Syntax</h2>

<p>^(To create a WITHOUT ROWID table, simply add the keywords "WITHOUT ROWID"
to the end of the [CREATE TABLE] statement.  For example:</p>

<blockquote><pre>
CREATE TABLE IF NOT EXISTS wordcount(
  word TEXT PRIMARY KEY,
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
<p>Every WITHOUT ROWID table must have a [PRIMARY KEY].  ^An error is raised
if a CREATE TABLE statement with the WITHOUT ROWID clause lacks a PRIMARY KEY.

<p>In most contexts, the special "rowid" column of normal tables can 
also be called "oid" or "_rowid_".  ^However, only "rowid" works as 
the keyword in the CREATE TABLE statement.</p>

<h3>1.2 Compatibility</h3>

<p>SQLite [version 3.8.2] or later is necessary in order to use a WITHOUT
ROWID table.  An attempt to open a database that contains one or more WITHOUT
ROWID tables using an earlier version of SQLite will result in a
"malformed database schema" error.</p>

<h3>1.3 Quirks</h3>

<p>WITHOUT ROWID is found only in SQLite and is not compatible
with any other SQL database engine, as far as we know.
In an elegant system, all tables would behave as WITHOUT ROWID
tables even without the WITHOUT ROWID keyword.  However, when SQLite was
first designed, it used only integer [rowid|rowids] for row keys 
to simplify the implementation.
This approach worked well for many years.  But as the demands on
SQLite grew, the need for tables in which the PRIMARY KEY really did
correspond to the underlying row key grew more acute.  The WITHOUT ROWID
concept was added
in order to meet that need without breaking backwards
compatibility with the billions of SQLite databases already in use at
the time (circa 2013).

<h2>2.0 Differences From Ordinary Rowid Tables</h2>

<p>The WITHOUT ROWID syntax is an optimization.  It provides no new
capabilities.  Anything that can be done using a WITHOUT ROWID table
can also be done in exactly the same way, and exactly the same syntax,
using an ordinary rowid table.  The only advantage of a WITHOUT ROWID
table is that it can sometimes use less disk space and/or perform a little
faster than an ordinary rowid table.</p>







|






|















|







31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
<p>Every WITHOUT ROWID table must have a [PRIMARY KEY].  ^An error is raised
if a CREATE TABLE statement with the WITHOUT ROWID clause lacks a PRIMARY KEY.

<p>In most contexts, the special "rowid" column of normal tables can 
also be called "oid" or "_rowid_".  ^However, only "rowid" works as 
the keyword in the CREATE TABLE statement.</p>

<h2>Compatibility</h2>

<p>SQLite [version 3.8.2] or later is necessary in order to use a WITHOUT
ROWID table.  An attempt to open a database that contains one or more WITHOUT
ROWID tables using an earlier version of SQLite will result in a
"malformed database schema" error.</p>

<h2>Quirks</h2>

<p>WITHOUT ROWID is found only in SQLite and is not compatible
with any other SQL database engine, as far as we know.
In an elegant system, all tables would behave as WITHOUT ROWID
tables even without the WITHOUT ROWID keyword.  However, when SQLite was
first designed, it used only integer [rowid|rowids] for row keys 
to simplify the implementation.
This approach worked well for many years.  But as the demands on
SQLite grew, the need for tables in which the PRIMARY KEY really did
correspond to the underlying row key grew more acute.  The WITHOUT ROWID
concept was added
in order to meet that need without breaking backwards
compatibility with the billions of SQLite databases already in use at
the time (circa 2013).

<h1>Differences From Ordinary Rowid Tables</h1>

<p>The WITHOUT ROWID syntax is an optimization.  It provides no new
capabilities.  Anything that can be done using a WITHOUT ROWID table
can also be done in exactly the same way, and exactly the same syntax,
using an ordinary rowid table.  The only advantage of a WITHOUT ROWID
table is that it can sometimes use less disk space and/or perform a little
faster than an ordinary rowid table.</p>
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
Hence, the update hook is not invoked when a WITHOUT ROWID table changes.
<p>Note that since the [session] extension uses the update hook, that means
that the session extension will not work correctly on a database that includes
WITHOUT ROWID tables.
</ol>

<tcl>hd_fragment bene {benefits of using WITHOUT ROWID}</tcl>
<h2>3.0 Benefits Of WITHOUT ROWID Tables</h2>

<p>A WITHOUT ROWID table is an optimization that can reduce storage and
processing requirements.

<p>In an ordinary SQLite table, the PRIMARY KEY is really just a 
[UNIQUE] index.  The key used to look up records on disk
is the [rowid].







|







131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
Hence, the update hook is not invoked when a WITHOUT ROWID table changes.
<p>Note that since the [session] extension uses the update hook, that means
that the session extension will not work correctly on a database that includes
WITHOUT ROWID tables.
</ol>

<tcl>hd_fragment bene {benefits of using WITHOUT ROWID}</tcl>
<h1>Benefits Of WITHOUT ROWID Tables</h1>

<p>A WITHOUT ROWID table is an optimization that can reduce storage and
processing requirements.

<p>In an ordinary SQLite table, the PRIMARY KEY is really just a 
[UNIQUE] index.  The key used to look up records on disk
is the [rowid].
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
of disk space and can operate nearly twice as fast.  Of course, in a 
real-world schema, there will typically be secondary indices and/or
UNIQUE constraints, and the situation is more complicated.  But even then,
there can often be space and performance advantages to using WITHOUT ROWID
on tables that have non-integer or composite PRIMARY KEYs.

<tcl>hd_fragment wtu {when to use WITHOUT ROWID}</tcl>
<h2>4.0 When To Use WITHOUT ROWID</h2>

<p>The WITHOUT ROWID optimization is likely to be helpful for tables
that have non-integer or composite (multi-column) PRIMARY KEYs and that do
not store large strings or BLOBs.</p>

<p>WITHOUT ROWID tables will work correctly (that is to say, they
provide the correct answer) for tables with a single INTEGER PRIMARY KEY. 







|







206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
of disk space and can operate nearly twice as fast.  Of course, in a 
real-world schema, there will typically be secondary indices and/or
UNIQUE constraints, and the situation is more complicated.  But even then,
there can often be space and performance advantages to using WITHOUT ROWID
on tables that have non-integer or composite PRIMARY KEYs.

<tcl>hd_fragment wtu {when to use WITHOUT ROWID}</tcl>
<h1>When To Use WITHOUT ROWID</h1>

<p>The WITHOUT ROWID optimization is likely to be helpful for tables
that have non-integer or composite (multi-column) PRIMARY KEYs and that do
not store large strings or BLOBs.</p>

<p>WITHOUT ROWID tables will work correctly (that is to say, they
provide the correct answer) for tables with a single INTEGER PRIMARY KEY. 
Changes to search/buildsearchdb.tcl.
44
45
46
47
48
49
50

51
52
53
54
55
56
57
      set nosearch(fileio.html) 1
      set nosearch(btreemodule.html) 1
      set nosearch(capi3ref.html) 1
      set nosearch(changes.html) 1
      set nosearch(fileformat2.html) 1
      set nosearch(index.html) 1
      set nosearch(docs.html) 1


      set weight(chronology.html) 25

      foreach f [glob *.html] { 
        if {[string match lang_* $f]==0 && [info exists nosearch($f)]==0} {
          lappend lFiles $f 
        }







>







44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
      set nosearch(fileio.html) 1
      set nosearch(btreemodule.html) 1
      set nosearch(capi3ref.html) 1
      set nosearch(changes.html) 1
      set nosearch(fileformat2.html) 1
      set nosearch(index.html) 1
      set nosearch(docs.html) 1
      set nosearch(mingw.html) 1

      set weight(chronology.html) 25

      foreach f [glob *.html] { 
        if {[string match lang_* $f]==0 && [info exists nosearch($f)]==0} {
          lappend lFiles $f 
        }