Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Add <fancy_format> or <table_of_contents> markup to a few more documents. To ensure that there are enough anchors in the longer documents for the search script to use. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | experimental |
Files: | files | file ages | folders |
SHA1: |
066e5931ce18a98e3b8c778adb807d8e |
User & Date: | dan 2016-08-25 12:18:04.821 |
Context
2016-08-25
| ||
17:29 | Change the way sub-sections are selected by the search script. (check-in: c7628dcb37 user: dan tags: experimental) | |
12:18 | Add <fancy_format> or <table_of_contents> markup to a few more documents. To ensure that there are enough anchors in the longer documents for the search script to use. (check-in: 066e5931ce user: dan tags: experimental) | |
2016-08-24
| ||
19:26 | Further updates to search database and script. (check-in: cc51dec17e user: dan tags: experimental) | |
Changes
Changes to pages/atomiccommit.in.
1 2 | <title>Atomic Commit In SQLite</title> <tcl>hd_keywords {atomic commit} {*Atomic Commit}</tcl> | | < < < | | 1 2 3 4 5 6 7 8 9 10 11 12 | <title>Atomic Commit In SQLite</title> <tcl>hd_keywords {atomic commit} {*Atomic Commit}</tcl> <table_of_contents> <h1> Introduction</h1> <p>An important feature of transactional databases like SQLite is "atomic commit". Atomic commit means that either all database changes within a single transaction occur or none of them occur. With atomic commit, it is as if many different writes to different sections of the database file occur instantaneously and simultaneously. |
︙ | ︙ | |||
33 34 35 36 37 38 39 | using a [write-ahead log]. SQLite still supports atomic commit when write-ahead logging is enabled, but it accomplishes atomic commit by a different mechanism from the one described in this article. See the [WAL | write-ahead log documentation] for additional information on how SQLite supports atomic commit in that context.</p> <tcl>hd_fragment hardware</tcl> | | | 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | using a [write-ahead log]. SQLite still supports atomic commit when write-ahead logging is enabled, but it accomplishes atomic commit by a different mechanism from the one described in this article. See the [WAL | write-ahead log documentation] for additional information on how SQLite supports atomic commit in that context.</p> <tcl>hd_fragment hardware</tcl> <h1> Hardware Assumptions</h1> <p>Throughout this article, we will call the mass storage device "disk" even though the mass storage device might really be flash memory.</p> <p>We assume that disk is written in chunks which we call a "sector". It is not possible to modify any part of the disk smaller than a sector. To change a part of the disk smaller than a sector, you have to read in |
︙ | ︙ | |||
171 172 173 174 175 176 177 | default in recent versions of SQLite. The assumption of powersafe overwrite property can be disabled at compile-time or a run-time if desired. See the [PSOW | powersafe overwrite documentation] for further details. <a name="section_3_0"></a> | | | | 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 | default in recent versions of SQLite. The assumption of powersafe overwrite property can be disabled at compile-time or a run-time if desired. See the [PSOW | powersafe overwrite documentation] for further details. <a name="section_3_0"></a> <h1> Single File Commit</h1> <p>We begin with an overview of the steps SQLite takes in order to perform an atomic commit of a transaction against a single database file. The details of file formats used to guard against damage from power failures and techniques for performing an atomic commit across multiple databases are discussed in later sections.</p> <tcl>hd_fragment initstate</tcl> <h2> Initial State</h2> <img src="images/ac/commit-0.gif" align="right" hspace="15"> <p>The state of the computer when a database connection is first opened is shown conceptually by the diagram at the right. The area of the diagram on the extreme right (labeled "Disk") represents |
︙ | ︙ | |||
202 203 204 205 206 207 208 | the process that is using SQLite. The database connection has just been opened and no information has been read yet, so the user space is empty. </p> <br clear="both"> <tcl>hd_fragment rdlck</tcl> | | | 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 | the process that is using SQLite. The database connection has just been opened and no information has been read yet, so the user space is empty. </p> <br clear="both"> <tcl>hd_fragment rdlck</tcl> <h2> Acquiring A Read Lock</h2> <img src="images/ac/commit-1.gif" align="right" hspace="15"> <p>Before SQLite can write to a database, it must first read the database to see what is there already. Even if it is just appending new data, SQLite still has to read in the database schema from the <b>sqlite_master</b> table so that it can know |
︙ | ︙ | |||
237 238 239 240 241 242 243 | operating system crashes or if there is a power loss. It is usually also the case that the lock will vanish if the process that created the lock exits.</p> <br clear="both"> <a name="section_3_3"></a> | | | 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 | operating system crashes or if there is a power loss. It is usually also the case that the lock will vanish if the process that created the lock exits.</p> <br clear="both"> <a name="section_3_3"></a> <h2> Reading Information Out Of The Database</h2> <img src="images/ac/commit-2.gif" align="right" hspace="15"> <p>After the shared lock is acquired, we can begin reading information from the database file. In this scenario, we are assuming a cold cache, so information must first be read from mass storage into the operating system cache then |
︙ | ︙ | |||
259 260 261 262 263 264 265 | pages out of eight being read. In a typical application, a database will have thousands of pages and a query will normally only touch a small percentage of those pages.</p> <br clear="both"> <tcl>hd_fragment rsvdlock</tcl> | | | 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 | pages out of eight being read. In a typical application, a database will have thousands of pages and a query will normally only touch a small percentage of those pages.</p> <br clear="both"> <tcl>hd_fragment rsvdlock</tcl> <h2> Obtaining A Reserved Lock</h2> <img src="images/ac/commit-3.gif" align="right" hspace="15"> <p>Before making changes to the database, SQLite first obtains a "reserved" lock on the database file. A reserved lock is similar to a shared lock in that both a reserved lock and shared lock allow other processes to read from the database |
︙ | ︙ | |||
283 284 285 286 287 288 289 | And because the modifications have not yet started, other processes can continue to read from the database. However, no other process should also begin trying to write to the database.</p> <br clear="both"> <a name="section_3_5"></a> | | | 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 | And because the modifications have not yet started, other processes can continue to read from the database. However, no other process should also begin trying to write to the database.</p> <br clear="both"> <a name="section_3_5"></a> <h2> Creating A Rollback Journal File</h2> <img src="images/ac/commit-4.gif" align="right" hspace="15"> <p>Prior to making any changes to the database file, SQLite first creates a separate rollback journal file and writes into the rollback journal the original content of the database pages that are to be altered. The idea behind the rollback journal is that it contains |
︙ | ︙ | |||
315 316 317 318 319 320 321 | is possible when doing real disk I/O. We illustrate this idea in the diagram to the right by showing that the new rollback journal appears in the operating system disk cache only and not on the disk itself.</p> <br clear="both"> <a name="section_3_6"></a> | | | | 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 | is possible when doing real disk I/O. We illustrate this idea in the diagram to the right by showing that the new rollback journal appears in the operating system disk cache only and not on the disk itself.</p> <br clear="both"> <a name="section_3_6"></a> <h2> Changing Database Pages In User Space</h2> <img src="images/ac/commit-5.gif" align="right" hspace="15"> <p>After the original page content has been saved in the rollback journal, the pages can be modified in user memory. Each database connection has its own private copy of user space, so the changes that are made in user space are only visible to the database connection that is making the changes. Other database connections still see the information in operating system disk cache buffers which have not yet been changed. And so even though one process is busy modifying the database, other processes can continue to read their own copies of the original database content.</p> <br clear="both"> <a name="section_3_7"></a> <h2> Flushing The Rollback Journal File To Mass Storage</h2> <img src="images/ac/commit-6.gif" align="right" hspace="15"> <p>The next step is to flush the content of the rollback journal file to nonvolatile storage. As we will see later, this is a critical step in insuring that the database can survive an unexpected power loss. |
︙ | ︙ | |||
352 353 354 355 356 357 358 | rollback journal is modified to show the number of pages in the rollback journal. Then the header is flushed to disk. The details on why we do this header modification and extra flush are provided in a later section of this paper.</p> <br clear="both"> <a name="section_3_8"></a> | | | 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 | rollback journal is modified to show the number of pages in the rollback journal. Then the header is flushed to disk. The details on why we do this header modification and extra flush are provided in a later section of this paper.</p> <br clear="both"> <a name="section_3_8"></a> <h2> Obtaining An Exclusive Lock</h2> <img src="images/ac/commit-7.gif" align="right" hspace="15"> <p>Prior to making changes to the database file itself, we must obtain an exclusive lock on the database file. Obtaining an exclusive lock is really a two-step process. First SQLite obtains a "pending" lock. Then it escalates the pending lock to an exclusive lock.</p> |
︙ | ︙ | |||
382 383 384 385 386 387 388 | that cycle by allowing existing shared locks to proceed but blocking new shared locks from being established. Eventually all shared locks will clear and the pending lock will then be able to escalate into an exclusive lock.</p> <br clear="both"> <a name="section_3_9"></a> | | | | | 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 | that cycle by allowing existing shared locks to proceed but blocking new shared locks from being established. Eventually all shared locks will clear and the pending lock will then be able to escalate into an exclusive lock.</p> <br clear="both"> <a name="section_3_9"></a> <h2> Writing Changes To The Database File</h2> <img src="images/ac/commit-8.gif" align="right" hspace="15"> <p>Once an exclusive lock is held, we know that no other processes are reading from the database file and it is safe to write changes into the database file. Usually those changes only go as far as the operating systems disk cache and do not make it all the way to mass storage.</p> <br clear="both"> <a name="section_3_10"></a> <h2>0 Flushing Changes To Mass Storage</h2> <img src="images/ac/commit-9.gif" align="right" hspace="15"> <p>Another flush must occur to make sure that all the database changes are written into nonvolatile storage. This is a critical step to ensure that the database will survive a power loss without damage. However, because of the inherent slowness of writing to disk or flash memory, this step together with the rollback journal file flush in section 3.7 above takes up most of the time required to complete a transaction commit in SQLite.</p> <br clear="both"> <a name="section_3_11"></a> <h2>1 Deleting The Rollback Journal</h2> <img src="images/ac/commit-A.gif" align="right" hspace="15"> <p>After the database changes are all safely on the mass storage device, the rollback journal file is deleted. This is the instant where the transaction commits. If a power failure or system crash occurs prior to this point, then recovery processes to be described later make |
︙ | ︙ | |||
457 458 459 460 461 462 463 | part of the header is malformed the journal will not roll back. Hence, one can say that the commit occurs as soon as the header is sufficiently changed to make it invalid. Typically this happens as soon as the first byte of the header is zeroed.</p> <br clear="both"> <a name="section_3_12"></a> | | | 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 | part of the header is malformed the journal will not roll back. Hence, one can say that the commit occurs as soon as the header is sufficiently changed to make it invalid. Typically this happens as soon as the first byte of the header is zeroed.</p> <br clear="both"> <a name="section_3_12"></a> <h2>2 Releasing The Lock</h2> <img src="images/ac/commit-B.gif" align="right" hspace="15"> <p>The last step in the commit process is to release the exclusive lock so that other processes can once again start accessing the database file.</p> <p>In the diagram at the right, we show that the information |
︙ | ︙ | |||
485 486 487 488 489 490 491 | database by checking that counter. If the database was modified, then the user space cache must be cleared and reread. But it is commonly the case that no changes have been made and the user space cache can be reused for a significant performance savings.</p> <br clear="both"> <tcl>hd_fragment rollback</tcl> | | | | | 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 | database by checking that counter. If the database was modified, then the user space cache must be cleared and reread. But it is commonly the case that no changes have been made and the user space cache can be reused for a significant performance savings.</p> <br clear="both"> <tcl>hd_fragment rollback</tcl> <h1> Rollback</h1> <p>An atomic commit is supposed to happen instantaneously. But the processing described above clearly takes a finite amount of time. Suppose the power to the computer were cut part way through the commit operation described above. In order to maintain the illusion that the changes were instantaneous, we have to "rollback" any partial changes and restore the database to the state it was in prior to the beginning of the transaction.</p> <tcl>hd_fragment crisis</tcl> <h2> When Something Goes Wrong...</h2> <img src="images/ac/rollback-0.gif" align="right" hspace="15"> <p>Suppose the power loss occurred during <a href="#section_3_10">step 3.10</a> above, while the database changes were being written to disk. After power is restored, the situation might be something like what is shown to the right. We were trying to change three pages of the database file but only one page was successfully written. Another page was partially written and a third page was not written at all.</p> <p>The rollback journal is complete and intact on disk when the power is restored. This is a key point. The reason for the flush operation in <a href="#section_3_7">step 3.7</a> is to make absolutely sure that all of the rollback journal is safely on nonvolatile storage prior to making any changes to the database file itself.</p> <br clear="both"> <a name="section_4_2"></a> <h2> Hot Rollback Journals</h2> <img src="images/ac/rollback-1.gif" align="right" hspace="15"> <p>The first time that any SQLite process attempts to access the database file, it obtains a shared lock as described in <a href="section_3_2">section 3.2</a> above. But then it notices that there is a rollback journal file present. SQLite then checks to see if |
︙ | ︙ | |||
556 557 558 559 560 561 562 | it aborted for some reason prior to the completion of the commit. A hot journal means that the database file is in an inconsistent state and needs to be repaired (by rollback) prior to being used.</p> <br clear="both"> <tcl>hd_fragment exlock</tcl> | | | | | | | | 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 | it aborted for some reason prior to the completion of the commit. A hot journal means that the database file is in an inconsistent state and needs to be repaired (by rollback) prior to being used.</p> <br clear="both"> <tcl>hd_fragment exlock</tcl> <h2> Obtaining An Exclusive Lock On The Database</h2> <img src="images/ac/rollback-2.gif" align="right" hspace="15"> <p>The first step toward dealing with a hot journal is to obtain an exclusive lock on the database file. This prevents two or more processes from trying to rollback the same hot journal at the same time.</p> <br clear="both"> <a name="section_4_4"></a> <h2> Rolling Back Incomplete Changes</h2> <img src="images/ac/rollback-3.gif" align="right" hspace="15"> <p>Once a process obtains an exclusive lock, it is permitted to write to the database file. It then proceeds to read the original content of pages out of the rollback journal and write that content back to where it came from in the database file. Recall that the header of the rollback journal records the original size of the database file prior to the start of the aborted transaction. SQLite uses this information to truncate the database file back to its original size in cases where the incomplete transaction caused the database to grow. At the end of this step, the database should be the same size and contain the same information as it did before the start of the aborted transaction.</p> <br clear="both"> <tcl>hd_fragment delhotjrnl</tcl> <h2> Deleting The Hot Journal</h2> <img src="images/ac/rollback-4.gif" align="right" hspace="15"> <p>After all information in the rollback journal has been played back into the database file (and flushed to disk in case we encounter yet another power failure), the hot rollback journal can be deleted.</p> <p>As in <a href="#section_3_11">section 3.11</a>, the journal file might be truncated to zero length or its header might be overwritten with zeros as an optimization on systems where deleting a file is expensive. Either way, the journal is no longer hot after this step.</p> <br clear="both"> <tcl>hd_fragment cont</tcl> <h2> Continue As If The Uncompleted Writes Had Never Happened</h2> <img src="images/ac/rollback-5.gif" align="right" hspace="15"> <p>The final recovery step is to reduce the exclusive lock back to a shared lock. Once this happens, the database is back in the state that it would have been if the aborted transaction had never started. Since all of this recovery activity happens completely automatically and transparently, it appears to the program using SQLite as if the aborted transaction had never begun.</p> <br clear="both"> <tcl>hd_fragment multicommit</tcl> <h1> Multi-file Commit</h1> <p>SQLite allows a single <a href="c3ref/sqlite3.html">database connection</a> to talk to two or more database files simultaneously through the use of the <a href="lang_attach.html">ATTACH DATABASE</a> command. When multiple database files are modified within a single transaction, all files are updated atomically. In other words, either all of the database files are updated or else none of them are. Achieving an atomic commit across multiple database files is more complex that doing so for a single file. This section describes how SQLite works that bit of magic.</p> <tcl>hd_fragment multijrnl</tcl> <h2> Separate Rollback Journals For Each Database</h2> <img src="images/ac/multi-0.gif" align="right" hspace="15"> <p>When multiple database files are involved in a transaction, each database has its own rollback journal and each database is locked separately. The diagram at the right shows a scenario where three different database files have been modified within one transaction. The situation at this step is analogous to |
︙ | ︙ | |||
656 657 658 659 660 661 662 | operating system cache and information that is on disk. All of these factors still apply in a multi-file commit scenario. They just take up a lot of space in the diagrams and they do not add any new information, so they are omitted here.</p> <br clear="both"> <tcl>hd_fragment masterjrnl</tcl> | | | 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 | operating system cache and information that is on disk. All of these factors still apply in a multi-file commit scenario. They just take up a lot of space in the diagrams and they do not add any new information, so they are omitted here.</p> <br clear="both"> <tcl>hd_fragment masterjrnl</tcl> <h2> The Master Journal File</h2> <img src="images/ac/multi-1.gif" align="right" hspace="15"> <p>The next step in a multi-file commit is the creation of a "master journal" file. The name of the master journal file is the same name as the original database filename (the database that was opened using the <a href="c3ref/open.html">sqlite3_open()</a> interface, |
︙ | ︙ | |||
693 694 695 696 697 698 699 | transactions are atomic across a power-loss. But if the database files have other settings that compromise integrity across a power-loss event (such as [PRAGMA synchronous=OFF] or [PRAGMA journal_mode=MEMORY]) then the creation of the master journal is omitted, as an optimization. <br clear="both"> <tcl>hd_fragment multijrnlupdate</tcl> | | | | | | 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 | transactions are atomic across a power-loss. But if the database files have other settings that compromise integrity across a power-loss event (such as [PRAGMA synchronous=OFF] or [PRAGMA journal_mode=MEMORY]) then the creation of the master journal is omitted, as an optimization. <br clear="both"> <tcl>hd_fragment multijrnlupdate</tcl> <h2> Updating Rollback Journal Headers</h2> <img src="images/ac/multi-2.gif" align="right" hspace="15"> <p>The next step is to record the full pathname of the master journal file in the header of every rollback journal. Space to hold the master journal filename was reserved at the beginning of each rollback journal as the rollback journals were created.</p> <p>The content of each rollback journal is flushed to disk both before and after the master journal filename is written into the rollback journal header. It is important to do both of these flushes. Fortunately, the second flush is usually inexpensive since typically only a single page of the journal file (the first page) has changed.</p> <p>This step is analogous to <a href="#section_3_7">step 3.7</a> in the single-file commit scenario described above.</p> <br clear="both"> <tcl>hd_fragment multidbupdate</tcl> <h2> Updating The Database Files</h2> <img src="images/ac/multi-3.gif" align="right" hspace="15"> <p>Once all rollback journal files have been flushed to disk, it is safe to begin updating database files. We have to obtain an exclusive lock on all database files before writing the changes. After all the changes are written, it is important to flush the changes to disk so that they will be preserved in the event of a power failure or operating system crash.</p> <p>This step corresponds to steps <a href="#section_3_8">3.8</a>, <a href="#section_3_9">3.9</a>, and <a href="#section_3_10">3.10</a> in the single-file commit scenario described previously.</p> <br clear="both"> <a name="section_5_5"></a> <h2> Delete The Master Journal File</h2> <img src="images/ac/multi-4.gif" align="right" hspace="15"> <p>The next step is to delete the master journal file. This is the point where the multi-file transaction commits. This step corresponds to <a href="#section_3_11">step 3.11</a> in the single-file commit scenario where the rollback journal is deleted.</p> <p>If a power failure or operating system crash occurs at this point, the transaction will not rollback when the system reboots even though there are rollback journals present. The difference is the master journal pathname in the header of the rollback journal. Upon restart, SQLite only considers a journal to be hot and will only playback the journal if there is no master journal filename in the header (which is the case for a single-file commit) or if the master journal file still exists on disk.</p> <br clear="both"> <tcl>hd_fragment cleanup</tcl> <h2> Clean Up The Rollback Journals</h2> <img src="images/ac/multi-5.gif" align="right" hspace="15"> <p>The final step in a multi-file commit is to delete the individual rollback journals and drop the exclusive locks on the database files so that other processes can see the changes. This corresponds to <a href="#section_3_12">step 3.12</a> in the single-file |
︙ | ︙ | |||
775 776 777 778 779 780 781 | this so that all rollback journals are deleted before any database files are unlocked. As long as the rollback journal is deleted before its corresponding database file is unlocked it does not matter in what order the rollback journals are deleted or the database files are unlocked.</p> <tcl>hd_fragment moredetail</tcl> | | | | 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 | this so that all rollback journals are deleted before any database files are unlocked. As long as the rollback journal is deleted before its corresponding database file is unlocked it does not matter in what order the rollback journals are deleted or the database files are unlocked.</p> <tcl>hd_fragment moredetail</tcl> <h1> Additional Details Of The Commit Process</h1> <p><a href="#section_3_0">Section 3.0</a> above provides an overview of how atomic commit works in SQLite. But it glosses over a number of important details. The following subsections will attempt to fill in the gaps.</p> <tcl>hd_fragment completesectors</tcl> <h2> Always Journal Complete Sectors</h2> <p>When the original content of a database page is written into the rollback journal (as shown in <a href="#section_3_5">section 3.5</a>), SQLite always writes a complete sector of data, even if the page size of the database is smaller than the sector size. Historically, the sector size in SQLite has been hard coded to 512 bytes and since the minimum page size is also 512 bytes, this has never |
︙ | ︙ | |||
809 810 811 812 813 814 815 | content of pages 1, 3, and 4 since the hardware must write the complete sector. If this write operation is interrupted by a power outage, one or more of the pages 1, 3, or 4 might be left with incorrect data. Hence, to avoid lasting corruption to the database, the original content of all of those pages must be contained in the rollback journal.</p> <tcl>hd_fragment journalgarbage</tcl> | | | 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 | content of pages 1, 3, and 4 since the hardware must write the complete sector. If this write operation is interrupted by a power outage, one or more of the pages 1, 3, or 4 might be left with incorrect data. Hence, to avoid lasting corruption to the database, the original content of all of those pages must be contained in the rollback journal.</p> <tcl>hd_fragment journalgarbage</tcl> <h2> Dealing With Garbage Written Into Journal Files</h2> <p>When data is appended to the end of the rollback journal, SQLite normally makes the pessimistic assumption that the file is first extended with invalid "garbage" data and that afterwards the correct data replaces the garbage. In other words, SQLite assumes that the file size is increased first and then afterwards the content is written into the file. If a power failure occurs after the file |
︙ | ︙ | |||
874 875 876 877 878 879 880 | <p>Note that the checksums in the rollback journal are not necessary if the synchronous setting is FULL. We only depend on the checksums when synchronous is lowered to NORMAL. Nevertheless, the checksums never hurt and so they are included in the rollback journal regardless of the synchronous setting.</p> <tcl>hd_fragment cachespill</tcl> | | | 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 | <p>Note that the checksums in the rollback journal are not necessary if the synchronous setting is FULL. We only depend on the checksums when synchronous is lowered to NORMAL. Nevertheless, the checksums never hurt and so they are included in the rollback journal regardless of the synchronous setting.</p> <tcl>hd_fragment cachespill</tcl> <h2> Cache Spill Prior To Commit</h2> <p>The commit process shown in <a href="#section_3_0">section 3.0</a> assumes that all database changes fit in memory until it is time to commit. This is the common case. But sometimes a larger change will overflow the user-space cache prior to transaction commit. In those cases, the cache must spill to the database before the transaction is complete.</p> |
︙ | ︙ | |||
909 910 911 912 913 914 915 | escalate from reserved to exclusive. This reduces concurrency. A cache spill also causes extra disk flush or fsync operations to occur and these operations are slow, hence a cache spill can seriously reduce performance. For these reasons a cache spill is avoided whenever possible.</p> <tcl>hd_fragment opts</tcl> | | | | 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 | escalate from reserved to exclusive. This reduces concurrency. A cache spill also causes extra disk flush or fsync operations to occur and these operations are slow, hence a cache spill can seriously reduce performance. For these reasons a cache spill is avoided whenever possible.</p> <tcl>hd_fragment opts</tcl> <h1> Optimizations</h1> <p>Profiling indicates that for most systems and in most circumstances SQLite spends most of its time doing disk I/O. It follows then that anything we can do to reduce the amount of disk I/O will likely have a large positive impact on the performance of SQLite. This section describes some of the techniques used by SQLite to try to reduce the amount of disk I/O to a minimum while still preserving atomic commit.</p> <tcl>hd_fragment keepcache</tcl> <h2> Cache Retained Between Transactions</h2> <p><a href="#section_3_12">Step 3.12</a> of the commit process shows that once the shared lock has been released, all user-space cache images of database content must be discarded. This is done because without a shared lock, other processes are free to modify the database file content and so any user-space image of that content might become obsolete. Consequently, each new transaction would begin by rereading |
︙ | ︙ | |||
952 953 954 955 956 957 958 | which is incremented during every change operation. SQLite saves a copy of this counter prior to releasing its database lock. Then after acquiring the next database lock it compares the saved counter value against the current counter value and erases the cache if the values are different, or reuses the cache if they are the same.</p> <a name="section_7_2"></a> | | | 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 | which is incremented during every change operation. SQLite saves a copy of this counter prior to releasing its database lock. Then after acquiring the next database lock it compares the saved counter value against the current counter value and erases the cache if the values are different, or reuses the cache if they are the same.</p> <a name="section_7_2"></a> <h2> Exclusive Access Mode</h2> <p>SQLite version 3.3.14 adds the concept of "Exclusive Access Mode". In exclusive access mode, SQLite retains the exclusive database lock at the conclusion of each transaction. This prevents other processes from accessing the database, but in many deployments only a single process is using a database so this is not a serious problem. The advantage of exclusive access mode is that |
︙ | ︙ | |||
989 990 991 992 993 994 995 | deleting the rollback journal file, does not depend on holding an exclusive lock at all times. This optimization can be set independently of exclusive lock mode using the [journal_mode pragma] as described in <a href="#section_7_6">section 7.6</a> below.</p> <tcl>hd_fragment freelistjrnl</tcl> | | | 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 | deleting the rollback journal file, does not depend on holding an exclusive lock at all times. This optimization can be set independently of exclusive lock mode using the [journal_mode pragma] as described in <a href="#section_7_6">section 7.6</a> below.</p> <tcl>hd_fragment freelistjrnl</tcl> <h2> Do Not Journal Freelist Pages</h2> <p>When information is deleted from an SQLite database, the pages used to hold the deleted information are added to a "[freelist]". Subsequent inserts will draw pages off of this freelist rather than expanding the database file.</p> <p>Some freelist pages contain critical data; specifically the locations |
︙ | ︙ | |||
1014 1015 1016 1017 1018 1019 1020 | Similarly, the content of a new freelist page is never written back into the database at <a href="#section_3_9">step 3.9</a> nor read from the database at <a href="#section_3_3">step 3.3</a>. These optimizations can greatly reduce the amount of I/O that occurs when making changes to a database file that contains free space.</p> <tcl>hd_fragment atomicsector</tcl> | | | 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 | Similarly, the content of a new freelist page is never written back into the database at <a href="#section_3_9">step 3.9</a> nor read from the database at <a href="#section_3_3">step 3.3</a>. These optimizations can greatly reduce the amount of I/O that occurs when making changes to a database file that contains free space.</p> <tcl>hd_fragment atomicsector</tcl> <h2> Single Page Updates And Atomic Sector Writes</h2> <p>Beginning in SQLite version 3.5.0, the new Virtual File System (VFS) interface contains a method named xDeviceCharacteristics which reports on special properties that the underlying mass storage device might have. Among the special properties that xDeviceCharacteristics might report is the ability of to do an atomic sector write.</p> |
︙ | ︙ | |||
1052 1053 1054 1055 1056 1057 1058 | only touches a single database page, then SQLite skips the whole journaling and syncing process and simply writes the modified page directly into the database file. The change counter in the first page of the database file is modified separately since no harm is done if power is lost before the change counter can be updated.</p> <tcl>hd_fragment safeappend</tcl> | | | 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 | only touches a single database page, then SQLite skips the whole journaling and syncing process and simply writes the modified page directly into the database file. The change counter in the first page of the database file is modified separately since no harm is done if power is lost before the change counter can be updated.</p> <tcl>hd_fragment safeappend</tcl> <h2> Filesystems With Safe Append Semantics</h2> <p>Another optimization introduced in SQLite version 3.5.0 makes use of "safe append" behavior of the underlying disk. Recall that SQLite assumes that when data is appended to a file (specifically to the rollback journal) that the size of the file is increased first and that the content is written second. So if power is lost after the file size is increased but before the |
︙ | ︙ | |||
1080 1081 1082 1083 1084 1085 1086 | occurs, we save a single flush operation and a sector write of the first page of the journal file. Furthermore, when a cache spill occurs we no longer need to append a new journal header to the end of the journal; we can simply continue appending new pages to the end of the existing journal.</p> <a name="section_7_6"></a> | | | 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 | occurs, we save a single flush operation and a sector write of the first page of the journal file. Furthermore, when a cache spill occurs we no longer need to append a new journal header to the end of the journal; we can simply continue appending new pages to the end of the existing journal.</p> <a name="section_7_6"></a> <h2> Persistent Rollback Journals</h2> <p>Deleting a file is an expensive operation on many systems. So as an optimization, SQLite can be configured to avoid the delete operation of <a href="#section_3_11">section 3.11</a>. Instead of deleting the journal file in order to commit a transaction, the file is either truncated to zero bytes in length or its header is overwritten with zeros. Truncating the file to zero |
︙ | ︙ | |||
1156 1157 1158 1159 1160 1161 1162 | in slower behavior than PERSIST. The commit operation is the same speed. But subsequent transactions are slower following a TRUNCATE because it is faster to overwrite existing content than to append to the end of a file. New journal file entries will always be appended following a TRUNCATE but will usually overwrite with PERSIST.</p> <tcl>hd_fragment testing</tcl> | | | 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 | in slower behavior than PERSIST. The commit operation is the same speed. But subsequent transactions are slower following a TRUNCATE because it is faster to overwrite existing content than to append to the end of a file. New journal file entries will always be appended following a TRUNCATE but will usually overwrite with PERSIST.</p> <tcl>hd_fragment testing</tcl> <h1> Testing Atomic Commit Behavior</h1> <p>The developers of SQLite are confident that it is robust in the face of power failures and system crashes because the automatic test procedures do extensive checks on the ability of SQLite to recover from simulated power loss. We call these the "crash tests".</p> |
︙ | ︙ | |||
1187 1188 1189 1190 1191 1192 1193 | using only code inspection and analysis techniques. From this experience, the developers of SQLite feel confident that any other database system that does not use a similar crash test system likely contains undetected bugs that will lead to database corruption following a system crash or power failure.</p> <tcl>hd_fragment {sect_9_0} {Things That Can Go Wrong}</tcl> | | | | 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 | using only code inspection and analysis techniques. From this experience, the developers of SQLite feel confident that any other database system that does not use a similar crash test system likely contains undetected bugs that will lead to database corruption following a system crash or power failure.</p> <tcl>hd_fragment {sect_9_0} {Things That Can Go Wrong}</tcl> <h1> Things That Can Go Wrong</h1> <p>The atomic commit mechanism in SQLite has proven to be robust, but it can be circumvented by a sufficiently creative adversary or a sufficiently broken operating system implementation. This section describes a few of the ways in which an SQLite database might be corrupted by a power failure or system crash. (See also: [how to corrupt | How To Corrupt Your Database Files].)</p> <tcl>hd_fragment brokenlocks</tcl> <h2> Broken Locking Implementations</h2> <p>SQLite uses filesystem locks to make sure that only one process and database connection is trying to modify the database at a time. The filesystem locking mechanism is implemented in the VFS layer and is different for every operating system. SQLite depends on this implementation being correct. If something goes wrong and two or more processes are able to write the same |
︙ | ︙ | |||
1232 1233 1234 1235 1236 1237 1238 | mechanisms do not exclude one another, so if one process is accessing a file using (for example) AFP locking and another process (perhaps on a different machine) is using dot-file locks, the two processes might collide because AFP locks do not exclude dot-file locks or vice versa.</p> <tcl>hd_fragment fsync</tcl> | | | 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 | mechanisms do not exclude one another, so if one process is accessing a file using (for example) AFP locking and another process (perhaps on a different machine) is using dot-file locks, the two processes might collide because AFP locks do not exclude dot-file locks or vice versa.</p> <tcl>hd_fragment fsync</tcl> <h2> Incomplete Disk Flushes</h2> <p>SQLite uses the fsync() system call on Unix and the FlushFileBuffers() system call on w32 in order to sync the file system buffers onto disk oxide as shown in <a href="#section_3_7">step 3.7</a> and <a href="#section_3_10">step 3.10</a>. Unfortunately, we have received reports that neither of these interfaces works as advertised on many systems. We hear that FlushFileBuffers() can be completely disabled |
︙ | ︙ | |||
1260 1261 1262 1263 1264 1265 1266 | <p>Setting fullfsync on a Mac will guarantee that data really does get pushed out to the disk platter on a flush. But the implementation of fullfsync involves resetting the disk controller. And so not only is it profoundly slow, it also slows down other unrelated disk I/O. So its use is not recommended.</p> <tcl>hd_fragment filedel</tcl> | | | | | 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 | <p>Setting fullfsync on a Mac will guarantee that data really does get pushed out to the disk platter on a flush. But the implementation of fullfsync involves resetting the disk controller. And so not only is it profoundly slow, it also slows down other unrelated disk I/O. So its use is not recommended.</p> <tcl>hd_fragment filedel</tcl> <h2> Partial File Deletions</h2> <p>SQLite assumes that file deletion is an atomic operation from the point of view of a user process. If power fails in the middle of a file deletion, then after power is restored SQLite expects to see either the entire file with all of its original data intact, or it expects not to find the file at all. Transactions may not be atomic on systems that do not work this way.</p> <tcl>hd_fragment filegarbage</tcl> <h2> Garbage Written Into Files</h2> <p>SQLite database files are ordinary disk files that can be opened and written by ordinary user processes. A rogue process can open an SQLite database and fill it with corrupt data. Corrupt data might also be introduced into an SQLite database by bugs in the operating system or disk controller; especially bugs triggered by a power failure. There is nothing SQLite can do to defend against these kinds of problems.</p> <tcl>hd_fragment mvhotjrnl</tcl> <h2> Deleting Or Renaming A Hot Journal</h2> <p>If a crash or power loss does occur and a hot journal is left on the disk, it is essential that the original database file and the hot journal remain on disk with their original names until the database file is opened by another SQLite process and rolled back. During recovery at <a href="#section_4_2">step 4.2</a> SQLite locates the hot journal by looking for a file in the same directory as the |
︙ | ︙ | |||
1326 1327 1328 1329 1330 1331 1332 | that SQLite can do to prevent it. If you are running on a system that is vulnerable to this kind of filesystem namespace corruption (most modern journalling filesystems are immune, we believe) then you might want to consider putting each SQLite database file in its own private subdirectory.</p> <tcl>hd_fragment future</tcl> | | | 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 | that SQLite can do to prevent it. If you are running on a system that is vulnerable to this kind of filesystem namespace corruption (most modern journalling filesystems are immune, we believe) then you might want to consider putting each SQLite database file in its own private subdirectory.</p> <tcl>hd_fragment future</tcl> <h1>10.0 Future Directions And Conclusion</h1> <p>Every now and then someone discovers a new failure mode for the atomic commit mechanism in SQLite and the developers have to put in a patch. This is happening less and less and the failure modes are becoming more and more obscure. But it would still be foolish to suppose that the atomic commit logic of SQLite is entirely bug-free. The developers are committed to fixing |
︙ | ︙ |
Changes to pages/cintro.in.
1 2 | <title>An Introduction To The SQLite C/C++ Interface</title> <tcl> | > < < < < < < < < | 1 2 3 4 5 6 7 8 9 10 | <title>An Introduction To The SQLite C/C++ Interface</title> <table_of_contents> <tcl> set level(0) 0 set level(1) 0 proc HEADING {n name {tag {}}} { if {$tag!=""} { hd_fragment $tag } global level |
︙ | ︙ | |||
31 32 33 34 35 36 37 | } } incr n 1 hd_puts "<h$n>$num $name</h$n>" } hd_keywords {*cintro} | < < > > | 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | } } incr n 1 hd_puts "<h$n>$num $name</h$n>" } hd_keywords {*cintro} </tcl> <h1>Summary</h1> <p>The following two objects and eight methods comprise the essential elements of the SQLite interface: <table border="0" cellpadding="0"> <tr> <td valign="top"><b>[sqlite3]</b></td><td> </td> |
︙ | ︙ | |||
98 99 100 101 102 103 104 | <td valign="top"><b>[sqlite3_exec()]</td><td> <td>A wrapper function that does [sqlite3_prepare()], [sqlite3_step()], [sqlite3_column_int|sqlite3_column()], and [sqlite3_finalize()] for a string of one or more SQL statements.</td> </tr> </table> | | | 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 | <td valign="top"><b>[sqlite3_exec()]</td><td> <td>A wrapper function that does [sqlite3_prepare()], [sqlite3_step()], [sqlite3_column_int|sqlite3_column()], and [sqlite3_finalize()] for a string of one or more SQL statements.</td> </tr> </table> <h1>Introduction</h1> <p> SQLite currently has over 200 distinct APIs. This can be overwhelming to a new programmer. Fortunately, most of the interfaces are very specialized and need not be considered by beginners. The core API is small, simple, and easy to learn. |
︙ | ︙ | |||
120 121 122 123 124 125 126 | the reader understands the basic principles of operation for SQLite, [capi3ref | that document] should be used as a reference guide. This article is intended as introduction only and is neither a complete nor authoritative reference for the SQLite API. </p> | | | 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 | the reader understands the basic principles of operation for SQLite, [capi3ref | that document] should be used as a reference guide. This article is intended as introduction only and is neither a complete nor authoritative reference for the SQLite API. </p> <h1>Core Objects And Interfaces</h1> <p> The principal task of an SQL database engine is to evaluate SQL statements of SQL. To accomplish this, the developer needs two objects: </p> <p><ul> |
︙ | ︙ | |||
265 266 267 268 269 270 271 | to [sqlite3_open()]. All [prepared statements] associated with the connection should be [sqlite3_finalize | finalized] prior to closing the connection. </td> </table> | | | 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 | to [sqlite3_open()]. All [prepared statements] associated with the connection should be [sqlite3_finalize | finalized] prior to closing the connection. </td> </table> <h1>Typical Usage Of Core Routines And Objects</h1> <p> An application will typically use [sqlite3_open()] to create a single [database connection] during initialization. Note that [sqlite3_open()] can be used to either open existing database files or to create and open new database files. |
︙ | ︙ | |||
311 312 313 314 315 316 317 | </ol></p> <p> The foregoing is all one really needs to know in order to use SQLite effectively. All the rest is optimization and detail. </p> | | | | 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 | </ol></p> <p> The foregoing is all one really needs to know in order to use SQLite effectively. All the rest is optimization and detail. </p> <h1>Convenience Wrappers Around Core Routines</h1> <p> The [sqlite3_exec()] interface is a convenience wrapper that carries out all four of the above steps with a single function call. A callback function passed into [sqlite3_exec()] is used to process each row of the result set. The [sqlite3_get_table()] is another convenience wrapper that does all four of the above steps. The [sqlite3_get_table()] interface differs from [sqlite3_exec()] in that it stores the results of queries in heap memory rather than invoking a callback. </p> <p> It is important to realize that neither [sqlite3_exec()] nor [sqlite3_get_table()] do anything that cannot be accomplished using the core routines. In fact, these wrappers are implemented purely in terms of the core routines. </p> <h1>Binding Parameters and Reusing Prepared Statements</h1> <p> In prior discussion, it was assumed that each SQL statement is prepared once, evaluated, then destroyed. However, SQLite allows the same [prepared statement] to be evaluated multiple times. This is accomplished using the following routines: </p> |
︙ | ︙ | |||
412 413 414 415 416 417 418 | create all of the [prepared statements] they will ever need. Other applications keep a cache of the most recently used [prepared statements] and then reuse [prepared statements] out of the cache when available. Another approach is to only reuse [prepared statements] when they are inside of a loop. </p> | | | 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 | create all of the [prepared statements] they will ever need. Other applications keep a cache of the most recently used [prepared statements] and then reuse [prepared statements] out of the cache when available. Another approach is to only reuse [prepared statements] when they are inside of a loop. </p> <h1>Configuring SQLite</h1> <p> The default configuration for SQLite works great for most applications. But sometimes developers want to tweak the setup to try to squeeze out a little more performance, or take advantage of some obscure feature. <p> The [sqlite3_config()] interface is used to make global, process-wide |
︙ | ︙ | |||
438 439 440 441 442 443 444 | application-defined mutex system. </ul> <p> After process-wide configuration is complete and [database connections] have been created, individual database connections can be configured using calls to [sqlite3_limit()] and [sqlite3_db_config()]. | | | 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 | application-defined mutex system. </ul> <p> After process-wide configuration is complete and [database connections] have been created, individual database connections can be configured using calls to [sqlite3_limit()] and [sqlite3_db_config()]. <h1>Extending SQLite</h1> <p> SQLite includes interfaces that can be used to extend its functionality. Such routines include: </p> <p><ul> |
︙ | ︙ | |||
485 486 487 488 489 490 491 | [http://www.sqlite.org/src/doc/trunk/src/func.c | func.c] source files for examples. </p> <p> Shared libraries or DLLs can be used as [loadable extensions] to SQLite. | | | 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 | [http://www.sqlite.org/src/doc/trunk/src/func.c | func.c] source files for examples. </p> <p> Shared libraries or DLLs can be used as [loadable extensions] to SQLite. <h1>Other Interfaces</h1> <p> This article only mentions the most important and most commonly used SQLite interfaces. The SQLite library includes many other APIs implementing useful features that are not described here. A [capi3ref_funclist | complete list of functions] that form the SQLite application programming interface is found at the [capi3ref | C/C++ Interface Specification]. Refer to that document for complete and authoritative information about all SQLite interfaces. </p> |
Changes to pages/compile.in.
1 2 3 | <title>Compilation Options For SQLite</title> <tcl>hd_keywords {compile-time options}</tcl> | > | > | 1 2 3 4 5 6 7 8 9 10 11 12 13 | <title>Compilation Options For SQLite</title> <tcl>hd_keywords {compile-time options}</tcl> <table_of_contents> <h1>Overview</h1> <p> For most purposes, SQLite can be built just fine using the default compilation options. However, if required, the compile-time options documented below can be used to <a href="#omitfeatures">omit SQLite features</a> (resulting in a [relfootprint | smaller compiled library size]) or to change the |
︙ | ︙ | |||
33 34 35 36 37 38 39 | hd_puts <p><b>$name</b></p> regsub -all "\n\\s*\n" $text "</p>\n\n<p>" text hd_resolve <blockquote><p>$text</p></blockquote> } </tcl> <a name="osconfig"></a> | | | 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | hd_puts <p><b>$name</b></p> regsub -all "\n\\s*\n" $text "</p>\n\n<p>" text hd_resolve <blockquote><p>$text</p></blockquote> } </tcl> <a name="osconfig"></a> <h1> Platform Configuration</h1> <tcl> COMPILE_OPTION {_HAVE_SQLITE_CONFIG_H} { If the _HAVE_SQLITE_CONFIG_H macro is defined then the SQLite source code will attempt to #include a file named "config.h". The "config.h" file usually contains other configuration options, especially "HAVE_<i>INTERFACE</i>" type options generated by autoconf scripts. |
︙ | ︙ | |||
120 121 122 123 124 125 126 | If the HAVE_UTIME option is true, then the built-in but non-standard "unix-dotfile" VFS will use the utime() system call, instead of utimes(), to set the last access time on the lock file. } </tcl> <a name="defaults"></a> | | | 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 | If the HAVE_UTIME option is true, then the built-in but non-standard "unix-dotfile" VFS will use the utime() system call, instead of utimes(), to set the last access time on the lock file. } </tcl> <a name="defaults"></a> <h1> Options To Set Default Parameter Values</h1> <tcl> COMPILE_OPTION {SQLITE_DEFAULT_AUTOMATIC_INDEX=<i><0 or 1></i>} { This macro determines the initial setting for [PRAGMA automatic_index] for newly opened [database connections]. For all versions of SQLite through 3.7.17, |
︙ | ︙ | |||
413 414 415 416 417 418 419 | Developers whose applications contain SQL statements that need more than 100 LALR(1) stack entries should seriously consider refactoring their SQL as it is likely to be well beyond the ability of any human to comprehend. } </tcl> | | | 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 | Developers whose applications contain SQL statements that need more than 100 LALR(1) stack entries should seriously consider refactoring their SQL as it is likely to be well beyond the ability of any human to comprehend. } </tcl> <h1> Options To Set Size Limits</h1> <p>There are compile-time options that will set upper bounds on the sizes of various structures in SQLite. The compile-time options normally set a hard upper bound that can be changed at run-time on individual [database connections] using the [sqlite3_limit()] interface.</p> |
︙ | ︙ | |||
439 440 441 442 443 444 445 | <li> [SQLITE_MAX_LIKE_PATTERN_LENGTH] </li> <li> [SQLITE_MAX_PAGE_COUNT] </li> <li> [SQLITE_MAX_SQL_LENGTH] </li> <li> [SQLITE_MAX_VARIABLE_NUMBER] </li> </ul> <a name="controlfeatures"></a> | | | 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 | <li> [SQLITE_MAX_LIKE_PATTERN_LENGTH] </li> <li> [SQLITE_MAX_PAGE_COUNT] </li> <li> [SQLITE_MAX_SQL_LENGTH] </li> <li> [SQLITE_MAX_VARIABLE_NUMBER] </li> </ul> <a name="controlfeatures"></a> <h1> Options To Control Operating Characteristics</h1> <tcl> COMPILE_OPTION {SQLITE_4_BYTE_ALIGNED_MALLOC} { On most systems, the malloc() system call returns a buffer that is aligned to an 8-byte boundary. But on some systems (ex: windows) malloc() returns 4-byte aligned pointer. This compile-time option must be used on systems that return 4-byte aligned pointers from malloc(). |
︙ | ︙ | |||
572 573 574 575 576 577 578 | This option causes the [URI filename] process logic to be enabled by default. } </tcl> <a name="enablefeatures"></a> | | | 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 | This option causes the [URI filename] process logic to be enabled by default. } </tcl> <a name="enablefeatures"></a> <h1> Options To Enable Features Normally Turned Off</h1> <tcl> COMPILE_OPTION {SQLITE_ALLOW_URI_AUTHORITY} { [URI filenames] normally throws an error if the authority section is not either empty or "localhost". However, if SQLite is compiled with the SQLITE_ALLOW_URI_AUTHORITY compile-time option, then the URI is converted into a Uniform Naming Convention (UNC) filename and passed |
︙ | ︙ | |||
948 949 950 951 952 953 954 | (determined at compile-time using the [YYSTACKDEPTH] options). This option can be used to help determine if an application is getting close to exceeding the maximum LALR(1) stack depth. } </tcl> <a name="disablefeatures"></a> | | | 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 | (determined at compile-time using the [YYSTACKDEPTH] options). This option can be used to help determine if an application is getting close to exceeding the maximum LALR(1) stack depth. } </tcl> <a name="disablefeatures"></a> <h1> Options To Disable Features Normally Turned On</h1> <tcl> COMPILE_OPTION {SQLITE_DISABLE_LFS} { If this C-preprocessor macro is defined, large file support is disabled. } |
︙ | ︙ | |||
982 983 984 985 986 987 988 | } </tcl> <tcl> hd_fragment "omitfeatures" hd_keywords "omitfeatures" </tcl> | | | 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 | } </tcl> <tcl> hd_fragment "omitfeatures" hd_keywords "omitfeatures" </tcl> <h1> Options To Omit Features</h1> <p>The following options can be used to [relfootprint | reduce the size of the compiled library] by omitting unused features. This is probably only useful in embedded systems where space is especially tight, as even with all features included the SQLite library is relatively small. Don't forget to tell your compiler to optimize for binary size! (the -Os option if |
︙ | ︙ | |||
1456 1457 1458 1459 1460 1461 1462 | So the net effect of this compile-time option is that it allows SQLite to be compiled and linked against a system library that does not support malloc(), free(), and/or realloc(). } </tcl> <a name="debugoptions"></a> | | | 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 | So the net effect of this compile-time option is that it allows SQLite to be compiled and linked against a system library that does not support malloc(), free(), and/or realloc(). } </tcl> <a name="debugoptions"></a> <h1> Analysis and Debugging Options</h1> <tcl> COMPILE_OPTION {SQLITE_DEBUG} { The SQLite source code contains literally thousands of assert() statements used to verify internal assumptions and subroutine preconditions and postconditions. These assert() statements are normally turned off (they generate no code) since turning them on makes SQLite run approximately |
︙ | ︙ | |||
1484 1485 1486 1487 1488 1489 1490 | writing off the ends of a memory allocation, freeing memory not previously obtained from the memory allocator, or failing to initialize newly allocated memory. } </tcl> <a name="win32options"></a> | | | | 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 | writing off the ends of a memory allocation, freeing memory not previously obtained from the memory allocator, or failing to initialize newly allocated memory. } </tcl> <a name="win32options"></a> <h1> Windows-Specific Options</h1> <tcl> COMPILE_OPTION {SQLITE_WIN32_HEAP_CREATE} { This option forces the Win32 native memory allocator, when enabled, to create a private heap to hold all memory allocations. } COMPILE_OPTION {SQLITE_WIN32_MALLOC_VALIDATE} { This option forces the Win32 native memory allocator, when enabled, to make strategic calls into the HeapValidate() function if assert() is also enabled. } </tcl> <a name="linkage"></a> <h1>Compiler Linkage Control</h1> <p>The following macros specify interface linkage for certain kinds of SQLite builds. The Makefiles will normally handle setting these macros automatically. Application developers should not need to worry with these macros. The following documentation about these macros is included completeness.</p> |
︙ | ︙ |
Changes to pages/dbstat.in.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | <title>The DBSTAT Virtual Table</title> <tcl>hd_keywords dbstat {dbstat virtual table}</tcl> <fancy_format> <p> The DBSTAT virtual tables is a read-only [eponymous virtual table] that returns information about which pages of the database files are used by which tables and indexes in the schema. The the DBSTAT virtual table is used to implement [sqlite3_analyzer.exe] utility program, and to help compute the [https://www.sqlite.org/src/repo-tabsize|table size pie-chart] in the [https://www.fossil-scm.org/|Fossil-implemented] version control system for SQLite. </p> | > > > < < < | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | <title>The DBSTAT Virtual Table</title> <tcl>hd_keywords dbstat {dbstat virtual table}</tcl> <fancy_format> <h1>Overview</h1> <p> The DBSTAT virtual tables is a read-only [eponymous virtual table] that returns information about which pages of the database files are used by which tables and indexes in the schema. The the DBSTAT virtual table is used to implement [sqlite3_analyzer.exe] utility program, and to help compute the [https://www.sqlite.org/src/repo-tabsize|table size pie-chart] in the [https://www.fossil-scm.org/|Fossil-implemented] version control system for SQLite. </p> <p> ^The <b>dbstat</b> virtual table is available on all [database connections] when SQLite is built using the [SQLITE_ENABLE_DBSTAT_VTAB] compile-time option. ^The dbstat virtual table provides low-level information about btree and overflow pages in a database file. |
︙ | ︙ |
Changes to pages/errlog.in.
1 2 3 | <title>The Error And Warning Log</title> <tcl>hd_keywords {errlog} {error log}</tcl> | | > | | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | <title>The Error And Warning Log</title> <tcl>hd_keywords {errlog} {error log}</tcl> <table_of_contents> <h2 style="margin-left:1.0em" notoc id=overview> Overview</h2> <p>SQLite can be configured to invoke a callback function containing an error code and a terse error message whenever anomalies occur. This mechanism is very helpful in tracking obscure problems that occur rarely and in the field. Application developers are encouraged to take advantage of the error logging facility of SQLite in their products, as it is very low CPU and memory cost but can be a huge aid for debugging.</p> <h1>Setting Up The Error Logging Callback</h1> <p>There can only be a single error logging callback per process. The error logging callback is registered at start-time using C-code similar to the following: <blockquote><pre> [sqlite3_config]([SQLITE_CONFIG_LOG], errorLogCallback, pData); |
︙ | ︙ | |||
41 42 43 44 45 46 47 | <p>Do not misunderstand: There is nothing technically wrong with displaying the error logger messages to end users. The messages do not contain sensitive or private information that must be protected from unauthorized viewing. Rather the messages are technical in nature and are not useful or meaningful to the typical end user. The messages coming from the error logger are intended for database geeks. Display them accordingly.</p> | | | 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | <p>Do not misunderstand: There is nothing technically wrong with displaying the error logger messages to end users. The messages do not contain sensitive or private information that must be protected from unauthorized viewing. Rather the messages are technical in nature and are not useful or meaningful to the typical end user. The messages coming from the error logger are intended for database geeks. Display them accordingly.</p> <h1>Interface Details</h1> <p>The third argument to the [sqlite3_config]([SQLITE_CONFIG_LOG],...) interface (the "pData" argument in the example above) is a pointer to arbitrary data. SQLite passes this pointer through to the first argument of the error logger callback. The pointer can be used to pass application-specific setup or state information, if desired. Or it can simply be a NULL pointer which is ignored by the callback.</p> |
︙ | ︙ | |||
71 72 73 74 75 76 77 | to try to allocate memory inside the error logger. Do not even think about trying to store the error message in another SQLite database.</p> <p>Applications can use the [sqlite3_log(E,F,..)] API to send new messages to the log, if desired, but this is discouraged. The [sqlite3_log()] interface is intended for use by extensions only, not by applications.</p> | | | 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 | to try to allocate memory inside the error logger. Do not even think about trying to store the error message in another SQLite database.</p> <p>Applications can use the [sqlite3_log(E,F,..)] API to send new messages to the log, if desired, but this is discouraged. The [sqlite3_log()] interface is intended for use by extensions only, not by applications.</p> <h1>Variety of Error Messages</h1> <p>The error messages that might be sent to the error logger and their exact format is subject to changes from one release to the next. So applications should not depend on any particular error message text formats or error codes. Things do not change capriciously, but they do sometimes changes.</p> |
︙ | ︙ | |||
142 143 144 145 146 147 148 | to the error logger when there really is something wrong. Applications might further cull the error message traffic by deliberately ignore certain classes of error messages that they do not care about. For example, an application that makes frequent database schema changes might want to ignore all SQLITE_SCHEMA errors.</p> | | | 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 | to the error logger when there really is something wrong. Applications might further cull the error message traffic by deliberately ignore certain classes of error messages that they do not care about. For example, an application that makes frequent database schema changes might want to ignore all SQLITE_SCHEMA errors.</p> <h1>Summary</h1> <p>The use of the error logger callback is highly recommended. The debugging information that the error logger provides has proven very useful in tracking down obscure problems that occurs with applications after they get into the field. The error logger callback has also proven useful in catching errors occasional errors that the application misses because of inconsistent checking of API return codes. Developers are encouraged to implement an error logger callback early in the development cycle in order to spot unexpected behavior quickly, and to leave the error logger callback turned on through deployment. If the error logger never finds a problem, then no harm is done. But failure to set up an appropriate error logger might compromise diagnostic capabilities later on.</p> |
Changes to pages/fts5.in.
1 2 3 4 5 6 | <tcl>hd_keywords *fts5 FTS5</tcl> <title>SQLite FTS5 Extension</title> <table_of_contents> | < < | 1 2 3 4 5 6 7 8 9 10 11 12 13 | <tcl>hd_keywords *fts5 FTS5</tcl> <title>SQLite FTS5 Extension</title> <table_of_contents> <h1>Overview of FTS5</h1> <p>FTS5 is an SQLite [virtual table module] that provides <a href=http://en.wikipedia.org/wiki/Full_text_search>full-text search</a> functionality to database applications. In their most elementary form, full-text search engines allow the user to efficiently search a large collection of documents for the subset that contain one or more instances of a |
︙ | ︙ |
Changes to pages/howtocompile.in.
1 2 3 4 | <title>How To Compile SQLite</title> <title>How To Compile SQLite</title> <tcl>hd_keywords {how to compile} {How To Compile SQLite}</tcl> | | > | | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | <title>How To Compile SQLite</title> <title>How To Compile SQLite</title> <tcl>hd_keywords {how to compile} {How To Compile SQLite}</tcl> <table_of_contents> <h2 style="margin-left:1.0em" notoc id=overview> Overview</h2> <p> SQLite is ANSI-C source code. It must be compiled into machine code before it is useful. This article is a guide to the various ways of compiling SQLite. </p> <p>This article does not contain a step-by-step recipe for compiling SQLite. That would be difficult since each development situation is different. Rather, this article describes and illustrates the principals behind the compilation of SQLite. Typical compilation commands are provided as examples with the expectation that application developers can use these examples as guidance for developing their own custom compilation procedures. In other words, this article provides ideas and insights, not turnkey solutions.</p> <h1>Amalgamation Versus Individual Source Files</h1> <p>SQLite is built from over one hundred files of C code and script spread across multiple directories. The implementation of SQLite is pure ANSI-C, but many of the C-language source code files are either generated or transformed by auxiliary C programs and AWK, SED, and TCL scripts prior to being incorporated into the finished SQLite library. Building the necessary C programs and transforming and/or creating the |
︙ | ︙ | |||
54 55 56 57 58 59 60 | For those situations, it is recommended that a customized amalgamation be built (as described [building the amalgamation | below]) and used. In other words, even if a project requires building SQLite beginning with individual source files, it is still recommended that an amalgamation source file be used as an intermediate step.</p> <tcl>hd_fragment {cli} {compiling the CLI}</tcl> | | | 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 | For those situations, it is recommended that a customized amalgamation be built (as described [building the amalgamation | below]) and used. In other words, even if a project requires building SQLite beginning with individual source files, it is still recommended that an amalgamation source file be used as an intermediate step.</p> <tcl>hd_fragment {cli} {compiling the CLI}</tcl> <h1>Compiling The Command-Line Interface</h1> <p>A build of the [CLI | command-line interface] requires three source files:</p> <ul> <li><b>sqlite3.c</b>: The SQLite amalgamation source file <li><b>sqlite3.h</b>: The header files that accompanies sqlite3.c and |
︙ | ︙ | |||
135 136 137 138 139 140 141 | <p>The key point is this: Building the CLI consists of compiling together two C-language files. The <b>shell.c</b> file contains the definition of the entry point and the user input loop and the SQLite amalgamation <b>sqlite3.c</b> contains the complete implementation of the SQLite library.</p> <tcl>hd_fragment {tcl} {compiling the TCL interface}</tcl> | | | 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 | <p>The key point is this: Building the CLI consists of compiling together two C-language files. The <b>shell.c</b> file contains the definition of the entry point and the user input loop and the SQLite amalgamation <b>sqlite3.c</b> contains the complete implementation of the SQLite library.</p> <tcl>hd_fragment {tcl} {compiling the TCL interface}</tcl> <h1>Compiling The TCL Interface</h1> <p>The TCL interface for SQLite is a small module that is added into the regular amalgamation. The result is a new amalgamated source file called "<b>tclsqlite3.c</b>". This single source file is all that is needed to generate a shared library that can be loaded into a standard [http://wiki.tcl-lang.org/2541 | tclsh] or |
︙ | ︙ | |||
175 176 177 178 179 180 181 | SQLite includes a <b>main()</b> procedure that initializes a TCL interpreter and enters a command-line loop when it is compiled with -DTCLSH=1. The command above works on both Linux and Mac OS X, though one may need to adjust the library options depending on the platform and which version of TCL one is linking against.</p> <tcl>hd_fragment {amal} {building the amalgamation}</tcl> | | | 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 | SQLite includes a <b>main()</b> procedure that initializes a TCL interpreter and enters a command-line loop when it is compiled with -DTCLSH=1. The command above works on both Linux and Mac OS X, though one may need to adjust the library options depending on the platform and which version of TCL one is linking against.</p> <tcl>hd_fragment {amal} {building the amalgamation}</tcl> <h1>Building The Amalgamation</h1> <p>The versions of the SQLite amalgamation that are supplied on the [download page] are normally adequate for most users. However, some projects may want or need to build their own amalgamations. A common reason for building a custom amalgamation is in order to use certain [compile-time options] to customize the SQLite library. Recall that the SQLite amalgamation contains a lot of C-code that is generated by |
︙ | ︙ | |||
244 245 246 247 248 249 250 | "<b>sqlite3.c</b>" amalgamation source file, its header file "<b>sqlite3.h</b>", and the "<b>tclsqlite3.c</b>" amalgamation source file that includes the TCL interface. Afterwards, the needed files can be copied into project directories and compiled according to the procedures outlined above.</p> <tcl>hd_fragment {dll} {building a DLL}</tcl> | | | 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 | "<b>sqlite3.c</b>" amalgamation source file, its header file "<b>sqlite3.h</b>", and the "<b>tclsqlite3.c</b>" amalgamation source file that includes the TCL interface. Afterwards, the needed files can be copied into project directories and compiled according to the procedures outlined above.</p> <tcl>hd_fragment {dll} {building a DLL}</tcl> <h1>Building A Windows DLL</h1> <p>To build a DLL of SQLite for use in Windows, first acquire the appropriate amalgamated source code files, sqlite3.c and sqlite3.h. These can either be downloaded from the [http://www.sqlite.org/download.html | SQLite website] or custom generated from sources as shown above.</p> |
︙ | ︙ |
Changes to pages/howtocorrupt.in.
1 2 3 | <title>How To Corrupt An SQLite Database File</title> <tcl>hd_keywords {how to corrupt}</tcl> | | > | | | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | <title>How To Corrupt An SQLite Database File</title> <tcl>hd_keywords {how to corrupt}</tcl> <table_of_contents> <h2 style="margin-left:1.0em" notoc id=overview> Overview</h2> <p>An SQLite database is highly resistant to corruption. If an application crash, or an operating-system crash, or even a power failure occurs in the middle of a transaction, the partially written transaction should be automatically rolled back the next time the database file is accessed. The recovery process is fully automatic and does not require any action on the part of the user or the application. </p> <p>Though SQLite is resistant to database corruption, it is not immune. This document describes the various ways that an SQLite database might go corrupt.</p> <h1> File overwrite by a rogue thread or process</h1> <p>SQLite database files are ordinary disk files. That means that any process can open the file and overwrite it with garbage. There is nothing that the SQLite library can do to defend against this.</p> <h2> Continuing to use a file descriptor after it has been closed</h2> <p>We have seen multiple cases where a file descriptor was open on a file, then that file descriptor was closed and reopened on an SQLite database. Later, some other thread continued to write into the old file descriptor, not realizing that the original file had been closed already. But because the file descriptor had been reopened by SQLite, the information that was intended to go into the original file ended up |
︙ | ︙ | |||
48 49 50 51 52 53 54 | for database files. (See [SQLITE_MINIMUM_FILE_DESCRIPTOR] for additional information.)</p> <p>Another example of corruption caused by using a closed file descriptor was [https://code.facebook.com/posts/313033472212144/debugging-file-corruption-on-ios/|reported by facebook engineers] in a blog post on 2014-08-12.</p> | | | | | | | 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 | for database files. (See [SQLITE_MINIMUM_FILE_DESCRIPTOR] for additional information.)</p> <p>Another example of corruption caused by using a closed file descriptor was [https://code.facebook.com/posts/313033472212144/debugging-file-corruption-on-ios/|reported by facebook engineers] in a blog post on 2014-08-12.</p> <h2> Backup or restore while a transaction is active</h2> <p>Systems that run automatic backups in the background might try to make a backup copy of an SQLite database file while it is in the middle of a transaction. The backup copy then might contain some old and some new content, and thus be corrupt.</p> <p>The best approach to make reliable backup copies of an SQLite database is to make use of the [backup API] that is part of the SQLite library. Failing that, it is safe to make a copy of an SQLite database file as long as there are no transactions in progress by any process. If the previous transaction failed, then it is important that any rollback journal (the <tt>*-journal</tt> file) or write-ahead log (the <tt>*-wal</tt> file) be copied together with the database file itself.</p> <tcl>hd_fragment delhotjrnl {deleting a hot journal}</tcl> <h2> Deleting a hot journal</h2> <p>SQLite normally stores all content in a single disk file. However, while performing a transaction, information necessary to roll back that transaction following a crash or power failure is stored in auxiliary journal files. These journal files have the same name as the original database file with the addition of <tt>-journal</tt> or <tt>-wal</tt> suffix.</p> <p>SQLite must see the journal files in order to recover from a crash or power failure. If the journal files are moved, deleted, or renamed after a crash or power failure, then automatic recovery will not work and the database may go corrupt.</p> <p>Another manifestation of this problem is [database corruption caused by inconsistent use of 8+3 filenames].</p> <h1> File locking problems</h1> <p>SQLite uses file locks on the database file, and on the [write-ahead log] or [WAL] file, to coordinate access between concurrent processes. Without coordination, two threads or processes might try to make incompatible changes to a database file at the same time, resulting in database corruption.</p> <h2> Filesystems with broken or missing lock implementations</h2> <p>SQLite depends on the underlying filesystem to do locking as the documentation says it will. But some filesystems contain bugs in their locking logic such that the locks do not always behave as advertised. This is especially true of network filesystems and NFS in particular. If SQLite is used on a filesystem where the locking primitives contain bugs, and if two or more threads or processes try to access the same database at the same time, then database corruption might result.</p> <tcl>hd_fragment posix_close_bug</tcl> <h2> Posix advisory locks canceled by a separate thread doing close()</h2> <p>The default locking mechanism used by SQLite on unix platforms is POSIX advisory locking. Unfortunately, POSIX advisory locking has design quirks that make it prone to misuse and failure. In particular, any thread in the same process with a file descriptor that is holding a POSIX advisory lock can override that lock using a different file descriptor. One particularly pernicious problem is that the <tt>close()</tt> system |
︙ | ︙ | |||
134 135 136 137 138 139 140 | <p>Note that it is perfectly safe for two or more threads to access the same SQLite database file using the SQLite library. The unix drivers for SQLite know about the POSIX advisory locking quirks and work around them. This problem only arises when a thread tries to bypass the SQLite library and read the database file directly.</p> | | | 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 | <p>Note that it is perfectly safe for two or more threads to access the same SQLite database file using the SQLite library. The unix drivers for SQLite know about the POSIX advisory locking quirks and work around them. This problem only arises when a thread tries to bypass the SQLite library and read the database file directly.</p> <h3>Multiple copies of SQLite linked into the same application</h3> <p>As pointed out in the previous paragraph, SQLite takes steps to work around the quirks of POSIX advisory locking. Part of that work-around involves keeping a global list (mutex protected) of open SQLite database files. But, if multiple copies of SQLite are linked into the same application, then there will be multiple instances of this global list. Database connections opened using one copy of the SQLite library |
︙ | ︙ | |||
158 159 160 161 162 163 164 | with exactly this bug. The vendor came to the SQLite developers seeking help in tracking down some infrequent database corruption issues they were seeing on Linux and Mac. The problem was eventually traced to the fact that the application was linking against two separate copies of SQLite. The solution was to change the application build procedures to link against just one copy of SQLite instead of two.</p> | | | | 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 | with exactly this bug. The vendor came to the SQLite developers seeking help in tracking down some infrequent database corruption issues they were seeing on Linux and Mac. The problem was eventually traced to the fact that the application was linking against two separate copies of SQLite. The solution was to change the application build procedures to link against just one copy of SQLite instead of two.</p> <h2> Two processes using different locking protocols</h2> <p>The default locking mechanism used by SQLite on unix platforms is POSIX advisory locking, but there are other options. By selecting an alternative [sqlite3_vfs] using the [sqlite3_open_v2()] interface, an application can make use of other locking protocols that might be more appropriate to certain filesystems. For example, dot-file locking might be select for use in an application that has to run on an NFS filesystem that does not support POSIX advisory locking.</p> <p>It is important that all connections to the same database file use the same locking protocol. If one application is using POSIX advisory locks and another application is using dot-file locking, then the two applications will not see each other's locks and will not be able to coordinate database access, possibly leading to database corruption.</p> <tcl>hd_fragment unlink {unlink corruption} {unlinked database files}</tcl> <h2> Unlinking or renaming a database file while in use</h2> <p>If two processes have open connections to the same database file and one process closes its connection, unlinks the file, then creates a new database file in its place with the same name and reopens the new file, then the two processes will be talking to different database files with the same name. (Note that this is only possible on Posix and Posix-like systems that permit a file to be unlinked while it is still open for |
︙ | ︙ | |||
200 201 202 203 204 205 206 | results in behavior that is undefined and probably undesirable.</p> <p>Beginning with SQLite [version 3.7.17], the unix OS interface will send SQLITE_WARNING messages to the [error log] if a database file is unlinked while it is still in use.</p> <tcl>hd_fragment alias {database filename aliasing}</tcl> | | | 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 | results in behavior that is undefined and probably undesirable.</p> <p>Beginning with SQLite [version 3.7.17], the unix OS interface will send SQLITE_WARNING messages to the [error log] if a database file is unlinked while it is still in use.</p> <tcl>hd_fragment alias {database filename aliasing}</tcl> <h2> Multiple links to the same file</h2> <p>If a single database file has multiple links (either hard or soft links) then that is just another way of saying that the file has multiple names. If two or more processes open the database using different names, then they will use different rollback journals and WAL files. That means that if one process crashes, the other process will be unable to recover the transaction in progress because it will be looking in the wrong place |
︙ | ︙ | |||
223 224 225 226 227 228 229 | <p>Beginning with SQLite [version 3.10.0], the unix OS interface will attempt to resolve symbolic links and open the database file by its canonical name. Prior to version 3.10.0, opening a database file through a symbolic link was similar to opening a database file that had multiple hard links and resulted in undefined behavior.</p> | | | | 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 | <p>Beginning with SQLite [version 3.10.0], the unix OS interface will attempt to resolve symbolic links and open the database file by its canonical name. Prior to version 3.10.0, opening a database file through a symbolic link was similar to opening a database file that had multiple hard links and resulted in undefined behavior.</p> <h1> Failure to sync</h1> <p>In order to guarantee that database files are always consistent, SQLite will occasionally ask the operating system to flush all pending writes to persistent storage then wait for that flush to complete. This is accomplished using the <tt>fsync()</tt> system call under unix and <tt>FlushFileBuffers()</tt> under Windows. We call this flush of pending writes a "sync".</p> <p>Actually, if one is only concerned with atomic and consistent writes and is willing to forego durable writes, the sync operation does not need to wait until the content is completely stored on persistent media. Instead, the sync operation can be thought of as an I/O barrier. As long as all writes that occur before the sync are completed before any write that happens after the sync, no database corruption will occur. If sync is operating as an I/O barrier and not as a true sync, then a power failure or system crash might cause one or more previously committed transactions to roll back (in violation of the "durable" property of "ACID") but the database will at least continue to be consistent, and that is what most people care about.</p> <h2> Disk drives that do not honor sync requests</h2> <p>Unfortunately, most consumer-grade mass storage devices lie about syncing. Disk drives will report that content is safely on persistent media as soon as it reaches the track buffer and before actually being written to oxide. This makes the disk drives seem to operate faster (which is vitally important to the manufacturer so that they can show good benchmark numbers in trade magazines). And in fairness, the lie |
︙ | ︙ | |||
278 279 280 281 282 283 284 | out-of-order writes than in the default rollback journal modes. In WAL mode, the only time that a failed sync operation can cause database corruption is during a [checkpoint] operation. A sync failure during a COMMIT might result in loss of durability but not in a corrupt database file. Hence, one line of defense against database corruption due to failed sync operations is to use SQLite in WAL mode and to checkpoint as infrequently as possible.</p> | | | | | | | | | | | | 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 | out-of-order writes than in the default rollback journal modes. In WAL mode, the only time that a failed sync operation can cause database corruption is during a [checkpoint] operation. A sync failure during a COMMIT might result in loss of durability but not in a corrupt database file. Hence, one line of defense against database corruption due to failed sync operations is to use SQLite in WAL mode and to checkpoint as infrequently as possible.</p> <h2> Disabling sync using PRAGMAs</h2> <p>The sync operations that SQLite performs to help ensure integrity can be disabled at run-time using the [synchronous pragma]. By setting PRAGMA synchronous=OFF, all sync operations are omitted. This makes SQLite seem to run faster, but it also allows the operating system to freely reorder writes, which could result in database corruption if a power failure or hard reset occurs prior to all content reaching persistent storage.</p> <p>For maximum reliability and for robustness against database corruption, SQLite should always be run with its default synchronous setting of FULL.</p> <tcl>hd_fragment hardwarefault</tcl> <h1> Disk Drive and Flash Memory Failures</h1> <p>An SQLite database can become corrupt if the file content changes due to a disk drive or flash memory failure. It is very rare, but disks will occasionally flip a bit in the middle of a sector.</p> <h2> Non-powersafe flash memory controllers</h2> <p>We are told that in some flash memory controllers the wear-leveling logic can cause random filesystem damage if power is interrupted during a write. This can manifest, for example, as random changes in the middle of a file that was not even open at the time of the power loss. So, for example, a device would be writing content into an MP3 file in flash memory when a power loss occurs, and that could result in an SQLite database being corrupted even though the database as not even in use at the time of the power loss.</p> <tcl>hd_fragment fakeusb</tcl> <h2> Fake capacity USB sticks</h2> <p>There are many fraudulent USB sticks in circulation that report to have a high capacity (ex: 8GB) but are really only capable of storing a much smaller amount (ex: 1GB). Attempts to write on these devices will often result in unrelated files being overwritten. Any use of a fraudulent flash memory device can easily lead to database corruption, therefore. Internet searches such as "fake capacity usb" will turn up lots of disturbing information about this problem. <h1> Memory corruption</h1> <p>SQLite is a C-library that runs in the same address space as the application that it serves. That means that stray pointers, buffer overruns, heap corruption, or other malfunctions in the application can corrupt internal SQLite data structure and ultimately result in a corrupt database file. Normally these kinds of problems manifest themselves as segfaults prior to any database corruption occurring, but there have been instances where application code errors have caused SQLite to malfunction subtly so as to corrupt the database file rather than panicking.</p> <p>The memory corruption problem becomes more acute when using [memory-mapped I/O]. When all or part of the database file is mapped into the application's address space, then a stray pointer the overwrites any part of that mapped space will immediately corrupt the database file, without requiring the application to do a subsequent write() system call.</p> <h1> Other operating system problems</h1> <p>Sometimes operating systems will exhibit non-standard behavior which can lead to problems. Sometimes this non-standard behavior is deliberate, and sometimes it is a mistake in the implementation. But in any event, if the operating performs differently from they way SQLite expects it to perform, the possibility of database corruption exists.</p> <h2> Linux Threads</h2> <p>Some older versions of Linux used the LinuxThreads library for thread support. LinuxThreads is similar to Pthreads, but is subtly different with respect to handling of POSIX advisory locks. SQLite versions 2.2.3 through 3.6.23 recognized that LinuxThreads where being used at runtime and took appropriate action to work around the non-standard behavior of LinuxThreads. But most modern Linux implementations make use of the newer, and correct, NPTL implementation of Pthreads. Beginning with SQLite version 3.7.0, the use of NPTL is assumed. No checks are made. Hence, recent versions of SQLite will subtly malfunction and may corrupt database files if used in multi-threaded application that run on older linux systems that make use of LinuxThreads.</p> <h2> Failures of mmap() on QNX</h2> <p>There exists some subtle problem with mmap() on QNX such that making a second mmap() call against a single file descriptor can cause the memory obtained from the first mmap() call to be zeroed. SQLite on unix uses mmap() to create a shared memory region for transaction coordination in [WAL | WAL mode], and it will call mmap() multiple times for large transactions. The QNX mmap() has been demonstrated to corrupt database file under that scenario. QNX engineers are aware of this problem and are working on a solution; the problem may have already been fixed by the time you read this.</p> <p>When running on QNX, it is recommended that [memory-mapped I/O] never be used. Furthermore, to use [WAL mode], it is recommended that applications employ the [locking_mode | exclusive locking mode] in order to use [WAL without shared memory]. <tcl>hd_fragment fscorruption {filesystem corruption}</tcl> <h2> Filesystem Corruption</h2> <p>Since SQLite databases are ordinary disk files, any malfunction in the filesystem can corrupt the database. Filesystems in modern operating systems are very reliable, but errors do still occur. For example, on 2013-10-01 the SQLite database that holds the <a href="http://wiki.tcl-lang.org/">Wiki for Tcl/Tk</a> went corrupt a few days after the host computer was moved to a dodgy build of the (linux) kernel that had issues in the filesystem layer. In that event, the filesystem eventually became so badly corrupted that the machine was unusable, but the earliest symptom of trouble was the corrupted SQLite database.</p> <h1> Bugs in SQLite</h1> <p>SQLite is [testing | very carefully tested] to help ensure that it is as bug-free as possible. Among the many tests that are carried out for every SQLite version are tests that simulate power failures, I/O errors, and out-of-memory (OOM) errors and verify that no database corrupt occurs during any of these events. SQLite is also field-proven with approximately two billion active deployments with no serious problems.</p> |
︙ | ︙ | |||
414 415 416 417 418 419 420 | of all database-corruption bugs found in SQLite during the four-year period from 2009-04-01 to 2013-04-15. This account should give the reader an intuitive sense of the kinds of bugs in SQLite that manage to slip through testing procedures and make it into a release.</p> | | | | | | | | 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 | of all database-corruption bugs found in SQLite during the four-year period from 2009-04-01 to 2013-04-15. This account should give the reader an intuitive sense of the kinds of bugs in SQLite that manage to slip through testing procedures and make it into a release.</p> <h2> False corruption reports due to database shrinkage</h2> <p>If a database is written by SQLite version 3.7.0 or later and then written again by SQLite version 3.6.23 or earlier in such a way as to make the size of the database file decrease, then the next time that SQLite version 3.7.0 access the database file, it might report that the database file is corrupt. The database file is not really corrupt, however. Version 3.7.0 was simply being overly zealous in its corruption detection.</p> <p>The problem was fixed on 2011-02-20. The fix first appears in SQLite version 3.7.6.</p> <h2> Corruption following switches between rollback and WAL modes</h2> <p>Repeatedly switching an SQLite database in and out of [WAL | WAL mode] and running the [VACUUM] command in between switches, in one process or thread, can cause another process or thread that has the database file open to miss the fact that the database has changed. That second process or thread might then try to modify the database using a stale cache and cause database corruption.</p> <p>This problem was discovered during internal testing and has never been observed in the wild. The problem was fixed on 2011-01-27 and in version 3.7.5.</p> <h2> I/O while obtaining a lock leads to corruption</h2> <p>If the operating system returns an I/O error while attempting to obtain a certain lock on shared memory in [WAL | WAL mode] then SQLite might fail to reset its cache, which could lead to database corruption if subsequent writes are attempted.</p> <p>Note that this problem only occurs if the attempt to acquire the lock resulted in an I/O error. If the lock is simply not granted (because some other thread or process is already holding a conflicting lock) then no corruption will ever occur. We are not aware of any operating systems that will fail with an I/O error while attempting to get a file lock on shared memory. So this is a theoretical problem rather than a real problem. Needless to say, this problem has never been observed in the wild. The problem was discovered while doing stress testing of SQLite in a test harness that simulates I/O errors.</p> <p>This problem was fixed on 2010-09-20 for SQLite version 3.7.3.</p> <h2> Database pages leak from the free page list</h2> <p>When content is deleted from an SQLite database, pages that are no longer used are added to a free list and are reused to hold content added by subsequent inserts. A bug in SQLite that was present in version 3.6.16 through 3.7.2 might cause pages to go missing out of the free list when [incremental_vacuum] was used. This would not cause data loss. But it would result in the database file being larger than necessary. And it would cause the [integrity_check pragma] to report pages missing from the free list.</p> <p>This problem was fixed on 2010-08-23 for SQLite version 3.7.2.</p> <h2> Corruption following alternating writes from 3.6 and 3.7.</h2> <p>SQLite version 3.7.0 introduced a number of new enhancements to the SQLite database file format (such as but not limited to [WAL]). The 3.7.0 release was a shake-out release for these new features. We expected to find problems and were not disappointed.</p> <p>If a database were originally created using SQLite version 3.7.0, then written by SQLite version 3.6.23.1 such that the size of the database file increased, then written again by SQLite version 3.7.0, the database file could go corrupt.</p> <p>This problem was fixed on 2010-08-04 for SQLite version 3.7.1.</p> <h2> Race condition in recovery on windows system.</h2> <p>SQLite version 3.7.16.2 fixes a subtle race condition in the locking logic on Windows systems. When a database file is in need of recovery because the previous process writing to it crashed in the middle of a transaction and two or more processes try to open the that database at the same time, then the race condition might cause one of those processes to get a false indication that the recovery |
︙ | ︙ |
Changes to pages/malloc.in.
1 2 | <title>Dynamic Memory Allocation In SQLite</title> | | > > | | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | <title>Dynamic Memory Allocation In SQLite</title> <table_of_contents> <tcl>hd_keywords {memory allocation}</tcl> <h1 style="margin-left:1.0em" notoc id=overview> Overview</h1> <p>SQLite uses dynamic memory allocation to obtain memory for storing various objects (ex: [database connections] and [prepared statements]) and to build a memory cache of the database file and to hold the results of queries. Much effort has gone into making the dynamic memory allocation subsystem of SQLite reliable, predictable, robust, secure, and efficient.</p> <p>This document provides an overview of dynamic memory allocation within SQLite. The target audience is software engineers who are tuning their use of SQLite for peak performance in demanding environments. Nothing in this document is required knowledge for using SQLite. The default settings and configuration for SQLite will work well in most applications. However, the information contained in this document may be useful to engineers who are tuning SQLite to comply with special requirements or to run under unusual circumstances.</p> <a name="features"></a> <h1> Features</h1> <p>The SQLite core and its memory allocation subsystem provides the following capabilities:</p> <ul> <li><p> <b>Robust against allocation failures.</b> |
︙ | ︙ | |||
110 111 112 113 114 115 116 | routines used by SQLite through the [sqlite3_malloc()], [sqlite3_realloc()], and [sqlite3_free()] interfaces. </p></li> </ul> <a name="testing"></a> | | | 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 | routines used by SQLite through the [sqlite3_malloc()], [sqlite3_realloc()], and [sqlite3_free()] interfaces. </p></li> </ul> <a name="testing"></a> <h1> Testing</h1> <p>Most of the code in the SQLite source tree is devoted purely to [testing | testing and verification]. Reliability is important to SQLite. Among the tasks of the test infrastructure is to ensure that SQLite does not misuse dynamically allocated memory, that SQLite does not leak memory, and that SQLite responds correctly to a dynamic memory allocation failure.</p> |
︙ | ︙ | |||
196 197 198 199 200 201 202 | the [TCL test suite] provides over 99% statement test coverage and that the [TH3] test harness provides [test coverage | 100% branch test coverage] with no leak leaks. This is strong evidence that dynamic memory allocation is used correctly everywhere within SQLite.</p> <a name="allocarray"></a> | | | 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 | the [TCL test suite] provides over 99% statement test coverage and that the [TH3] test harness provides [test coverage | 100% branch test coverage] with no leak leaks. This is strong evidence that dynamic memory allocation is used correctly everywhere within SQLite.</p> <a name="allocarray"></a> <h2> Use of reallocarray()</h2> <p>The reallocarray() interface is a recent innovation (circa 2014) from the OpenBSD community that grow out of efforts to prevent the next [http://heartbleed.com | "heartbleed" bug] by avoiding 32-bit integer arithmetic overflow on memory allocation size computations. The reallocarray() function has both unit-size and count parameters. To allocate memory sufficient to hold an array of N elements each X-bytes |
︙ | ︙ | |||
235 236 237 238 239 240 241 | also verifies that it is impossible to overflow a 64-bit integer during the computation.</p> <p>The code audits used to ensure that memory allocation size computations do not overflow in SQLite are repeated prior to every SQLite release.</p> <a name="config"></a> | | | | | | 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 | also verifies that it is impossible to overflow a 64-bit integer during the computation.</p> <p>The code audits used to ensure that memory allocation size computations do not overflow in SQLite are repeated prior to every SQLite release.</p> <a name="config"></a> <h1> Configuration</h1> <p>The default memory allocation settings in SQLite are appropriate for most applications. However, applications with unusual or particularly strict requirements may want to adjust the configuration to more closely align SQLite to their needs. Both compile-time and start-time configuration options are available.</p> <tcl>hd_fragment altalloc {built-in memory allocators}</tcl> <h2> Alternative low-level memory allocators</h2> <p>The SQLite source code includes several different memory allocation modules that can be selected at compile-time, or to a limited extent at start-time.</p> <tcl>hd_fragment defaultalloc {default memory allocator}</tcl> <h3>The default memory allocator</h3> <p>By default, SQLite uses the malloc(), realloc(), and free() routines from the standard C library for its memory allocation needs. These routines are surrounded by a thin wrapper that also provides a "memsize()" function that will return the size of an existing allocation. The memsize() function is needed to keep an accurate count of the number of bytes of outstanding memory; memsize() determines how many bytes to remove from the outstanding count when an allocation is freed. The default allocator implements memsize() by always allocating 8 extra bytes on each malloc() request and storing the size of the allocation in that 8-byte header.</p> <p>The default memory allocator is recommended for most applications. If you do not have a compelling need to use an alternative memory allocator, then use the default.</p> <tcl>hd_fragment memdebug {debugging memory allocator} *memsys2</tcl> <h3>The debugging memory allocator</h3> <p>If SQLite is compiled with the [SQLITE_MEMDEBUG] compile-time option, then a different, heavy wrapper is used around system malloc(), realloc(), and free(). The heavy wrapper allocates around 100 bytes of extra space with each allocation. The extra space is used to place sentinel values at both ends of the allocation returned to the SQLite core. When an |
︙ | ︙ | |||
299 300 301 302 303 304 305 | <p>The heavy wrapper employed by [SQLITE_MEMDEBUG] is intended for use only during testing, analysis, and debugging of SQLite. The heavy wrapper has a significant performance and memory overhead and probably should not be used in production.</p> <tcl>hd_fragment win32heap {Win32 native memory allocator}</tcl> | | | | 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 | <p>The heavy wrapper employed by [SQLITE_MEMDEBUG] is intended for use only during testing, analysis, and debugging of SQLite. The heavy wrapper has a significant performance and memory overhead and probably should not be used in production.</p> <tcl>hd_fragment win32heap {Win32 native memory allocator}</tcl> <h3>The Win32 native memory allocator</h3> <p>If SQLite is compiled for Windows with the [SQLITE_WIN32_MALLOC] compile-time option, then a different, thin wrapper is used around HeapAlloc(), HeapReAlloc(), and HeapFree(). The thin wrapper uses the configured SQLite heap, which will be different from the default process heap if the [SQLITE_WIN32_HEAP_CREATE] compile-time option is used. In addition, when an allocation is made or freed, HeapValidate() will be called if SQLite is compiled with assert() enabled and the [SQLITE_WIN32_MALLOC_VALIDATE] compile-time option.</p> <tcl>hd_fragment memsys5 *memsys5 {zero-malloc memory allocator}</tcl> <h3>Zero-malloc memory allocator</h3> <p>When SQLite is compiled with the [SQLITE_ENABLE_MEMSYS5] option, an alternative memory allocator that does not use malloc() is included in the build. The SQLite developers refer to this alternative memory allocator as "memsys5". Even when it is included in the build, memsys5 is disabled by default. To enable memsys5, the application must invoke the following SQLite |
︙ | ︙ | |||
351 352 353 354 355 356 357 | requests are rounded up to a power of two and the request is satisfied by the first free slot in pBuf that is large enough. Adjacent freed allocations are coalesced using a buddy system. When used appropriately, this algorithm provides mathematical guarantees against fragmentation and breakdown, as described further <a href="#nofrag">below</a>.</p> <tcl>hd_fragment memsysx {experimental memory allocators}</tcl> | | | 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 | requests are rounded up to a power of two and the request is satisfied by the first free slot in pBuf that is large enough. Adjacent freed allocations are coalesced using a buddy system. When used appropriately, this algorithm provides mathematical guarantees against fragmentation and breakdown, as described further <a href="#nofrag">below</a>.</p> <tcl>hd_fragment memsysx {experimental memory allocators}</tcl> <h3>Experimental memory allocators</h3> <p>The name "memsys5" used for the zero-malloc memory allocator implies that there are several additional memory allocators available, and indeed there are. The default memory allocator is "memsys1". The debugging memory allocator is "memsys2". Those have already been covered.</p> <p>If SQLite is compiled with [SQLITE_ENABLE_MEMSYS3] then another |
︙ | ︙ | |||
408 409 410 411 412 413 414 | [version 3.6.5].</p> <p>Other experimental memory allocators might be added in future releases of SQLite. One may anticipate that these will be called memsys7, memsys8, and so forth.</p> <a name="appalloc"></a> | | | 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 | [version 3.6.5].</p> <p>Other experimental memory allocators might be added in future releases of SQLite. One may anticipate that these will be called memsys7, memsys8, and so forth.</p> <a name="appalloc"></a> <h3>Application-defined memory allocators</h3> <p>New memory allocators do not have to be part of the SQLite source tree nor included in the sqlite3.c [amalgamation]. Individual applications can supply their own memory allocators to SQLite at start-time.</p> <p>To cause SQLite to use a new memory allocator, the application simply calls:</p> |
︙ | ︙ | |||
433 434 435 436 437 438 439 | <p>In a multi-threaded application, access to the [sqlite3_mem_methods] is serialized if and only if [SQLITE_CONFIG_MEMSTATUS] is enabled. If [SQLITE_CONFIG_MEMSTATUS] is disabled then the methods in [sqlite3_mem_methods] must take care of their own serialization needs.</p> <a name="overlayalloc"></a> | | | 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 | <p>In a multi-threaded application, access to the [sqlite3_mem_methods] is serialized if and only if [SQLITE_CONFIG_MEMSTATUS] is enabled. If [SQLITE_CONFIG_MEMSTATUS] is disabled then the methods in [sqlite3_mem_methods] must take care of their own serialization needs.</p> <a name="overlayalloc"></a> <h3>Memory allocator overlays</h3> <p>An application can insert layers or "overlays" in between the SQLite core and the underlying memory allocator. For example, the <a href="#oomtesting">out-of-memory test logic</a> for SQLite uses an overlay that can simulate memory allocation failures.</p> |
︙ | ︙ | |||
455 456 457 458 459 460 461 | The existing allocator is saved by the overlay and is used as a fallback to do real memory allocation. Then the overlay is inserted in place of the existing memory allocator using the [sqlite3_config]([SQLITE_CONFIG_MALLOC],...) as described <a href="#appalloc">above</a>. <a name="stuballoc"></a> | | | | 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 | The existing allocator is saved by the overlay and is used as a fallback to do real memory allocation. Then the overlay is inserted in place of the existing memory allocator using the [sqlite3_config]([SQLITE_CONFIG_MALLOC],...) as described <a href="#appalloc">above</a>. <a name="stuballoc"></a> <h3>No-op memory allocator stub</h3> <p>If SQLite is compiled with the [SQLITE_ZERO_MALLOC] option, then the [default memory allocator] is omitted and replaced by a stub memory allocator that never allocates any memory. Any calls to the stub memory allocator will report back that no memory is available.</p> <p>The no-op memory allocator is not useful by itself. It exists only as a placeholder so that SQLite has a memory allocator to link against on systems that may not have malloc(), free(), or realloc() in their standard library. An application that is compiled with [SQLITE_ZERO_MALLOC] will need to use [sqlite3_config()] together with [SQLITE_CONFIG_MALLOC] or [SQLITE_CONFIG_HEAP] to specify a new alternative memory allocator before beginning to use SQLite.</p> <tcl>hd_fragment scratch {scratch memory allocator}</tcl> <h2> Scratch memory</h2> <p>SQLite occasionally needs a large chunk of "scratch" memory to perform some transient calculation. Scratch memory is used, for example, as temporary storage when rebalancing a B-Tree. These scratch memory allocations are typically about 10 kilobytes in size and are transient - lasting only for the duration of a single, short-lived function call.</p> |
︙ | ︙ | |||
516 517 518 519 520 521 522 | <p>If the scratch memory setup does not define enough memory, then SQLite falls back to using the regular memory allocator for its scratch memory allocations. The default setup is sz=0 and N=0 so the use of the regular memory allocator is the default behavior.</p> <tcl>hd_fragment pagecache {pagecache memory allocator}</tcl> | | | 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 | <p>If the scratch memory setup does not define enough memory, then SQLite falls back to using the regular memory allocator for its scratch memory allocations. The default setup is sz=0 and N=0 so the use of the regular memory allocator is the default behavior.</p> <tcl>hd_fragment pagecache {pagecache memory allocator}</tcl> <h2> Page cache memory</h2> <p>In most applications, the database page cache subsystem within SQLite uses more dynamically allocated memory than all other parts of SQLite combined. It is not unusual to see the database page cache consumes over 10 times more memory than the rest of SQLite combined.</p> <p>SQLite can be configured to make page cache memory allocations from |
︙ | ︙ | |||
566 567 568 569 570 571 572 | number of available allocations.</p> <p>If SQLite needs a page-cache entry that is larger than "sz" bytes or if it needs more than N entries, it falls back to using the general-purpose memory allocator.</p> <tcl>hd_fragment lookaside {lookaside memory allocator}</tcl> | | | 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 | number of available allocations.</p> <p>If SQLite needs a page-cache entry that is larger than "sz" bytes or if it needs more than N entries, it falls back to using the general-purpose memory allocator.</p> <tcl>hd_fragment lookaside {lookaside memory allocator}</tcl> <h2> Lookaside memory allocator</h2> <p>SQLite [database connections] make many small and short-lived memory allocations. This occurs most commonly when compiling SQL statements using [sqlite3_prepare_v2()] but also to a lesser extent when running [prepared statements] using [sqlite3_step()]. These small memory allocations are used to hold things such as the names of tables |
︙ | ︙ | |||
654 655 656 657 658 659 660 | <p>The lookaside configuration can only be changed while there are no outstanding lookaside allocations for the database connection. Hence, the configuration should be set immediately after creating the database connection using [sqlite3_open()] (or equivalent) and before evaluating any SQL statements on the connection.</p> <tcl>hd_fragment memstatus {memory statistics}</tcl> | | | 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 | <p>The lookaside configuration can only be changed while there are no outstanding lookaside allocations for the database connection. Hence, the configuration should be set immediately after creating the database connection using [sqlite3_open()] (or equivalent) and before evaluating any SQL statements on the connection.</p> <tcl>hd_fragment memstatus {memory statistics}</tcl> <h2> Memory status</h2> <p>By default, SQLite keeps statistics on its memory usage. These statistics are useful in helping to determine how much memory an application really needs. The statistics can also be used in high-reliability system to determine if the memory usage is coming close to or exceeding the limits of the [Robson proof] and hence that the memory allocation subsystem is |
︙ | ︙ | |||
716 717 718 719 720 721 722 | <p>The per-connection statistics do not use global variables and hence do not require mutexes to update or access. Consequently the per-connection statistics continue to function even if [SQLITE_CONFIG_MEMSTATUS] is turned off.</p> <a name="heaplimit"></a> | | | 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 | <p>The per-connection statistics do not use global variables and hence do not require mutexes to update or access. Consequently the per-connection statistics continue to function even if [SQLITE_CONFIG_MEMSTATUS] is turned off.</p> <a name="heaplimit"></a> <h2> Setting memory usage limits</h2> <p>The [sqlite3_soft_heap_limit64()] interface can be used to set an upper bound on the total amount of outstanding memory that the general-purpose memory allocator for SQLite will allow to be outstanding at one time. If attempts are made to allocate more memory that specified by the soft heap limit, then SQLite will first attempt to free cache memory before continuing with the allocation request. The soft heap |
︙ | ︙ | |||
742 743 744 745 746 747 748 | <p>As of SQLite version 3.6.1, the soft heap limit only applies to the general-purpose memory allocator. The soft heap limit does not know about or interact with the [scratch memory allocator], the [pagecache memory allocator], or the [lookaside memory allocator]. This deficiency will likely be addressed in a future release.</p> <tcl>hd_fragment nofrag {Robson proof}</tcl> | | | 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 | <p>As of SQLite version 3.6.1, the soft heap limit only applies to the general-purpose memory allocator. The soft heap limit does not know about or interact with the [scratch memory allocator], the [pagecache memory allocator], or the [lookaside memory allocator]. This deficiency will likely be addressed in a future release.</p> <tcl>hd_fragment nofrag {Robson proof}</tcl> <h1> Mathematical Guarantees Against Memory Allocation Failures</h1> <p>The problem of dynamic memory allocation, and specifically the problem of a memory allocator breakdown, has been studied by J. M. Robson and the results published as:</p> <blockquote> J. M. Robson. "Bounds for Some Functions Concerning Dynamic |
︙ | ︙ | |||
822 823 824 825 826 827 828 | for <b>N</b> that will guarantee that no call to any SQLite interface will ever return [SQLITE_NOMEM]. The memory pool will never become so fragmented that a new memory allocation request cannot be satisfied. This is an important property for applications where a software fault could cause injury, physical harm, or loss of irreplaceable data.</p> | | | 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 | for <b>N</b> that will guarantee that no call to any SQLite interface will ever return [SQLITE_NOMEM]. The memory pool will never become so fragmented that a new memory allocation request cannot be satisfied. This is an important property for applications where a software fault could cause injury, physical harm, or loss of irreplaceable data.</p> <h2> Computing and controlling parameters <b>M</b> and <b>n</b></h2> <p>The Robson proof applies separately to each of the memory allocators used by SQLite:</p> <ul> <li>The general-purpose memory allocator ([memsys5]).</li> <li>The [scratch memory allocator].</li> |
︙ | ︙ | |||
963 964 965 966 967 968 969 | [database connection] and [prepared statement] objects the application uses, and on the complexity of the [prepared statements]. For any given application, these factors are normally fixed and can be determined experimentally using [SQLITE_STATUS_MEMORY_USED]. A typical application might only use about 40KB of general-purpose memory. This gives a value of <b>N</b> of around 100KB.</p> | | | 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 | [database connection] and [prepared statement] objects the application uses, and on the complexity of the [prepared statements]. For any given application, these factors are normally fixed and can be determined experimentally using [SQLITE_STATUS_MEMORY_USED]. A typical application might only use about 40KB of general-purpose memory. This gives a value of <b>N</b> of around 100KB.</p> <h2> Ductile failure</h2> <p>If the memory allocation subsystems within SQLite are configured for breakdown-free operation but the actual memory usage exceeds design limits set by the [Robson proof], SQLite will usually continue to operate normally. The [scratch memory allocator], the [pagecache memory allocator], and the [lookaside memory allocator] all automatically failover |
︙ | ︙ | |||
992 993 994 995 996 997 998 | the application will provide operators with abundant warning well in advance of failure. The [memory statistics] interfaces of SQLite provide the application with all the mechanism necessary to complete the monitoring portion of this task.</p> <a name="stability"></a> | | | 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 | the application will provide operators with abundant warning well in advance of failure. The [memory statistics] interfaces of SQLite provide the application with all the mechanism necessary to complete the monitoring portion of this task.</p> <a name="stability"></a> <h1> Stability Of Memory Interfaces</h1> <p><b>Update:</b> As of SQLite version 3.7.0 (2010-07-22), all of SQLite memory allocation interfaces are considered stable and will be supported in future releases.</p> |
Changes to pages/privatebranch.in.
|
| | | < < | | 1 2 3 4 5 6 7 8 9 10 11 12 13 | <title>Maintaining Private Branches Of SQLite</title> <tcl>hd_keywords {private branches}</tcl> <table_of_contents> <h1> Introduction</h1> <p>SQLite is designed to meet most developer's needs without any changes or customization. When changes are needed, they can normally be accomplished using start-time [sqlite3_config | (1)] or runtime [sqlite3_db_config | (2)] [sqlite3_limit | (3)] |
︙ | ︙ | |||
35 36 37 38 39 40 41 | This article is not trying to impose a particular procedure on maintainers of private branches. The point of this article is to offer an example of one process for maintaining a private branch which can be used as a template for designing processes best suited for the circumstances of each individual project.</p> <img src="images/private_branch.gif" align="right"> | | | 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | This article is not trying to impose a particular procedure on maintainers of private branches. The point of this article is to offer an example of one process for maintaining a private branch which can be used as a template for designing processes best suited for the circumstances of each individual project.</p> <img src="images/private_branch.gif" align="right"> <h1> The Basic Idea</h1> <p>We propose to use the [http://www.fossil-scm.org | fossil software configuration management] system to set up two branches. One branch (the "public branch" or "trunk") contains the published SQLite sources and the other branch is the private branch which contains the code that is customized for the project. |
︙ | ︙ | |||
88 89 90 91 92 93 94 | <p>The cycle above can be repeated many times. The diagram shows a third SQLite release, 3.6.17 in circle (6). The private branch maintainer can do another merge in order to incorporate the changes moving from (4) to (6) into the private branch, resulting in version (7).</p> | | | | | 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 | <p>The cycle above can be repeated many times. The diagram shows a third SQLite release, 3.6.17 in circle (6). The private branch maintainer can do another merge in order to incorporate the changes moving from (4) to (6) into the private branch, resulting in version (7).</p> <h1> The Procedure</h1> <p>The remainder of this document will guide the reader through the steps needed to maintain a private branch. The general idea is the same as outlined above. This section merely provides more detail.</p> <p>We emphasize again that these steps are not intended to be the only acceptable method for maintaining private branch. This approach is one of many. Use this document as a baseline for preparing project-specific procedures. Do not be afraid to experiment.</p> <h2> Obtain The Software</h2> <p>[http://www.fossil-scm.org/ | Fossil] is a computer program that must be installed on your machine before you use it. Fortunately, installing fossil is very easy. Fossil is a single "*.exe" file that you simply download and run. To uninstall fossil, simply delete the exe file. [http://www.fossil-scm.org/index.html/doc/tip/www/quickstart.wiki | Detailed instructions] for installing and getting started with fossil are available on the [http://www.fossil-scm.org | fossil website].</p> <h2> Create A Project Repository</h2> <p>Create a fossil repository to host the private branch using the following command:</p> <blockquote><pre> fossil new private-project.fossil </pre></blockquote> |
︙ | ︙ | |||
152 153 154 155 156 157 158 | fossil open private-project.fossil </pre></blockquote> <p>You can have multiple checkouts of the same project if you want. And you can "clone" the repository to different machines so that multiple developers can use it. See the fossil website for further information.</p> | | | 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 | fossil open private-project.fossil </pre></blockquote> <p>You can have multiple checkouts of the same project if you want. And you can "clone" the repository to different machines so that multiple developers can use it. See the fossil website for further information.</p> <h2> Installing The SQLite Baseline In Fossil</h2> <p>The repository created in the previous step is initially empty. The next step is to load the baseline SQLite release - circle (1) in the diagram above.</p> <p>Begin by obtaining a copy of SQLite in whatever form you use it. The public SQLite you obtain should be as close to your private edited |
︙ | ︙ | |||
217 218 219 220 221 222 223 | <blockquote><pre> fossil timeline fossil info fossil status </pre></blockquote> | | | 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 | <blockquote><pre> fossil timeline fossil info fossil status </pre></blockquote> <h2> Creating The Private Branch</h2> <p>The previous step created circle (1) in the diagram above. This step will create circle (2). Run the following command:</p> <blockquote><pre> fossil branch new private trunk -bgcolor "#add8e8" </pre></blockquote> |
︙ | ︙ | |||
258 259 260 261 262 263 264 | fossil update trunk </pre></blockquote> <p>Normally, fossil will modify all the files in your checkout when switching between the private and the public branches. But at this point, the files are identical in both branches so not modifications need to be made.</p> | | | 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 | fossil update trunk </pre></blockquote> <p>Normally, fossil will modify all the files in your checkout when switching between the private and the public branches. But at this point, the files are identical in both branches so not modifications need to be made.</p> <h2> Adding Customizations To The Code In The Private Branch</h2> <p>Now it is time to make the private, custom modifications to SQLite which are the whole point of this exercise. Switch to the private branch (if you are not already there) using the "<tt>fossil update private</tt>" command, then bring up the source files in your text editor and make whatever changes you want to make. Once you have finished making changes, commit those changes using this command:</p> |
︙ | ︙ | |||
287 288 289 290 291 292 293 | <p>Note that in the diagram above, we showed the private edits as a single commit. This was for clarity of presentation only. There is nothing to stop you from doing dozens or hundreds of separate tiny changes and committing each separately. In fact, making many small changes is the preferred way to work. The only reason for doing all the changes in a single commit is that it makes the diagram easier to draw.</p> | | | | 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 | <p>Note that in the diagram above, we showed the private edits as a single commit. This was for clarity of presentation only. There is nothing to stop you from doing dozens or hundreds of separate tiny changes and committing each separately. In fact, making many small changes is the preferred way to work. The only reason for doing all the changes in a single commit is that it makes the diagram easier to draw.</p> <h2> Incorporating New Public SQLite Releases</h2> <p>Suppose that after a while (about a month, usually) a new version of SQLite is released: 3.6.16. You will want to incorporate this new public version of SQLite into your repository in the public branch (the trunk). To do this, first change your repository over to the trunk:</p> <blockquote><pre> fossil update trunk </pre></blockquote> <p>Then download the new version of the SQLite sources and overwrite the files that are in the checkout.<p> <p>If you made NL to CR-NL line ending changes or space to tab indentation changes in the original baseline, make the same changes to the new source file.</p> <p>Once everything is ready, run the "<tt>fossil commit</tt>" command to check in the changes. This creates circle (4) in the diagram above.</p> <h2> Merging Public SQLite Updates Into The Private Branch</h2> <p>The next step is to move the changes in the public branch over into the private branch. In other words, we want to create circle (5) in the diagram above. Begin by changing to the private branch using "<tt>fossil update private</tt>". Then type this command:</p> <blockquote><pre> |
︙ | ︙ | |||
336 337 338 339 340 341 342 | conflicts and manually resolve the conflicts.</p> <p>After resolving conflicts, many users like to compile and test the new version before committing it to the repository. Or you can commit first and test later. Either way, run the "<tt>fossil commit</tt>" command to check-in the circle (5) version. | | | | 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 | conflicts and manually resolve the conflicts.</p> <p>After resolving conflicts, many users like to compile and test the new version before committing it to the repository. Or you can commit first and test later. Either way, run the "<tt>fossil commit</tt>" command to check-in the circle (5) version. <h2> Further Updates</h2> <p>As new versions of SQLite are released, repeat steps 3.6 and 3.7 to add changes in the new release to the private branch. Additional private changes can be made on the private branch in between releases if desired.</p> <h1> Variations</h1> <p>Since this document was first written, the canonical SQLite source code has been moved from the venerable CVS system into a Fossil repository at [http://www.sqlite.org/src]. This means that if you are working with canonical SQLite source code (as opposed to the [amalgamation] source code files, sqlite3.c and sqlite3.h) then you can create a private repository simply by cloning the official repository:</p> |
︙ | ︙ |
Changes to pages/queryplanner-ng.in.
︙ | ︙ | |||
10 11 12 13 14 15 16 | hd_puts "<center><table><tr><td><pre>\n" regsub {^[ \n]*\n} $txt {} txt hd_puts [string trimright $txt]\n hd_puts "</pre></table></center>\n" } hd_keywords {next generation query planner} {Next Generation Query Planner} NGQP </tcl> | | > | | 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | hd_puts "<center><table><tr><td><pre>\n" regsub {^[ \n]*\n} $txt {} txt hd_puts [string trimright $txt]\n hd_puts "</pre></table></center>\n" } hd_keywords {next generation query planner} {Next Generation Query Planner} NGQP </tcl> <table_of_contents> <h1> Introduction</h1> <p> The task of the "query planner" is to figure out the best algorithm or "query plan" to accomplish an SQL statement. Beginning with SQLite version 3.8.0, the query planner component has been rewritten so that it runs faster and generates better plans. The rewrite is called the "next generation query planner" or "NGQP". |
︙ | ︙ | |||
37 38 39 40 41 42 43 | regressions. This risk is considered and a checklist is provided for reducing the risk and for fixing any issues that do arise.</p> <p>This document focuses on the NGQP. For a more general overview of the SQLite query planner that encompasses the entire history of SQLite, see "[query planner | The SQLite Query Optimizer Overview]".</p> | | | 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 | regressions. This risk is considered and a checklist is provided for reducing the risk and for fixing any issues that do arise.</p> <p>This document focuses on the NGQP. For a more general overview of the SQLite query planner that encompasses the entire history of SQLite, see "[query planner | The SQLite Query Optimizer Overview]".</p> <h1> Background</h1> <p>For simple queries against a single table with few indices, there is usually an obvious choice for the best algorithm. But for larger and more complex queries, such as multi-way joins with many indices and subqueries, there can be hundreds, thousands, or millions of reasonable algorithms for computing the result. |
︙ | ︙ | |||
72 73 74 75 76 77 78 | without actually running that plan. So when comparing two or more plans to figure out which is "best", the query planner has to make some guesses and assumptions and those guesses and assumptions will sometimes be wrong. A good query planner is one that will find the correct solution often enough that application programmers rarely need to get involved.</p> | | | | 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 | without actually running that plan. So when comparing two or more plans to figure out which is "best", the query planner has to make some guesses and assumptions and those guesses and assumptions will sometimes be wrong. A good query planner is one that will find the correct solution often enough that application programmers rarely need to get involved.</p> <h2> Query Planning In SQLite</h2> <p>SQLite computes joins using nested loops, one loop for each table in the join. (Additional loops might be inserted for IN and OR operators in the WHERE clause. SQLite considers those too, but for simplicity we will ignore them in this essay.) One or more indices might be used on each loop to speed the search, or a loop might be a "full table scan" that reads every row in the table. Thus query planning decomposes into two subtasks:</p> <ol> <li> Picking the nested order of the various loops <li> Choosing good indices for each loop </ol> <p>Picking the nesting order is generally the more challenging problem. Once the nesting order of the join is established, the choice of indices for each loop is normally obvious.</p> <tcl>hd_fragment qpstab {query planner stability guarantee}</tcl> <h2> The SQLite Query Planner Stability Guarantee</h2> <p>SQLite will always pick the same query plan for any given SQL statement as long as: <ol type="a"> <li>the database schema does not change in significant ways such as adding or dropping indices,</li> <li>the ANALYZE command is not rerun, </li> |
︙ | ︙ | |||
140 141 142 143 144 145 146 | a different version of SQLite, then query plans might change. In rare cases, an SQLite version change might lead to a performance regression. This is one reason you should consider statically linking your applications against SQLite rather than use a system-wide SQLite shared library which might change without your knowledge or control.</p> | | | | 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 | a different version of SQLite, then query plans might change. In rare cases, an SQLite version change might lead to a performance regression. This is one reason you should consider statically linking your applications against SQLite rather than use a system-wide SQLite shared library which might change without your knowledge or control.</p> <h1> A Difficult Case</h1> <p> "TPC-H Q8" is a test query from the <a href="http://www.tpc.org/tpch/">Transaction Processing Performance Council</a>. The query planners in SQLite versions 3.7.17 and earlier do not choose good plans for TPC-H Q8. And it has been determined that no amount of tweaking of the legacy query planner will fix that. In order to find a good solution to the TPC-H Q8 query, and to continue improving the quality of SQLite's query planner, it became necessary to redesign the query planner. This section tries to explain why this redesign was necessary and how the NGQP is different and addresses the TPC-H Q8 problem. </p> <h2> Query Details</h2> <p> TPC-H Q8 is an eight-way join. As observed above, the main task of the query planner is to figure out the best nesting order of the eight loops in order to minimize the work needed to complete the join. A simplified model of this problem for the case of TPC-H Q8 is shown |
︙ | ︙ | |||
205 206 207 208 209 210 211 | a minimum-cost path through the graph that visits each node exactly once.</p> <p>(Side note: The costs estimates in the TPC-H Q8 graph were computed by the query planner in SQLite 3.7.16 and converted using a natural logarithm.) </p> | | | 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 | a minimum-cost path through the graph that visits each node exactly once.</p> <p>(Side note: The costs estimates in the TPC-H Q8 graph were computed by the query planner in SQLite 3.7.16 and converted using a natural logarithm.) </p> <h2> Complications</h2> <p>The presentation of the query planner problem above is a simplification. The costs are estimates. We cannot know what the true cost of running a loop is until we actually run the loop. SQLite makes guesses for the cost of running a loop based on the availability of indices and constraints found in the WHERE clause. These guesses are usually pretty good, but they can sometimes be |
︙ | ︙ | |||
252 253 254 255 256 257 258 | <p>In the TPC-H Q8 query, the setup costs are all negligible, all dependencies are between individual nodes, and there is no ORDER BY, GROUP BY, or DISTINCT clause. So for TPC-H Q8, the graph above is a reasonable representation of what needs to be computed. The general case involves a lot of extra complication, which for clarity is neglected in the remainder of this article.</p> | | | 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 | <p>In the TPC-H Q8 query, the setup costs are all negligible, all dependencies are between individual nodes, and there is no ORDER BY, GROUP BY, or DISTINCT clause. So for TPC-H Q8, the graph above is a reasonable representation of what needs to be computed. The general case involves a lot of extra complication, which for clarity is neglected in the remainder of this article.</p> <h2> Finding The Best Query Plan</h2> <p>Prior to version 3.8.0, SQLite always used the "Nearest Neighbor" or "NN" heuristic when searching for the best query plan. The NN heuristic makes a single traversal of the graph, always choosing the lowest-cost arc as the next step. The NN heuristic works surprisingly well in most cases. And NN is fast, so that SQLite is able to quickly find good plans |
︙ | ︙ | |||
283 284 285 286 287 288 289 | <p>One solution to this problem is to change SQLite to do an exhaustive search for the best path. But an exhaustive search requires time proportional to K! (where K is the number of tables in the join) and so when you get beyond a 10-way join, the time to run [sqlite3_prepare()] becomes very large.</p> | | | 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 | <p>One solution to this problem is to change SQLite to do an exhaustive search for the best path. But an exhaustive search requires time proportional to K! (where K is the number of tables in the join) and so when you get beyond a 10-way join, the time to run [sqlite3_prepare()] becomes very large.</p> <h2> The N Nearest Neighbors or "N3" Heuristic</h2> <p>The NGQP uses a new heuristic for seeking the best path through the graph: "N Nearest Neighbors" (hereafter "N3"). With N3, instead of choosing just one nearest neighbor for each step, the algorithm keeps track of the N bests paths at each step for some small integer N.</p> <p>Suppose N=4. Then for the TPC-H Q8 graph, the first step finds |
︙ | ︙ | |||
345 346 347 348 349 350 351 | when N is 10 or greater.</p> <p>The initial implementation of NGQP chooses N=1 for simple queries, N=5 for two-way joins and N=10 for all joins with three or more tables. This formula for selecting N might change in subsequent releases.</p> <tcl>hd_fragment hazards {hazards of upgrading to the NGQP}</tcl> | | | 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 | when N is 10 or greater.</p> <p>The initial implementation of NGQP chooses N=1 for simple queries, N=5 for two-way joins and N=10 for all joins with three or more tables. This formula for selecting N might change in subsequent releases.</p> <tcl>hd_fragment hazards {hazards of upgrading to the NGQP}</tcl> <h1> Hazards Of Upgrading To NGQP</h1> <p>For most applications, upgrading from the legacy query planner to the NGQP requires little thought or effort. Simply replace the older SQLite version with the newer version of SQLite and recompile and the application will run faster. There are no API changes nor modifications to compilation procedures.</p> |
︙ | ︙ | |||
385 386 387 388 389 390 391 | </ul> <p>Not all applications meet these conditions. Fortunately, the NGQP will still usually find good query plans, even without these conditions. However, cases do arise (rarely) where performance regressions can occur.</p> <tcl>hd_fragment fossilcasestudy {The Fossil NGQP Upgrade Case Study}</tcl> | | | 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 | </ul> <p>Not all applications meet these conditions. Fortunately, the NGQP will still usually find good query plans, even without these conditions. However, cases do arise (rarely) where performance regressions can occur.</p> <tcl>hd_fragment fossilcasestudy {The Fossil NGQP Upgrade Case Study}</tcl> <h2> Case Study: Upgrading Fossil to the NGQP</h2> <p>The <a href="http://www.fossil-scm.org/">Fossil DVCS</a> is the version control system used to track all of the SQLite source code. A Fossil repository is an SQLite database file. (Readers are invited to ponder this recursion as an independent exercise.) Fossil is both the version-control system for SQLite and a test platform for SQLite. Whenever enhancements are made to SQLite, |
︙ | ︙ | |||
582 583 584 585 586 587 588 | <p>(Side note: The costs estimates in the two most recent graphs were computed by the NGQP using a base-2 logarithm and slightly different cost assumptions compared to the legacy query planner. Hence, the cost estimates in these latter two graphs are not directly comparable to the cost estimates in the TPC-H Q8 graph.)</p> | | | 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 | <p>(Side note: The costs estimates in the two most recent graphs were computed by the NGQP using a base-2 logarithm and slightly different cost assumptions compared to the legacy query planner. Hence, the cost estimates in these latter two graphs are not directly comparable to the cost estimates in the TPC-H Q8 graph.)</p> <h2> Fixing The Problem</h2> <p>Running [ANALYZE] on the repository database immediately fixed the performance problem. However, we want Fossil to be robust and to always work quickly regardless of whether or not its repository has been analyzed. For this reason, the query was modified to use the CROSS JOIN operator instead of the plain JOIN operator. SQLite will not reorder the tables of a CROSS JOIN. |
︙ | ︙ | |||
610 611 612 613 614 615 616 | than PLINK_I1 and algorithm-2 really would be the faster choice. However such repositories are very unlikely to appear in practice and so hard-coding the loop nested order using the CROSS JOIN syntax is a reasonable solution to the problem in this case.</p> <tcl>hd_fragment howtofix {query planner checklist}</tcl> | | | 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 | than PLINK_I1 and algorithm-2 really would be the faster choice. However such repositories are very unlikely to appear in practice and so hard-coding the loop nested order using the CROSS JOIN syntax is a reasonable solution to the problem in this case.</p> <tcl>hd_fragment howtofix {query planner checklist}</tcl> <h1> Checklist For Avoiding Or Fixing Query Planner Problems</h1> <ol> <li><p><b>Don't panic!</b> Cases where the query planner picks an inferior plan are actually quite rare. You are unlikely to run across any problems in your application. If you are not having performance issues, you do not need to worry about any of this.</p> |
︙ | ︙ | |||
699 700 701 702 703 704 705 | <li><p><b>Use the [INDEXED BY] syntax to enforce the selection of particular indices on problem queries.</b> As with the previous two bullets, avoid this step if possible, and especially avoid doing this early in development as it is clearly a premature optimization.</p> </ol> | | | 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 | <li><p><b>Use the [INDEXED BY] syntax to enforce the selection of particular indices on problem queries.</b> As with the previous two bullets, avoid this step if possible, and especially avoid doing this early in development as it is clearly a premature optimization.</p> </ol> <h1> Summary</h1> <p>The query planner in SQLite normally does a terrific job of selecting fast algorithms for running your SQL statements. This is true of the legacy query planner and even more true of the new NGQP. There may be an occasional situation where, due to incomplete information, the query planner selects a suboptimal plan. This will happen less often with the NGQP than with the legacy query planner, but it might still happen. Only in those rare cases do application developers need to get involved and help the query planner to do the right thing. In the common case, the NGQP is just a new enhancement to SQLite that makes the application run a little faster and which requires no new developer thought or action.</p> |
Changes to pages/queryplanner.in.
︙ | ︙ | |||
14 15 16 17 18 19 20 | proc code {txt} { hd_puts "<center><table><tr><td><pre>\n" regsub {^[ \n]*\n} $txt {} txt hd_puts [string trimright $txt]\n hd_puts "</pre></table></center>\n" } </tcl> | | > > > | 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | proc code {txt} { hd_puts "<center><table><tr><td><pre>\n" regsub {^[ \n]*\n} $txt {} txt hd_puts [string trimright $txt]\n hd_puts "</pre></table></center>\n" } </tcl> <table_of_contents> <h2 style="margin-left:1.0em" notoc id=overview> Overview</h2> <p> The best feature of SQL (in <u>all</u> its implementations, not just SQLite) is that it is a <i>declarative</i> language, not a <i>procedural</i> language. When programming in SQL you tell the system <i>what</i> you want to compute, not <i>how</i> to compute it. The task of figuring out the <i>how</i> is delegated to the <i>query planner</i> subsystem within |
︙ | ︙ | |||
46 47 48 49 50 51 52 | with background information to help them understand what is going on behind the scenes with SQLite, which in turn should make it easier for programmers to create the indices that will help the SQLite query planner to pick the best plans. </p> <tcl>hd_fragment searching strategies</tcl> | | | | 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | with background information to help them understand what is going on behind the scenes with SQLite, which in turn should make it easier for programmers to create the indices that will help the SQLite query planner to pick the best plans. </p> <tcl>hd_fragment searching strategies</tcl> <h1> Searching</h1> <h2> Tables Without Indices</h2> <p> Most tables in SQLite consist of zero or more rows with a unique integer key (the [rowid] or [INTEGER PRIMARY KEY]) followed by content. (The exception is [WITHOUT ROWID] tables.) The rows are logically stored in order of increasing rowid. As an example, this |
︙ | ︙ | |||
109 110 111 112 113 114 115 | With a table of only 7 rows, this is not a big deal, but if your table contained 7 million rows, a full table scan might read megabytes of content in order to find a single 8-byte number. For that reason, one normally tries to avoid full table scans. </p> <tcl>figure 2 #fig2 fullscan.gif {Full Table Scan}</tcl> | | | 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 | With a table of only 7 rows, this is not a big deal, but if your table contained 7 million rows, a full table scan might read megabytes of content in order to find a single 8-byte number. For that reason, one normally tries to avoid full table scans. </p> <tcl>figure 2 #fig2 fullscan.gif {Full Table Scan}</tcl> <h2> Lookup By Rowid</h2> <p> One technique for avoiding a full table scan is to do lookups by rowid (or by the equivalent INTEGER PRIMARY KEY). To lookup the price of peaches, one would query for the entry with a rowid of 4: </p> |
︙ | ︙ | |||
133 134 135 136 137 138 139 | to N as in a full table scan. If the table contains 10 million elements, that means the query will be on the order of N/logN or about 1 million times faster! </p> <tcl>figure 3 #fig3 rowidlu.gif {Lookup By Rowid}</tcl> | | | 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 | to N as in a full table scan. If the table contains 10 million elements, that means the query will be on the order of N/logN or about 1 million times faster! </p> <tcl>figure 3 #fig3 rowidlu.gif {Lookup By Rowid}</tcl> <h2> Lookup By Index</h2> <p> The problem with looking up information by rowid is that you probably do not care what the price of "item 4" is - you want to know the price of peaches. And so a rowid lookup is not helpful. </p> <p> |
︙ | ︙ | |||
194 195 196 197 198 199 200 | <p> SQLite has to do two binary searches to find the price of peaches using the method show above. But for a table with a large number of rows, this is still much faster than doing a full table scan. </p> | | | 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 | <p> SQLite has to do two binary searches to find the price of peaches using the method show above. But for a table with a large number of rows, this is still much faster than doing a full table scan. </p> <h2> Multiple Result Rows</h2> <p> In the previous query the fruit='Peach' constraint narrowed the result down to a single row. But the same technique works even if multiple rows are obtained. Suppose we looked up the price of Oranges instead of Peaches: </p> |
︙ | ︙ | |||
225 226 227 228 229 230 231 | cheap in comparison to a binary search that we usually ignore it. So our estimate for the total cost of this query is 3 binary searches. If the number of rows of output is K and the number of rows in the table is N, then in general the cost of doing the query is proportional to (K+1)*logN. </p> | | | 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 | cheap in comparison to a binary search that we usually ignore it. So our estimate for the total cost of this query is 3 binary searches. If the number of rows of output is K and the number of rows in the table is N, then in general the cost of doing the query is proportional to (K+1)*logN. </p> <h2> Multiple AND-Connected WHERE-Clause Terms</h2> <p> Next, suppose that you want to look up the price of not just any orange, but specifically California-grown oranges. The appropriate query would be as follows: </p> |
︙ | ︙ | |||
301 302 303 304 305 306 307 | search down to two rows. So, if all else is equal, SQLite will choose Idx1 with the hope of narrowing the search to as small a number of rows as possible. This choice is only possible because of the statistics provided by [ANALYZE]. If [ANALYZE] has not been run then the choice of which index to use is arbitrary. </p> | | | 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 | search down to two rows. So, if all else is equal, SQLite will choose Idx1 with the hope of narrowing the search to as small a number of rows as possible. This choice is only possible because of the statistics provided by [ANALYZE]. If [ANALYZE] has not been run then the choice of which index to use is arbitrary. </p> <h2> Multi-Column Indices</h2> <p> To get the maximum performance out of a query with multiple AND-connected terms in the WHERE clause, you really want a multi-column index with columns for each of the AND terms. In this case we create a new index on the "fruit" and "state" columns of FruitsForSale: </p> |
︙ | ︙ | |||
367 368 369 370 371 372 373 | Hence, a good rule of thumb is that your database schema should never contain two indices where one index is a prefix of the other. Drop the index with fewer columns. SQLite will still be able to do efficient lookups with the longer index. </p> <tcl>hd_fragment covidx {covering index} {covering indices}</tcl> | | | 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 | Hence, a good rule of thumb is that your database schema should never contain two indices where one index is a prefix of the other. Drop the index with fewer columns. SQLite will still be able to do efficient lookups with the longer index. </p> <tcl>hd_fragment covidx {covering index} {covering indices}</tcl> <h2> Covering Indices</h2> <p> The "price of California oranges" query was made more efficient through the use of a two-column index. But SQLite can do even better with a three-column index that also includes the "price" column: </p> |
︙ | ︙ | |||
406 407 408 409 410 411 412 | A two-fold performance increase is not nearly as dramatic as the one-million-fold increase seen when the table was first indexed. And for most queries, the difference between 1 microsecond and 2 microseconds is unlikely to be noticed. </p> <tcl>hd_fragment or_in_where or-connected-terms</tcl> | | | 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 | A two-fold performance increase is not nearly as dramatic as the one-million-fold increase seen when the table was first indexed. And for most queries, the difference between 1 microsecond and 2 microseconds is unlikely to be noticed. </p> <tcl>hd_fragment or_in_where or-connected-terms</tcl> <h2> OR-Connected Terms In The WHERE Clause</h2> <p> Multi-column indices only work if the constraint terms in the WHERE clause of the query are connected by AND. So Idx3 and Idx4 are helpful when the search is for items that are both Oranges and grown in California, but neither index would be that useful if we wanted all items that were either oranges |
︙ | ︙ | |||
464 465 466 467 468 469 470 | database engines will do just that. But the performance gain over using just a single index is slight and so SQLite does not implement that technique at this time. However, a future version SQLite might be enhanced to support AND-by-INTERSECT. </p> <tcl>hd_fragment sorting sorting</tcl> | | | 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 | database engines will do just that. But the performance gain over using just a single index is slight and so SQLite does not implement that technique at this time. However, a future version SQLite might be enhanced to support AND-by-INTERSECT. </p> <tcl>hd_fragment sorting sorting</tcl> <h1> Sorting</h1> <p> SQLite (like all other SQL database engines) can also use indices to satisfy the ORDER BY clauses in a query, in addition to expediting lookup. In other words, indices can be used to speed up sorting as well as searching. </p> |
︙ | ︙ | |||
499 500 501 502 503 504 505 | full table scan. Furthermore, the entire output is accumulated in temporary storage (which might be either in main memory or on disk, depending on various compile-time and run-time settings) which can mean that a lot of temporary storage is required to complete the query. </p> | | | 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 | full table scan. Furthermore, the entire output is accumulated in temporary storage (which might be either in main memory or on disk, depending on various compile-time and run-time settings) which can mean that a lot of temporary storage is required to complete the query. </p> <h2> Sorting By Rowid</h2> <p> Because sorting can be expensive, SQLite works hard to convert ORDER BY clauses into no-ops. If SQLite determines that output will naturally appear in the order specified, then no sorting is done. So, for example, if you request the output in rowid order, no sorting will be done: |
︙ | ︙ | |||
528 529 530 531 532 533 534 | SQLite will still omit the sorting step. But in order for output to appear in the correct order, SQLite will do the table scan starting at the end and working toward the beginning, rather than starting at the beginning and working toward the end as shown in <a href="#fig17">figure 17</a>. </p> | | | 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 | SQLite will still omit the sorting step. But in order for output to appear in the correct order, SQLite will do the table scan starting at the end and working toward the beginning, rather than starting at the beginning and working toward the end as shown in <a href="#fig17">figure 17</a>. </p> <h2> Sorting By Index</h2> <p> Of course, ordering the output of a query by rowid is seldom useful. Usually one wants to order the output by some other column. </p> <p> |
︙ | ︙ | |||
572 573 574 575 576 577 578 | time, and so this case could go either way depending on the table size and what WHERE clause constraints were available, and so forth. But generally speaking, the indexed sort would probably be chosen, if for no other reason, because it does not need to accumulate the entire result set in temporary storage before sorting and thus uses much less temporary storage. </p> | | | | | 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 | time, and so this case could go either way depending on the table size and what WHERE clause constraints were available, and so forth. But generally speaking, the indexed sort would probably be chosen, if for no other reason, because it does not need to accumulate the entire result set in temporary storage before sorting and thus uses much less temporary storage. </p> <h2> Sorting By Covering Index</h2> <p> If a covering index can be used for a query, then the multiple rowid lookups can be avoided and the cost of the query drops dramatically. </p> <tcl> figure 19 #fig19 obfruitidx4.gif {Sorting With A Covering Index} </tcl> <p> With a covering index, SQLite can simply walk the index from one end to the other and deliver the output in time proportional to N and without having allocate a large buffer to hold the result set. </p> <h1> Searching And Sorting At The Same Time</h1> <p> The previous discussion has treated searching and sorting as separate topics. But in practice, it is often the case that one wants to search and sort at the same time. Fortunately, it is possible to do this using a single index. </p> <h2> Searching And Sorting With A Multi-Column Index</h2> <p> Suppose we want to find the prices of all kinds of oranges sorted in order of the state where they are grown. The query is this: </p> <tcl> |
︙ | ︙ | |||
639 640 641 642 643 644 645 | column also happens to be the first column after the fruit column in the index. So, if we scan entries of the index that have the same value for the fruit column from top to bottom, those index entries are guaranteed to be ordered by the state column. </p> <tcl>hd_fragment {srchsortcovidx}</tcl> | | | 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 | column also happens to be the first column after the fruit column in the index. So, if we scan entries of the index that have the same value for the fruit column from top to bottom, those index entries are guaranteed to be ordered by the state column. </p> <tcl>hd_fragment {srchsortcovidx}</tcl> <h2> Searching And Sorting With A Covering Index</h2> <p> A [covering index] can also be used to search and sort at the same time. Consider the following: </p> <tcl> |
︙ | ︙ | |||
679 680 681 682 683 684 685 | <p> The same basic algorithm is followed, except this time the matching rows of the index are scanned from bottom to top instead of from top to bottom, so that the states will appear in descending order. </p> <tcl>hd_fragment {partialsort} {partial sorting by index} {block sorting}</tcl> | | | 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 | <p> The same basic algorithm is followed, except this time the matching rows of the index are scanned from bottom to top instead of from top to bottom, so that the states will appear in descending order. </p> <tcl>hd_fragment {partialsort} {partial sorting by index} {block sorting}</tcl> <h2> Partial Sorting Using An Index (a.k.a. Block Sorting)</h2> <p> Sometimes only part of an ORDER BY clause can be satisfied using indexes. Consider, for example, the following query: </p> <tcl> |
︙ | ︙ | |||
726 727 728 729 730 731 732 | completes, and well before the table scan is complete. <li>If a LIMIT clause is present, it might be possible to avoid scanning the entire table. </ol> Because of these advantages, SQLite always tries to do a partial sort using an index even if a complete sort by index is not possible.</p> | | | 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 | completes, and well before the table scan is complete. <li>If a LIMIT clause is present, it might be possible to avoid scanning the entire table. </ol> Because of these advantages, SQLite always tries to do a partial sort using an index even if a complete sort by index is not possible.</p> <h1> WITHOUT ROWID tables</h1> <p> The basic principals described above apply to both ordinary rowid tables and [WITHOUT ROWID] tables. The only difference is that the rowid column that serves as the key for tables and that appears as the right-most term in indexes is replaced by the PRIMARY KEY. </p> |
Changes to pages/rescode.in.
1 2 | <title>SQLite Result Codes</title> <tcl>hd_keywords {result code} {result codes} {error code} {error codes}</tcl> | | > > > | | | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | <title>SQLite Result Codes</title> <tcl>hd_keywords {result code} {result codes} {error code} {error codes}</tcl> <table_of_contents> <h2 style="margin-left:1.0em" notoc id=overview> Overview</h2> <p> Many of the routines in the SQLite [C-language Interface] return numeric result codes indicating either success or failure, and in the event of a failure, providing some idea of the cause of the failure. This document strives to explain what each of those numeric result codes means. <h1>Result Codes versus Error Codes</h1> <p> "Error codes" are a subset of "result codes" that indicate that something has gone wrong. There are only a few non-error result codes: [SQLITE_OK], [SQLITE_ROW], and [SQLITE_DONE]. The term "error code" means any result code other than these three. <tcl>hd_fragment pve {primary versus extended result codes} \ {*ext-v-prim} </tcl> <h1>Primary Result Codes versus Extended Result Codes</h1> <p> Result codes are signed 32-bit integers. The least significant 8 bits of the result code define a broad category and are called the "primary result code". More significant bits provide more detailed information about the error and are called the "extended result code" |
︙ | ︙ | |||
43 44 45 46 47 48 49 | primary result codes by default. The extended result code for the most recent error can be retrieved using the [sqlite3_extended_errcode()] interface. The [sqlite3_extended_result_codes()] interface can be used to put a [database connection] into a mode where it returns the extended result codes instead of the primary result codes. | | | 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | primary result codes by default. The extended result code for the most recent error can be retrieved using the [sqlite3_extended_errcode()] interface. The [sqlite3_extended_result_codes()] interface can be used to put a [database connection] into a mode where it returns the extended result codes instead of the primary result codes. <h1>Definitions</h1> <p> All result codes are integers. Symbolic names for all result codes are created using "#define" macros in the sqlite3.h header file. There are separate sections in the sqlite3.h header file for the [result code definitions] and the [extended result code definitions]. |
︙ | ︙ | |||
726 727 728 729 730 731 732 | set prim_rc($name) $val } else { incr nExtCode set ext_rc($name) $val } } | > | | < | | | > | | < | | | | | 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 | set prim_rc($name) $val } else { incr nExtCode set ext_rc($name) $val } } </tcl> <h1>Primary Result Code List</h1> <p>The $nPrimCode result codes are [result code definitions|defined in sqlite3.h] and are listed in alphabetical order below: <table border=0 width="100%" cellpadding=10> <tr><td valign="top" align="left"><ul> <tcl> set nrow [expr {($nPrimCode+2)/3}] set i 0 foreach name [lsort [array names prim_rc]] { if {$i==$nrow} { hd_puts "</ul></td><td valign=\"top\" align=\"left\"><ul>\n" set i 0 } incr i hd_resolve "<li> \[$name\] ($prim_rc($name))\n" } hd_puts "</td></td></table>\n\n" hd_fragment extrc {extended result code} {extended result codes} \ {extended error code} {extended error codes} </tcl> <h1>Extended Result Code List</h1> <p>The $nExtCode extended result codes are [extended result code definitions|defined in sqlite3.h] and are listed in alphabetical order below: <table border=0 width="100%" cellpadding=10> <tr><td valign="top" align="left"><ul> <tcl> set nrow [expr {($nExtCode+1)/2}] set i 0 foreach name [lsort [array names ext_rc]] { if {$i==$nrow} { hd_puts "</ul></td><td valign=\"top\" align=\"left\"><ul>\n" set i 0 } incr i hd_resolve "<li> \[$name\] ($ext_rc($name))\n" } hd_puts "</td></td></table>\n\n" </tcl> <h1>Result Code Meanings</h1> <p> The meanings for all $nResCode result code values are shown below, in numeric order. <tcl> # Generate the table of result codes # foreach val [lsort -int [array names valtoname]] { set name $valtoname($val) regsub {sqlite_} [string tolower $name] {} tag hd_puts "<!--------------------------------------------------------------->\n" hd_fragment $tag --override $name hd_puts "<h3>($val) $valtoname($val)</h3>\n" hd_resolve $resdesc($name)\n\n } </tcl> |
Changes to pages/sharedcache.in.
1 2 3 4 | <title>SQLite Shared-Cache Mode</title> <tcl>hd_keywords {SQLite Shared-Cache Mode} \ {shared cache} {shared cache mode}</tcl> | | < < < < < | < < < < < < < < < < < < < < < < < < | | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 | <title>SQLite Shared-Cache Mode</title> <tcl>hd_keywords {SQLite Shared-Cache Mode} \ {shared cache} {shared cache mode}</tcl> <table_of_contents> <h1>SQLite Shared-Cache Mode</h1> <p>Starting with version 3.3.0, SQLite includes a special "shared-cache" mode (disabled by default) intended for use in embedded servers. If shared-cache mode is enabled and a thread establishes multiple connections to the same database, the connections share a single data and schema cache. This can significantly reduce the quantity of memory and IO required by the system.</p> |
︙ | ︙ | |||
45 46 47 48 49 50 51 | <p>Shared-cache mode changes the semantics of the locking model in some cases. The details are described by this document. A basic understanding of the normal SQLite locking model (see <a href="lockingv3.html">File Locking And Concurrency In SQLite Version 3</a> for details) is assumed.</p> | | | 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | <p>Shared-cache mode changes the semantics of the locking model in some cases. The details are described by this document. A basic understanding of the normal SQLite locking model (see <a href="lockingv3.html">File Locking And Concurrency In SQLite Version 3</a> for details) is assumed.</p> <h1>Shared-Cache Locking Model</h1> <p>Externally, from the point of view of another process or thread, two or more [sqlite3|database connections] using a shared-cache appear as a single connection. The locking protocol used to arbitrate between multiple shared-caches or regular database users is described elsewhere. </p> |
︙ | ︙ | |||
88 89 90 91 92 93 94 | connections 2 and 3 is described in the remainder of this section. </p> <p>There are three levels to the shared-cache locking model, transaction level locking, table level locking and schema level locking. They are described in the following three sub-sections.</p> | | | | | 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 | connections 2 and 3 is described in the remainder of this section. </p> <p>There are three levels to the shared-cache locking model, transaction level locking, table level locking and schema level locking. They are described in the following three sub-sections.</p> <h2>Transaction Level Locking</h2> <p>SQLite connections can open two kinds of transactions, read and write transactions. This is not done explicitly, a transaction is implicitly a read-transaction until it first writes to a database table, at which point it becomes a write-transaction. </p> <p>At most one connection to a single shared cache may open a write transaction at any one time. This may co-exist with any number of read transactions. </p> <h2>Table Level Locking</h2> <p>When two or more connections use a shared-cache, locks are used to serialize concurrent access attempts on a per-table basis. Tables support two types of locks, "read-locks" and "write-locks". Locks are granted to connections - at any one time, each database connection has either a read-lock, write-lock or no lock on each database table. </p> <p>At any one time, a single table may have any number of active read-locks or a single active write lock. To read data a table, a connection must first obtain a read-lock. To write to a table, a connection must obtain a write-lock on that table. If a required table lock cannot be obtained, the query fails and SQLITE_LOCKED is returned to the caller. </p> <p>Once a connection obtains a table lock, it is not released until the current transaction (read or write) is concluded. </p> <h3>Read-Uncommitted Isolation Mode</h3> <p>The behaviour described above may be modified slightly by using the [read_uncommitted] pragma to change the isolation level from serialized (the default), to read-uncommitted.</p> <p> A database connection in read-uncommitted mode does not attempt to obtain read-locks before reading from database tables as described |
︙ | ︙ | |||
153 154 155 156 157 158 159 | */ PRAGMA read_uncommitted = <boolean>; /* Retrieve the current value of the read-uncommitted flag */ PRAGMA read_uncommitted; </pre></blockquote> | | | | | | 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 | */ PRAGMA read_uncommitted = <boolean>; /* Retrieve the current value of the read-uncommitted flag */ PRAGMA read_uncommitted; </pre></blockquote> <h2>Schema (sqlite_master) Level Locking</h2> <p>The <i>sqlite_master</i> table supports shared-cache read and write locks in the same way as all other database tables (see description above). The following special rules also apply: </p> <ul> <li>A connection must obtain a read-lock on <i>sqlite_master</i> before accessing any database tables or obtaining any other read or write locks.</li> <li>Before executing a statement that modifies the database schema (i.e. a CREATE or DROP TABLE statement), a connection must obtain a write-lock on <i>sqlite_master</i>. </li> <li>A connection may not compile an SQL statement if any other connection is holding a write-lock on the <i>sqlite_master</i> table of any attached database (including the default database, "main"). </li> </ul> <h1>Thread Related Issues</h1> <p>In SQLite versions 3.3.0 through 3.4.2 when shared-cache mode is enabled, a database connection may only be used by the thread that called [sqlite3_open()] to create it. And a connection could only share cache with another connection in the same thread. These restrictions were dropped beginning with SQLite version 3.5.0. </p> <h1>Shared Cache And Virtual Tables</h1> <p> In older versions of SQLite, shared cache mode could not be used together with virtual tables. This restriction was removed in SQLite [version 3.6.17]. <h1>Enabling Shared-Cache Mode</h1> <p>Shared-cache mode is enabled on a per-process basis. Using the C interface, the following API can be used to globally enable or disable shared-cache mode: </p> <blockquote><pre> |
︙ | ︙ | |||
228 229 230 231 232 233 234 | cache sharing behavior of a database connection allows cache sharing to be controlled in [ATTACH] statements. For example:</p> <blockquote><pre> ATTACH 'file:aux.db?cache=shared' AS aux; </pre></blockquote> | < | | | 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 | cache sharing behavior of a database connection allows cache sharing to be controlled in [ATTACH] statements. For example:</p> <blockquote><pre> ATTACH 'file:aux.db?cache=shared' AS aux; </pre></blockquote> <tcl> hd_fragment inmemsharedcache {in-memory shared-cache} </tcl> <h1>Shared Cache And In-Memory Databases</h1> <p> Beginning with SQLite [version 3.7.13], shared cache can be used on [in-memory databases], provided that the database is created using a [URI filename]. For backwards compatibility, shared cache is always disable for in-memory databases if the unadorned name ":memory:" is used to open the database. |
︙ | ︙ |
Changes to pages/spellfix1.in.
|
| | > | > | 1 2 3 4 5 6 7 8 9 10 11 12 13 | <title>The Spellfix1 Virtual Table</title> <tcl> hd_keywords {spellfix1}</tcl> <table_of_contents> <h1>Overview</h1> <p>This spellfix1 [virtual table] can be used to search a large vocabulary for close matches. For example, spellfix1 can be used to suggest corrections to misspelled words. Or, it could be used with [FTS4] to do full-text search using potentially misspelled words. |
︙ | ︙ | |||
90 91 92 93 94 95 96 | kennestone kenneson kenneys keanes keenes </pre></blockquote> | | | 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 | kennestone kenneson kenneys keanes keenes </pre></blockquote> <h1>Search Refinements</h1> <p>By default, the spellfix1 table returns no more than 20 results. (It might return less than 20 if there were fewer good matches.) You can change the upper bound on the number of returned rows by adding a "top=N" term to the WHERE clause of your query, where N is the new maximum. For example, to see the 5 best matches: |
︙ | ︙ | |||
132 133 134 135 136 137 138 | <p>Note that if you do not include the "langid=N" term in the WHERE clause, the search will be against language 0 (English in the example above.) All spellfix1 searches are against a single language id. There is no way to search all languages at once. | | | | 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 | <p>Note that if you do not include the "langid=N" term in the WHERE clause, the search will be against language 0 (English in the example above.) All spellfix1 searches are against a single language id. There is no way to search all languages at once. <h1>Virtual Table Details</h1> <p>Each row in the spellfix1 virtual table has a unique rowid with seven columns plus five extra hidden columns. The columns are as follows: <dl> <dt><p><b>rowid</b><dd> A unique integer number associated with each vocabulary item in the table. This can be used as a foreign key on other tables in the database. <dt><p><b>word</b><dd> The text of the word that matches the pattern. |
︙ | ︙ | |||
221 222 223 224 225 226 227 | <dt><p><b>command</b><dd> (HIDDEN) The value of the "command" column is always NULL. However, applications can insert special strings into the "command" column in order to provoke certain behaviors in the spellfix1 virtual table. For example, inserting the string 'reset' into the "command" column will cause the virtual table to reread its edit distance weights (if there are any). | | | | | 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 | <dt><p><b>command</b><dd> (HIDDEN) The value of the "command" column is always NULL. However, applications can insert special strings into the "command" column in order to provoke certain behaviors in the spellfix1 virtual table. For example, inserting the string 'reset' into the "command" column will cause the virtual table to reread its edit distance weights (if there are any). </dl> <h1>Algorithm</h1> <p>The spellfix1 virtual table creates a single shadow table named "%_vocab" (where the % is replaced by the name of the virtual table; Ex: "demo_vocab" for the "demo" virtual table). the shadow table contains the following columns: <dl> <dt><p><b>id</b><dd> The unique id (INTEGER PRIMARY KEY) <dt><p><b>rank</b><dd> The rank of word. <dt><p><b>langid</b><dd> |
︙ | ︙ | |||
264 265 266 267 268 269 270 | single symbol "A". And the letters "p", "b", "f", and "v" all become "B". All nasal sounds are represented as "N". And so forth. The mapping is base on ideas found in Soundex, Metaphone, and other long-standing phonetic matching systems. This key can be generated by the function spellfix1_phonehash(X). Hence: k2 = spellfix1_phonehash(k1) | | | 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 | single symbol "A". And the letters "p", "b", "f", and "v" all become "B". All nasal sounds are represented as "N". And so forth. The mapping is base on ideas found in Soundex, Metaphone, and other long-standing phonetic matching systems. This key can be generated by the function spellfix1_phonehash(X). Hence: k2 = spellfix1_phonehash(k1) </dl> <p>There is also a function for computing the Wagner edit distance or the Levenshtein distance between a pattern and a word. This function is exposed as spellfix1_editdist(X,Y). The edit distance function returns the "cost" of converting X into Y. Some transformations cost more than others. Changing one vowel into a different vowel, for example is relatively cheap, as is doubling a constant, or |
︙ | ︙ | |||
321 322 323 324 325 326 327 | <p>Only terms of the vocabulary with a matching langid are searched. Hence, the same table can contain entries from multiple languages and only the requested language will be used. The default langid is 0. <tcl>hd_fragment configeditdist {configurable edit distances}</tcl> | | | 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 | <p>Only terms of the vocabulary with a matching langid are searched. Hence, the same table can contain entries from multiple languages and only the requested language will be used. The default langid is 0. <tcl>hd_fragment configeditdist {configurable edit distances}</tcl> <h1>Configurable Edit Distance</h1> <p>The built-in Wagner edit-distance function with fixed weights can be replaced by the [editdist3()] edit-distance function with application-defined weights and support for unicode, by specifying the "edit_cost_table=<i>TABLENAME</i>" parameter to the spellfix1 module when the virtual table is created. For example: |
︙ | ︙ | |||
361 362 363 364 365 366 367 | APPCOST table. Hence, applications should run a SQL statement similar to the following when changes to the APPCOST table occur: <blockquote> INSERT INTO demo2(command) VALUES("reset"); </blockquote> | | | 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 | APPCOST table. Hence, applications should run a SQL statement similar to the following when changes to the APPCOST table occur: <blockquote> INSERT INTO demo2(command) VALUES("reset"); </blockquote> <h1>Dealing With Unusual And Difficult Spellings</h1> <p>The algorithm above works quite well for most cases, but there are exceptions. These exceptions can be dealt with by making additional entries in the virtual table using the "soundslike" column. <p>For example, many words of Greek origin begin with letters "ps" where the "p" is silent. Ex: psalm, pseudonym, psoriasis, psyche. In |
︙ | ︙ | |||
413 414 415 416 417 418 419 | <li>"Tch" sounds in Slavic words: Tchaikovsky vs. Chaykovsky <li>The letter "j" pronounced like "h" in Spanish: LaJolla <li>Words beginning with "wr" versus "r": write vs. rite <li>Miscellaneous problem words such as "debt", "tsetse", "Nguyen", "Van Nuyes". </ul> | | | | 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 | <li>"Tch" sounds in Slavic words: Tchaikovsky vs. Chaykovsky <li>The letter "j" pronounced like "h" in Spanish: LaJolla <li>Words beginning with "wr" versus "r": write vs. rite <li>Miscellaneous problem words such as "debt", "tsetse", "Nguyen", "Van Nuyes". </ul> <h1>Auxiliary Functions</h1> <p>The source code module that implements the spellfix1 virtual table also implements several SQL functions that might be useful to applications that employ spellfix1 or for testing or diagnostic work while developing applications that use spellfix1. The following auxiliary functions are available: <dl> <dt><p><b>editdist3(P,W)<br>editdist3(P,W,L)<br>editdist3(T)</b><dd> These routines provide direct access to the version of the Wagner edit-distance function that allows for application-defined weights on edit operations. The first two forms of this function compare pattern P against word W and return the edit distance. In the first function, the langid is assumed to be 0 and in the second, the langid is given by the L parameter. The third form of this function |
︙ | ︙ | |||
459 460 461 462 463 464 465 | <dt><p><b>spellfix1_translit(X)</b><dd> This routine transliterates unicode text into pure ascii, returning the pure ascii representation of the input text X. This is the function that is used internally to transform vocabulary words into the K1 column of the shadow table. | | | | | 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 | <dt><p><b>spellfix1_translit(X)</b><dd> This routine transliterates unicode text into pure ascii, returning the pure ascii representation of the input text X. This is the function that is used internally to transform vocabulary words into the K1 column of the shadow table. </dl> <tcl>hd_fragment editdist3 editdist3</tcl> <h1>The editdist3 function</h1> <p>The editdist3 algorithm is a function that computes the minimum edit distance (a.k.a. the Levenshtein distance) between two input strings. The editdist3 algorithm is a configurable alternative to the default edit distance function of spellfix1. Features of editdist3 include: <ul> <li><p>It works with unicode (UTF8) text. <li><p>A table of insertion, deletion, and substitution costs can be provided by the application. <li><p>Multi-character insertions, deletions, and substitutions can be enumerated in the cost table. </ul> <h1>The editdist3 COST table</h1> <p>To program the costs of editdist3, create a table such as the following: <blockquote><pre> CREATE TABLE editcost( iLang INT, -- The language ID cFrom TEXT, -- Convert text from this |
︙ | ︙ | |||
523 524 525 526 527 528 529 | <p>In the spellfix1 algorithm, cFrom is the text as the user entered it and cTo is the correctly spelled text as it exists in the database. The goal of the editdist3 algorithm is to determine how close the user-entered text is to the dictionary text. <p>There are three special-case entries in the cost table: | | | 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 | <p>In the spellfix1 algorithm, cFrom is the text as the user entered it and cTo is the correctly spelled text as it exists in the database. The goal of the editdist3 algorithm is to determine how close the user-entered text is to the dictionary text. <p>There are three special-case entries in the cost table: <table border=1 align=center> <tr><th>cFrom</th><th>cTo</th><th>Meaning</th></tr> <tr><td>''</td><td>'?'</td><td>The default insertion cost</td></tr> <tr><td>'?'</td><td>''</td><td>The default deletion cost</td></tr> <tr><td>'?'</td><td>'?'</td><td>The default substitution cost</td></tr> </table> <p>If any of the special-case entries shows above are omitted, then the |
︙ | ︙ | |||
559 560 561 562 563 564 565 | INSERT INTO editcost(iLang, cFrom, cTo, iCost) VALUES(0, 'ss', 'ß', 8); </pre></blockquote> <p>The number of characters in cFrom and cTo do not need to be the same. The rule above says that "ss" on user input will match "ß" with a penalty of 8. | | | 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 | INSERT INTO editcost(iLang, cFrom, cTo, iCost) VALUES(0, 'ss', 'ß', 8); </pre></blockquote> <p>The number of characters in cFrom and cTo do not need to be the same. The rule above says that "ss" on user input will match "ß" with a penalty of 8. <h1>Experimenting with the editcost3() function</h1> <p>The spellfix1 virtual table uses editdist3 if the "edit_cost_table=TABLE" option is specified as an argument when the spellfix1 virtual table is created. But editdist3 can also be tested directly using the built-in "editdist3()" SQL function. The editdist3() SQL function has 3 forms: |
︙ | ︙ |
Changes to pages/tempfiles.in.
1 2 3 | <title>Temporary Files Used By SQLite</title> <tcl>hd_keywords {temporary disk files}</tcl> | | | | 1 2 3 4 5 6 7 8 9 10 11 12 13 | <title>Temporary Files Used By SQLite</title> <tcl>hd_keywords {temporary disk files}</tcl> <table_of_contents> <h1>Introduction</h1> <p> One of the <a href="different.html">distinctive features</a> of SQLite is that a database consists of a single disk file. This simplifies the use of SQLite since moving or backing up a database is a simple as copying a single file. It also makes SQLite appropriate for use as an |
︙ | ︙ | |||
34 35 36 37 38 39 40 | will use temporary files in the same way. New kinds of temporary files might be employed and some of the current temporary file uses might be discontinued in future releases of SQLite. </p> <tcl>hd_fragment types</tcl> | | | 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | will use temporary files in the same way. New kinds of temporary files might be employed and some of the current temporary file uses might be discontinued in future releases of SQLite. </p> <tcl>hd_fragment types</tcl> <h1>Nine Kinds Of Temporary Files</h1> <p> SQLite currently uses nine distinct types of temporary files: </p> <ol> <li>Rollback journals</li> |
︙ | ︙ | |||
58 59 60 61 62 63 64 | <p> Additional information about each of these temporary file types is in the sequel. </p> <tcl>hd_fragment rollbackjrnl</tcl> | | | 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | <p> Additional information about each of these temporary file types is in the sequel. </p> <tcl>hd_fragment rollbackjrnl</tcl> <h2>Rollback Journals</h2> <p> A rollback journal is a temporary file used to implement atomic commit and rollback capabilities in SQLite. (For a detailed discussion of how this works, see the separate document titled <a href="atomiccommit.html">Atomic Commit In SQLite</a>.) |
︙ | ︙ | |||
145 146 147 148 149 150 151 | memory rather than on disk. The ROLLBACK command still works when the journal mode is MEMORY, but because no file exists on disks for recovery, a crash or power loss in the middle of a transaction that uses the MEMORY journal mode will likely result in a corrupt database. </p> <tcl>hd_fragment walfile</tcl> | | | | 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 | memory rather than on disk. The ROLLBACK command still works when the journal mode is MEMORY, but because no file exists on disks for recovery, a crash or power loss in the middle of a transaction that uses the MEMORY journal mode will likely result in a corrupt database. </p> <tcl>hd_fragment walfile</tcl> <h2>Write-Ahead Log (WAL) Files</h2> <p> A write-ahead log or WAL file is used in place of a rollback journal when SQLite is operating in [WAL mode]. As with the rollback journal, the purpose of the WAL file is to implement atomic commit and rollback. The WAL file is always located in the same directory as the database file and has the same name as the database file except with the 4 characters "<b>-wal</b>" appended. The WAL file is created when the first connection to the database is opened and is normally removed when the last connection to the database closes. However, if the last connection does not shutdown cleanly, the WAL file will remain in the filesystem and will be automatically cleaned up the next time the database is opened. </p> <tcl>hd_fragment shmfile</tcl> <h2>Shared-Memory Files</h2> <p> When operating in [WAL mode], all SQLite database connections associated with the same database file need to share some memory that is used as an index for the WAL file. In most implementations, this shared memory is implemented by calling mmap() on a file created for this sole purpose: the shared-memory file. The shared-memory file, if it exists, is located |
︙ | ︙ | |||
198 199 200 201 202 203 204 | The shared-memory file is created when the WAL file is created and is deleted when the WAL file is deleted. During WAL file recovery, the shared memory file is recreated from scratch based on the contents of the WAL file being recovered. </p> <tcl>hd_fragment masterjrnl {master journal}</tcl> | | | 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 | The shared-memory file is created when the WAL file is created and is deleted when the WAL file is deleted. During WAL file recovery, the shared memory file is recreated from scratch based on the contents of the WAL file being recovered. </p> <tcl>hd_fragment masterjrnl {master journal}</tcl> <h2>Master Journal Files</h2> <p> The master journal file is used as part of the atomic commit process when a single transaction makes changes to multiple databases that have been added to a single [database connection] using the [ATTACH] statement. The master journal file is always located in the same directory as the main database file |
︙ | ︙ | |||
252 253 254 255 256 257 258 | some files might rollback while others roll forward after power is restored. </p> <tcl>hd_fragment stmtjrnl {statement journal} {statement journals} \ {Statement journals}</tcl> | | | 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 | some files might rollback while others roll forward after power is restored. </p> <tcl>hd_fragment stmtjrnl {statement journal} {statement journals} \ {Statement journals}</tcl> <h2>Statement Journal Files</h2> <p> A statement journal file is used to rollback partial results of a single statement within a larger transaction. For example, suppose an UPDATE statement will attempt to modify 100 rows in the database. But after modifying the first 50 rows, the UPDATE hits a constraint violation which should block the entire statement. |
︙ | ︙ | |||
301 302 303 304 305 306 307 | deleted at the conclusion of the transaction. The size of the statement journal is proportional to the size of the change implemented by the UPDATE or INSERT statement that caused the statement journal to be created. </p> <tcl>hd_fragment tempdb</tcl> | | | 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 | deleted at the conclusion of the transaction. The size of the statement journal is proportional to the size of the change implemented by the UPDATE or INSERT statement that caused the statement journal to be created. </p> <tcl>hd_fragment tempdb</tcl> <h2>TEMP Databases</h2> <p>Tables created using the "CREATE TEMP TABLE" syntax are only visible to the [database connection] in which the "CREATE TEMP TABLE" statement is originally evaluated. These TEMP tables, together with any associated indices, triggers, and views, are collectively stored in a separate temporary database file that is created as soon as the first "CREATE TEMP TABLE" statement is seen. |
︙ | ︙ | |||
336 337 338 339 340 341 342 | <p> The temporary files associated with the TEMP database and its rollback journal are only created if the application makes use of the "CREATE TEMP TABLE" statement. </p> <tcl>hd_fragment views</tcl> | | | 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 | <p> The temporary files associated with the TEMP database and its rollback journal are only created if the application makes use of the "CREATE TEMP TABLE" statement. </p> <tcl>hd_fragment views</tcl> <h2>Materializations Of Views And Subqueries</h2> <p>Queries that contain subqueries must sometime evaluate the subqueries separately and store the results in a temporary table, then use the content of the temporary table to evaluate the outer query. We call this "materializing" the subquery. The query optimizer in SQLite attempts to avoid materializing, |
︙ | ︙ | |||
442 443 444 445 446 447 448 | or not the subquery or outer query contain aggregate functions, ORDER BY or GROUP BY clauses, LIMIT clauses, and so forth. The rules for when a query can and cannot be flattened are very complex and are beyond the scope of this document. </p> <tcl>hd_fragment transidx</tcl> | | | 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 | or not the subquery or outer query contain aggregate functions, ORDER BY or GROUP BY clauses, LIMIT clauses, and so forth. The rules for when a query can and cannot be flattened are very complex and are beyond the scope of this document. </p> <tcl>hd_fragment transidx</tcl> <h2>Transient Indices</h2> <p> SQLite may make use of transient indices to implement SQL language features such as: </p> <ul> |
︙ | ︙ | |||
524 525 526 527 528 529 530 | <p> Note that the UNION ALL operator for compound queries does not use transient indices by itself (though of course the right and left subqueries of the UNION ALL might use transient indices depending on how they are composed.) <tcl>hd_fragment vacuumdb</tcl> | | | | 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 | <p> Note that the UNION ALL operator for compound queries does not use transient indices by itself (though of course the right and left subqueries of the UNION ALL might use transient indices depending on how they are composed.) <tcl>hd_fragment vacuumdb</tcl> <h2>Transient Database Used By [VACUUM]</h2> <p> The [VACUUM] command works by creating a temporary file and then rebuilding the entire database into that temporary file. Then the content of the temporary file is copied back into the original database file and the temporary file is deleted. </p> <p> The temporary file created by the [VACUUM] command exists only for the duration of the command itself. The size of the temporary file will be no larger than the original database. </p> <tcl>hd_fragment tempstore *tempstore</tcl> <h1>The SQLITE_TEMP_STORE Compile-Time Parameter and Pragma</h1> <p> The temporary files associated with transaction control, namely the rollback journal, master journal, write-ahead log (WAL) files, and shared-memory files, are always written to disk. But the other kinds of temporary files might be stored in memory only and never written to disk. |
︙ | ︙ | |||
630 631 632 633 634 635 636 | and the master journal. The rollback journal and the master journal are always written to disk regardless of the settings of the [SQLITE_TEMP_STORE] compile-time parameter and the [temp_store pragma]. </p> <tcl>hd_fragment otheropt</tcl> | | | 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 | and the master journal. The rollback journal and the master journal are always written to disk regardless of the settings of the [SQLITE_TEMP_STORE] compile-time parameter and the [temp_store pragma]. </p> <tcl>hd_fragment otheropt</tcl> <h1>Other Temporary File Optimizations</h1> <p> SQLite uses a page cache of recently read and written database pages. This page cache is used not just for the main database file but also for transient indices and tables stored in temporary files. If SQLite needs to use a temporary index or table and the [SQLITE_TEMP_STORE] compile-time parameter and the |
︙ | ︙ | |||
666 667 668 669 670 671 672 | same for every temporary table and index. The value cannot be changed at run-time or on a per-table or per-index basis. Each temporary file gets its own private page cache with its own SQLITE_DEFAULT_TEMP_CACHE_SIZE page limit. </p> <tcl>hd_fragment tempdir {temporary directory search algorithm}</tcl> | | | 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 | same for every temporary table and index. The value cannot be changed at run-time or on a per-table or per-index basis. Each temporary file gets its own private page cache with its own SQLITE_DEFAULT_TEMP_CACHE_SIZE page limit. </p> <tcl>hd_fragment tempdir {temporary directory search algorithm}</tcl> <h1>Temporary File Storage Locations</h1> <p> The directory or folder in which temporary files are created is determined by the OS-specific [VFS]. <p> On unix-like systems, directories are searched in the following order: |
︙ | ︙ |
Changes to pages/testing.in.
1 2 3 4 5 6 7 8 9 10 | <title>How SQLite Is Tested</title> <tcl>hd_keywords testing *tested {test suite}</tcl> <tcl> # This document contains many size statistics about SQLite, statistics # that change frequently. We want the document to be up-to-date. To # facilitate that, all the size values are defined by variables here # which are then used as needed through the document. # # NOTE: Also update the version number in the text!!! | > > | 1 2 3 4 5 6 7 8 9 10 11 12 | <title>How SQLite Is Tested</title> <tcl>hd_keywords testing *tested {test suite}</tcl> <table_of_contents> <tcl> # This document contains many size statistics about SQLite, statistics # that change frequently. We want the document to be up-to-date. To # facilitate that, all the size values are defined by variables here # which are then used as needed through the document. # # NOTE: Also update the version number in the text!!! |
︙ | ︙ | |||
95 96 97 98 99 100 101 | } proc version {} { hd_puts $::stat(version) } </tcl> | < < | | | | 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 | } proc version {} { hd_puts $::stat(version) } </tcl> <h1>Introduction</h1> <p>The reliability and robustness of SQLite is achieved in part by thorough and careful testing.</p> <p>As of [version 3.12.0], the SQLite library consists of approximately <tcl>KB {$stat(coreSLOC)}</tcl> KSLOC of C code. (KSLOC means thousands of "Source Lines Of Code" or, in other words, lines of code excluding blank lines and comments.) By comparison, the project has <tcl> hd_puts "[expr {int($stat(totalSLOC)/$stat(coreSLOC))}] times as much" </tcl> test code and test scripts - <tcl>KB {$stat(totalSLOC)}</tcl> KSLOC.</p> <h2>Executive Summary</h2> <ul> <li> Three independently developed test harnesses <li> 100% branch test coverage in an as-deployed configuration <li> Millions and millions of test cases <li> Out-of-memory tests <li> I/O error tests <li> Crash and power loss tests <li> Fuzz tests <li> Boundary value tests <li> Disabled optimization tests <li> Regression tests <li> Malformed database tests <li> Extensive use of assert() and run-time checks <li> Valgrind analysis <li> Undefined behavior checks <li> Checklists </ul> <tcl>hd_fragment {harnesses} {test harness} {three test harnesses}</tcl> <h1>Test Harnesses</h1> <p>There are three independent test harnesses used for testing the core SQLite library. Each test harness is designed, maintained, and managed separately from the others. </p> |
︙ | ︙ | |||
220 221 222 223 224 225 226 | <tcl>KB {$stat(vqNEval)}</tcl> thousand test cases. The veryquick tests include most tests other than the anomaly, fuzz, and soak tests. The idea behind the veryquick tests are that they are sufficient to catch most errors, but also run in only a few minutes instead of a few hours.</p> <tcl>hd_fragment anomoly</tcl> | | | | 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 | <tcl>KB {$stat(vqNEval)}</tcl> thousand test cases. The veryquick tests include most tests other than the anomaly, fuzz, and soak tests. The idea behind the veryquick tests are that they are sufficient to catch most errors, but also run in only a few minutes instead of a few hours.</p> <tcl>hd_fragment anomoly</tcl> <h1>Anomaly Testing</h1> <p>Anomaly tests are tests designed to verify the correct behavior of SQLite when something goes wrong. It is (relatively) easy to build an SQL database engine that behaves correctly on well-formed inputs on a fully functional computer. It is more difficult to build a system that responds sanely to invalid inputs and continues to function following system malfunctions. The anomaly tests are designed to verify the latter behavior.</p> <tcl>hd_fragment oomtesting</tcl> <h2>Out-Of-Memory Testing</h2> <p>SQLite, like all SQL database engines, makes extensive use of malloc() (See the separate report on [memory allocation | dynamic memory allocation in SQLite] for additional detail.) On servers and workstations, malloc() never fails in practice and so correct handling of out-of-memory (OOM) errors is not particularly important. |
︙ | ︙ | |||
263 264 265 266 267 268 269 | repeated. The loop continues until the entire operation runs to completion without ever encountering a simulated OOM failure. Tests like this are run twice, once with the instrumented malloc set to fail only once, and again with the instrumented malloc set to fail continuously after the first failure.</p> <tcl>hd_fragment ioerrtesting</tcl> | | | 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 | repeated. The loop continues until the entire operation runs to completion without ever encountering a simulated OOM failure. Tests like this are run twice, once with the instrumented malloc set to fail only once, and again with the instrumented malloc set to fail continuously after the first failure.</p> <tcl>hd_fragment ioerrtesting</tcl> <h2>I/O Error Testing</h2> <p>I/O error testing seeks to verify that SQLite responds sanely to failed I/O operations. I/O errors might result from a full disk drive, malfunctioning disk hardware, network outages when using a network file system, system configuration or permission changes that occur in the middle of an SQL operation, or other hardware or operating system malfunctions. Whatever the cause, it is important that SQLite be able |
︙ | ︙ | |||
294 295 296 297 298 299 300 | <p>In I/O error tests, after the I/O error simulation failure mechanism is disabled, the database is examined using [PRAGMA integrity_check] to make sure that the I/O error has not introduced database corruption.</p> <tcl>hd_fragment crashtesting</tcl> | | | 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 | <p>In I/O error tests, after the I/O error simulation failure mechanism is disabled, the database is examined using [PRAGMA integrity_check] to make sure that the I/O error has not introduced database corruption.</p> <tcl>hd_fragment crashtesting</tcl> <h2>Crash Testing</h2> <p>Crash testing seeks to demonstrate that an SQLite database will not go corrupt if the application or operating system crashes or if there is a power failure in the middle of a database update. A separate white-paper titled <a href="atomiccommit.html">Atomic Commit in SQLite</a> describes the defensive measure SQLite takes to prevent database corruption following |
︙ | ︙ | |||
339 340 341 342 343 344 345 | one expects to see following a power loss. Then the database is opened and checks are made to ensure that it is well-formed and that the transaction either ran to completion or was completely rolled back. The interior of the loop is repeated multiple times for each snapshot with different random damage each time.</p> <tcl>hd_fragment multifail</tcl> | | | | | | 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 | one expects to see following a power loss. Then the database is opened and checks are made to ensure that it is well-formed and that the transaction either ran to completion or was completely rolled back. The interior of the loop is repeated multiple times for each snapshot with different random damage each time.</p> <tcl>hd_fragment multifail</tcl> <h2>Compound failure tests</h2> <p>The test suites for SQLite also explore the result of stacking multiple failures. For example, tests are run to ensure correct behavior when an I/O error or OOM fault occurs while trying to recover from a prior crash. <tcl>hd_fragment fuzztesting {fuzz testing} {SQL fuzzing}</tcl> <h1>Fuzz Testing</h1> <p>[http://en.wikipedia.org/wiki/Fuzz_testing | Fuzz testing] seeks to establish that SQLite responds correctly to invalid, out-of-range, or malformed inputs.</p> <h2>SQL Fuzz</h2> <p>SQL fuzz testing consists of creating syntactically correct yet wildly nonsensical SQL statements and feeding them to SQLite to see what it will do with them. Usually some kind of error is returned (such as "no such table"). Sometimes, purely by chance, the SQL statement also happens to be semantically correct. In that case, the resulting prepared statement is run to make sure it gives a reasonable result.</p> <p>The SQL fuzz generator tests are part of the TCL test suite. During a full test run, about <tcl>KB {$stat(nSqlFuzz)}</tcl> thousand fuzz SQL statements are generated and tested.</p> <tcl>hd_fragment aflfuzz {American Fuzzy Lop fuzzer}</tcl> <h3>SQL Fuzz Using The American Fuzzy Lop Fuzzer</h3> <p>The <a href="http://lcamtuf.coredump.cx/afl/">American Fuzzy Lop</a> or "AFL" fuzzer is a recent (circa 2014) innovation from Michal Zalewski. Unlike most other fuzzers that blindly generate random inputs, the AFL fuzzer instruments the program being tested (by modifying the assembly-language output from the C compiler) and uses that instrumentation to detect when an input causes the program to do something different - to follow |
︙ | ︙ | |||
397 398 399 400 401 402 403 | Both SQL statements and database files are fuzzed. Billions and billions of mutations have been tried, but AFL's instrumentation has narrowed them down to less than 50,000 test cases that cover all distinct behaviors. Newly discovered test cases are periodically captured and added to the [TCL test suite] where they can be rerun using the "make fuzztest" or "make valgrindfuzz" commands. | | | | | | | | 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 | Both SQL statements and database files are fuzzed. Billions and billions of mutations have been tried, but AFL's instrumentation has narrowed them down to less than 50,000 test cases that cover all distinct behaviors. Newly discovered test cases are periodically captured and added to the [TCL test suite] where they can be rerun using the "make fuzztest" or "make valgrindfuzz" commands. <h2>Malformed Database Files</h2> <p>There are numerous test cases that verify that SQLite is able to deal with malformed database files. These tests first build a well-formed database file, then add corruption by changing one or more bytes in the file by some means other than SQLite. Then SQLite is used to read the database. In some cases, the bytes changes are in the middle of data. This causes the content of the database to change while keeping the database well-formed. In other cases, unused bytes of the file are modified, which has no effect on the integrity of the database. The interesting cases are when bytes of the file that define database structure get changed. The malformed database tests verify that SQLite finds the file format errors and reports them using the [SQLITE_CORRUPT] return code without overflowing buffers, dereferencing NULL pointers, or performing other unwholesome actions.</p> <h2>Boundary Value Tests</h2> <p>SQLite defines certain [limits] on its operation, such as the maximum number of columns in a table, the maximum length of an SQL statement, or the maximum value of an integer. The TCL and TH3 test suites both contains numerous tests that push SQLite right to the edge of its defined limits and verify that it performs correctly for all allowed values. Additional tests go beyond the defined limits and verify that SQLite correctly returns errors. The source code contains [testcase macros] to verify that both sides of each boundary have been tested.</p> <tcl>hd_fragment regressiontesting</tcl> <h1>Regression Testing</h1> <p>Whenever a bug is reported against SQLite, that bug is not considered fixed until new test cases that would exhibit the bug have been added to either the TCL or TH3 test suites. Over the years, this has resulted in thousands and thousands of new tests. These regression tests ensure that bugs that have been fixed in the past are not reintroduced into future versions of SQLite.</p> <tcl>hd_fragment leakcheck</tcl> <h1>Automatic Resource Leak Detection</h1> <p>Resource leak occurs when system resources are allocated and never freed. The most troublesome resource leaks in many applications are memory leaks - when memory is allocated using malloc() but never released using free(). But other kinds of resources can also be leaked: file descriptors, threads, mutexes, etc.</p> <p>Both the TCL and TH3 test harnesses automatically track system resources and report resource leaks on <u>every</u> test run. No special configuration or setup is required. The test harnesses are especially vigilant with regard to memory leaks. If a change causes a memory leak, the test harnesses will recognize this quickly. SQLite is designed to never leak memory, even after an exception such as an OOM error or disk I/O error. The test harnesses are zealous to enforce this.</p> <tcl>hd_fragment coverage {test coverage}</tcl> <h1>Test Coverage</h1> <p>The SQLite core, including the unix [VFS], has 100% branch test coverage under [TH3] in its default configuration as measured by [http://gcc.gnu.org/onlinedocs/gcc/Gcov.html | gcov]. Extensions such as FTS3 and RTree are excluded from this analysis.</p> <tcl>hd_fragment stmtvbr</tcl> <h2>Statement versus branch coverage</h2> <p>There are many ways to measure test coverage. The most popular metric is "statement coverage". When you hear someone say that their program as "XX% test coverage" without further explanation, they usually mean statement coverage. Statement coverage measures what percentage of lines of code are executed at least once by the test suite.</p> |
︙ | ︙ | |||
516 517 518 519 520 521 522 | but all three are required for 100% branch coverage. Generally speaking, 100% branch coverage implies 100% statement coverage, but the converse is not true. To reemphasize, the [TH3] test harness for SQLite provides the stronger form of test coverage - 100% branch test coverage.</p> <tcl>hd_fragment defensivecode</tcl> | | | 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 | but all three are required for 100% branch coverage. Generally speaking, 100% branch coverage implies 100% statement coverage, but the converse is not true. To reemphasize, the [TH3] test harness for SQLite provides the stronger form of test coverage - 100% branch test coverage.</p> <tcl>hd_fragment defensivecode</tcl> <h2>Coverage testing of defensive code</h2> <p>A well-written C program will typically contain some defensive conditionals which in practice are always true or always false. This leads to a programming dilemma: Does one remove defensive code in order to obtain 100% branch coverage?</p> |
︙ | ︙ | |||
565 566 567 568 569 570 571 | the ALWAYS() and NEVER() definitions shown above. All three test runs should yield exactly the same result. There is a run-time test using the [sqlite3_test_control]([SQLITE_TESTCTRL_ALWAYS], ...) interface that can be used to verify that the macros are correctly set to the first form (the pass-through form) for deployment.</p> <tcl>hd_fragment {testcase} {testcase macros}</tcl> | | | 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 | the ALWAYS() and NEVER() definitions shown above. All three test runs should yield exactly the same result. There is a run-time test using the [sqlite3_test_control]([SQLITE_TESTCTRL_ALWAYS], ...) interface that can be used to verify that the macros are correctly set to the first form (the pass-through form) for deployment.</p> <tcl>hd_fragment {testcase} {testcase macros}</tcl> <h2>Forcing coverage of boundary values and boolean vector tests</h2> <p>Another macro used in conjunction with test coverage measurement is the <tt>testcase()</tt> macro. The argument is a condition for which we want test cases that evaluate to both true and false. In non-coverage builds (that is to say, in release builds) the <tt>testcase()</tt> macro is a no-op:</p> |
︙ | ︙ | |||
624 625 626 627 628 629 630 | if( (mask & (SQLITE_OPEN_MAIN_DB|SQLITE_OPEN_TEMP_DB))!=0 ){ ... } </pre></blockquote> <p>The SQLite source code contains <tcl>N {$stat(nTestcase)}</tcl> uses of the <tt>testcase()</tt> macro.</p> <tcl>hd_fragment {mcdc} *MC/DC {MC/DC testing}</tcl> | | | 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 | if( (mask & (SQLITE_OPEN_MAIN_DB|SQLITE_OPEN_TEMP_DB))!=0 ){ ... } </pre></blockquote> <p>The SQLite source code contains <tcl>N {$stat(nTestcase)}</tcl> uses of the <tt>testcase()</tt> macro.</p> <tcl>hd_fragment {mcdc} *MC/DC {MC/DC testing}</tcl> <h2>Branch coverage versus MC/DC</h2> <p>Two methods of measuring test coverage were described above: "statement" and "branch" coverage. There are many other test coverage metrics besides these two. Another popular metric is "Modified Condition/Decision Coverage" or MC/DC. [http://en.wikipedia.org/wiki/Modified_Condition/Decision_Coverage | Wikipedia] defines MC/DC as follows:</p> |
︙ | ︙ | |||
655 656 657 658 659 660 661 | might not be satisfied.</p> <p>SQLite uses <tt>testcase()</tt> macros as described in the previous subsection to make sure that every condition in a bit-vector decision takes on every possible outcome. In this way, SQLite also achieves 100% MC/DC in addition to 100% branch coverage.</p> | | | 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 | might not be satisfied.</p> <p>SQLite uses <tt>testcase()</tt> macros as described in the previous subsection to make sure that every condition in a bit-vector decision takes on every possible outcome. In this way, SQLite also achieves 100% MC/DC in addition to 100% branch coverage.</p> <h2>Measuring branch coverage</h2> <p>Branch coverage in SQLite is currently measured using [https://gcc.gnu.org/onlinedocs/gcc/Gcov.html|gcov] with the "-b" option. First the test program is compiled using options "-g -fprofile-arcs -ftest-coverage" and then the test program is run. Then "gcov -b" is run to generate a coverage report. The coverage report is verbose and inconvenient to read, |
︙ | ︙ | |||
691 692 693 694 695 696 697 | or a bug in the compiler. Note that SQLite has, over the previous decade, encountered bugs in each of GCC, Clang, and MSVC. Compiler bugs, while rare, do happen, which is why it is so important to test the code in an as-delivered configuration. <tcl>hd_fragment thoughts1</tcl> | | | 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 | or a bug in the compiler. Note that SQLite has, over the previous decade, encountered bugs in each of GCC, Clang, and MSVC. Compiler bugs, while rare, do happen, which is why it is so important to test the code in an as-delivered configuration. <tcl>hd_fragment thoughts1</tcl> <h2>Experience with full test coverage</h2> <p>The developers of SQLite have found that full coverage testing is an extremely effective method for locating and preventing bugs. Because every single branch instruction in SQLite core code is covered by test cases, the developers can be confident that changes made in one part of the code do not have unintended consequences in other parts of the code. |
︙ | ︙ | |||
713 714 715 716 717 718 719 | However, we think that full-coverage testing is justified for a [most widely deployed|very widely deployed] infrastructure library like SQLite, and especially for a database library which by its very nature "remembers" past mistakes. <tcl>hd_fragment dynamicanalysis</tcl> | | | | | 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 | However, we think that full-coverage testing is justified for a [most widely deployed|very widely deployed] infrastructure library like SQLite, and especially for a database library which by its very nature "remembers" past mistakes. <tcl>hd_fragment dynamicanalysis</tcl> <h1>Dynamic Analysis</h1> <p>Dynamic analysis refers to internal and external checks on the SQLite code which are performed while the code is live and running. Dynamic analysis has proven to be a great help in maintaining the quality of SQLite.</p> <tcl>hd_fragment asserts</tcl> <h2>Assert</h2> <p>The SQLite core contains <tcl>N {$stat(nAssert)}</tcl> <tt>assert()</tt> statements that verify function preconditions and postconditions and loop invariants. Assert() is a macro which is a standard part of ANSI-C. The argument is a boolean value that is assumed to always be true. If the assertion is false, the program prints an error message and halts.</p> <p>Assert() macros are disabled by compiling with the NDEBUG macro defined. In most systems, asserts are enabled by default. But in SQLite, the asserts are so numerous and are in such performance critical places, that the database engine runs about three times slower when asserts are enabled. Hence, the default (production) build of SQLite disables asserts. Assert statements are only enabled when SQLite is compiled with the SQLITE_DEBUG preprocessor macro defined.</p> <tcl>hd_fragment valgrind</tcl> <h2>Valgrind</h2> <p>[http://valgrind.org/ | Valgrind] is perhaps the most amazing and useful developer tool in the world. Valgrind is a simulator - it simulates an x86 running a Linux binary. (Ports of Valgrind for platforms other than Linux are in development, but as of this writing, Valgrind only works reliably on Linux, which in the opinion of the SQLite developers means that Linux should be the preferred platform for all software development.) |
︙ | ︙ | |||
764 765 766 767 768 769 770 | running in Valgrind on a workstation will perform about the same as it would running natively on a smartphone.) So it is impractical to run the full SQLite test suite through Valgrind. However, the veryquick tests and the coverage of the TH3 tests are run through Valgrind prior to every release.</p> <tcl>hd_fragment memtesting</tcl> | | | | | | 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 | running in Valgrind on a workstation will perform about the same as it would running natively on a smartphone.) So it is impractical to run the full SQLite test suite through Valgrind. However, the veryquick tests and the coverage of the TH3 tests are run through Valgrind prior to every release.</p> <tcl>hd_fragment memtesting</tcl> <h2>Memsys2</h2> <p>SQLite contains a pluggable [memory allocation | memory allocation subsystem]. The default implementation uses system malloc() and free(). However, if SQLite is compiled with [SQLITE_MEMDEBUG], an alternative memory allocation wrapper ([memsys2]) is inserted that looks for memory allocation errors at run-time. The memsys2 wrapper checks for memory leaks, of course, but also looks for buffer overruns, uses of uninitialized memory, and attempts to use memory after it has been freed. These same checks are also done by valgrind (and, indeed, Valgrind does them better) but memsys2 has the advantage of being much faster than Valgrind, which means the checks can be done more often and for longer tests.</p> <tcl>hd_fragment mutextesting</tcl> <h2>Mutex Asserts</h2> <p>SQLite contains a pluggable mutex subsystem. Depending on compile-time options, the default mutex system contains interfaces [sqlite3_mutex_held()] and [sqlite3_mutex_notheld()] that detect whether or not a particular mutex is held by the calling thread. These two interfaces are used extensively within assert() statements in SQLite to verify mutexes are held and released at all the right moments, in order to double-check that SQLite does work correctly in multi-threaded applications.</p> <tcl>hd_fragment journaltest</tcl> <h2>Journal Tests</h2> <p>One of the things that SQLite does to ensure that transactions are atomic across system crashes and power failures is to write all changes into the rollback journal file prior to changing the database. The TCL test harness contains an alternative [OS backend] implementation that helps to verify this is occurring correctly. The "journal-test VFS" monitors all disk I/O traffic between the database file and rollback journal, checking to make sure that nothing is written into the database file which has not first been written and synced to the rollback journal. If any discrepancies are found, an assertion fault is raised.</p> <p>The journal tests are an additional double-check over and above the crash tests to make sure that SQLite transactions will be atomic across system crashes and power failures.</p> <tcl>hd_fragment intoverflow</tcl> <h2>Undefined Behavior Checks</h2> <p>In the C programming language, it is very easy to write code that has "undefined" or "implementation defined" behavior. That means that the code might work during development, but then give a different answer on a different system, or when recompiled using different compiler options. Examples of undefined and implementation-defined behavior in |
︙ | ︙ | |||
855 856 857 858 859 860 861 | on 32-bit and 64-bit systems and on big-endian and little-endian systems, using a variety of CPU architectures. Furthermore, the test suites are augmented with many test cases that are deliberately designed to provoke undefined behavior. For example: "<b>SELECT -1*(-9223372036854775808);</b>". <tcl>hd_fragment disopttest</tcl> | | | 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 | on 32-bit and 64-bit systems and on big-endian and little-endian systems, using a variety of CPU architectures. Furthermore, the test suites are augmented with many test cases that are deliberately designed to provoke undefined behavior. For example: "<b>SELECT -1*(-9223372036854775808);</b>". <tcl>hd_fragment disopttest</tcl> <h1>Disabled Optimization Tests</h1> <p>The [sqlite3_test_control]([SQLITE_TESTCTRL_OPTIMIZATIONS], ...) interface allows selected SQL statement optimizations to be disabled at run-time. SQLite should always generate exactly the same answer with optimizations enabled and with optimizations disabled; the answer simply arrives quicker with the optimizations turned on. So in a production environment, one always leaves the optimizations turned on (the default setting).</p> |
︙ | ︙ | |||
881 882 883 884 885 886 887 | But the majority of test cases simply check that the correct answer was obtained, and all of those cases can be run successfully with and without the optimizations, in order to show that the optimizations do not cause malfunctions.</p> <tcl>hd_fragment cklist</tcl> | | | 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 | But the majority of test cases simply check that the correct answer was obtained, and all of those cases can be run successfully with and without the optimizations, in order to show that the optimizations do not cause malfunctions.</p> <tcl>hd_fragment cklist</tcl> <h1>Checklists</h1> <p>The SQLite developers use an on-line checklist to coordinate testing activity and to verify that all tests pass prior each SQLite release. <a href="http://www.sqlite.org/checklists/index.html">Past checklists</a> are retained for historical reference. (The checklists are read-only for anonymous internet viewers, but developers can log in and update checklist items in their web |
︙ | ︙ | |||
915 916 917 918 919 920 921 | potential problems are discovered, new checklist items are added to make sure those problems do not appear in subsequent releases. The release checklist has proven to be an invaluable tool in helping to ensure that nothing is overlooked during the release process.</p> <tcl>hd_fragment staticanalysis</tcl> | | | 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 | potential problems are discovered, new checklist items are added to make sure those problems do not appear in subsequent releases. The release checklist has proven to be an invaluable tool in helping to ensure that nothing is overlooked during the release process.</p> <tcl>hd_fragment staticanalysis</tcl> <h1>Static Analysis</h1> <p>Static analysis means analyzing code at or before compile-time to check for correctness. Static analysis includes looking at compiler warning messages and running the code through more in-depth analysis engines such as the [http://clang-analyzer.llvm.org/ | Clang Static Analyzer]. SQLite compiles without warnings on GCC and Clang using |
︙ | ︙ | |||
939 940 941 942 943 944 945 | <p>Static analysis has not proven to be especially helpful in finding bugs in SQLite. Static analysis has found a few bugs in SQLite, but those are the exceptions. More bugs have been introduced into SQLite while trying to get it to compile without warnings than have been found by static analysis.</p> <tcl>hd_fragment summary</tcl> | | | 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 | <p>Static analysis has not proven to be especially helpful in finding bugs in SQLite. Static analysis has found a few bugs in SQLite, but those are the exceptions. More bugs have been introduced into SQLite while trying to get it to compile without warnings than have been found by static analysis.</p> <tcl>hd_fragment summary</tcl> <h1>Summary</h1> <p>SQLite is open source. This gives many people the idea that it is not well tested as commercial software and is perhaps unreliable. But that impression is false. SQLite has exhibited very high reliability in the field and a very low defect rate, especially considering how rapidly it is evolving. The quality of SQLite is achieved in part by careful code design and implementation. But extensive testing also plays a vital role in maintaining and improving the quality of SQLite. This document has summarized the testing procedures that every release of SQLite undergoes with the hope of inspiring confidence that SQLite is suitable for use in mission-critical applications.</p> |
Changes to pages/th3.in.
|
| | < | | | 1 2 3 4 5 6 7 8 9 10 11 12 | <title>TH3 Test Harness</title> <tcl>hd_keywords {TH3}</tcl> <fancy_format> <h1>Overview</h1> <p>SQLite Test Harness #3 (hereafter "TH3") is one of [three test harnesses] used for testing SQLite. TH3 is designed to meet the following objectives:</p> <ul> <li><p> TH3 is able to run on embedded platforms that lack the support |
︙ | ︙ | |||
35 36 37 38 39 40 41 | <p>TH3 was originally written for validation testing only, but has subsequently been used for development testing and debugging as well, and has proven very helpful in those roles. A full-coverage test takes less than five minutes on a workstation and hence serves as a fast regression test during day-to-day maintenance of the SQLite code base.</p> | | | 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | <p>TH3 was originally written for validation testing only, but has subsequently been used for development testing and debugging as well, and has proven very helpful in those roles. A full-coverage test takes less than five minutes on a workstation and hence serves as a fast regression test during day-to-day maintenance of the SQLite code base.</p> <h1>Operation</h1> <p>TH3 is a test program generator. The output of TH3 is a program implemented in C-code and intended to be linked against the SQLite library under test. The generated test program is compiled and run on the target platform in order to verify correct operation of SQLite on that platform.</p> |
︙ | ︙ | |||
75 76 77 78 79 80 81 | written SQL with the remainder either in pure C or a combination of C and SQL. </p> <p>Each test module file contains a header which describes the circumstances under which the test is valid. For a particular configuration, only those modules that are compatible with the configuration are run. </p> | | | 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 | written SQL with the remainder either in pure C or a combination of C and SQL. </p> <p>Each test module file contains a header which describes the circumstances under which the test is valid. For a particular configuration, only those modules that are compatible with the configuration are run. </p> <h1>Generating A Test Program</h1> <p>The TH3 program generator is a TCL script named "<tt>mkth3.tcl</tt>". To generate a test program, one has merely to run this script and supply the names of files containing test modules and configurations on the command line. Test modules are files that use the "<tt>.test</tt>" suffix and configurations are files that use the "<tt>.cfg</tt>" suffix. A typical invocation of mkth3.tcl might look something like the following:</p> |
︙ | ︙ | |||
171 172 173 174 175 176 177 | configurations and "pager08", "build33", "orderby01", etc. are test modules. Compile-time and run-time options are available to increase or decrease the amount of output. The output can be increased by showing each test case within each test module. The output can be decreased by degrees: omitting test modules starts and stops, omitting configuration starts and stops, and finally by omitting all output. | | | 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 | configurations and "pager08", "build33", "orderby01", etc. are test modules. Compile-time and run-time options are available to increase or decrease the amount of output. The output can be increased by showing each test case within each test module. The output can be decreased by degrees: omitting test modules starts and stops, omitting configuration starts and stops, and finally by omitting all output. <h2>Test Automation Scripts</h2> <p>TH3 comes with additional TCL scripts that help automate the testing process on workstations. The "th3make" script automatically runs "mkth3.tcl" and "gcc" and then runs the resulting test program and checks the results. Arguments to th3make include all of the "*.test" test modules and "*.cfg" configurations that are to be included in the test. Additional options to th3make can cause the test program to be compiled using different |
︙ | ︙ | |||
284 285 286 287 288 289 290 | The two-line summary at the bottom shows the total number of errors and tests over all th3make runs and the total elapse time, together with the [SQLITE_SOURCE_ID] information for the version of SQLite that was tested. This summary information is recorded in the <a href="https://www.sqlite.org/checklists/3081000/index#c6">release checklist</a> during testing. | | | | 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 | The two-line summary at the bottom shows the total number of errors and tests over all th3make runs and the total elapse time, together with the [SQLITE_SOURCE_ID] information for the version of SQLite that was tested. This summary information is recorded in the <a href="https://www.sqlite.org/checklists/3081000/index#c6">release checklist</a> during testing. <h1>Test Coverage</h1> <p>Using one particular subset of the available TH3 test modules (the "cov1" tests) SQLite obtained [test coverage | 100% branch test coverage] and 100% [MC/DC] as measured by [http://gcc.gnu.org/onlinedocs/gcc/Gcov.html | gcov] on Linux x86 and x86_64 hardware. All releases of SQLite since [version 3.6.17] (circa 2009-08-10) have been tested to this standard. The SQLite developers are committed to maintaining 100% branch coverage and MC/DC for all future releases of SQLite.</p> <p>The cov1 test set used to obtain 100% branch test coverage are only a subset of the tests currently implemented using TH3. New test modules are added on a regular basis.</p> <h1>TH3 License</h1> <p>SQLite itself is in the <a href="copyright.html">public domain</a> and can be used for any purpose. But TH3 is proprietary and requires a license. </p> <p>Even though open-source users do not have direct access to TH3, all users of SQLite benefit from TH3 indirectly since each version of SQLite is validated running TH3 on multiple platforms (Linux, Windows, WinRT, Mac, OpenBSD) prior to release. So anyone using an official release of SQLite can deploy their application with the confidence of knowing that it has been tested using TH3. They simply cannot rerun those tests themselves without purchasing a TH3 license.</p> |
Changes to pages/threadsafe.in.
1 2 3 | <title>Using SQLite In Multi-Threaded Applications</title> <tcl>hd_keywords {threading mode}</tcl> | > | > | 1 2 3 4 5 6 7 8 9 10 11 12 13 | <title>Using SQLite In Multi-Threaded Applications</title> <tcl>hd_keywords {threading mode}</tcl> <fancy_format> <h1>Overview</h1> <p>SQLite supports three different threading modes:</p> <ol> <li><p><b>Single-thread</b>. In this mode, all mutexes are disabled and SQLite is unsafe to use in more than a single thread at once.</p></li> |
︙ | ︙ | |||
30 31 32 33 34 35 36 | overridden once selected. </p> <p> The default mode is serialized. </p> | | | 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | overridden once selected. </p> <p> The default mode is serialized. </p> <h1>Compile-time selection of threading mode</h1> <p> Use the [SQLITE_THREADSAFE] compile-time parameter to selected the threading mode. If no [SQLITE_THREADSAFE] compile-time parameter is present, then serialized mode is used. This can be made explicit with [SQLITE_THREADSAFE | -DSQLITE_THREADSAFE=1]. |
︙ | ︙ | |||
64 65 66 67 68 69 70 | <p> If single-thread mode is selected at compile-time, then critical mutexing logic is omitted from the build and it is impossible to enable either multi-thread or serialized modes at start-time or run-time. </p> | | | | 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 | <p> If single-thread mode is selected at compile-time, then critical mutexing logic is omitted from the build and it is impossible to enable either multi-thread or serialized modes at start-time or run-time. </p> <h1>Start-time selection of threading mode</h1> <p> Assuming that the compile-time threading mode is not single-thread, then the threading mode can be changed during initialization using the [sqlite3_config()] interface. The [SQLITE_CONFIG_SINGLETHREAD] verb puts SQLite into single-thread mode, the [SQLITE_CONFIG_MULTITHREAD] verb sets multi-thread mode, and the [SQLITE_CONFIG_SERIALIZED] verb sets serialized mode. </p> <h1>Run-time selection of threading mode</h1> <p>If single-thread mode has not been selected at compile-time or start-time, then individual database connections can be created as either multi-thread or serialized. It is not possible to downgrade an individual database connection to single-thread mode. Nor is it possible to escalate an individual database connection if the compile-time or start-time mode is single-thread.</p> |
︙ | ︙ |
Changes to pages/uri.in.
1 2 3 4 5 | <title>Uniform Resource Identifiers</title> <tcl> hd_keywords {URI} {Uniform Resource Identifier} {URI filename} {URI filenames} </tcl> | > > | | | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | <title>Uniform Resource Identifiers</title> <tcl> hd_keywords {URI} {Uniform Resource Identifier} {URI filename} {URI filenames} </tcl> <fancy_format> <h1>URI Filenames In SQLite</h1> <p> Beginning with [version 3.7.7], the SQLite database file argument to the [sqlite3_open()], [sqlite3_open16()], and [sqlite3_open_v2()] interfaces and to the [ATTACH] command can be specified either as an ordinary filename or as a Uniform Resource Identifier or URI. The advantage of using a URI filename is that query parameters on the URI can be used to control details of the newly created database connection. For example, an alternative [VFS] can be specified using a "vfs=" query parameter. Or the database can be opened read-only by using "mode=ro" as a query parameter. </p> <h1>Backwards Compatibility</h1> <p> ^In order to maintain full backwards compatibility for legacy applications, the URI filename capability is disabled by default. ^URI filenames can be enabled or disabled using the [SQLITE_USE_URI=1] or [SQLITE_USE_URI=0] compile-time options. ^The compile-time setting for URI filenames can be changed |
︙ | ︙ | |||
46 47 48 49 50 51 52 | with "<tt>file:</tt>" as an ordinary filename regardless of the URI setting, and because it is very unusual to have an actual file begin with "<tt>file:</tt>", it is safe for most applications to enable URI processing even if URI filenames are not currently being used. </p> | | | 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | with "<tt>file:</tt>" as an ordinary filename regardless of the URI setting, and because it is very unusual to have an actual file begin with "<tt>file:</tt>", it is safe for most applications to enable URI processing even if URI filenames are not currently being used. </p> <h1>URI Format</h1> <p> According to [http://tools.ietf.org/html/rfc3986 | RFC 3986], a URI consists of a scheme, an authority, a path, a query string, and a fragment. The scheme is always required. One of either the authority or the path is also always required. The query string and fragment are optional. </p> |
︙ | ︙ | |||
97 98 99 100 101 102 103 | <p>^A filename that is not a well-formed URI is interpreted as an ordinary filename.</p> <p>^URIs are processed as UTF8 text. ^The filename argument sqlite3_open16() is converted from UTF16 native byte order into UTF8 prior to processing. | | | 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 | <p>^A filename that is not a well-formed URI is interpreted as an ordinary filename.</p> <p>^URIs are processed as UTF8 text. ^The filename argument sqlite3_open16() is converted from UTF16 native byte order into UTF8 prior to processing. <h2>The URI Path</h2> <p>^The path component of the URI specifies the disk file that is the SQLite database to be opened. ^(If the path component is omitted, then the database is stored in a temporary file that will be automatically deleted when the database connection closes.)^ ^If the authority section is present, then the path is always an absolute pathname. ^If the authority section is omitted, then the path is an absolute pathname if it |
︙ | ︙ | |||
128 129 130 131 132 133 134 | <li>Convert all sequences of two or more "<tt>/</tt>" characters into a single "<tt>/</tt>" character. <li>On windows only, if the filename begins with a drive letter, prepend a single "<tt>/</tt>" character. <li>Prepend the "<tt>file:</tt>" scheme. </ol> | | | 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 | <li>Convert all sequences of two or more "<tt>/</tt>" characters into a single "<tt>/</tt>" character. <li>On windows only, if the filename begins with a drive letter, prepend a single "<tt>/</tt>" character. <li>Prepend the "<tt>file:</tt>" scheme. </ol> <h2>Query String</h2> <p>^A URI filename can optionally be followed by a query string. ^The query string consists of text following the first "<tt>?</tt>" character but excluding the optional fragment that begins with "<tt>#</tt>". ^The query string is divided into key/value pairs. We usually refer to these key/value pairs as "query parameters". ^Key/value pairs are separated by a single "<tt>&</tt>" character. |
︙ | ︙ | |||
155 156 157 158 159 160 161 | ^The list of query parameters appended to the xOpen filename is terminated by a single zero-length key. Note that the value of a query parameter can be an empty string. </p> <tcl>hd_fragment coreqp *coreqp {standard query parameters} {URI query parameters} \ {query parameters with special meaning to SQLite}</tcl> | | | 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 | ^The list of query parameters appended to the xOpen filename is terminated by a single zero-length key. Note that the value of a query parameter can be an empty string. </p> <tcl>hd_fragment coreqp *coreqp {standard query parameters} {URI query parameters} \ {query parameters with special meaning to SQLite}</tcl> <h2>Recognized Query Parameters</h2> <p> Some query parameters are interpreted by the SQLite core and used to modify the characteristics of the new connection. ^All query parameters are always passed through into the xOpen method of the [VFS] even if they are previously read and interpreted by the SQLite core. </p> |
︙ | ︙ | |||
228 229 230 231 232 233 234 | the [SQLITE_IOCAP_IMMUTABLE] bit in xDeviceCharacteristics) asserts that a database file is immutable and that file changes anyhow, then SQLite might return incorrect query results and/or [SQLITE_CORRUPT] errors. </dd> </dl> | | | 230 231 232 233 234 235 236 237 238 239 240 241 242 | the [SQLITE_IOCAP_IMMUTABLE] bit in xDeviceCharacteristics) asserts that a database file is immutable and that file changes anyhow, then SQLite might return incorrect query results and/or [SQLITE_CORRUPT] errors. </dd> </dl> <h1>See Also</h1> <ul> <li> [URI filenames in sqlite3_open()] <li> [URI filename examples] </ul> |
Changes to pages/wal.in.
1 2 3 | <title>Write-Ahead Logging</title> <tcl>hd_keywords {WAL} {write-ahead log} {WAL mode}</tcl> | > | > | 1 2 3 4 5 6 7 8 9 10 11 12 13 | <title>Write-Ahead Logging</title> <tcl>hd_keywords {WAL} {write-ahead log} {WAL mode}</tcl> <fancy_format> <h1>Overview</h1> <p>The default method by which SQLite implements [atomic commit | atomic commit and rollback] is a [rollback journal]. Beginning with [version 3.7.0], a new "Write-Ahead Log" option (hereafter referred to as "WAL") is available.</p> <p>There are advantages and disadvantages to using WAL instead of |
︙ | ︙ | |||
62 63 64 65 66 67 68 | It is recommended that one of the rollback journal modes be used for transactions larger than a few dozen megabytes.</s> Beginning with [version 3.11.0], WAL mode works as efficiently with large transactions as does rollback mode. </ol> | | | | 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 | It is recommended that one of the rollback journal modes be used for transactions larger than a few dozen megabytes.</s> Beginning with [version 3.11.0], WAL mode works as efficiently with large transactions as does rollback mode. </ol> <h1>How WAL Works</h1> <p>The traditional rollback journal works by writing a copy of the original unchanged database content into a separate rollback journal file and then writing changes directly into the database file. In the event of a crash or [ROLLBACK], the original content contained in the rollback journal is played back into the database file to revert the database file to its original state. The [COMMIT] occurs when the rollback journal is deleted.</p> <p>The WAL approach inverts this. The original content is preserved in the database file and the changes are appended into a separate WAL file. A [COMMIT] occurs when a special record indicating a commit is appended to the WAL. Thus a COMMIT can happen without ever writing to the original database, which allows readers to continue operating from the original unaltered database while changes are simultaneously being committed into the WAL. Multiple transactions can be appended to the end of a single WAL file.</p> <tcl>hd_fragment ckpt checkpoint checkpointed checkpointing</tcl> <h2>Checkpointing</h2> <p>Of course, one wants to eventually transfer all the transactions that are appended in the WAL file back into the original database. Moving the WAL file transactions back into the database is called a "<i>checkpoint</i>".<p> <p>Another way to think about the difference between rollback and |
︙ | ︙ | |||
106 107 108 109 110 111 112 | specify a different default.) Applications using WAL do not have to do anything in order to for these checkpoints to occur. But if they want to, applications can adjust the automatic checkpoint threshold. Or they can turn off the automatic checkpoints and run checkpoints during idle moments or in a separate thread or process.</p> <tcl>hd_fragment concurrency {WAL concurrency}</tcl> | | | 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 | specify a different default.) Applications using WAL do not have to do anything in order to for these checkpoints to occur. But if they want to, applications can adjust the automatic checkpoint threshold. Or they can turn off the automatic checkpoints and run checkpoints during idle moments or in a separate thread or process.</p> <tcl>hd_fragment concurrency {WAL concurrency}</tcl> <h2>Concurrency</h2> <p>When a read operation begins on a WAL-mode database, it first remembers the location of the last valid commit record in the WAL. Call this point the "end mark". Because the WAL can be growing and adding new commit records while various readers connect to the database, each reader can potentially have its own end mark. But for any particular reader, the end mark is unchanged for the duration of the |
︙ | ︙ | |||
160 161 162 163 164 165 166 | the checkpointer has made, and if the entire WAL has been transferred into the database and synced and if no readers are making use of the WAL, then the writer will rewind the WAL back to the beginning and start putting new transactions at the beginning of the WAL. This mechanism prevents a WAL file from growing without bound.</p> <tcl>hd_fragment fast</tcl> | | | 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 | the checkpointer has made, and if the entire WAL has been transferred into the database and synced and if no readers are making use of the WAL, then the writer will rewind the WAL back to the beginning and start putting new transactions at the beginning of the WAL. This mechanism prevents a WAL file from growing without bound.</p> <tcl>hd_fragment fast</tcl> <h2>Performance Considerations</h2> <p>Write transactions are very fast since they only involve writing the content once (versus twice for rollback-journal transactions) and because the writes are all sequential. Further, syncing the content to the disk is not required, as long as the application is willing to sacrifice durability following a power loss or hard reboot. (Writers sync the WAL on every transaction commit if |
︙ | ︙ | |||
233 234 235 236 237 238 239 | vary from one application to another depending on the relative read and write performance requirements of the application. The default strategy is to run a checkpoint once the WAL reaches 1000 pages and this strategy seems to work well in test applications on workstations, but other strategies might work better on different platforms or for different workloads.</p> | | | | | | | 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 | vary from one application to another depending on the relative read and write performance requirements of the application. The default strategy is to run a checkpoint once the WAL reaches 1000 pages and this strategy seems to work well in test applications on workstations, but other strategies might work better on different platforms or for different workloads.</p> <h1>Activating And Configuring WAL Mode</h1> <p>An SQLite database connection defaults to [journal_mode | journal_mode=DELETE]. To convert to WAL mode, use the following pragma:</p> <blockquote><pre> PRAGMA journal_mode=WAL; </pre></blockquote> <p>The journal_mode pragma returns a string which is the new journal mode. On success, the pragma will return the string "<tt>wal</tt>". If the conversion to WAL could not be completed (for example, if the [VFS] does not support the necessary shared-memory primitives) then the journaling mode will be unchanged and the string returned from the primitive will be the prior journaling mode (for example "<tt>delete</tt>"). <a name="how_to_checkpoint"></a> <h2>Automatic Checkpoint</h2> <p>By default, SQLite will automatically checkpoint whenever a [COMMIT] occurs that causes the WAL file to be 1000 pages or more in size, or when the last database connection on a database file closes. The default configuration is intended to work well for most applications. But programs that want more control can force a checkpoint using the [wal_checkpoint pragma] or by calling the [sqlite3_wal_checkpoint()] C interface. The automatic checkpoint threshold can be changed or automatic checkpointing can be completely disabled using the [wal_autocheckpoint pragma] or by calling the [sqlite3_wal_autocheckpoint()] C interface. A program can also use [sqlite3_wal_hook()] to register a callback to be invoked whenever any transaction commits to the WAL. This callback can then invoke [sqlite3_wal_checkpoint()] or [sqlite3_wal_checkpoint_v2()] based on whatever criteria it thinks is appropriate. (The automatic checkpoint mechanism is implemented as a simple wrapper around [sqlite3_wal_hook()].)</p> <h2>Application-Initiated Checkpoints</h2> <p>An application can initiate a checkpoint using any writable database connection on the database simply by invoking [sqlite3_wal_checkpoint()] or [sqlite3_wal_checkpoint_v2()]. There are three subtypes of checkpoints that vary in their aggressiveness: PASSIVE, FULL, and RESTART. The default checkpoint style is PASSIVE, which does as much work as it can without interfering with other database connections, and which might not run to completion if there are concurrent readers or writers. All checkpoints initiated by [sqlite3_wal_checkpoint()] and by the automatic checkpoint mechanism are PASSIVE. FULL and RESTART checkpoints try harder to run the checkpoint to completion and can only be initiated by a call to [sqlite3_wal_checkpoint_v2()]. See the [sqlite3_wal_checkpoint_v2()] documentation for additional information on FULL and RESET checkpoints. <h2>Persistence of WAL mode</h2> <p>Unlike the other journaling modes, [journal_mode | PRAGMA journal_mode=WAL] is persistent. If a process sets WAL mode, then closes and reopens the database, the database will come back in WAL mode. In contrast, if a process sets (for example) PRAGMA journal_mode=TRUNCATE and then closes and reopens the database will come back up in the default rollback mode of DELETE rather than the previous TRUNCATE setting.</p> <p>The persistence of WAL mode means that applications can be converted to using SQLite in WAL mode without making any changes to the application itself. One has merely to run "<tt>PRAGMA journal_mode=WAL;</tt>" on the database file(s) using the [command-line shell] or other utility, then restart the application.</p> <p>The WAL journal mode will be set on all connections to the same database file if it is set on any one connection. </p> <tcl>hd_fragment {readonly} {read-only WAL databases}</tcl> <h1>Read-Only Databases</h1> <p>No SQLite database (regardless of whether or not it is WAL mode) is readable if it is located on read-only media and it requires recovery. So, for example, if an application crashes and leaves an SQLite database with a [hot journal], that database cannot be opened unless the opening process has write privilege on the database file, the directory containing the database file, and the hot journal. This is because the |
︙ | ︙ | |||
354 355 356 357 358 359 360 | <p>Also, if multiple processes are to access a WAL mode database, then all processes should run under user or group IDs that give them write access to the database files, the WAL file, the shared memory <tt>-shm</tt> file, and the containing directory.</p> <tcl>hd_fragment bigwal {large WAL files} {avoiding large WAL files}</tcl> | | | 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 | <p>Also, if multiple processes are to access a WAL mode database, then all processes should run under user or group IDs that give them write access to the database files, the WAL file, the shared memory <tt>-shm</tt> file, and the containing directory.</p> <tcl>hd_fragment bigwal {large WAL files} {avoiding large WAL files}</tcl> <h1>Avoiding Excessively Large WAL Files</h1> <p>In normal cases, new content is appended to the WAL file until the WAL file accumulates about 1000 pages (and is thus about 4MB in size) at which point a checkpoint is automatically run and the WAL file is recycled. The checkpoint does not normally truncate the WAL file (unless the [journal_size_limit pragma] is set). Instead, it merely causes SQLite to start overwriting the WAL file from the beginning. |
︙ | ︙ | |||
428 429 430 431 432 433 434 | should be proportional in size to the transaction itself. Pages that are changed by the transaction should only be written into the WAL file once. However, with older versions of SQLite, the same page might be written into the WAL file multiple times if the transaction grows larger than the page cache. </ul> | | | 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 | should be proportional in size to the transaction itself. Pages that are changed by the transaction should only be written into the WAL file once. However, with older versions of SQLite, the same page might be written into the WAL file multiple times if the transaction grows larger than the page cache. </ul> <h1>Implementation Of Shared-Memory For The WAL-Index</h1> <p>The [wal-index] is implemented using an ordinary file that is mmapped for robustness. Early (pre-release) implementations of WAL mode stored the wal-index in volatile shared-memory, such as files created in /dev/shm on Linux or /tmp on other unix systems. The problem with that approach is that processes with a different root directory (changed via <a href="http://en.wikipedia.org/wiki/Chroot">chroot</a>) |
︙ | ︙ | |||
461 462 463 464 465 466 467 | shared memory is unacceptable can devise alternative methods via a custom [VFS]. For example, if it is known that a particular database will only be accessed by threads within a single process, the wal-index can be implemented using heap memory instead of true shared memory.</p> <tcl>hd_fragment noshm {WAL without shared memory}</tcl> | | | 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 | shared memory is unacceptable can devise alternative methods via a custom [VFS]. For example, if it is known that a particular database will only be accessed by threads within a single process, the wal-index can be implemented using heap memory instead of true shared memory.</p> <tcl>hd_fragment noshm {WAL without shared memory}</tcl> <h1>Use of WAL Without Shared-Memory</h1> <p>Beginning in SQLite version 3.7.4, ^WAL databases can be created, read, and written even if shared memory is unavailable as long as the [locking_mode] is set to EXCLUSIVE before the first attempted access. In other words, a process can interact with a WAL database without using shared memory if that process is guaranteed to be the only process accessing the database. |
︙ | ︙ | |||
498 499 500 501 502 503 504 | ^As long as exactly one connection is using a shared-memory wal-index, the locking mode can be changed freely between NORMAL and EXCLUSIVE. ^It is only when the shared-memory wal-index is omitted, when the locking mode is EXCLUSIVE prior to the first WAL-mode database access, that the locking mode is stuck in EXCLUSIVE.</p> <tcl>hd_fragment bkwrds {WAL backwards compatibility}</tcl> | | | 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 | ^As long as exactly one connection is using a shared-memory wal-index, the locking mode can be changed freely between NORMAL and EXCLUSIVE. ^It is only when the shared-memory wal-index is omitted, when the locking mode is EXCLUSIVE prior to the first WAL-mode database access, that the locking mode is stuck in EXCLUSIVE.</p> <tcl>hd_fragment bkwrds {WAL backwards compatibility}</tcl> <h1>Backwards Compatibility</h1> <p>The database file format is unchanged for WAL mode. However, the WAL file and the [wal-index] are new concepts and so older versions of SQLite will not know how to recover a crashed SQLite database that was operating in WAL mode when the crash occurred. ^(To prevent older versions of SQLite (prior to version 3.7.0, 2010-07-22) |
︙ | ︙ |
Changes to pages/withoutrowid.in.
1 2 3 4 | <title>The WITHOUT ROWID Optimization</title> <tcl> hd_keywords {WITHOUT rowid} {WITHOUT ROWID} </tcl> | | | | | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | <title>The WITHOUT ROWID Optimization</title> <tcl> hd_keywords {WITHOUT rowid} {WITHOUT ROWID} </tcl> <fancy_format> <h1>Introduction</h1> <p>^By default, every row in SQLite has a special column, usually called the "[rowid]", that uniquely identifies that row within the table. ^However if the phrase "WITHOUT ROWID" is added to the end of a [CREATE TABLE] statement, then the special "rowid" column is omitted. There are sometimes space and performance advantages to omitting the rowid.</p> <h2>Syntax</h2> <p>^(To create a WITHOUT ROWID table, simply add the keywords "WITHOUT ROWID" to the end of the [CREATE TABLE] statement. For example:</p> <blockquote><pre> CREATE TABLE IF NOT EXISTS wordcount( word TEXT PRIMARY KEY, |
︙ | ︙ | |||
31 32 33 34 35 36 37 | <p>Every WITHOUT ROWID table must have a [PRIMARY KEY]. ^An error is raised if a CREATE TABLE statement with the WITHOUT ROWID clause lacks a PRIMARY KEY. <p>In most contexts, the special "rowid" column of normal tables can also be called "oid" or "_rowid_". ^However, only "rowid" works as the keyword in the CREATE TABLE statement.</p> | | | | | 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 | <p>Every WITHOUT ROWID table must have a [PRIMARY KEY]. ^An error is raised if a CREATE TABLE statement with the WITHOUT ROWID clause lacks a PRIMARY KEY. <p>In most contexts, the special "rowid" column of normal tables can also be called "oid" or "_rowid_". ^However, only "rowid" works as the keyword in the CREATE TABLE statement.</p> <h2>Compatibility</h2> <p>SQLite [version 3.8.2] or later is necessary in order to use a WITHOUT ROWID table. An attempt to open a database that contains one or more WITHOUT ROWID tables using an earlier version of SQLite will result in a "malformed database schema" error.</p> <h2>Quirks</h2> <p>WITHOUT ROWID is found only in SQLite and is not compatible with any other SQL database engine, as far as we know. In an elegant system, all tables would behave as WITHOUT ROWID tables even without the WITHOUT ROWID keyword. However, when SQLite was first designed, it used only integer [rowid|rowids] for row keys to simplify the implementation. This approach worked well for many years. But as the demands on SQLite grew, the need for tables in which the PRIMARY KEY really did correspond to the underlying row key grew more acute. The WITHOUT ROWID concept was added in order to meet that need without breaking backwards compatibility with the billions of SQLite databases already in use at the time (circa 2013). <h1>Differences From Ordinary Rowid Tables</h1> <p>The WITHOUT ROWID syntax is an optimization. It provides no new capabilities. Anything that can be done using a WITHOUT ROWID table can also be done in exactly the same way, and exactly the same syntax, using an ordinary rowid table. The only advantage of a WITHOUT ROWID table is that it can sometimes use less disk space and/or perform a little faster than an ordinary rowid table.</p> |
︙ | ︙ | |||
131 132 133 134 135 136 137 | Hence, the update hook is not invoked when a WITHOUT ROWID table changes. <p>Note that since the [session] extension uses the update hook, that means that the session extension will not work correctly on a database that includes WITHOUT ROWID tables. </ol> <tcl>hd_fragment bene {benefits of using WITHOUT ROWID}</tcl> | | | 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 | Hence, the update hook is not invoked when a WITHOUT ROWID table changes. <p>Note that since the [session] extension uses the update hook, that means that the session extension will not work correctly on a database that includes WITHOUT ROWID tables. </ol> <tcl>hd_fragment bene {benefits of using WITHOUT ROWID}</tcl> <h1>Benefits Of WITHOUT ROWID Tables</h1> <p>A WITHOUT ROWID table is an optimization that can reduce storage and processing requirements. <p>In an ordinary SQLite table, the PRIMARY KEY is really just a [UNIQUE] index. The key used to look up records on disk is the [rowid]. |
︙ | ︙ | |||
206 207 208 209 210 211 212 | of disk space and can operate nearly twice as fast. Of course, in a real-world schema, there will typically be secondary indices and/or UNIQUE constraints, and the situation is more complicated. But even then, there can often be space and performance advantages to using WITHOUT ROWID on tables that have non-integer or composite PRIMARY KEYs. <tcl>hd_fragment wtu {when to use WITHOUT ROWID}</tcl> | | | 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 | of disk space and can operate nearly twice as fast. Of course, in a real-world schema, there will typically be secondary indices and/or UNIQUE constraints, and the situation is more complicated. But even then, there can often be space and performance advantages to using WITHOUT ROWID on tables that have non-integer or composite PRIMARY KEYs. <tcl>hd_fragment wtu {when to use WITHOUT ROWID}</tcl> <h1>When To Use WITHOUT ROWID</h1> <p>The WITHOUT ROWID optimization is likely to be helpful for tables that have non-integer or composite (multi-column) PRIMARY KEYs and that do not store large strings or BLOBs.</p> <p>WITHOUT ROWID tables will work correctly (that is to say, they provide the correct answer) for tables with a single INTEGER PRIMARY KEY. |
︙ | ︙ |
Changes to search/buildsearchdb.tcl.
︙ | ︙ | |||
44 45 46 47 48 49 50 51 52 53 54 55 56 57 | set nosearch(fileio.html) 1 set nosearch(btreemodule.html) 1 set nosearch(capi3ref.html) 1 set nosearch(changes.html) 1 set nosearch(fileformat2.html) 1 set nosearch(index.html) 1 set nosearch(docs.html) 1 set weight(chronology.html) 25 foreach f [glob *.html] { if {[string match lang_* $f]==0 && [info exists nosearch($f)]==0} { lappend lFiles $f } | > | 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 | set nosearch(fileio.html) 1 set nosearch(btreemodule.html) 1 set nosearch(capi3ref.html) 1 set nosearch(changes.html) 1 set nosearch(fileformat2.html) 1 set nosearch(index.html) 1 set nosearch(docs.html) 1 set nosearch(mingw.html) 1 set weight(chronology.html) 25 foreach f [glob *.html] { if {[string match lang_* $f]==0 && [info exists nosearch($f)]==0} { lappend lFiles $f } |
︙ | ︙ |