Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Comment: | Move fileio.html and fileformat.html from the other repository to this one. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
861079d8dcd21b67c81d72f5f6e19036 |
User & Date: | dan 2008-07-22 17:24:03.000 |
2008-07-22
| ||
17:40 | Change the requirement numbers used in fileformat.in and fileio.in to start with "H2". (check-in: c1bf78dd4f user: dan tags: trunk) | |
17:24 | Move fileio.html and fileformat.html from the other repository to this one. (check-in: 861079d8dc user: dan tags: trunk) | |
17:22 | (no comment) (check-in: 906aef1f58 user: dan tags: trunk) | |
cannot compute difference between binary files
cannot compute difference between binary files
cannot compute difference between binary files
cannot compute difference between binary files
cannot compute difference between binary files
cannot compute difference between binary files
cannot compute difference between binary files
cannot compute difference between binary files
cannot compute difference between binary files
cannot compute difference between binary files
cannot compute difference between binary files
cannot compute difference between binary files
cannot compute difference between binary files
cannot compute difference between binary files
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | /* Style for requirements paragraph. */ .req { margin: 1.0em 10ex; color: darkblue } .subreq { margin: 1.0em 5ex 1.0em 15ex; color: darkblue } .subsubreq { margin: 1.0em 5ex 1.0em 20ex; color: darkblue } .req:before { color: black; content: "[" attr(id) "] "} .subreq:before { color: black; content: "[" attr(id) "] "} .subsubreq:before { color: black; content: "[SUBSUBREQ] " } /* The dark-green color used for headings. */ h1,h2,h3,h4,#toc_header,#document_title { color: #80a796 } .req code {display: block; margin: 0.5em 5ex} code {white-space: pre} /* Style for content headings */ h2 { margin-left: 10px } h3 { margin-left: 20px } h4 { margin-left: 30px } h1,h2,h3,h4 { font-weight: normal } /* Style for document and toc headings */ #toc_header { font-size: 1.5em; margin: 1.0em; } #document_title { font-size: 2em; text-align: center } #toc a { color: darkblue ; text-decoration: none } /* Document font */ body { font-family: sans-serif } /* Margins for block boxes that occur in the document flow. */ p,ul,ol { margin: 1em 5ex } td p, td ul, td ol { margin: 1em auto } /* Table style */ table.striped, table#glossary { margin: 1em auto; width: 80% ; border-spacing: 0} .striped th, #glossary th { white-space:nowrap; text-align:left; border-bottom: solid 1px #444444; padding: 0.2em 1ex; } .striped td, #glossary td { vertical-align: top } .striped td, #glossary td { padding: 0.2em 1ex; } .spacedlist li { margin-top 0.5em ; margin-bottom: 0.5em } /* Style for "todo" notes. These are represented by markup like: ** ** <span class=todo>Fix this bit!</span> ** <p class=todo>Longer todo note.</p> */ .todo { color: #AA3333 ; font-style : italic } .todo:before { content: 'TODO:' } p.todo { border: solid #AA3333 1px; padding: 1ex } cite { font-style: normal; font-weight: normal } cite a { color: inherit; text-decoration: none } :link:hover,:visited:hover { background: wheat } img {display:block} |
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 | function populate_toc () { var children = document.getElementsByTagName("h1").item(0).parentNode.childNodes var toc = "" var counters = new Array() counters[1] = 0 counters[2] = 0 counters[3] = 0 counters[4] = 0 /* Generate the table of contents */ for(var ii=0; ii<children.length; ii++){ var node = children.item(ii) var iHeader = -1 if( node.tagName == "H1" ){ iHeader = 1 } if( node.tagName == "H2" ){ iHeader = 2 } if( node.tagName == "H3" ){ iHeader = 3 } if( node.tagName == "H4" ){ iHeader = 4 } if( iHeader>0 ){ var anchor = "tocentry_" + ii for(var jj=iHeader+1; jj<=4; jj++){ counters[jj] = 0 } counters[iHeader]++ var number = "" for(var jj=1; jj<=iHeader; jj++){ number += counters[jj] + "." } toc += '<div style="margin-left:' + (iHeader*6) + 'ex">' toc += '<a href="#' + anchor + '">' + number + " " + node.innerHTML toc += "</a></div>" var a = '<a style="color:inherit" name="' + anchor + '">' + number + '</a>' node.innerHTML = a + " " + node.innerHTML } } document.getElementById("toc").innerHTML = toc } function number_figs () { /* Number the figures in this document */ var figcounter = 1 var spans = document.getElementsByTagName("span") for(var ii=0; ii<spans.length; ii++){ var s = spans.item(ii) if( s.className=="fig" ){ s.innerHTML = figcounter figcounter++ } } } function populate_refs () { /* Fix up <cite> references */ var cites = document.getElementsByTagName("cite") for(var ii=0; ii<cites.length; ii++){ var t = cites.item(ii).innerHTML var h = document.getElementById(t) if( !h ){ alert("Bad reference: " + t) continue } var label if( h.tagName=="H1" || h.tagName=="H2" || h.tagName=="H3" || h.tagName=="H4" ){ label = h.firstChild.firstChild.data label = label.substring(0, label.length-1) } else { label = h.firstChild.data } cites.item(ii).innerHTML = '<a href="#' + t + '">' + label + '</a>' } } function decorate_tables () { /* Decorate tables */ var tables = document.getElementsByTagName("table") for(var ii=0; ii<tables.length; ii++){ var t = tables.item(ii) if( t.className!="striped" ) continue var rows = t.rows for(var jj=1; jj<rows.length; jj += 2){ rows.item(jj).style.backgroundColor = '#DDDDDD' } } } function check_for_duplicates () { var aReq = new Array(); var ps = document.getElementsByTagName("p") for(var ii=0; ii<ps.length; ii++){ var p = ps.item(ii) if( p.className!="req" || !p.id ) continue; if( aReq[p.id] ){ alert("Duplicate requirement number: " + p.id) } aReq[p.id] = 1; } } onload = function () { number_figs() populate_toc() populate_refs() decorate_tables() check_for_duplicates() } |
cannot compute difference between binary files
cannot compute difference between binary files
cannot compute difference between binary files
cannot compute difference between binary files
cannot compute difference between binary files
cannot compute difference between binary files
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head> <link type="text/css" rel="stylesheet" href="images/fileformat/rtdocs.css"> <script type="text/javascript" src=images/fileformat/rtdocs.js></script> </head> <div id=document_title>SQLite File IO Specification</div> <div id=toc_header>Table Of Contents</div> <div id=toc> <b>Javascript is required for some features of this document, including table of contents, figure numbering and internal references (section numbers and hyper-links. </b> </div> <!-- End of standard rt docs header --> <h1>Document Overview</h1> <h2>Scope and Purpose</h2> <p> The <i>SQLite File Database File Format</i> document <cite>ff_sqlitert_requirements</cite> contains a description of the serialized format used to store an SQLite database <span class=todo>ref</span>. This document describes, in detail, the methods used by SQLite to safely read and modify a database file stored in a file-system. <p> SQLite does not interact directly with the host operating system that provides the file-system service. Instead, the user is required to supply an adaptor component that implements the <i>SQLite Virtual File System</i> interface (described in <cite>capi_sqlitert_requirements</cite>) used by SQLite. The adaptor component is responsible for translating the calls made by SQLite to the calls to the operating system specific interface provided to access the file-system service. <center><img src="images/fileformat/vfs_role.gif"> <p><i>Figure <span class=fig id=figure_vfs_role></span> - Virtual File System (VFS) Adaptor</i> </center> <p> The descriptions in this document are of the way that SQLite invokes the <i>VFS API</i> provided by the adaptor component depicted in figure <cite>figure_vfs_role</cite>. <h2>Document Organization</h2> <h2>Glossary</h2> <p class=todo> After this document is ready, make the vocabulary consistent and then add a glossary here. <ul> <li>xFileControl <li>xDeviceCharacteristics <li>xSectorSize <li>xAccess <li>xOpen <li>xFullPathname <li>xClose </ul> <h1>VFS Adaptor Implementations</h1> <h2>SQLite File-System Usage </h2> <p> SQLite uses the file-system service made available via the <i>VFS adaptor</i> to create, read and modify files for a variety of purposes, including: <ul> <li> database files, <li> journal files, <li> master journal files, and <li> statement journal files. </ul> <p> A database file is a file-system file used to store an SQLite database image (see <cite>ff_sqlitert_requirements</cite>). <h2 id=fs_characteristics>File-System Characteristics</h2> <p> In the event of an operating system or power failure, the various combinations of file-system and storage hardware available provide varying levels of guarantee as to the integrity of the data written to the file system just before or during the failure. The exact combination of IO operations that SQLite is required to perform in order to safely modify a database file depend on the exact characteristics of the target platform. <p> SQLite queries an implementation for file-system characteristics using the xDeviceCharacteristics() and xSectorSize() methods of the database file file-handle. These two methods are only ever called on file-handles open on database files. They are not called for <i>journal files</i>, <i>master-journal files</i> or <i>temporary database files</i>. <p> The return value of the xSectorSize() method, the <i>sector-size</i>, is expected by SQLite to be a power of 2 value greater than or equal to 512. <p class=todo> What does it do if this is not the case? If the sector size is less than 512 then 512 is used instead. How about a non power-of-two value? UPDATE: How this situation is handled should be described in the API requirements. Here we can just refer to the other document. <p> SQLite assumes that files are stored and written to within the file-system as a collection of blocks (hereafter sectors) of data, each <i>sector-size</i> bytes in size. This model is used to derive the following assumptions related to the expected state of the file-system following a power failure or operating system crash. <ul> <li> After part or all of a file sector has been modified using the xWrite() method of an open file-handle, the sector is said to be in a transient state, where the operating system makes no guarantees about the actual content of the sector on the persistent media. The sector remains in the transient state until the next successful call to xSync() on the same file-handle returns. If a power failure or operating system crash occurs, then part or all of all sectors in the transient state when the crash occured may contain invalid data following system recovery. <li> Following a power failure or operating system crash, the content of all sectors that were not in a transient state when the crash occured may be trusted. </ul> <p class=todo> What do we assume about the other three file-system write operations - xTruncate(), xDelete() and "create file"? <p> For example, if the <i>sector-size</i> of a given file-system is 2048 bytes, and SQLite opens a file and writes a 1024 byte block of data to offset 3072 of the file, then according to the model the second sector of the file is in the transient state. If a power failure or operating system crash occurs before or during the next call to xSync() on the file handle, then following system recovery SQLite assumes that all file data between byte offsets 2048 and 4095, inclusive, is invalid. It also assumes that since the first sector of the file, containing the data from byte offset 0 to 2047 inclusive, is valid, since it was not in a transient state when the crash occured. <p> The xDeviceCharacteristics() method returns a set of flags, indicating which of the following properties (if any) the file-system provides: <ul> <li>The <b><i>sequential IO</i></b> property. If a file-system has this property, then in the event of a crash at most a single sector may contain invalid data. The file-system guarantees <li>The <b><i>safe-append</i></b> property. <li>The <b><i>atomic write</i></b> property. </ul> <p class=todo> Write an explanation as to how the file-system properties influence the model used to predict file damage after a catastrophy. <h1>Database Connections</h1> <p> Within this document, the term <i>database connection</i> has a slightly different meaning from that which one might assume. The handles returned by the <code>sqlite3_open()</code> and <code>sqlite3_open16()</code> APIs (<span class=todo>reference</span>) are referred to as <i>database handles</i>. A <i>database connection</i> is a connection to a single database file using a single file-handle, which is held open for the lifetime of the connection. Using the "ATTACH" syntax, multiple <i>database connections</i> may be accessed via a single <i>database handle</i>. Or, using SQLite's <i>shared-cache mode</i> feature, multiple <i>database handles</i> may access a single <i>database connection</i>. <center><img src="images/fileformat/db_connection.gif"> <p><i>Figure <span class=fig id=figure_db_connection></span> - Relationship between Database Connections and Database Handles.</i> </center> <p> As well as a file-handle open on the database file, each <i>database connection</i> has a <i>page cache</i> associated with it. The <i>page cache</i> is used to cache data read from the database file to reduce the amount of data that must be read from the file-handle. It is also used to accumulate data written to the database file so that write operations can be batched for greater efficiency. Figure <cite>figure_db_connection</cite> illustrates a system containing two database connections, each to a separate database file. The leftmost of the two depicted <i>database connections</i> is shared between two <i>database handles</i>. The connection illustrated towards the right of the diagram is used by a single <i>database handle</i>. <p> A database connection is always in one of the following states: <ol> <li><i>Unlocked state</i> (no transaction). <li><i>Shared lock state</i> (read-only transaction). <li><i>Reserved lock state</i> (read/write transaction). <li><i>Pending lock state</i> (read/write transaction). <li><i>Exclusive lock state</i> (read/write transaction). </ol> <p> Obviously, each state corresponds to the type of lock held on the database file. In some cases, various actions apart from simply obtaining the file-system lock must take place when a <i>database connection</i> transitions from one state to another. <p class=todo> Maybe a state diagram will be possible... <h2 id=open_new_connection>Opening a New Connection</h2> <p> Opening a new database connection is a two-step process: <ol> <li> A file-handle is opened on the database file. <li> If step 1 was successful, an attempt is made to read the <i>database file header</i> from the database file using the new file-handle. </ol> <p> In step 2 of the procedure above, the database file is not locked before it is read from. This is the only exception to the locking rules described in section <cite>reading_data</cite>. <p> One reason for attempting to read the <i>database file header</i> is to determine the <i>page-size</i> used by the database file. Because it is not possible to be certain as to the <i>page-size</i> without holding at least a <i>shared lock</i> on the database file (because some other <i>database connection</i> might have changed it since the <i>database file header</i> was read), the value read from the <i>database file header</i> is known as the <i>expected page size</i>. <p class=req id=FS0060> When a new <i>database connection</i> is required, SQLite shall attempt to open a file-handle on the database file. If the attempt fails, then no new <i>database connection</i> is created and an error returned. <p class=req id=FS0070> When a new <i>database connection</i> is required, after opening the new file-handle, SQLite shall attempt to read the first 100 bytes of the database file. If the attempt fails for any other reason than that the opened file is less than 100 bytes in size, then the file-handle is closed, no new <i>database connection</i> is created and an error returned instead. <p class=req id=FS0080> If the <i>database file header</i> is successfully read from a newly opened database file, the connections <i>expected page-size</i> shall be set to the value stored in the <i>page-size field</i> of the database header. <p class=req id=FS0090> If the <i>database file header</i> cannot be read from a newly opened database file (because the file is less than 100 bytes in size), the connections <i>expected page-size</i> shall be set to the compile time value of the SQLITE_DEFAULT_PAGESIZE option. <h2>Closing a Connection</h2> <p class=todo> Close file handles. Discard pager cache. </ol> <h2>The Page Cache</h2> <h1 id=reading_data>Reading Data</h1> <p> In order to return data from the database to the user, for example as the results of a SELECT query, SQLite must at some point read data from the database file. Usually, data is read from the database file in aligned blocks of <i>page-size</i> bytes. The exception is when the database file header fields are being inspected, before the <i>page-size</i> used by the database can be known. <p> With two exceptions, a <i>database connection</i> must have an open transaction (either a <i>read-only transaction</i> or a <i>read/write transaction</i>) on the database file before data may be read from the database connection. In this case, data "read from the database connection" includes data that is read from the database file and data that is already present in the <i>page cache</i>. Without an open transaction on the database file, the contents of the <i>page cache</i> may not be trusted. <p> The two exceptions are: <ul> <li> When an attempt is made to read the 100 byte <i>database file header</i> immediately after opening the <i>database connection</i> (see section <cite>open_new_connection</cite>). When this occurs no lock is held on the database file. <li> Data read while in the process of opening a read-only transaction (see section <cite>open_read_only_trans</cite>). These occur after a <i>shared lock</i> is held on the database file. </ul> <p> Once a transaction has been opened, reading data from a database connection is a simple operation. Using the xRead() method of the file-handle open on the database file, the required database file pages are read one at a time. SQLite never reads partial pages and always uses a single call to xRead() for each required page. After reading the data for a database page, SQLite adds it to the connections <i>page cache</i> so that it does not have to be read if required again. Refer to section <cite>page_cache_algorithms</cite> for a description of how this affects the IO performed by SQLite. <p class=req id=FS0010> Except for the read operation required by FS0070 and those reads made as part of opening a read-only transaction, SQLite shall only read data from a <i>database connection</i> while the <i>database connection</i> has an open read-only or read/write transaction. <p> In the above requirement, reading data from a database connection includes retrieving data from the connections <i>page cache</i>. <p class=req id=FS0020> Aside from those read operations described by FS0070 and FSXXXX, SQLite shall read data from the database in aligned blocks of <i>page-size</i> bytes, where <i>page-size</i> is the database page size used by the database file. <h2 id=open_read_only_trans>Opening a Read-Only Transaction</h2> <p> Before data may be read from a <i>database connection</i>, a <i>read-only transaction</i> must be successfully opened (this is true even if the connection will eventually write to the database, as a <i>read/write transaction</i> may only be opened by upgrading from a <i>read-only transaction</i>). This section describes the procedure for opening a <i>read-only transaction</i>. <p> The key element of a <i>read-only transaction</i> is that the file-handle open on the database file obtains and holds a <i>shared-lock</i> on the database file. Because a connection requires an <i>exclusive-lock</i> before it may actually modify the contents of the database file, and by definition while one connection is holding a <i>shared-lock</i> no other connection may hold an <i>exclusive-lock</i>, holding a <i>shared-lock</i> guarantees that no other process may modify the database file while the <i>read-only transaction</i> remains open. <p>Obtaining the <i>shared lock</i> itself on the database file is quite simple, SQLite just calls the xLock() method of the database file handle. Some of the other processes that take place as part of opening the <i>read-only transaction</i> are quite complex. The list of steps SQLite is required to take to open a <i>read-only transaction</i>, in the order in which the must occur, is as follows: <ol> <li>A <i>shared-lock</i> is obtained on the database file. <li>The connection checks if a <i>hot journal file</i> exists in the file-system. If one does, then it is rolled back before continuing. <li>The connection checks if the data in the <i>page cache</i> may still be trusted. If not, all page cache data is discarded. <li>If the file-size is not zero bytes and the page cache does not contain valid data for the first page of the database, then the data for the first page must be read from the database. </ol> <p> Of course, an error may occur while attempting any of the 4 steps enumerated above. If this happens, then the <i>shared-lock</i> is released (if it was obtained) and an error returned to the user. Step 2 of the procedure above is described in more detail in section <cite>hot_journal_detection</cite>. Section <cite>cache_validation</cite> describes the process identified by step 3 above. Further detail on step 4 may be found in section <cite>read_page_one</cite>. <p class=req id=FS0100> When required to open a <i>read-only transaction</i> using a <i>database connection</i>, SQLite shall first attempt to obtain a <i>shared-lock</i> on the file-handle open on the database file. <p class=req id=FS0110> If, while opening a <i>read-only transaction</i>, SQLite fails to obtain the <i>shared-lock</i> on the database file, then the process is abandoned, no transaction is opened and an error returned to the user. <p> The most common reason an attempt to obtain a <i>shared-lock</i> may fail is that some other connection is holding an <i>exclusive</i> or <i>pending lock</i>. However it may also fail because some other error (e.g. IO, comms related) occurs within the call to the xLock() method. <p class=req id=FS0030> While opening a <i>read-only transaction</i>, after successfully obtaining a <i>shared lock</i> on the database file, SQLite shall attempt to detect and roll back a <i>hot journal file</i> associated with the same database file. <p class=req id=FS0120> If, while opening a <i>read-only transaction</i>, SQLite encounters an error while attempting to detect or roll back a <i>hot journal file</i>, then the <i>shared-lock</i> on the database file is released, no transaction is opened and an error returned to the user. <p> Section <cite>hot_journal_detection</cite> contains a description of and requirements governing the detection of a hot-journal file refered to in the above requirements. <p class=req id=FS0040> Assuming no errors have occured, then after attempting to detect and roll back a <i>hot journal file</i>, if the connections <i>page cache</i> is not empty, then SQLite shall validate the contents of the <i>page cache</i> by testing the <i>file change counter</i>. This procedure is known as <i>cache validiation</i>. <p class=req id=FS0050> If the contents of the <i>page cache</i> are found to be invalid by the check prescribed by F20040, SQLite shall discard the cache contents before continuing. <h3 id=hot_journal_detection>Hot Journal Detection</h3> <p> This section describes the procedure that SQLite uses to detect a <i>hot journal file</i>. If a <i>hot journal file</i> is detected, this indicates that at some point the process of writing a transaction to the database was interrupted and a recovery operation (<i>hot journal rollback</i>) needs to take place. <p> The procedure used to detect a <i>hot-journal file</i> is quite complex. The following steps take place: <ol class=spacedlist> <li>Using the VFS xAccess() method, SQLite queries the file-system to see if the journal file associated with the database exists. If it does not, then there is no hot-journal file. <li>By invoking the xCheckReservedLock() method of the file-handle opened on the database file, SQLite checks if some other connection holds a <i>reserved lock</i> or greater. If some other connection does hold a <i>reserved lock</i>, this indicates that the other connection is midway through a <i>read/write transaction</i> (see section <cite>writing_data</cite>). In this case the <i>journal file</i> is not a <i>hot-journal</i> and must not be rolled back. <li>Using the xFileSize() method of the file-handle opened on the database file, SQLite checks if the database file is 0 bytes in size. If it is, the journal file is not considered to be a <i>hot journal</i> file. Instead of rolling back the journal file, in this case it is deleted from the file-system by calling the VFS xDelete() method. <span class=todo>Technically, there is a race condition here. This step should be moved to after the exclusive lock is held.</span> <li>An attempt is made to upgrade to an <i>exclusive lock</i> on the database file. If the attempt fails, then all locks, including the recently obtained <i>shared lock</i> are dropped. The attempt to open a <i>read-only transaction</i> has failed. This occurs when some other connection is also attempting to open a <i>read-only transaction</i> and the attempt to gain the <i>exclusive lock</i> fails because the other connection is also holding a <i>shared lock</i>. It is left to the other connection to roll back the <i>hot journal</i>. <div style="margin-top:0.5em"></div> It is important that the file-handle lock is upgraded directly from <i>shared</i> to <i>exclusive</i> in this case, instead of first upgrading to <i>reserved</i> or </i>pending</i> locks as is required when obtaining an <i>exclusive lock</i> to write to the database file (section <cite>writing_data</cite>). If SQLite were to first upgrade to a <i>reserved</i> or <i>pending</i> lock in this scenario, then a second process also trying to open a <i>read-transaction</i> on the database file might detect the <i>reserved</i> lock in step 2 of this process, conclude that there was no <i>hot journal</i>, and commence reading data from the <i>database file</i>. <li>The xAccess() method is invoked again to detect if the journal file is still in the file system. If it is, then it is a hot-journal file and SQLite tries to roll it back (see section <cite>rollback</cite>). </ol> <p> The following requirements describe step 1 of the above procedure in more detail. <p class=req id=FS0140> When required to attempt to detect a <i>hot-journal file</i>, SQLite shall first use the xAccess() method of the VFS layer to check if a journal file exists in the file-system. <p class=req id=FS0150> When required to attempt to detect a <i>hot-journal file</i>, if the call to xAccess() required by FS0140 indicates that a journal file does not exist, then the attempt to detect a <i>hot-journal file</i> is finished. A <i>hot-journal file</i> was not detected. <p> The following requirements describe step 2 of the above procedure in more detail. <p class=req id=FS0160> When required to attempt to detect a <i>hot-journal file</i>, if the call to xAccess() required by FS0140 indicates that a journal file is present, then the xCheckReservedLock() method of the database file file-handle is invoked to determine whether or not some other process is holding a <i>reserved</i> or greater lock on the database file. <p class=req id=FS0170> If the call to xCheckReservedLock() required by FS0160 indicates that some other <i>database connection</i> is holding a <i>reserved</i> or greater lock on the database file, <p class=todo> Finish this section. <h3 id=cache_validation>Cache Validation</h3> <p> When a <i>database connection</i> opens a <i>read transaction</i>, the associated <i>page cache</i> may already contain data. However, if another process has modified the database file since the cached pages were loaded it is possible that the cached data is invalid. <p> SQLite determines whether or not the contents of a <i>page cache</i> are valid or not using the <i>file change counter</i>, a field in the <i>database file header</i>. The <i>file change counter</i> is a 4-byte big-endian integer field stored starting at byte offset 24 of the <i>database file header</i>. Before the conclusion of a <i>read/write transaction</i> that modifies the contents of the database file in any way (see section <cite>writing_data</cite>), the value stored in the <i>file change counter</i> is incremented. When a <i>database connection</i> unlocks the database file, it stores the current value of the <i>file change counter</i>. Later, while opening a new <i>read-only transaction</i>, SQLite checks the value of the <i>file change counter</i> stored in the database file. If the value has not changed since the database file was unlocked, then the contents of the <i>page cache</i> can be trusted. If the value has changed, then the <i>page cache</i> cannot be trusted and all data is discarded. <p class=req id=FS0180> When a file-handle open on a database file is unlocked, if the <i>page cache</i> belonging to the associated <i>database connection</i> is not empty, SQLite shall store the value of the <i>file change counter</i> internally. <p class=req id=FS0190> When required to perform <i>cache validation</i> as part of opening a <i>read transaction</i>, SQLite shall read a 16 byte block starting at byte offset 24 of the <i>database file</i> using the xRead() method of the <i>database connections</i> file handle. <p class=todo> Why a 16 byte block? Why not 4? (something to do with encrypted databases). <p class=req id=FS0200> While performing <i>cache validation</i>, after loading the 16 byte block as required by FS0190, SQLite shall compare the 32-bit big-endian integer stored in the first 4 bytes of the block to the most recently stored value of the <i>file change counter</i> (see FS0180). If the values are not the same, then SQLite shall conclude that the contents of the cache are invalid. <p> Requirement FS0050 (section <cite>open_read_only_trans</cite>) specifies the action SQLite is required to take upon determining that the cache contents are invalid. <h3 id=read_page_one>Page 1 and the Expected Page Size</h3> <p> As the last step in opening a <i>read transaction</i> on a database file that is more than 0 bytes in size, SQLite is required to load data for page 1 of the database into the <i>page cache</i>, if it is not already there. This is slightly more complicated than it seems, as the database <i>page-size</i> is no known at this point. <p> Even though the database <i>page-size</i> cannot be known for sure, SQLite is usually able to guess correctly by assuming it to be equal to the connections <i>expected page size</i>. The <i>expected page size</i> is the value of the <i>page-size</i> field read from the <i>database file header</i> while opening the database connection (see section <cite>open_new_connection</cite>), or the <i>page-size</i> stored of the database file when the most <i>read transaction</i> was concluded. <p class=req id=FS0210> During the conclusing of a <i>read transaction</i>, before unlocking the database file, SQLite shall set the connections <i>expected page size</i> to the current database <i>page-size</i>. <p class=req id=FS0220> As part of opening a new <i>read transaction</i>, immediately after performing <i>cache validation</i>, if there is no data for database page 1 in the <i>page cache</i>, SQLite shall read <i>N</i> bytes from the start of the database file using the xRead() method of the connections file handle, where <i>N</i> is the connections current <i>expected page size</i> value. <p class=req id=FS0230> If page 1 data is read as required by FS0230, then the value of the <i>page-size</i> field that appears in the database file header that consumes the first 100 bytes of the read block is not the same as the connections current <i>expected page size</i>, then the <i>expected page size</i> is set to this value, the database file is unlocked and the entire procedure to open a <i>read transaction</i> is repeated. <p class=req id=FS0240> If page 1 data is read as required by FS0230, then the value of the <i>page-size</i> field that appears in the database file header that consumes the first 100 bytes of the read block is the same as the connections current <i>expected page size</i>, then the block of data read is added to the connections <i>page cache</i> as page 1. <h2>Ending a Read-only Transaction</h2> <p> To end a <i>read-only transaction</i>, SQLite simply relinquishes the <i>shared lock</i> on the file-handle open on the database file. No other action is required. <p class=req id=FS0130> When required to end a <i>read-only transaction</i>, SQLite shall relinquish the <i>shared lock</i> held on the database file by calling the xUnlock() method of the file-handle. <p> See also requirements FS0180 and FS0210 above. <h1 id=writing_data>Writing Data</h1> <p> Safely writing data to a database file is also a complex procedure. The database file must be updated in such a way that if a power failure, operating system crash or application fault occurs while SQLite is midway through writing to the database file the database contents are still accessible and correct after system recovery. <p> Logically, an SQLite database file is modified using <i>write transactions</i>. Each <i>write transaction</i> may contain any number of modifications to the database files content or size. From the point of view of an external observer (a second <i>database connection</i>) an entire <i>write transaction</i> is applied to the database file atomically. If a failure of some sort occurs while SQLite is midway through applying a <i>write transaction</i> to a database file, then it must appear from the point of view of the next <i>database connection</i> that reads data from the <i>database file</i> that the aborted transaction was not applied. <p> SQLite accomplishes these goals using two techniques: <ul class=spacedlist> <li>While modifying the content or size of a <i>database file</i> to apply a <i>write transaction</i>, SQLite maintains an <i>exclusive lock</i> on the <i>database file</i>. Because reading from the <i>database file</i> requires a <i>shared lock</i> (see section <cite>reading_data</cite>), and because holding an <i>exclusive lock</i> guarantees that no other <i>database connection</i> is holding or can obtain a <i>shared lock</i>, this ensures that no other connection may read data from the <i>database file</i> at a point when a <i>write transaction</i> has been partially applied. This alone ensures that the absence of an application or system failure <i>write transactions</i> appear to be atomically applied from the point of view of a second <i>database connection</i>. <li>In almost all cases, before the contents of a page of the <i>database file</i> may be modified or deleted by file truncation, the original contents of that page is stored in the <i>journal file</i>. Similarly, before the size of the <i>database file</i> may be modifed (either by extending or truncating the file), then the original size of the database file is stored in the <i>journal file</i>. If an application or system failure occurs while updating the <i>database file</i>, then the database file content may be restored based on the contents of the <i>journal file</i> before data is next read from it. Restoring the contents of a <i>database file</i> using the contents of the <i>journal file</i> after an application or system failure occurs is known as <i>hot-journal rollback</i> and is described in section <cite>hot_journal_rollback</cite>. The methods used by SQLite for detecting that <i>hot-journal rollback</i> is required is detailed in section <cite>hot_journal_detection</cite>. </ul> <p> The <i>page cache</i> belonging to the <i>database connection</i> is used to buffer writes before they are written to the <i>database file</i>. Often, all changes for an entire <i>write transaction</i> are accumulated within the <i>page cache</i>. In this case no write operations are performed on the database file until the user commits the transaction. <p> Even if an application or system failure does not occur while a <i>write transaction</i> is in progress, a rollback operation to restore the database file to the state that it was in before the transaction started may be required. This may occur if the user explicitly requests transaction rollback (i.e. by issuing a "ROLLBACK" command), or automatically, as a result of encountering an SQL constraint (see <cite>sql_sqlitert_requirements</cite>). For this reason, the original page content is stored in the <i>journal file</i> before the page is even modified within the <i>page cache</i>. <p class=req id=FS0250> Before modifying or adding any in-memory <i>page cache</i> pages in preparation for writing to the <i>database file</i>, the <i>database connection</i> shall open a <i>write transaction</i> on the database file. <p class=req id=FS0260> Before modifying the <i>page cache</i> image of a database page that existed and was not a <i>free-list leaf</i> page when the current <i>write transaction</i> began, SQLite shall ensure that the original page content has been written to the journal file (<i>journalled</i>). <p class=todo> If the sector size is larger than the page-size, coresident pages must also be journalled. <p> The process of <i>journalling a database page</i> is described in detail in section <cite>journalling_a_page</cite>. <p> Eventually, the content of pages modified by a transaction must be copied from the <i>page cache</i> and into the actual database file. This may occur for either of the following two reasons: <ol> <li> Because the <i>write transaction</i> is being committed (section <cite>committing_a_transaction</cite>), or <li> To free up memory if the number of modified pages grows too large (see section <cite>page_cache_algorithms</cite>). </ol> <p> In both cases, the region of the journal file containing the original data for the pages being modified within the database file must be flushed through to the persistent media before the database file may be written to. This is to ensure that the original data is recoverable in the event of a system failure. This process is known as <i>syncing the journal file</i> and is described in section <cite>syncing_journal_file</cite>. <p> A <i>write transaction</i> may be terminated in one of two ways. It may be committed, meaning that the changes involved in the transaction are written to the database file, or rolled back, meaning no changes are applied. Committing a transaction is described in section <cite>committing_a_transaction</cite>. Transaction rollback is described in section <cite>rollback</cite>. <p> Figure <cite>figure_write_transaction</cite> depicts an overview of an entire <i>write transaction</i>. This is intended to be illustrative only, many operations are omitted. <center><img src="images/fileformat/write_transaction.gif"> <p><i>Figure <span class=fig id=figure_write_transaction></span> - Progression of a Write Transaction</i> </center> <h2>Journal File Format</h2> <h3 id=journal_header_format>Journal Header Format</h3> <p> A <i>journal header</i> is <i>sector-size</i> bytes in size, where <i> sector-size</i> is the value returned by the xSectorSize method of the file handle opened on the database file. <center><img src="images/fileformat/journal_header.gif"> <p><i>Figure <span class=fig id=figure_journal_record></span> - Journal Header Format</i> </center> <h3 id=journal_record_format>Journal Record Format</h3> <center><img src="images/fileformat/journal_record.gif"> <p><i>Figure <span class=fig id=figure_journal_record></span> - Journal Record Format</i> </center> <h3>Master Journal Pointer</h3> <h2>Write Transactions</h2> <h3>Beginning a Write Transaction</h3> <p> Before any database pages may be modified within the <i>page cache</i>, the <i>database connection</i> must open a <i>write transaction</i>. Opening a <i>write transaction</i> requires that the <i>database connection</i> obtains a <i>reserved lock</i> (or greater) on the <i>database file</i>. Because a obtaining a <i>reserved lock</i> on a <i>database file</i> guarantees that no other <i>database connection</i> may hold or obtain a <i>reserved lock</i> or greater, it follows that no other <i>database connection</i> may have an open <i>write transaction</i>. <p> A <i>reserved lock</i> on the <i>database file</i> may be thought of as an exclusive lock on the <i>journal file</i>. No <i>database connection</i> may read from or write to a <i>journal file</i> without a <i>reserved</i> or greater lock on the corresponding <i>database file</i>. <p> Before opening a <i>write transaction</i>, a <i>database connection</i> must have an open <i>read transaction</i>, opened via the procedure described in section <cite>open_read_only_trans</cite>. This ensures that there is no <i>hot-journal file</i> that needs to be rolled back and that the content of the <i>page cache</i>, if any, can be trusted. <p> Once a <i>read transaction</i> has been opened, upgrading to a <i>write transaction</i> is a two step process, as follows: <ol> <li>A <i>reserved lock</i> is obtained on the <i>database file</i>. <li>The <i>journal file</i> is opened and created if necessary (using the VFS xOpen method), and a <i>journal file header</i> written to the start of it using a single call to the file handles xWrite method. </ol> <p> Requirements describing step 1 of the above procedure in detail: <p class=req id=FS0350> When required to open a <i>write transaction</i> on the database, SQLite shall first open a <i>read transaction</i>, if the <i>database connection</i> in question has not already opened one. <p class=req id=FS0360> When required to open a <i>write transaction</i> on the database, after ensuring a <i>read transaction</i> has already been opened, SQLite shall obtain a <i>reserved lock</i> on the database file by calling the xLock method of the file-handle open on the database file. <p> Requirements describing step 2 of the above procedure in detail: <p class=req id=FS0370> When required to open a <i>write transaction</i> on the database, after obtaining a <i>reserved lock</i> on the database file, SQLite shall open a read/write file-handle on the corresponding <i>journal file</i>. <p class=req id=FS0380> When required to open a <i>write transaction</i> on the database, after opening a file-handle on the <i>journal file</i>, SQLite shall write a <i>journal header</i> into the first <i>sector-size</i> bytes of the journal file, using single call to the xWrite method of the recently opened file-handle. <p> Requirements describing the <i>journal header</i> written to the <i>journal file</i>: <p class=req id=FS0390> The first 8 bytes of the <i>journal header</i> required to be written by FS0380 shall be: <p class=todo> Reqirements describing the details of opening a <i>write transaction</i>. <p class=todo> Reqirement for error handling? <h3 id=journalling_a_page>Journalling a Database Page</h3> <p> Before modifying a database page within the <i>page cache</i>, the page must be <i>journalled</i>. <i>Journalling a page</i> is the process of copying that pages original data into the journal file so that it can be recovered if the <i>write transaction</i> is rolled back. <p> A page is journalled by adding a <i>journal record</i> to the <i> journal file</i>. The format of a <i>journal record</i> is described in section <cite>journal_record_format</cite>. <p class=req id=FS0270> When required to <i>journal a database page</i>, SQLite shall first append the <i>page number</i> of the page being journalled to the <i>journal file</i>, formatted as a 4-byte big-endian unsigned integer, using a single call to the xWrite method of the file-handle opened on the journal file. <p class=req id=FS0280> When required to <i>journal a database page</i>, if the attempt to append the <i>page number</i> to the journal file is successful, then the current page data (<i>page-size</i> bytes) shall be appended to the journal file, using a single call to the xWrite method of the file-handle opened on the journal file. <p class=req id=FS0290> When required to <i>journal a database page</i>, if the attempt to append the current page data to the journal file is successful, then SQLite shall append a 4-byte big-endian integer checksum value to the to the journal file, using a single call to the xWrite method of the file-handle opened on the journal file. <p> The checksum value written to the <i>journal file</i> immediately after the page data (requirement FS0290), is a function of both the page data and the <i>checksum initializer</i> field stored in the <i>journal header</i> (see section <cite>journal_header_format</cite>). Specifically, it is the sum of the <i>checksum initializer</i> and the value of every 200th byte of page data interpreted as an 8-bit unsigned integer, starting with the (<i>page-size</i> % 200)'th byte of page data. For example, if the <i>page-size</i> is 1024 bytes, then a checksum is calculated by adding the values of the bytes at offsets 23, 223, 423, 623, 823 and 1023 (the last byte of the page) together with the value of the <i>checksum initializer</i>. <p class=req id=FS0300> The checksum value written to the <i>journal file</i> by the write required by FS0290 shall be equal to the sum of the <i>checksum initializer</i> field stored in the <i>journal header</i> (FS0XXX) and every 200th byte of the page data, beginning with the (<i>page-size</i> % 200)th byte. <p> The '%' character is used in the two paragraphs to represent the modulo operator, just as it is in programming languages such as C, Java and Javascript. <h3 id=syncing_journal_file>Syncing the Journal File</h3> <p> Even after the original data of a database page has been written into the journal file using calls to the journal file file-handle xWrite method (section <cite>journalling_a_page</cite>), it is still not safe to write to the page within the database file. This is because in the event of a system failure the data written to the journal file may still be corrupted (see section <cite>fs_characteristics</cite>). Before the page can be updated within the database itself, the following procedure takes place: <ol> <li> The xSync method of the file-handle opened on the journal file is called. This operation ensures that all <i>journal records</i> in the journal file have been written to persistent storage, and that they will not become corrupted as a result of a subsequent system failure. <li> The <i>journal record count</i> field (see section <cite>journal_header_format</cite>) of the most recently written journal header in the journal file is updated to contain the number of <i>journal records</i> added to the journal file since the header was written. <li> The xSync method is called again, to ensure that the update to the <i>journal record count</i> has been committed to persistent storage. </ol> <p> If all three of the steps enumerated above are executed succesfully, then it is safe to modify the content of the <i>journalled</i> database pages within the database file itself. The combination of the three steps above is refered to as <i>syncing the journal file</i>. <h3 id=upgrading_to_exclusive_lock>Upgrading to an Exclusive Lock</h3> <p> Before the content of a page modified within the <i>page cache</i> may be written to the database file, an <i>exclusive lock</i> must be held on the database file. The purpose of this lock is to prevent another connection from reading from the database file while the first connection is midway through writing to it. Whether the reason for writing to the database file is because a transaction is being committed, or to free up space within the <i>page cache</i>, upgrading to an <i>exclusive lock</i> always occurs immediately after <i>syncing the journal file</i>. <h3 id=committing_a_transaction>Committing a Transaction</h3> <p> Committing a <i>write transaction</i> is the final step in updating the database file. <ol> <li> Update the change counter. <li> Sync the journal file. <li> Obtain exclusive lock <li> Write the database file. <li> Sync the database file. <li> Delete the journal file. </ol> <h4 id=writing_out_cache>Writing out the Page Cache</h4> <p> When a modification is made to the database, the change is first applied in-memory, to pages stored in the <i>page cache</i>. The process of copying the modified pages from the <i>page cache</i> to the actual database file in the file system is known as <i>writing out the page cache</i>. There are two circumstances in which this may occur: <ol> <li> Because the <i>write transaction</i> is being committed (section <cite>committing_a_transaction</cite>), or <li> To free up memory if the number of modified pages grows too large (see section <cite>page_cache_algorithms</cite>). </ol> <p> Before any data can be written into the database file, it must be locked with an <i>exclusive</i> lock if it is not already. This is to prevent any other <i>database connection</i> from reading the database after a subset of the modifications that have been or will be made by a <i>write transaction</i> have been written into the database file. <p class=todo> Journal header operations? <p class=req id=FS0310> Unless a <i>pending</i> or <i>exclusive</i> lock has already been obtained, when SQLite is required to <i>write out a page cache</i>, it shall first upgrade the lock on the database file to a <i>pending lock</i> using a call to the xLock method of the file-handle open on the database file. <p class=req id=FS0320> Unless one has already been obtained, when SQLite is required to <i>write out a page cache</i>, after successfully obtaining a <i>pending lock</i> it shall upgrade the lock on the database file to an <i>exclusive lock</i> using a call to the xLock method of the file-handle open on the database file. <p class=todo> If obtaining the lock fails? <p class=req id=FS0330> When SQLite is required to <i>write out a page cache</i>, if the required <i>exclusive lock</i> is already held or successfully obtained, SQLite shall copy the contents of all pages that have been modified within the <i>page cache</i> to the database file, using a single write of <i>page-size</i> bytes for each. <p class=req id=FS0340> When the modified contents of a <i>page cache</i> is copied into the database file, as required by FS0330, the write operations shall occur in <i>page number</i> order, from lowest to highest. <p> The above requirement to write data to the database file in the order in which it occurs in the file is added to improve performance. On many systems, sorting the regions of the file to be written before writing to them allows the storage hardware to operate more efficiently. <h2>Statement Transactions</h2> <h2>Multi-File Transactions</h2> <h1 id=rollback>Rollback</h1> <h2 id=hot_journal_rollback>Hot Journal Rollback</h2> <h2>Transaction Rollback</h2> <h2>Statement Rollback</h2> <h1 id=page_cache_algorithms>Page Cache Algorithms</h1> <h1>References</h1> <table id="refs" style="width:auto; margin: 1em 5ex"> <tr><td style="width:5ex" id="capi_sqlitert_requirements">[1]<td> C API Requirements Document. <tr><td style="width:5ex" id="sql_sqlitert_requirements">[2]<td> SQL Requirements Document. <tr><td style="width:5ex" id="ff_sqlitert_requirements">[3]<td> File Format Requirements Document. </table> |