SQLite: Check-in [a71a22b52f]

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview

Comment:	Add a large comment to wal.c describing the WAL and wal-index file formats.
Downloads:	Tarball \| ZIP archive
Timelines:	family \| ancestors \| descendants \| both \| trunk
Files:	files \| file ages \| folders
SHA1:	a71a22b52f4570e934063553a81b39268127dc44
User & Date:	drh 2010-05-19 01:53:54.000

Context

2010-05-19
17:49		Refactoring some variable names in wal.c. (check-in: 1d201ff51f user: drh tags: trunk)
01:53		Add a large comment to wal.c describing the WAL and wal-index file formats. (check-in: a71a22b52f user: drh tags: trunk)
2010-05-18
23:29		Update the wal-index hash format so that hash-table space is reused following a rollback, thus preventing hash table overflows. Add assert()s to verify that hash tables do not overfill. Further refactoring of the wal-index code. (check-in: ada9a8c7b6 user: drh tags: trunk)

Changes

Changes to src/wal.c.

1 2 3 4 5 6 7 8 9 10 11 12 ~~13 14~~ 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 ~~65 66~~ 67 68 69 70 71 72 73	/* 2010 February 1 The author disclaims copyright to this source code. In place of a legal notice, here is a blessing: May you do good and not evil. May you find forgiveness for yourself and forgive others. May you share freely, never taking more than you give. ********************************************************************* This file contains the implementation of a write-ahead log ~~file~~ used in "journal_mode=~~wal~~" mode. WRITE-AHEAD LOG (WAL) FILE FORMAT A wal file consists of a header followed by zero or more "frames". Each frame records the revised content of a single page ~~within~~ the database file. All changes to the database are recorded by writing frames into the WAL. Transactions commit when a frame is written that contains a commit marker. A single WAL can and usually does record multiple transactions. Periodically, the content of the WAL is transferred back into the database file in an operation called a "checkpoint". A single WAL file can be used multiple times. In other words, the WAL can fill up with frames and then be checkpointed~~. T~~hen new frames can overwrite the old ones. A WAL always grows from beginning toward the end. Checksums and counters attached to each frame are used to determine which frames within the WAL are valid and which are leftovers from prior checkpoints. The WAL header is 12 bytes in size and consists of the following three big-endian 32-bit unsigned integer values: 0: Database page size, 4: Randomly selected salt value 1, 8: Randomly selected salt value 2. Immediately following the header are zero or more frames. Each frame ~~itself~~ consists of a 16-byte header followed by a <page-size> bytes of page data. The header is broken into 4 big-endian 32-bit unsigned integer values, as follows: 0: Page number. 4: For commit records, the size of the database image in pages after the commit. For all other records, zero. 8: Checksum value 1. 12: Checksum value 2. READER ALGORITHM To read a page from the database (call it page number P), a reader first checks the WAL to see if it contains page P. If so, then the last valid instance of page P that is or is followed by a commit frame become the value read. If the WAL contains no copies of page P that are valid and which are or are followed by a commit frame, then page P is read from the database file. The reader algorithm in the previous paragraph works correctly, but because frames for page P can appear anywhere within the WAL, the reader has to scan the e~~ither~~ WAL looking for page P frames. If the WAL is large (multiple megabytes is typical) that scan can be slow, and read performanc suffers. To overcome this problem, a separate datastructure called the wal-index is maintained to expedite the search for frames of a particular page. WAL-INDEX FORMAT Conceptually, the wal-index is shared memory, though VFS implementations might choose to implement the wal-index using a mmapped file. Because ** the wal-index is shared memory, SQLite does not support journal_mode=WAL	\| \| \| \| \| \| \| \|	1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73	/* 2010 February 1 The author disclaims copyright to this source code. In place of a legal notice, here is a blessing: May you do good and not evil. May you find forgiveness for yourself and forgive others. May you share freely, never taking more than you give. ********************************************************************* This file contains the implementation of a write-ahead log (WAL) used in "journal_mode=WAL" mode. WRITE-AHEAD LOG (WAL) FILE FORMAT A wal file consists of a header followed by zero or more "frames". Each frame records the revised content of a single page from the database file. All changes to the database are recorded by writing frames into the WAL. Transactions commit when a frame is written that contains a commit marker. A single WAL can and usually does record multiple transactions. Periodically, the content of the WAL is transferred back into the database file in an operation called a "checkpoint". A single WAL file can be used multiple times. In other words, the WAL can fill up with frames and then be checkpointed and then new frames can overwrite the old ones. A WAL always grows from beginning toward the end. Checksums and counters attached to each frame are used to determine which frames within the WAL are valid and which are leftovers from prior checkpoints. The WAL header is 12 bytes in size and consists of the following three big-endian 32-bit unsigned integer values: 0: Database page size, 4: Randomly selected salt value 1, 8: Randomly selected salt value 2. Immediately following the header are zero or more frames. Each frame consists of a 16-byte header followed by a <page-size> bytes of page data. The header is broken into 4 big-endian 32-bit unsigned integer values, as follows: 0: Page number. 4: For commit records, the size of the database image in pages after the commit. For all other records, zero. 8: Checksum value 1. 12: Checksum value 2. READER ALGORITHM To read a page from the database (call it page number P), a reader first checks the WAL to see if it contains page P. If so, then the last valid instance of page P that is or is followed by a commit frame become the value read. If the WAL contains no copies of page P that are valid and which are or are followed by a commit frame, then page P is read from the database file. The reader algorithm in the previous paragraph works correctly, but because frames for page P can appear anywhere within the WAL, the reader has to scan the entire WAL looking for page P frames. If the WAL is large (multiple megabytes is typical) that scan can be slow, and read performance suffers. To overcome this problem, a separate data structure called the wal-index is maintained to expedite the search for frames of a particular page. WAL-INDEX FORMAT Conceptually, the wal-index is shared memory, though VFS implementations might choose to implement the wal-index using a mmapped file. Because ** the wal-index is shared memory, SQLite does not support journal_mode=WAL
︙			︙
86 87 88 89 90 91 92 93 94 95 96 97 98 99 100	The purpose of the wal-index is to answer this question quickly: Given a page number P, return the index of the last frame for page P in the WAL, or return NULL if there are no frames for page P in the WAL. The wal-index consists of a header region, followed by an one or more index blocks. ~~To b~~e ~~completed...~~. / #ifndef SQLITE_OMIT_WAL #include "wal.h" / Object declarations */	> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > \| > > > > > > > > > > > > > > > > > > > > > > > > > > >	86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177	The purpose of the wal-index is to answer this question quickly: Given a page number P, return the index of the last frame for page P in the WAL, or return NULL if there are no frames for page P in the WAL. The wal-index consists of a header region, followed by an one or more index blocks. The wal-index header contains the total number of frames within the WAL in the the mxFrame field. Each index block contains information on HASHTABLE_NPAGE frames. Each index block contains two sections, a mapping which is a database page number for each frame, and a hash table used to look up frames by page number. The mapping section is an array of HASHTABLE_NPAGE 32-bit page numbers. The first entry on the array is the page number for the first frame; the second entry is the page number for the second frame; and so forth. The last index block holds a total of (mxFrame%HASHTABLE_NPAGE) page numbers. All index blocks other than the last are completely full with HASHTABLE_NPAGE page numbers. All index blocks are the same size; the mapping section of the last index block merely contains unused entries if mxFrame is not an even multiple of HASHTABLE_NPAGE. Even without using the hash table, the last frame for page P can be found by scanning the mapping sections of each index block starting with the last index block and moving toward the first, and within each index block, starting at the end and moving toward the beginning. The first entry that equals P corresponds to the frame holding the content for that page. The hash table consists of HASHTABLE_NSLOT 16-bit unsigned integers. HASHTABLE_NSLOT = 2HASHTABLE_NPAGE, and there is one entry in the * hash table for each page number in the mapping section, so the hash table is never more than half full. The expected number of collisions prior to finding a match is 1. Each entry of the hash table is an 1-based index of an entry in the mapping section of the same index block. Let K be the 1-based index of the largest entry in the mapping section. (For index blocks other than the last, K will always be exactly HASHTABLE_NPAGE (4096) and for the last index block K will be (mxFrame%HASHTABLE_NPAGE).) Unused slots of the hash table contain a value greater than K. Note that no hash table slot ever contains a zero value. To look for page P in the hash table, first compute a hash iKey on P as follows: iKey = (P * 383) % HASHTABLE_NSLOT Then start scanning entries of the hash table, starting with iKey (wrapping around to the beginning when the end of the hash table is reached) until an unused hash slot is found. Let the first unused slot be at index iUnused. (iUnused might be less than iKey if there was wrap-around.) Because the hash table is never more than half full, the search is guaranteed to eventually hit an unused entry. Let iMax be the value between iKey and iUnused, closest to iUnused, where aHash[iMax]==P. If there is no iMax entry (if there exists no hash slot such that aHash[i]==p) then page P is not in the current index block. Otherwise the iMax-th mapping entry of the current index block corresponds to the last entry that references page P. A hash search begins with the last index block and moves toward the first index block, looking for entries corresponding to page P. On average, only two or three slots in each index block need to be examined in order to either find the last entry for page P, or to establish that no such entry exists in the block. Each index block holds over 4000 entries. So two or three index blocks are sufficient to cover a typical 10 megabyte WAL file, assuming 1K pages. 8 or 10 comparisons (on average) suffice to either locate a frame in the WAL or to establish that the frame does not exist in the WAL. This is much faster than scanning the entire 10MB WAL. Note that entries are added in order of increasing K. Hence, one reader might be using some value K0 and a second reader that started at a later time (after additional transactions were added to the WAL and to the wal-index) might be using a different value K1, where K1>K0. Both readers can use the same hash table and mapping section to get the correct result. There may be entries in the hash table with K>K0 but to the first reader, those entries will appear to be unused slots in the hash table and so the first reader will get an answer as if no values greater than K0 had ever been inserted into the hash table in the first place - which is what reader one wants. Meanwhile, the second reader using K1 will see additional values that were inserted later, which is exactly what reader two wants. When a rollback occurs, the value of K is decreased. This has the effect of automatically removing entries from the hash table. / #ifndef SQLITE_OMIT_WAL #include "wal.h" / Object declarations */
︙			︙
108 109 110 111 112 113 114 ~~115~~ 116 117 118 119 120 121 122	Member variables iCheck1 and iCheck2 contain the checksum for the last frame written to the wal, or 2 and 3 respectively if the log ** is currently empty. / struct WalIndexHdr { u32 iChange; / Counter incremented each transaction / u32 pgsz; / Database page size in bytes / ~~u32 iLa~~stPg~~; / Index of last valid frame in the WAL /~~ u32 nPage; / Size of database in pages / u32 iCheck1; / Checkpoint value 1 / u32 iCheck2; / Checkpoint value 2 / }; / Size of serialized WalIndexHdr object. */ #define WALINDEX_HDR_NFIELD (sizeof(WalIndexHdr) / sizeof(u32))	\|	185 186 187 188 189 190 191 192 193 194 195 196 197 198 199	Member variables iCheck1 and iCheck2 contain the checksum for the last frame written to the wal, or 2 and 3 respectively if the log ** is currently empty. / struct WalIndexHdr { u32 iChange; / Counter incremented each transaction / u32 pgsz; / Database page size in bytes / u32 mxFrame; / Index of last valid frame in the WAL / u32 nPage; / Size of database in pages / u32 iCheck1; / Checkpoint value 1 / u32 iCheck2; / Checkpoint value 2 / }; / Size of serialized WalIndexHdr object. */ #define WALINDEX_HDR_NFIELD (sizeof(WalIndexHdr) / sizeof(u32))
︙			︙
616 617 618 619 620 621 622 ~~623~~ 624 625 626 627 628 629 630 631 632 633 634 635 ~~636~~ 637 638 639 640 641 642 643	rc = walIndexAppend(pWal, ++iFrame, pgno); if( rc!=SQLITE_OK ) break; /* If nTruncate is non-zero, this is a commit record. */ if( nTruncate ){ hdr.iCheck1 = aCksum[0]; hdr.iCheck2 = aCksum[1]; ~~hdr.iLa~~stPg~~ = iFrame;~~ hdr.nPage = nTruncate; hdr.pgsz = nPgsz; } } sqlite3_free(aFrame); }else{ hdr.iCheck1 = 2; hdr.iCheck2 = 3; } finished: ~~if( rc==SQLITE_OK && hdr.iLa~~stPg~~==0 ){~~ rc = walIndexRemap(pWal, WALINDEX_MMAP_INCREMENT); } if( rc==SQLITE_OK ){ walIndexWriteHdr(pWal, &hdr); memcpy(&pWal->hdr, &hdr, sizeof(hdr)); } return rc;	\| \|	693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720	rc = walIndexAppend(pWal, ++iFrame, pgno); if( rc!=SQLITE_OK ) break; /* If nTruncate is non-zero, this is a commit record. */ if( nTruncate ){ hdr.iCheck1 = aCksum[0]; hdr.iCheck2 = aCksum[1]; hdr.mxFrame = iFrame; hdr.nPage = nTruncate; hdr.pgsz = nPgsz; } } sqlite3_free(aFrame); }else{ hdr.iCheck1 = 2; hdr.iCheck2 = 3; } finished: if( rc==SQLITE_OK && hdr.mxFrame==0 ){ rc = walIndexRemap(pWal, WALINDEX_MMAP_INCREMENT); } if( rc==SQLITE_OK ){ walIndexWriteHdr(pWal, &hdr); memcpy(&pWal->hdr, &hdr, sizeof(hdr)); } return rc;
︙			︙
837 838 839 840 841 842 843 ~~844~~ 845 846 847 848 849 850 851 852 853 854 855 856 857 ~~858~~ 859 860 861 862 863 864 865	int i; /* Iterator variable / int nFinal; / Number of unindexed entries / u8 aTmp; /* Temp space used by merge-sort / int rc; / Return code of walIndexMap() / u8 aSpace; /* Surplus space on the end of the allocation / / Make sure the wal-index is mapped into local memory / ~~rc = walIndexMap(pWal, walMappingSize(pWal->hdr.iLa~~stPg~~));~~ if( rc!=SQLITE_OK ){ return rc; } / This routine only runs while holding SQLITE_SHM_CHECKPOINT. No other thread is able to write to shared memory while this routine is running (or, indeed, while the WalIterator object exists). Hence, ** we can cast off the volatile qualifacation from shared memory / assert( pWal->lockState==SQLITE_SHM_CHECKPOINT ); aData = (u32)pWal->pWiData; /* Allocate space for the WalIterator object / ~~iLast = pWal->hdr.iLa~~stPg~~;~~ nSegment = (iLast >> 8) + 1; nFinal = (iLast & 0x000000FF); nByte = sizeof(WalIterator) + (nSegment+1)(sizeof(struct WalSegment)+256); p = (WalIterator *)sqlite3_malloc(nByte); if( !p ){ return SQLITE_NOMEM; }	\| \|	914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942	int i; /* Iterator variable / int nFinal; / Number of unindexed entries / u8 aTmp; /* Temp space used by merge-sort / int rc; / Return code of walIndexMap() / u8 aSpace; /* Surplus space on the end of the allocation / / Make sure the wal-index is mapped into local memory / rc = walIndexMap(pWal, walMappingSize(pWal->hdr.mxFrame)); if( rc!=SQLITE_OK ){ return rc; } / This routine only runs while holding SQLITE_SHM_CHECKPOINT. No other thread is able to write to shared memory while this routine is running (or, indeed, while the WalIterator object exists). Hence, ** we can cast off the volatile qualifacation from shared memory / assert( pWal->lockState==SQLITE_SHM_CHECKPOINT ); aData = (u32)pWal->pWiData; /* Allocate space for the WalIterator object / iLast = pWal->hdr.mxFrame; nSegment = (iLast >> 8) + 1; nFinal = (iLast & 0x000000FF); nByte = sizeof(WalIterator) + (nSegment+1)(sizeof(struct WalSegment)+256); p = (WalIterator *)sqlite3_malloc(nByte); if( !p ){ return SQLITE_NOMEM; }
︙			︙
911 912 913 914 915 916 917 ~~918~~ 919 920 921 922 923 924 925	int pgsz = pWal->hdr.pgsz; /* Database page-size / WalIterator pIter = 0; /* Wal iterator context / u32 iDbpage = 0; / Next database page to write / u32 iFrame = 0; / Wal frame containing data for iDbpage / / Allocate the iterator */ rc = walIteratorInit(pWal, &pIter); ~~if( rc!=SQLITE_OK \|\| pWal->hdr.iLa~~stPg~~==0 ){~~ goto out; } if( pWal->hdr.pgsz!=nBuf ){ rc = SQLITE_CORRUPT_BKPT; goto out; }	\|	988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002	int pgsz = pWal->hdr.pgsz; /* Database page-size / WalIterator pIter = 0; /* Wal iterator context / u32 iDbpage = 0; / Next database page to write / u32 iFrame = 0; / Wal frame containing data for iDbpage / / Allocate the iterator */ rc = walIteratorInit(pWal, &pIter); if( rc!=SQLITE_OK \|\| pWal->hdr.mxFrame==0 ){ goto out; } if( pWal->hdr.pgsz!=nBuf ){ rc = SQLITE_CORRUPT_BKPT; goto out; }
︙			︙
945 946 947 948 949 950 951 ~~952~~ 953 954 955 956 957 958 959	if( rc!=SQLITE_OK ) goto out; /* Sync the database file. If successful, update the wal-index. / if( sync_flags ){ rc = sqlite3OsSync(pWal->pDbFd, sync_flags); if( rc!=SQLITE_OK ) goto out; } ~~pWal->hdr.iLa~~stPg~~ = 0;~~ pWal->hdr.iCheck1 = 2; pWal->hdr.iCheck2 = 3; walIndexWriteHdr(pWal, &pWal->hdr); / TODO: If a crash occurs and the current log is copied into the database there is no problem. However, if a crash occurs while writing the next transaction into the start of the log, such that:	\|	1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036	if( rc!=SQLITE_OK ) goto out; /* Sync the database file. If successful, update the wal-index. / if( sync_flags ){ rc = sqlite3OsSync(pWal->pDbFd, sync_flags); if( rc!=SQLITE_OK ) goto out; } pWal->hdr.mxFrame = 0; pWal->hdr.iCheck1 = 2; pWal->hdr.iCheck2 = 3; walIndexWriteHdr(pWal, &pWal->hdr); / TODO: If a crash occurs and the current log is copied into the database there is no problem. However, if a crash occurs while writing the next transaction into the start of the log, such that:
︙			︙
1202 1203 1204 1205 1206 1207 1208 ~~1209~~ 1210 1211 1212 1213 1214 1215 1216	Pgno pgno, /* Database page number to read data for / int pInWal, /* OUT: True if data is read from WAL / int nOut, / Size of buffer pOut in bytes / u8 pOut /* Buffer to write page data to / ){ int rc; / Return code / u32 iRead = 0; / If !=0, WAL frame to return data from / ~~u32 iLast = pWal->hdr.iLa~~stPg~~; / Last page in WAL for this reader /~~ int iHash; / Used to loop through N hash tables / / If the "last page" field of the wal-index header snapshot is 0, then no data will be read from the wal under any circumstances. Return early in this case to avoid the walIndexMap/Unmap overhead. */ if( iLast==0 ){	\|	1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293	Pgno pgno, /* Database page number to read data for / int pInWal, /* OUT: True if data is read from WAL / int nOut, / Size of buffer pOut in bytes / u8 pOut /* Buffer to write page data to / ){ int rc; / Return code / u32 iRead = 0; / If !=0, WAL frame to return data from / u32 iLast = pWal->hdr.mxFrame; / Last page in WAL for this reader / int iHash; / Used to loop through N hash tables / / If the "last page" field of the wal-index header snapshot is 0, then no data will be read from the wal under any circumstances. Return early in this case to avoid the walIndexMap/Unmap overhead. */ if( iLast==0 ){
︙			︙
1378 1379 1380 1381 1382 1383 1384 ~~1385~~ 1386 1387 1388 1389 ~~1390~~ 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 ~~1404~~ 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 ~~1415~~ 1416 1417 1418 1419 1420 1421 1422	Otherwise, if the callback function does not return an error, this function returns SQLITE_OK. / int sqlite3WalUndo(Wal pWal, int (xUndo)(void , Pgno), void pUndoCtx){ int rc = SQLITE_OK; if( pWal->lockState==SQLITE_SHM_WRITE ){ int unused; ~~Pgno iMax = pWal->hdr.iLa~~stPg~~;~~ Pgno iFrame; assert( pWal->pWiData==0 ); rc = walIndexReadHdr(pWal, &unused); ~~for(iFrame=pWal->hdr.iLa~~stPg~~+1; rc==SQLITE_OK && iFrame<=iMax; iFrame++){~~ assert( pWal->lockState==SQLITE_SHM_WRITE ); rc = xUndo(pUndoCtx, pWal->pWiData[walIndexEntry(iFrame)]); } walIndexUnmap(pWal); } return rc; } / Return an integer that records the current (uncommitted) write ** position in the WAL / u32 sqlite3WalSavepoint(Wal pWal){ assert( pWal->lockState==SQLITE_SHM_WRITE ); ~~return pWal->hdr.iLa~~stPg~~;~~ } /* Move the write position of the WAL back to iFrame. Called in ** response to a ROLLBACK TO command. / int sqlite3WalSavepointUndo(Wal pWal, u32 iFrame){ int rc = SQLITE_OK; u8 aCksum[8]; assert( pWal->lockState==SQLITE_SHM_WRITE ); ~~pWal->hdr.iLa~~stPg~~ = iFrame;~~ if( iFrame>0 ){ i64 iOffset = walFrameOffset(iFrame, pWal->hdr.pgsz) + sizeof(u32)*2; rc = sqlite3OsRead(pWal->pWalFd, aCksum, sizeof(aCksum), iOffset); pWal->hdr.iCheck1 = sqlite3Get4byte(&aCksum[0]); pWal->hdr.iCheck2 = sqlite3Get4byte(&aCksum[4]); }	\| \| \| \|	1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499	Otherwise, if the callback function does not return an error, this function returns SQLITE_OK. / int sqlite3WalUndo(Wal pWal, int (xUndo)(void , Pgno), void pUndoCtx){ int rc = SQLITE_OK; if( pWal->lockState==SQLITE_SHM_WRITE ){ int unused; Pgno iMax = pWal->hdr.mxFrame; Pgno iFrame; assert( pWal->pWiData==0 ); rc = walIndexReadHdr(pWal, &unused); for(iFrame=pWal->hdr.mxFrame+1; rc==SQLITE_OK && iFrame<=iMax; iFrame++){ assert( pWal->lockState==SQLITE_SHM_WRITE ); rc = xUndo(pUndoCtx, pWal->pWiData[walIndexEntry(iFrame)]); } walIndexUnmap(pWal); } return rc; } / Return an integer that records the current (uncommitted) write ** position in the WAL / u32 sqlite3WalSavepoint(Wal pWal){ assert( pWal->lockState==SQLITE_SHM_WRITE ); return pWal->hdr.mxFrame; } /* Move the write position of the WAL back to iFrame. Called in ** response to a ROLLBACK TO command. / int sqlite3WalSavepointUndo(Wal pWal, u32 iFrame){ int rc = SQLITE_OK; u8 aCksum[8]; assert( pWal->lockState==SQLITE_SHM_WRITE ); pWal->hdr.mxFrame = iFrame; if( iFrame>0 ){ i64 iOffset = walFrameOffset(iFrame, pWal->hdr.pgsz) + sizeof(u32)*2; rc = sqlite3OsRead(pWal->pWalFd, aCksum, sizeof(aCksum), iOffset); pWal->hdr.iCheck1 = sqlite3Get4byte(&aCksum[0]); pWal->hdr.iCheck2 = sqlite3Get4byte(&aCksum[4]); }
︙			︙
1449 1450 1451 1452 1453 1454 1455 ~~1456~~ 1457 1458 1459 1460 1461 1462 1463	assert( pWal->pWiData==0 ); /* If this is the first frame written into the log, write the WAL header to the start of the WAL file. See comments at the top of this source file for a description of the WAL header format. */ assert( WAL_FRAME_HDRSIZE>=WAL_HDRSIZE ); ~~iFrame = pWal->hdr.iLa~~stPg~~;~~ if( iFrame==0 ){ sqlite3Put4byte(aFrame, nPgsz); sqlite3_randomness(8, &aFrame[4]); pWal->hdr.iCheck1 = sqlite3Get4byte(&aFrame[4]); pWal->hdr.iCheck2 = sqlite3Get4byte(&aFrame[8]); rc = sqlite3OsWrite(pWal->pWalFd, aFrame, WAL_HDRSIZE, 0); if( rc!=SQLITE_OK ){	\|	1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540	assert( pWal->pWiData==0 ); /* If this is the first frame written into the log, write the WAL header to the start of the WAL file. See comments at the top of this source file for a description of the WAL header format. */ assert( WAL_FRAME_HDRSIZE>=WAL_HDRSIZE ); iFrame = pWal->hdr.mxFrame; if( iFrame==0 ){ sqlite3Put4byte(aFrame, nPgsz); sqlite3_randomness(8, &aFrame[4]); pWal->hdr.iCheck1 = sqlite3Get4byte(&aFrame[4]); pWal->hdr.iCheck2 = sqlite3Get4byte(&aFrame[8]); rc = sqlite3OsWrite(pWal->pWalFd, aFrame, WAL_HDRSIZE, 0); if( rc!=SQLITE_OK ){
︙			︙
1521 1522 1523 1524 1525 1526 1527 ~~1528~~ 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 ~~1542~~ 1543 1544 1545 1546 1547 1548 1549	assert( pWal->pWiData==0 ); /* Append data to the wal-index. It is not necessary to lock the wal-index to do this as the SQLITE_SHM_WRITE lock held on the wal-index guarantees that there are no other writers, and no data that may ** be in use by existing readers is being overwritten. / ~~iFrame = pWal->hdr.iLa~~stPg~~;~~ for(p=pList; p && rc==SQLITE_OK; p=p->pDirty){ iFrame++; rc = walIndexAppend(pWal, iFrame, p->pgno); } while( nLast>0 && rc==SQLITE_OK ){ iFrame++; nLast--; rc = walIndexAppend(pWal, iFrame, pLast->pgno); } if( rc==SQLITE_OK ){ / Update the private copy of the header. */ pWal->hdr.pgsz = nPgsz; ~~pWal->hdr.iLa~~stPg~~ = iFrame;~~ if( isCommit ){ pWal->hdr.iChange++; pWal->hdr.nPage = nTruncate; } pWal->hdr.iCheck1 = aCksum[0]; pWal->hdr.iCheck2 = aCksum[1];	\| \|	1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626	assert( pWal->pWiData==0 ); /* Append data to the wal-index. It is not necessary to lock the wal-index to do this as the SQLITE_SHM_WRITE lock held on the wal-index guarantees that there are no other writers, and no data that may ** be in use by existing readers is being overwritten. / iFrame = pWal->hdr.mxFrame; for(p=pList; p && rc==SQLITE_OK; p=p->pDirty){ iFrame++; rc = walIndexAppend(pWal, iFrame, p->pgno); } while( nLast>0 && rc==SQLITE_OK ){ iFrame++; nLast--; rc = walIndexAppend(pWal, iFrame, pLast->pgno); } if( rc==SQLITE_OK ){ / Update the private copy of the header. */ pWal->hdr.pgsz = nPgsz; pWal->hdr.mxFrame = iFrame; if( isCommit ){ pWal->hdr.iChange++; pWal->hdr.nPage = nTruncate; } pWal->hdr.iCheck1 = aCksum[0]; pWal->hdr.iCheck2 = aCksum[1];
︙			︙