Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | If a readonly_shm connection cannot map the *-shm file because no other process is holding the DMS lock, have it read from the database file only, ignoring any content in the wal file. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | readonly-wal-recovery |
Files: | files | file ages | folders |
SHA3-256: |
ce5d13c2de69b73378637d4f7e109714 |
User & Date: | dan 2017-11-01 20:59:28.295 |
References
2017-11-13
| ||
05:51 | Remove some branches in walTryBeginRead() that were added by check-in [ce5d13c2de] but became unreachable with the addition of logic in check-in [18b26843] that enabled read-only clients to parse the WAL file into a heap-memory WAL-index, thus guaranteeing that the WAL-index header is always available. (check-in: 9c6b38b9a9 user: drh tags: readonly-wal-recovery) | |
Context
2017-11-02
| ||
11:12 | Avoid locking shm-lock WAL_READ_LOCK(0) during recovery. Doing this allows recovery to proceed while a readonly_shm connection in unlocked mode has an ongoing read transaction. (check-in: 5190d84a29 user: dan tags: readonly-wal-recovery) | |
2017-11-01
| ||
20:59 | If a readonly_shm connection cannot map the *-shm file because no other process is holding the DMS lock, have it read from the database file only, ignoring any content in the wal file. (check-in: ce5d13c2de user: dan tags: readonly-wal-recovery) | |
07:06 | Merge latest trunk changes into this branch. (check-in: 985bfc9929 user: dan tags: readonly-wal-recovery) | |
Changes
Changes to src/os_unix.c.
︙ | ︙ | |||
4104 4105 4106 4107 4108 4109 4110 4111 4112 4113 4114 4115 4116 4117 | unixInodeInfo *pInode; /* unixInodeInfo that owns this SHM node */ sqlite3_mutex *mutex; /* Mutex to access this object */ char *zFilename; /* Name of the mmapped file */ int h; /* Open file descriptor */ int szRegion; /* Size of shared-memory regions */ u16 nRegion; /* Size of array apRegion */ u8 isReadonly; /* True if read-only */ char **apRegion; /* Array of mapped shared-memory regions */ int nRef; /* Number of unixShm objects pointing to this */ unixShm *pFirst; /* All unixShm objects pointing to this */ #ifdef SQLITE_DEBUG u8 exclMask; /* Mask of exclusive locks held */ u8 sharedMask; /* Mask of shared locks held */ u8 nextShmId; /* Next available unixShm.id value */ | > | 4104 4105 4106 4107 4108 4109 4110 4111 4112 4113 4114 4115 4116 4117 4118 | unixInodeInfo *pInode; /* unixInodeInfo that owns this SHM node */ sqlite3_mutex *mutex; /* Mutex to access this object */ char *zFilename; /* Name of the mmapped file */ int h; /* Open file descriptor */ int szRegion; /* Size of shared-memory regions */ u16 nRegion; /* Size of array apRegion */ u8 isReadonly; /* True if read-only */ u8 isUnlocked; /* True if no DMS lock held */ char **apRegion; /* Array of mapped shared-memory regions */ int nRef; /* Number of unixShm objects pointing to this */ unixShm *pFirst; /* All unixShm objects pointing to this */ #ifdef SQLITE_DEBUG u8 exclMask; /* Mask of exclusive locks held */ u8 sharedMask; /* Mask of shared locks held */ u8 nextShmId; /* Next available unixShm.id value */ |
︙ | ︙ | |||
4265 4266 4267 4268 4269 4270 4271 4272 4273 4274 4275 4276 4277 4278 | robust_close(pFd, p->h, __LINE__); p->h = -1; } p->pInode->pShmNode = 0; sqlite3_free(p); } } /* ** Open a shared-memory area associated with open database file pDbFd. ** This particular implementation uses mmapped files. ** ** The file used to implement shared-memory is in the same directory ** as the open database file and has the same name as the open database | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 4266 4267 4268 4269 4270 4271 4272 4273 4274 4275 4276 4277 4278 4279 4280 4281 4282 4283 4284 4285 4286 4287 4288 4289 4290 4291 4292 4293 4294 4295 4296 4297 4298 4299 4300 4301 4302 4303 4304 4305 4306 4307 4308 4309 4310 4311 4312 4313 4314 4315 4316 4317 4318 4319 4320 4321 4322 4323 4324 4325 4326 4327 4328 4329 4330 4331 4332 4333 4334 4335 4336 4337 | robust_close(pFd, p->h, __LINE__); p->h = -1; } p->pInode->pShmNode = 0; sqlite3_free(p); } } /* ** The DMS lock has not yet been taken on shm file pShmNode. Attempt to ** take it now. Return SQLITE_OK if successful, or an SQLite error ** code otherwise. ** ** If the DMS cannot be locked because this is a readonly_shm=1 ** connection and no other process already holds a lock, return ** SQLITE_READONLY_CANTLOCK and set pShmNode->isUnlocked=1. */ static int unixLockSharedMemory(unixFile *pDbFd, unixShmNode *pShmNode){ struct flock lock; int rc = SQLITE_OK; /* Use F_GETLK to determine the locks other processes are holding ** on the DMS byte. If it indicates that another process is holding ** a SHARED lock, then this process may also take a SHARED lock ** and proceed with opening the *-shm file. ** ** Or, if no other process is holding any lock, then this process ** is the first to open it. In this case take an EXCLUSIVE lock on the ** DMS byte and truncate the *-shm file to zero bytes in size. Then ** downgrade to a SHARED lock on the DMS byte. ** ** If another process is holding an EXCLUSIVE lock on the DMS byte, ** return SQLITE_BUSY to the caller (it will try again). An earlier ** version of this code attempted the SHARED lock at this point. But ** this introduced a subtle race condition: if the process holding ** EXCLUSIVE failed just before truncating the *-shm file, then this ** process might open and use the *-shm file without truncating it. ** And if the *-shm file has been corrupted by a power failure or ** system crash, the database itself may also become corrupt. */ lock.l_whence = SEEK_SET; lock.l_start = UNIX_SHM_DMS; lock.l_len = 1; lock.l_type = F_WRLCK; if( osFcntl(pShmNode->h, F_GETLK, &lock)!=0 ) { rc = SQLITE_IOERR_LOCK; }else if( lock.l_type==F_UNLCK ){ if( pShmNode->isReadonly ){ pShmNode->isUnlocked = 1; rc = SQLITE_READONLY_CANTLOCK; }else{ rc = unixShmSystemLock(pDbFd, F_WRLCK, UNIX_SHM_DMS, 1); if( rc==SQLITE_OK && robust_ftruncate(pShmNode->h, 0) ){ rc = unixLogError(SQLITE_IOERR_SHMOPEN,"ftruncate",pShmNode->zFilename); } } }else if( lock.l_type==F_WRLCK ){ rc = SQLITE_BUSY; } if( rc==SQLITE_OK ){ assert( lock.l_type==F_UNLCK || lock.l_type==F_RDLCK ); rc = unixShmSystemLock(pDbFd, F_RDLCK, UNIX_SHM_DMS, 1); } return rc; } /* ** Open a shared-memory area associated with open database file pDbFd. ** This particular implementation uses mmapped files. ** ** The file used to implement shared-memory is in the same directory ** as the open database file and has the same name as the open database |
︙ | ︙ | |||
4304 4305 4306 4307 4308 4309 4310 | ** that no other processes are able to read or write the database. In ** that case, we do not really need shared memory. No shared memory ** file is created. The shared memory will be simulated with heap memory. */ static int unixOpenSharedMemory(unixFile *pDbFd){ struct unixShm *p = 0; /* The connection to be opened */ struct unixShmNode *pShmNode; /* The underlying mmapped file */ | | | 4363 4364 4365 4366 4367 4368 4369 4370 4371 4372 4373 4374 4375 4376 4377 | ** that no other processes are able to read or write the database. In ** that case, we do not really need shared memory. No shared memory ** file is created. The shared memory will be simulated with heap memory. */ static int unixOpenSharedMemory(unixFile *pDbFd){ struct unixShm *p = 0; /* The connection to be opened */ struct unixShmNode *pShmNode; /* The underlying mmapped file */ int rc = SQLITE_OK; /* Result code */ unixInodeInfo *pInode; /* The inode of fd */ char *zShmFilename; /* Name of the file used for SHM */ int nShmFilename; /* Size of the SHM filename in bytes */ /* Allocate space for the new unixShm object. */ p = sqlite3_malloc64( sizeof(*p) ); if( p==0 ) return SQLITE_NOMEM_BKPT; |
︙ | ︙ | |||
4368 4369 4370 4371 4372 4373 4374 | if( pShmNode->mutex==0 ){ rc = SQLITE_NOMEM_BKPT; goto shm_open_err; } } if( pInode->bProcessLock==0 ){ | < | < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < | < < < < | | 4427 4428 4429 4430 4431 4432 4433 4434 4435 4436 4437 4438 4439 4440 4441 4442 4443 4444 4445 4446 4447 4448 4449 4450 4451 4452 4453 4454 4455 4456 4457 4458 4459 | if( pShmNode->mutex==0 ){ rc = SQLITE_NOMEM_BKPT; goto shm_open_err; } } if( pInode->bProcessLock==0 ){ int openFlags = O_RDWR | O_CREAT; if( sqlite3_uri_boolean(pDbFd->zPath, "readonly_shm", 0) ){ openFlags = O_RDONLY; pShmNode->isReadonly = 1; } pShmNode->h = robust_open(zShmFilename, openFlags, (sStat.st_mode&0777)); if( pShmNode->h<0 ){ rc = unixLogError(SQLITE_CANTOPEN_BKPT, "open", zShmFilename); goto shm_open_err; } /* If this process is running as root, make sure that the SHM file ** is owned by the same user that owns the original database. Otherwise, ** the original owner will not be able to connect. */ robustFchown(pShmNode->h, sStat.st_uid, sStat.st_gid); rc = unixLockSharedMemory(pDbFd, pShmNode); if( rc!=SQLITE_OK && rc!=SQLITE_READONLY_CANTLOCK ) goto shm_open_err; } } /* Make the new connection a child of the unixShmNode */ p->pShmNode = pShmNode; #ifdef SQLITE_DEBUG p->id = pShmNode->nextShmId++; |
︙ | ︙ | |||
4452 4453 4454 4455 4456 4457 4458 | ** at pShmNode->pFirst. This must be done while holding the pShmNode->mutex ** mutex. */ sqlite3_mutex_enter(pShmNode->mutex); p->pNext = pShmNode->pFirst; pShmNode->pFirst = p; sqlite3_mutex_leave(pShmNode->mutex); | | | 4469 4470 4471 4472 4473 4474 4475 4476 4477 4478 4479 4480 4481 4482 4483 | ** at pShmNode->pFirst. This must be done while holding the pShmNode->mutex ** mutex. */ sqlite3_mutex_enter(pShmNode->mutex); p->pNext = pShmNode->pFirst; pShmNode->pFirst = p; sqlite3_mutex_leave(pShmNode->mutex); return rc; /* Jump here on any error */ shm_open_err: unixShmPurge(pDbFd); /* This call frees pShmNode if required */ sqlite3_free(p); unixLeaveMutex(); return rc; |
︙ | ︙ | |||
4504 4505 4506 4507 4508 4509 4510 4511 4512 4513 4514 4515 4516 4517 | rc = unixOpenSharedMemory(pDbFd); if( rc!=SQLITE_OK ) return rc; } p = pDbFd->pShm; pShmNode = p->pShmNode; sqlite3_mutex_enter(pShmNode->mutex); assert( szRegion==pShmNode->szRegion || pShmNode->nRegion==0 ); assert( pShmNode->pInode==pDbFd->pInode ); assert( pShmNode->h>=0 || pDbFd->pInode->bProcessLock==1 ); assert( pShmNode->h<0 || pDbFd->pInode->bProcessLock==0 ); /* Minimum number of regions required to be mapped. */ nReqRegion = ((iRegion+nShmPerMap) / nShmPerMap) * nShmPerMap; | > > > > > | 4521 4522 4523 4524 4525 4526 4527 4528 4529 4530 4531 4532 4533 4534 4535 4536 4537 4538 4539 | rc = unixOpenSharedMemory(pDbFd); if( rc!=SQLITE_OK ) return rc; } p = pDbFd->pShm; pShmNode = p->pShmNode; sqlite3_mutex_enter(pShmNode->mutex); if( pShmNode->isUnlocked ){ rc = unixLockSharedMemory(pDbFd, pShmNode); if( rc!=SQLITE_OK ) goto shmpage_out; pShmNode->isUnlocked = 0; } assert( szRegion==pShmNode->szRegion || pShmNode->nRegion==0 ); assert( pShmNode->pInode==pDbFd->pInode ); assert( pShmNode->h>=0 || pDbFd->pInode->bProcessLock==1 ); assert( pShmNode->h<0 || pDbFd->pInode->bProcessLock==0 ); /* Minimum number of regions required to be mapped. */ nReqRegion = ((iRegion+nShmPerMap) / nShmPerMap) * nShmPerMap; |
︙ | ︙ |
Changes to src/wal.c.
︙ | ︙ | |||
571 572 573 574 575 576 577 | if( pWal->exclusiveMode==WAL_HEAPMEMORY_MODE ){ pWal->apWiData[iPage] = (u32 volatile *)sqlite3MallocZero(WALINDEX_PGSZ); if( !pWal->apWiData[iPage] ) rc = SQLITE_NOMEM_BKPT; }else{ rc = sqlite3OsShmMap(pWal->pDbFd, iPage, WALINDEX_PGSZ, pWal->writeLock, (void volatile **)&pWal->apWiData[iPage] ); | | > | > | 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 | if( pWal->exclusiveMode==WAL_HEAPMEMORY_MODE ){ pWal->apWiData[iPage] = (u32 volatile *)sqlite3MallocZero(WALINDEX_PGSZ); if( !pWal->apWiData[iPage] ) rc = SQLITE_NOMEM_BKPT; }else{ rc = sqlite3OsShmMap(pWal->pDbFd, iPage, WALINDEX_PGSZ, pWal->writeLock, (void volatile **)&pWal->apWiData[iPage] ); if( (rc&0xff)==SQLITE_READONLY ){ pWal->readOnly |= WAL_SHM_RDONLY; if( rc==SQLITE_READONLY ){ rc = SQLITE_OK; } } } } *ppPage = pWal->apWiData[iPage]; assert( iPage==0 || *ppPage || rc!=SQLITE_OK ); return rc; |
︙ | ︙ | |||
2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 | /* Ensure that page 0 of the wal-index (the page that contains the ** wal-index header) is mapped. Return early if an error occurs here. */ assert( pChanged ); rc = walIndexPage(pWal, 0, &page0); if( rc!=SQLITE_OK ){ return rc; }; assert( page0 || pWal->writeLock==0 ); /* If the first page of the wal-index has been mapped, try to read the ** wal-index header immediately, without holding any lock. This usually ** works, but may fail if the wal-index header is corrupt or currently | > > > > > > > > | 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 | /* Ensure that page 0 of the wal-index (the page that contains the ** wal-index header) is mapped. Return early if an error occurs here. */ assert( pChanged ); rc = walIndexPage(pWal, 0, &page0); if( rc!=SQLITE_OK ){ if( rc==SQLITE_READONLY_CANTLOCK #ifdef SQLITE_ENABLE_SNAPSHOT && pWal->pSnapshot==0 #endif ){ memset(&pWal->hdr, 0, sizeof(WalIndexHdr)); rc = SQLITE_OK; } return rc; }; assert( page0 || pWal->writeLock==0 ); /* If the first page of the wal-index has been mapped, try to read the ** wal-index header immediately, without holding any lock. This usually ** works, but may fail if the wal-index header is corrupt or currently |
︙ | ︙ | |||
2255 2256 2257 2258 2259 2260 2261 | } } if( rc!=SQLITE_OK ){ return rc; } } | > > | | > | > | 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 | } } if( rc!=SQLITE_OK ){ return rc; } } assert( pWal->nWiData>0 ); assert( pWal->apWiData[0] || (pWal->readOnly & WAL_SHM_RDONLY) ); pInfo = pWal->apWiData[0] ? walCkptInfo(pWal) : 0; if( !useWal && (pInfo==0 || pInfo->nBackfill==pWal->hdr.mxFrame) #ifdef SQLITE_ENABLE_SNAPSHOT && (pWal->pSnapshot==0 || pWal->hdr.mxFrame==0 || 0==memcmp(&pWal->hdr, pWal->pSnapshot, sizeof(WalIndexHdr))) #endif ){ /* The WAL has been completely backfilled (or it is empty). ** and can be safely ignored. */ rc = walLockShared(pWal, WAL_READ_LOCK(0)); walShmBarrier(pWal); if( rc==SQLITE_OK ){ if( pInfo && memcmp((void *)walIndexHdr(pWal), &pWal->hdr, sizeof(WalIndexHdr)) ){ /* It is not safe to allow the reader to continue here if frames ** may have been appended to the log before READ_LOCK(0) was obtained. ** When holding READ_LOCK(0), the reader ignores the entire log file, ** which implies that the database file contains a trustworthy ** snapshot. Since holding READ_LOCK(0) prevents a checkpoint from ** happening, this is usually correct. ** |
︙ | ︙ |
Added test/walro2.test.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 | # 2011 May 09 # # The author disclaims copyright to this source code. In place of # a legal notice, here is a blessing: # # May you do good and not evil. # May you find forgiveness for yourself and forgive others. # May you share freely, never taking more than you give. # #*********************************************************************** # # This file contains tests for using WAL databases in read-only mode. # set testdir [file dirname $argv0] source $testdir/tester.tcl source $testdir/lock_common.tcl set ::testprefix walro # These tests are only going to work on unix. # if {$::tcl_platform(platform) != "unix"} { finish_test return } # And only if the build is WAL-capable. # ifcapable !wal { finish_test return } do_multiclient_test tn { # Close all connections and delete the database. # code1 { db close } code2 { db2 close } code3 { db3 close } forcedelete test.db # Do not run tests with the connections in the same process. # if {$tn==2} continue foreach c {code1 code2 code3} { $c { sqlite3_shutdown sqlite3_config_uri 1 } } do_test 1.1 { code2 { sqlite3 db2 test.db } sql2 { CREATE TABLE t1(x, y); PRAGMA journal_mode = WAL; INSERT INTO t1 VALUES('a', 'b'); INSERT INTO t1 VALUES('c', 'd'); } file exists test.db-shm } {1} do_test 1.2 { forcecopy test.db test.db2 forcecopy test.db-wal test.db2-wal forcecopy test.db-shm test.db2-shm code1 { sqlite3 db file:test.db2?readonly_shm=1 } sql1 { SELECT * FROM t1 } } {} do_test 1.3.1 { code3 { sqlite3 db3 test.db2 } sql3 { SELECT * FROM t1 } } {a b c d} do_test 1.3.2 { sql1 { SELECT * FROM t1 } } {a b c d} } finish_test |