Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Have worker clients and writers that discard an old in-memory tree update a read-lock slot before concluding their work or write transaction. This is required for read-only clients - which cannot set the value of their own read-lock slot. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
798d9e23be2109448da99844b21bf328 |
User & Date: | dan 2013-02-09 16:55:14.330 |
Context
2013-02-11
| ||
15:17 | Remove two MySQL-isms: Strings in double-quotes and identifiers quoted by grave accents. check-in: eec75c074c user: drh tags: trunk | |
2013-02-09
| ||
19:42 | Add definitions for the extra locks required for read-only clients to detect whether or not a database is live. check-in: 69f33cfa12 user: dan tags: read-only-clients | |
16:55 | Have worker clients and writers that discard an old in-memory tree update a read-lock slot before concluding their work or write transaction. This is required for read-only clients - which cannot set the value of their own read-lock slot. check-in: 798d9e23be user: dan tags: trunk | |
2013-02-08
| ||
15:22 | Avoid extending the database file when truncating it to the minimum number of blocks required during system shutdown. check-in: 9afc42d70d user: dan tags: trunk | |
Changes
Changes to src/lsmInt.h.
︙ | ︙ | |||
327 328 329 330 331 332 333 334 335 336 337 338 339 340 | int iReader; /* Read lock held (-1 == unlocked) */ MultiCursor *pCsr; /* List of all open cursors */ LogWriter *pLogWriter; /* Context for writing to the log file */ int nTransOpen; /* Number of opened write transactions */ int nTransAlloc; /* Allocated size of aTrans[] array */ TransMark *aTrans; /* Array of marks for transaction rollback */ IntArray rollback; /* List of tree-nodes to roll back */ /* Worker context */ Snapshot *pWorker; /* Worker snapshot (or NULL) */ Freelist *pFreelist; /* See sortedNewToplevel() */ int bUseFreelist; /* True to use pFreelist */ int bIncrMerge; /* True if currently doing a merge */ | > | 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 | int iReader; /* Read lock held (-1 == unlocked) */ MultiCursor *pCsr; /* List of all open cursors */ LogWriter *pLogWriter; /* Context for writing to the log file */ int nTransOpen; /* Number of opened write transactions */ int nTransAlloc; /* Allocated size of aTrans[] array */ TransMark *aTrans; /* Array of marks for transaction rollback */ IntArray rollback; /* List of tree-nodes to roll back */ int bDiscardOld; /* True if lsmTreeDiscardOld() was called */ /* Worker context */ Snapshot *pWorker; /* Worker snapshot (or NULL) */ Freelist *pFreelist; /* See sortedNewToplevel() */ int bUseFreelist; /* True to use pFreelist */ int bIncrMerge; /* True if currently doing a merge */ |
︙ | ︙ |
Changes to src/lsm_shared.c.
︙ | ︙ | |||
784 785 786 787 788 789 790 | ** but then not used. This function is used to push the block back onto ** the freelist. Refreeing a block is different from freeing is, as a refreed ** block may be reused immediately. Whereas a freed block can not be reused ** until (at least) after the next checkpoint. */ int lsmBlockRefree(lsm_db *pDb, int iBlk){ int rc = LSM_OK; /* Return code */ | < | 784 785 786 787 788 789 790 791 792 793 794 795 796 797 | ** but then not used. This function is used to push the block back onto ** the freelist. Refreeing a block is different from freeing is, as a refreed ** block may be reused immediately. Whereas a freed block can not be reused ** until (at least) after the next checkpoint. */ int lsmBlockRefree(lsm_db *pDb, int iBlk){ int rc = LSM_OK; /* Return code */ #ifdef LSM_LOG_FREELIST lsmLogMessage(pDb, LSM_OK, "lsmBlockRefree(): Refree block %d", iBlk); #endif rc = freelistAppend(pDb, iBlk, 0); return rc; |
︙ | ︙ | |||
898 899 900 901 902 903 904 | /* ** Argument bFlush is true if the contents of the in-memory tree has just ** been flushed to disk. The significance of this is that once the snapshot ** created to hold the updated state of the database is synced to disk, log ** file space can be recycled. */ void lsmFinishWork(lsm_db *pDb, int bFlush, int *pRc){ | > | | | > > > > > > > > > > > > > > | 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 | /* ** Argument bFlush is true if the contents of the in-memory tree has just ** been flushed to disk. The significance of this is that once the snapshot ** created to hold the updated state of the database is synced to disk, log ** file space can be recycled. */ void lsmFinishWork(lsm_db *pDb, int bFlush, int *pRc){ int rc = *pRc; assert( rc!=0 || pDb->pWorker ); if( pDb->pWorker ){ /* If no error has occurred, serialize the worker snapshot and write ** it to shared memory. */ if( rc==LSM_OK ){ rc = lsmSaveWorker(pDb, bFlush); } /* Assuming no error has occurred, update a read lock slot with the ** new snapshot id (see comments above function lsmSetReadLock()). */ if( rc==LSM_OK ){ if( pDb->iReader<0 ){ rc = lsmTreeLoadHeader(pDb, 0); } if( rc==LSM_OK ){ rc = lsmSetReadLock(pDb, pDb->pWorker->iId, pDb->treehdr.iUsedShmid); } } /* Free the snapshot object. */ lsmFreeSnapshot(pDb->pEnv, pDb->pWorker); pDb->pWorker = 0; } lsmShmLock(pDb, LSM_LOCK_WORKER, LSM_LOCK_UNLOCK, 0); *pRc = rc; } /* ** Called when recovery is finished. */ int lsmFinishRecovery(lsm_db *pDb){ lsmTreeEndTransaction(pDb, 1); |
︙ | ︙ | |||
1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 | ** Open a write transaction. */ int lsmBeginWriteTrans(lsm_db *pDb){ int rc; /* Return code */ ShmHeader *pShm = pDb->pShmhdr; /* Shared memory header */ assert( pDb->nTransOpen==0 ); /* If there is no read-transaction open, open one now. */ rc = lsmBeginReadTrans(pDb); /* Attempt to take the WRITER lock */ if( rc==LSM_OK ){ rc = lsmShmLock(pDb, LSM_LOCK_WRITER, LSM_LOCK_EXCL, 0); | > | 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 | ** Open a write transaction. */ int lsmBeginWriteTrans(lsm_db *pDb){ int rc; /* Return code */ ShmHeader *pShm = pDb->pShmhdr; /* Shared memory header */ assert( pDb->nTransOpen==0 ); assert( pDb->bDiscardOld==0 ); /* If there is no read-transaction open, open one now. */ rc = lsmBeginReadTrans(pDb); /* Attempt to take the WRITER lock */ if( rc==LSM_OK ){ rc = lsmShmLock(pDb, LSM_LOCK_WRITER, LSM_LOCK_EXCL, 0); |
︙ | ︙ | |||
1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 | ** WRITER lock and return an error code. */ if( rc==LSM_OK ){ TreeHeader *p = &pDb->treehdr; pShm->bWriter = 1; p->root.iTransId++; if( lsmTreeHasOld(pDb) && p->iOldLog==pDb->pClient->iLogOff ){ lsmTreeDiscardOld(pDb); } }else{ lsmShmLock(pDb, LSM_LOCK_WRITER, LSM_LOCK_UNLOCK, 0); if( pDb->pCsr==0 ) lsmFinishReadTrans(pDb); } return rc; } | > | 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 | ** WRITER lock and return an error code. */ if( rc==LSM_OK ){ TreeHeader *p = &pDb->treehdr; pShm->bWriter = 1; p->root.iTransId++; if( lsmTreeHasOld(pDb) && p->iOldLog==pDb->pClient->iLogOff ){ lsmTreeDiscardOld(pDb); pDb->bDiscardOld = 1; } }else{ lsmShmLock(pDb, LSM_LOCK_WRITER, LSM_LOCK_UNLOCK, 0); if( pDb->pCsr==0 ) lsmFinishReadTrans(pDb); } return rc; } |
︙ | ︙ | |||
1144 1145 1146 1147 1148 1149 1150 | lsmLogEnd(pDb, bCommit); if( rc==LSM_OK && bCommit && lsmTreeSize(pDb)>pDb->nTreeLimit ){ bFlush = 1; lsmTreeMakeOld(pDb); } lsmTreeEndTransaction(pDb, bCommit); | > | | > > | > > > | 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 | lsmLogEnd(pDb, bCommit); if( rc==LSM_OK && bCommit && lsmTreeSize(pDb)>pDb->nTreeLimit ){ bFlush = 1; lsmTreeMakeOld(pDb); } lsmTreeEndTransaction(pDb, bCommit); if( rc==LSM_OK ){ if( bFlush && pDb->bAutowork ){ rc = lsmSortedAutoWork(pDb, 1); }else if( bCommit && pDb->bDiscardOld ){ rc = lsmSetReadLock(pDb, pDb->pClient->iId, pDb->treehdr.iUsedShmid); } } pDb->bDiscardOld = 0; lsmShmLock(pDb, LSM_LOCK_WRITER, LSM_LOCK_UNLOCK, 0); if( bFlush && pDb->bAutowork==0 && pDb->xWork ){ pDb->xWork(pDb, pDb->pWorkCtx); } return rc; } |
︙ | ︙ | |||
1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 | static int slotIsUsable(ShmReader *p, i64 iLsm, u32 iShmMin, u32 iShmMax){ return( p->iLsmId && p->iLsmId<=iLsm && shm_sequence_ge(iShmMax, p->iTreeId) && shm_sequence_ge(p->iTreeId, iShmMin) ); } /* ** Obtain a read-lock on database version identified by the combination ** of snapshot iLsm and tree iTree. Return LSM_OK if successful, or ** an LSM error code otherwise. */ int lsmReadlock(lsm_db *db, i64 iLsm, u32 iShmMin, u32 iShmMax){ | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 | static int slotIsUsable(ShmReader *p, i64 iLsm, u32 iShmMin, u32 iShmMax){ return( p->iLsmId && p->iLsmId<=iLsm && shm_sequence_ge(iShmMax, p->iTreeId) && shm_sequence_ge(p->iTreeId, iShmMin) ); } /* ** Attempt to populate one of the read-lock slots to contain lock values ** iLsm/iShm. Or, if such a slot exists already, this function is a no-op. ** ** It is not an error if no slot can be populated because the write-lock ** cannot be obtained. If any other error occurs, return an LSM error code. ** Otherwise, LSM_OK. ** ** This function is called at various points to try to ensure that there ** always exists at least one read-lock slot that can be used by a read-only ** client. And so that, in the usual case, there is an "exact match" available ** whenever a read transaction is opened by any client. At present this ** function is called when: ** ** * A write transaction that called lsmTreeDiscardOld() is committed, and ** * Whenever the working snapshot is updated (i.e. lsmFinishWork()). */ int lsmSetReadLock(lsm_db *db, i64 iLsm, u32 iShm){ int rc = LSM_OK; ShmHeader *pShm = db->pShmhdr; int i; /* Check if there is already a slot containing the required values. */ for(i=0; i<LSM_LOCK_NREADER; i++){ ShmReader *p = &pShm->aReader[i]; if( p->iLsmId==iLsm && p->iTreeId==iShm ) return LSM_OK; } /* Iterate through all read-lock slots, attempting to take a write-lock ** on each of them. If a write-lock succeeds, populate the locked slot ** with the required values and break out of the loop. */ for(i=0; rc==LSM_OK && i<LSM_LOCK_NREADER; i++){ rc = lsmShmLock(db, LSM_LOCK_READER(i), LSM_LOCK_EXCL, 0); if( rc==LSM_BUSY ){ rc = LSM_OK; }else{ ShmReader *p = &pShm->aReader[i]; p->iLsmId = iLsm; p->iTreeId = iShm; lsmShmLock(db, LSM_LOCK_READER(i), LSM_LOCK_UNLOCK, 0); break; } } return rc; } /* ** Obtain a read-lock on database version identified by the combination ** of snapshot iLsm and tree iTree. Return LSM_OK if successful, or ** an LSM error code otherwise. */ int lsmReadlock(lsm_db *db, i64 iLsm, u32 iShmMin, u32 iShmMax){ |
︙ | ︙ | |||
1364 1365 1366 1367 1368 1369 1370 | ** Ensure that database connection db has cached pointers to at least the ** first nChunk chunks of shared memory. */ int lsmShmCacheChunks(lsm_db *db, int nChunk){ int rc = LSM_OK; if( nChunk>db->nShm ){ static const int NINCR = 16; | < | 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 | ** Ensure that database connection db has cached pointers to at least the ** first nChunk chunks of shared memory. */ int lsmShmCacheChunks(lsm_db *db, int nChunk){ int rc = LSM_OK; if( nChunk>db->nShm ){ static const int NINCR = 16; Database *p = db->pDatabase; lsm_env *pEnv = db->pEnv; int nAlloc; int i; /* Ensure that the db->apShm[] array is large enough. If an attempt to ** allocate memory fails, return LSM_NOMEM immediately. The apShm[] array |
︙ | ︙ |
Changes to www/lsm.wiki.
︙ | ︙ | |||
428 429 430 431 432 433 434 | <ol> <li> <p>Load the current tree-header from shared-memory. <li> <p>Load the current snapshot from shared-memory. <p>Steps 1 and 2 are similar. In both cases, there are two copies of the data structure being read in shared memory. No lock is held to prevent | | | | 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 | <ol> <li> <p>Load the current tree-header from shared-memory. <li> <p>Load the current snapshot from shared-memory. <p>Steps 1 and 2 are similar. In both cases, there are two copies of the data structure being read in shared memory. No lock is held to prevent another client updating them while the read is taking place. Updaters use the following pattern: <ol type=i> <li> Update copy 2. <li> Invoke xShmBarrier(). <li> Update copy 1. </ol> |
︙ | ︙ |