Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Fix LSM single-process mode so that it holds an exclusive lock on the database file - preventing connections from within external processes. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
d6bd08ca0eb731d9ac0a0e3b573947bf |
User & Date: | dan 2013-02-02 16:45:05.723 |
Context
2013-02-05
| ||
09:52 | Add test file lsm3.test, which should have been added a few days ago. check-in: 5dfd8651df user: dan tags: trunk | |
2013-02-04
| ||
19:04 | Map and unmap parts of the database file on an LRU basis to limit the amount of address space consumed at any one time (for 32-bit address spaces). It looks like this might be slower than read() and write() anyway... check-in: d1b1a9e969 user: dan tags: mmap-on-demand | |
2013-02-02
| ||
16:45 | Fix LSM single-process mode so that it holds an exclusive lock on the database file - preventing connections from within external processes. check-in: d6bd08ca0e user: dan tags: trunk | |
2013-02-01
| ||
19:49 | Simplifications and clarifications to lsmusr.wiki. check-in: 33eca2e1f4 user: dan tags: trunk | |
Changes
Changes to src/kvlsm.c.
︙ | ︙ | |||
449 450 451 452 453 454 455 | if( pNew==0 ){ rc = SQLITE4_NOMEM; }else{ struct Config { const char *zParam; int eParam; } aConfig[] = { | | > | 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 | if( pNew==0 ){ rc = SQLITE4_NOMEM; }else{ struct Config { const char *zParam; int eParam; } aConfig[] = { { "lsm_block_size", LSM_CONFIG_BLOCK_SIZE }, { "lsm_multiple_processes", LSM_CONFIG_MULTIPLE_PROCESSES } }; memset(pNew, 0, sizeof(KVLsm)); pNew->base.pStoreVfunc = &kvlsmMethods; pNew->base.pEnv = pEnv; rc = lsm_new(0, &pNew->pDb); if( rc==SQLITE4_OK ){ |
︙ | ︙ |
Changes to src/lsm_shared.c.
︙ | ︙ | |||
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 | /* ** Database structure. There is one such structure for each distinct ** database accessed by this process. They are stored in the singly linked ** list starting at global variable gShared.pDatabase. Database objects are ** reference counted. Once the number of connections to the associated ** database drops to zero, they are removed from the linked list and deleted. */ struct Database { /* Protected by the global mutex (enterGlobalMutex/leaveGlobalMutex): */ char *zName; /* Canonical path to database file */ int nName; /* strlen(zName) */ int nDbRef; /* Number of associated lsm_db handles */ Database *pDbNext; /* Next Database structure in global list */ /* Protected by the local mutex (pClientMutex) */ lsm_file *pFile; /* Used for locks/shm in multi-proc mode */ LsmFile *pLsmFile; /* List of deferred closes */ lsm_mutex *pClientMutex; /* Protects the apShmChunk[] and pConn */ int nShmChunk; /* Number of entries in apShmChunk[] array */ void **apShmChunk; /* Array of "shared" memory regions */ lsm_db *pConn; /* List of connections to this db. */ }; | > > > > > > > | 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | /* ** Database structure. There is one such structure for each distinct ** database accessed by this process. They are stored in the singly linked ** list starting at global variable gShared.pDatabase. Database objects are ** reference counted. Once the number of connections to the associated ** database drops to zero, they are removed from the linked list and deleted. ** ** pFile: ** In multi-process mode, this file descriptor is used to obtain locks ** and to access shared-memory. In single process mode, its only job is ** to hold the exclusive lock on the file. ** */ struct Database { /* Protected by the global mutex (enterGlobalMutex/leaveGlobalMutex): */ char *zName; /* Canonical path to database file */ int nName; /* strlen(zName) */ int nDbRef; /* Number of associated lsm_db handles */ Database *pDbNext; /* Next Database structure in global list */ /* Protected by the local mutex (pClientMutex) */ int bMultiProc; /* True if running in multi-process mode */ lsm_file *pFile; /* Used for locks/shm in multi-proc mode */ LsmFile *pLsmFile; /* List of deferred closes */ lsm_mutex *pClientMutex; /* Protects the apShmChunk[] and pConn */ int nShmChunk; /* Number of entries in apShmChunk[] array */ void **apShmChunk; /* Array of "shared" memory regions */ lsm_db *pConn; /* List of connections to this db. */ }; |
︙ | ︙ | |||
265 266 267 268 269 270 271 | /* If the checkpoint was written successfully, delete the log file ** and, if possible, truncate the database file. */ if( rc==LSM_OK ){ Database *p = pDb->pDatabase; dbTruncateFile(pDb); lsmFsCloseAndDeleteLog(pDb->pFS); | | | 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 | /* If the checkpoint was written successfully, delete the log file ** and, if possible, truncate the database file. */ if( rc==LSM_OK ){ Database *p = pDb->pDatabase; dbTruncateFile(pDb); lsmFsCloseAndDeleteLog(pDb->pFS); if( p->pFile && p->bMultiProc ) lsmEnvShmUnmap(pDb->pEnv, p->pFile, 1); } } } lsmShmLock(pDb, LSM_LOCK_DMS2, LSM_LOCK_UNLOCK, 0); lsmShmLock(pDb, LSM_LOCK_DMS1, LSM_LOCK_UNLOCK, 0); pDb->pShmhdr = 0; |
︙ | ︙ | |||
314 315 316 317 318 319 320 | if( rc==LSM_OK ){ rc = lsmLogRecover(pDb); } }else if( rc==LSM_BUSY ){ rc = LSM_OK; } | | | | | > > > > > < | 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 | if( rc==LSM_OK ){ rc = lsmLogRecover(pDb); } }else if( rc==LSM_BUSY ){ rc = LSM_OK; } /* Take a shared lock on DMS2. In multi-process mode this lock "cannot" ** fail, as connections may only hold an exclusive lock on DMS2 if they ** first hold an exclusive lock on DMS1. And this connection is currently ** holding the exclusive lock on DSM1. ** ** However, if some other connection has the database open in single-process ** mode, this operation will fail. In this case, return the error to the ** caller - the attempt to connect to the db has failed. */ if( rc==LSM_OK ){ rc = lsmShmLock(pDb, LSM_LOCK_DMS2, LSM_LOCK_SHARED, 0); } /* If anything went wrong, unlock DMS2. Unlock DMS1 in any case. */ if( rc!=LSM_OK ){ lsmShmLock(pDb, LSM_LOCK_DMS2, LSM_LOCK_UNLOCK, 0); pDb->pShmhdr = 0; } |
︙ | ︙ | |||
359 360 361 362 363 364 365 | int nName = lsmStrlen(zName); assert( pDb->pDatabase==0 ); rc = enterGlobalMutex(pEnv); if( rc==LSM_OK ){ /* Search the global list for an existing object. TODO: Need something | | > | | > | > > > | 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 | int nName = lsmStrlen(zName); assert( pDb->pDatabase==0 ); rc = enterGlobalMutex(pEnv); if( rc==LSM_OK ){ /* Search the global list for an existing object. TODO: Need something ** better than the memcmp() below to figure out if a given Database ** object represents the requested file. */ for(p=gShared.pDatabase; p; p=p->pDbNext){ if( nName==p->nName && 0==memcmp(zName, p->zName, nName) ) break; } /* If no suitable Database object was found, allocate a new one. */ if( p==0 ){ p = (Database *)lsmMallocZeroRc(pEnv, sizeof(Database)+nName+1, &rc); /* If the allocation was successful, fill in other fields and ** allocate the client mutex. */ if( rc==LSM_OK ){ p->bMultiProc = pDb->bMultiProc; p->zName = (char *)&p[1]; p->nName = nName; memcpy((void *)p->zName, zName, nName+1); rc = lsmMutexNew(pEnv, &p->pClientMutex); } /* If nothing has gone wrong so far, open the shared fd. And if that ** succeeds and this connection requested single-process mode, ** attempt to take the exclusive lock on DMS2. */ if( rc==LSM_OK ){ rc = lsmEnvOpen(pDb->pEnv, p->zName, &p->pFile); } if( rc==LSM_OK && p->bMultiProc==0 ){ rc = lsmEnvLock(pDb->pEnv, p->pFile, LSM_LOCK_DMS2, LSM_LOCK_EXCL); } if( rc==LSM_OK ){ p->pDbNext = gShared.pDatabase; gShared.pDatabase = p; }else{ freeDatabase(pEnv, p); p = 0; |
︙ | ︙ | |||
456 457 458 459 460 461 462 | if( pDb->pShmhdr ){ doDbDisconnect(pDb); } lsmMutexEnter(pDb->pEnv, p->pClientMutex); for(ppDb=&p->pConn; *ppDb!=pDb; ppDb=&((*ppDb)->pNext)); *ppDb = pDb->pNext; | < | < > > | | > | | < | | | | < | 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 | if( pDb->pShmhdr ){ doDbDisconnect(pDb); } lsmMutexEnter(pDb->pEnv, p->pClientMutex); for(ppDb=&p->pConn; *ppDb!=pDb; ppDb=&((*ppDb)->pNext)); *ppDb = pDb->pNext; dbDeferClose(pDb); lsmMutexLeave(pDb->pEnv, p->pClientMutex); enterGlobalMutex(pDb->pEnv); p->nDbRef--; if( p->nDbRef==0 ){ LsmFile *pIter; LsmFile *pNext; Database **pp; /* Remove the Database structure from the linked list. */ for(pp=&gShared.pDatabase; *pp!=p; pp=&((*pp)->pDbNext)); *pp = p->pDbNext; /* If they were allocated from the heap, free the shared memory chunks */ if( p->bMultiProc==0 ){ int i; for(i=0; i<p->nShmChunk; i++){ lsmFree(pDb->pEnv, p->apShmChunk[i]); } } /* Close any outstanding file descriptors */ for(pIter=p->pLsmFile; pIter; pIter=pNext){ pNext = pIter->pNext; lsmEnvClose(pDb->pEnv, pIter->pFile); lsmFree(pDb->pEnv, pIter); } freeDatabase(pDb->pEnv, p); } leaveGlobalMutex(pDb->pEnv); } } |
︙ | ︙ | |||
1291 1292 1293 1294 1295 1296 1297 | /* ** This function may only be called after a successful call to ** lsmDbDatabaseConnect(). It returns true if the connection is in ** multi-process mode, or false otherwise. */ int lsmDbMultiProc(lsm_db *pDb){ | | | 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 | /* ** This function may only be called after a successful call to ** lsmDbDatabaseConnect(). It returns true if the connection is in ** multi-process mode, or false otherwise. */ int lsmDbMultiProc(lsm_db *pDb){ return pDb->pDatabase && pDb->pDatabase->bMultiProc; } /************************************************************************* ************************************************************************** ************************************************************************** ************************************************************************** |
︙ | ︙ | |||
1349 1350 1351 1352 1353 1354 1355 | } p->apShmChunk = apShm; } for(i=db->nShm; rc==LSM_OK && i<nChunk; i++){ if( i>=p->nShmChunk ){ void *pChunk = 0; | | | 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 | } p->apShmChunk = apShm; } for(i=db->nShm; rc==LSM_OK && i<nChunk; i++){ if( i>=p->nShmChunk ){ void *pChunk = 0; if( p->bMultiProc==0 ){ /* Single process mode */ pChunk = lsmMallocZeroRc(pEnv, LSM_SHM_CHUNK_SIZE, &rc); }else{ /* Multi-process mode */ rc = lsmEnvShmMap(pEnv, p->pFile, i, LSM_SHM_CHUNK_SIZE, &pChunk); } if( rc==LSM_OK ){ |
︙ | ︙ | |||
1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 | /* Release the client mutex */ lsmMutexLeave(pEnv, p->pClientMutex); } return rc; } /* ** Attempt to obtain the lock identified by the iLock and bExcl parameters. ** If successful, return LSM_OK. If the lock cannot be obtained because ** there exists some other conflicting lock, return LSM_BUSY. If some other ** error occurs, return an LSM error code. ** | > > > > > > > > | 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 | /* Release the client mutex */ lsmMutexLeave(pEnv, p->pClientMutex); } return rc; } static int lockSharedFile(lsm_env *pEnv, Database *p, int iLock, int eOp){ int rc = LSM_OK; if( p->bMultiProc ){ rc = lsmEnvLock(pEnv, p->pFile, iLock, eOp); } return rc; } /* ** Attempt to obtain the lock identified by the iLock and bExcl parameters. ** If successful, return LSM_OK. If the lock cannot be obtained because ** there exists some other conflicting lock, return LSM_BUSY. If some other ** error occurs, return an LSM error code. ** |
︙ | ︙ | |||
1427 1428 1429 1430 1431 1432 1433 | assert( nExcl==0 || nExcl==1 ); assert( nExcl==0 || nShared==0 ); assert( nExcl==0 || (db->mLock & (me|ms))==0 ); switch( eOp ){ case LSM_LOCK_UNLOCK: if( nShared==0 ){ | | | | | 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 | assert( nExcl==0 || nExcl==1 ); assert( nExcl==0 || nShared==0 ); assert( nExcl==0 || (db->mLock & (me|ms))==0 ); switch( eOp ){ case LSM_LOCK_UNLOCK: if( nShared==0 ){ lockSharedFile(db->pEnv, p, iLock, LSM_LOCK_UNLOCK); } db->mLock &= ~(me|ms); break; case LSM_LOCK_SHARED: if( nExcl ){ rc = LSM_BUSY; }else{ if( nShared==0 ){ rc = lockSharedFile(db->pEnv, p, iLock, LSM_LOCK_SHARED); } db->mLock |= ms; db->mLock &= ~me; } break; default: assert( eOp==LSM_LOCK_EXCL ); if( nExcl || nShared ){ rc = LSM_BUSY; }else{ rc = lockSharedFile(db->pEnv, p, iLock, LSM_LOCK_EXCL); db->mLock |= (me|ms); } break; } lsmMutexLeave(db->pEnv, p->pClientMutex); } |
︙ | ︙ |
Changes to www/lsmusr.wiki.
︙ | ︙ | |||
802 803 804 805 806 807 808 | <li> <p> Once sufficient data has been accumulated in an in-memory tree (by default "sufficient data" means 1MB, including data structure overhead), it is marked as "old" and a new "live" in-memory tree created. An old in-memory tree is immutable - new data is always inserted into the live tree. There may be at most one old tree | | | | | | | 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 | <li> <p> Once sufficient data has been accumulated in an in-memory tree (by default "sufficient data" means 1MB, including data structure overhead), it is marked as "old" and a new "live" in-memory tree created. An old in-memory tree is immutable - new data is always inserted into the live tree. There may be at most one old tree in memory at a time. <li> <p> The contents of an old in-memory tree may be written into the database file at any point. Once its contents have been written (or "flushed") to the database file, the in-memory tree may be discarded. Flushing an in-memory tree to the database file creates a new database "segment". A database segment is an immutable b-tree structure stored within the database file. A single database file may contain up to 64 segments. <li> <p> At any point, two or more existing segments within the database file may be merged together into a single segment. Once their contents has been merged into the new segment, the original segments may be discarded. <li> <p> After the set of segments in a database file has been modified (either by flushing an in-memory tree to disk or by merging existing segments together), the changes may be made persistent by "checkpointing" the database. Checkpointing involves updating the database file header and and (usually) syncing the contents of the database file to disk. </ol> <p>Steps 3 and 4 above are known as "working" on the database. Step 5 is refered to as "checkpointing". By default, database connections perform work and checkpoint operations periodically from within calls to API functions <code>lsm_insert</code>, <code>lsm_delete</code>, <code>lsm_delete_range</code> and <code>lsm_commit</code> (i.e. functions that write to the database). |
︙ | ︙ | |||
994 995 996 997 998 999 1000 | closing read and write transactions. <p>This option can only be set before lsm_open() is called on the database connection. <p>If this option is set to false and there is already a connection to the database from another process when lsm_open() is called, the lsm_open() | | < | 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 | closing read and write transactions. <p>This option can only be set before lsm_open() is called on the database connection. <p>If this option is set to false and there is already a connection to the database from another process when lsm_open() is called, the lsm_open() call fails with error code LSM_BUSY. <dt> <a href=lsmapi.wiki#LSM_CONFIG_SAFETY>LSM_CONFIG_SAFETY</a> <dd> <p style=margin-top:0> The effect of this option on <a href=#data_durability>data durability</a> is described above. <p>From a performance point of view, this option determines how often the |
︙ | ︙ | |||
1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 | the nMerge argument set to 1 and the third parameter set to a negative value (interpreted as - keep working until there is no more work to do). For example: <verbatim> rc = lsm_work(db, 1, -1, 0); </verbatim> <p>When optimizing the database as above, either the LSM_CONFIG_AUTOCHECKPOINT parameter should be set to a non-zero value or lsm_checkpoint() should be called periodically. Otherwise, no checkpoints will be performed, preventing the library from reusing any space occupied by old segments even after their content has been merged into the new segment. The result - a database file that is optimized, except that it is up to twice as large as it otherwise would be. | > > > | 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 | the nMerge argument set to 1 and the third parameter set to a negative value (interpreted as - keep working until there is no more work to do). For example: <verbatim> rc = lsm_work(db, 1, -1, 0); </verbatim> <p><span style=color:red>todo: the -1 as the 3rd argument above is currently not supported</span> <p>When optimizing the database as above, either the LSM_CONFIG_AUTOCHECKPOINT parameter should be set to a non-zero value or lsm_checkpoint() should be called periodically. Otherwise, no checkpoints will be performed, preventing the library from reusing any space occupied by old segments even after their content has been merged into the new segment. The result - a database file that is optimized, except that it is up to twice as large as it otherwise would be. |