Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Allocate more overflow data onto overflow pages, thus wasting less disk space. (CVS 1367) |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
1d52a4bb478648ef53a0dbb21865ccb9 |
User & Date: | drh 2004-05-13 01:12:57.000 |
Context
2004-05-13
| ||
05:16 | Manifest types in indices. At the moment indices use manifest typing, but some other parts of the SQL engine do not, which can lead to some strange results. (CVS 1368) (check-in: 9f2b6d9d3a user: danielk1977 tags: trunk) | |
01:12 | Allocate more overflow data onto overflow pages, thus wasting less disk space. (CVS 1367) (check-in: 1d52a4bb47 user: drh tags: trunk) | |
2004-05-12
| ||
21:11 | Fix a problem with B+trees. (CVS 1366) (check-in: 64a75c4cd4 user: drh tags: trunk) | |
Changes
Changes to src/btree.c.
1 2 3 4 5 6 7 8 9 10 11 | /* ** 2004 April 6 ** ** The author disclaims copyright to this source code. In place of ** a legal notice, here is a blessing: ** ** May you do good and not evil. ** May you find forgiveness for yourself and forgive others. ** May you share freely, never taking more than you give. ** ************************************************************************* | | | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | /* ** 2004 April 6 ** ** The author disclaims copyright to this source code. In place of ** a legal notice, here is a blessing: ** ** May you do good and not evil. ** May you find forgiveness for yourself and forgive others. ** May you share freely, never taking more than you give. ** ************************************************************************* ** $Id: btree.c,v 1.131 2004/05/13 01:12:57 drh Exp $ ** ** This file implements a external (disk-based) database using BTrees. ** For a detailed discussion of BTrees, refer to ** ** Donald E. Knuth, THE ART OF COMPUTER PROGRAMMING, Volume 3: ** "Sorting And Searching", pages 473-480. Addison-Wesley ** Publishing Company, Reading, Massachusetts. |
︙ | ︙ | |||
57 58 59 60 61 62 63 | ** of that header is as follows: ** ** OFFSET SIZE DESCRIPTION ** 0 16 Header string: "SQLite format 3\000" ** 16 2 Page size in bytes. ** 18 1 File format write version ** 19 1 File format read version | | > | > | > > | > > | | > > > > > > > > > > > > > > > > | | | | | > | | | 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 | ** of that header is as follows: ** ** OFFSET SIZE DESCRIPTION ** 0 16 Header string: "SQLite format 3\000" ** 16 2 Page size in bytes. ** 18 1 File format write version ** 19 1 File format read version ** 20 1 Bytes of unused space at the end of each page ** 21 1 Max embedded payload fraction ** 22 1 Min embedded payload fraction ** 23 1 Min leaf payload fraction ** 24 4 File change counter ** 28 4 Reserved for future use ** 32 4 First freelist page ** 36 4 Number of freelist pages in the file ** 40 60 15 4-byte meta values passed to higher layers ** ** All of the integer values are big-endian (most significant byte first). ** ** The file change counter is incremented every time the database is more ** than once within the same second. This counter, together with the ** modification time of the file, allows other processes to know ** when the file has changed and thus when they need to flush their ** cache. ** ** The max embedded payload fraction is the amount of the total usable ** space in a page that can be consumed by a single cell for standard ** B-tree (non-LEAFDATA) tables. A value of 255 means 100%. The default ** is to limit the maximum cell size so that at least 4 cells will fit ** on one pages. Thus the default max embedded payload fraction is 64. ** ** If the payload for a cell is larger than the max payload, then extra ** payload is spilled to overflow pages. Once an overflow page is allocated, ** as many bytes as possible are moved into the overflow pages without letting ** the cell size drop below the min embedded payload fraction. ** ** The min leaf payload fraction is like the min embedded payload fraction ** except that it applies to leaf nodes in a LEAFDATA tree. The maximum ** payload fraction for a LEAFDATA tree is always 100% (or 255) and it ** not specified in the header. ** ** Each btree page begins with a header described below. Note that the ** header for page one begins at byte 100. For all other btree pages, the ** header begins on byte zero. ** ** OFFSET SIZE DESCRIPTION ** 0 1 Flags. 1: intkey, 2: zerodata, 4: leafdata, 8: leaf ** 1 2 byte offset to the first freeblock ** 3 2 byte offset to the first cell ** 5 1 number of fragmented free bytes ** 6 4 Right child (the Ptr(N+1) value). Omitted if leaf ** ** The flags define the format of this btree page. The leaf flag means that ** this page has no children. The zerodata flag means that this page carries ** only keys and no data. The intkey flag means that the key is a single ** variable length integer at the beginning of the payload. ** ** A variable-length integer is 1 to 9 bytes where the lower 7 bits of each ** byte are used. The integer consists of all bytes that have bit 8 set and ** the first byte with bit 8 clear. The most significant byte of the integer ** appears first. ** ** 0x00 becomes 0x00000000 ** 0x7f becomes 0x0000007f ** 0x81 0x00 becomes 0x00000080 ** 0x82 0x00 becomes 0x00000100 ** 0x80 0x7f becomes 0x0000007f ** 0x8a 0x91 0xd1 0xac 0x78 becomes 0x12345678 ** 0x81 0x81 0x81 0x81 0x01 becomes 0x10204081 ** ** Variable length integers are used for rowids and to hold the number of ** bytes of key and data in a btree cell. ** ** Unused space within a btree page is collected into a linked list of ** freeblocks. Each freeblock is at least 4 bytes in size. The byte offset |
︙ | ︙ | |||
160 161 162 163 164 165 166 | ** or any machine where memory and especially stack memory is limited, ** one may wish to chose a smaller value for the maximum page size. */ #ifndef MX_PAGE_SIZE # define MX_PAGE_SIZE 1024 #endif | < < < < < < | | 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 | ** or any machine where memory and especially stack memory is limited, ** one may wish to chose a smaller value for the maximum page size. */ #ifndef MX_PAGE_SIZE # define MX_PAGE_SIZE 1024 #endif /* The following value is the maximum cell size assuming a maximum page ** size give above. */ #define MX_CELL_SIZE (MX_PAGE_SIZE-10) /* The maximum number of cells on a single page of the database. This ** assumes a minimum cell size of 3 bytes. Such small cells will be ** exceedingly rare, but they are possible. */ #define MX_CELL ((MX_PAGE_SIZE-10)/3) |
︙ | ︙ | |||
247 248 249 250 251 252 253 | Pager *pPager; /* The page cache */ BtCursor *pCursor; /* A list of all open cursors */ MemPage *pPage1; /* First page of the database */ u8 inTrans; /* True if a transaction is in progress */ u8 inStmt; /* True if there is a checkpoint on the transaction */ u8 readOnly; /* True if the underlying file is readonly */ int pageSize; /* Number of usable bytes on each page */ | | > > > > > > | 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 | Pager *pPager; /* The page cache */ BtCursor *pCursor; /* A list of all open cursors */ MemPage *pPage1; /* First page of the database */ u8 inTrans; /* True if a transaction is in progress */ u8 inStmt; /* True if there is a checkpoint on the transaction */ u8 readOnly; /* True if the underlying file is readonly */ int pageSize; /* Number of usable bytes on each page */ int maxLocal; /* Maximum local payload in non-LEAFDATA tables */ int minLocal; /* Minimum local payload in non-LEAFDATA tables */ int maxLeaf; /* Maximum local payload in a LEAFDATA table */ int minLeaf; /* Minimum local payload in a LEAFDATA table */ u8 maxEmbedFrac; /* Maximum payload as % of total page size */ u8 minEmbedFrac; /* Minimum payload as % of total page size */ u8 minLeafFrac; /* Minimum leaf payload as % of total page size */ }; typedef Btree Bt; /* ** A cursor is a pointer to a particular entry in the BTree. ** The entry is identified by its MemPage and the index in ** MemPage.aCell[] of the entry. |
︙ | ︙ | |||
271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 | int idx; /* Index of the entry in pPage->aCell[] */ u8 wrFlag; /* True if writable */ u8 iMatch; /* compare result from last sqlite3BtreeMoveto() */ u8 isValid; /* TRUE if points to a valid entry */ u8 status; /* Set to SQLITE_ABORT if cursors is invalidated */ }; /* ** Read or write a two-, four-, and eight-byte big-endian integer values. */ static u32 get2byte(unsigned char *p){ return (p[0]<<8) | p[1]; } static u32 get4byte(unsigned char *p){ return (p[0]<<24) | (p[1]<<16) | (p[2]<<8) | p[3]; } | > > > > > > > > > > > > > > < < < < > > > > > > > | > > | | > > > | > > > > > > > > > > | | | > > > | | | < < > > > | | | | > > > > > > > > > > > > > > > > > > > > > > > > > > | < < < | < < < < < < < < | | 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 | int idx; /* Index of the entry in pPage->aCell[] */ u8 wrFlag; /* True if writable */ u8 iMatch; /* compare result from last sqlite3BtreeMoveto() */ u8 isValid; /* TRUE if points to a valid entry */ u8 status; /* Set to SQLITE_ABORT if cursors is invalidated */ }; /* ** An instance of the following structure is used to hold information ** about a cell. The parseCell() function fills the structure in. */ typedef struct CellInfo CellInfo; struct CellInfo { i64 nKey; /* The key for INTKEY tables, or number of bytes in key */ u32 nData; /* Number of bytes of data */ int nHeader; /* Size of the header in bytes */ int nLocal; /* Amount of payload held locally */ int iOverflow; /* Offset to overflow page number. Zero if none */ int nSize; /* Size of the cell */ }; /* ** Read or write a two-, four-, and eight-byte big-endian integer values. */ static u32 get2byte(unsigned char *p){ return (p[0]<<8) | p[1]; } static u32 get4byte(unsigned char *p){ return (p[0]<<24) | (p[1]<<16) | (p[2]<<8) | p[3]; } static void put2byte(unsigned char *p, u32 v){ p[0] = v>>8; p[1] = v; } static void put4byte(unsigned char *p, u32 v){ p[0] = v>>24; p[1] = v>>16; p[2] = v>>8; p[3] = v; } #if 0 /* NOT_USED */ static u64 get8byte(unsigned char *p){ u64 v = get4byte(p); return (v<<32) | get4byte(&p[4]); } static void put8byte(unsigned char *p, u64 v){ put4byte(&p[4], v>>32); put4byte(p, v); } #endif /* ** Read a variable-length integer. Store the result in *pResult. ** Return the number of bytes in the integer. */ static unsigned int getVarint(unsigned char *p, u64 *pResult){ u64 x = 0; int n = 0; unsigned char c; do{ c = p[n++]; x = (x<<7) | (c & 0x7f); }while( (c & 0x80)!=0 ); *pResult = x; return n; } static unsigned int getVarint32(unsigned char *p, u32 *pResult){ u32 x = 0; int n = 0; unsigned char c; do{ c = p[n++]; x = (x<<7) | (c & 0x7f); }while( (c & 0x80)!=0 ); *pResult = x; return n; } /* ** Write a variable length integer with value v into p[]. Return ** the number of bytes written. */ static unsigned int putVarint(unsigned char *p, u64 v){ int i, j, n; u8 buf[10]; n = 0; do{ buf[n++] = (v & 0x7f) | 0x80; v >>= 7; }while( v!=0 ); buf[0] &= 0x7f; for(i=0, j=n-1; j>=0; j--, i++){ p[i] = buf[j]; } return n; } /* ** Parse a cell header and fill in the CellInfo structure. */ static void parseCell( MemPage *pPage, /* Page containing the cell */ unsigned char *pCell, /* The cell */ CellInfo *pInfo /* Fill in this structure */ ){ int n; int nPayload; Btree *pBt; int minLocal, maxLocal; if( pPage->leaf ){ n = 2; }else{ n = 6; } if( pPage->hasData ){ n += getVarint32(&pCell[n], &pInfo->nData); }else{ pInfo->nData = 0; } n += getVarint(&pCell[n], &pInfo->nKey); pInfo->nHeader = n; nPayload = pInfo->nData; if( !pPage->intKey ){ nPayload += pInfo->nKey; } pBt = pPage->pBt; if( pPage->leafData ){ minLocal = pBt->minLeaf; maxLocal = pBt->pageSize - 23; }else{ minLocal = pBt->minLocal; maxLocal = pBt->maxLocal; } if( nPayload<=maxLocal ){ pInfo->nLocal = nPayload; pInfo->iOverflow = 0; pInfo->nSize = nPayload + n; }else{ int surplus = minLocal + (nPayload - minLocal)%(pBt->pageSize - 4); if( surplus <= maxLocal ){ pInfo->nLocal = surplus; }else{ pInfo->nLocal = minLocal; } pInfo->iOverflow = pInfo->nLocal + n; pInfo->nSize = pInfo->iOverflow + 4; } } /* ** Compute the total number of bytes that a Cell needs on the main ** database page. The number returned includes the Cell header, ** local payload storage, and the pointer to overflow pages (if ** applicable). Additional space allocated on overflow pages ** is NOT included in the value returned from this routine. */ static int cellSize(MemPage *pPage, unsigned char *pCell){ CellInfo info; parseCell(pPage, pCell, &info); return info.nSize; } /* ** Do sanity checking on a page. Throw an exception if anything is ** not right. ** ** This routine is used for internal error checking only. It is omitted |
︙ | ︙ | |||
900 901 902 903 904 905 906 907 908 | return rc; } sqlite3pager_set_destructor(pBt->pPager, pageDestructor); pBt->pCursor = 0; pBt->pPage1 = 0; pBt->readOnly = sqlite3pager_isreadonly(pBt->pPager); pBt->pageSize = SQLITE_PAGE_SIZE; /* FIX ME - read from header */ /* maxLocal is the maximum amount of payload to store locally for | > > > | | | > > > | > > > > | | 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 | return rc; } sqlite3pager_set_destructor(pBt->pPager, pageDestructor); pBt->pCursor = 0; pBt->pPage1 = 0; pBt->readOnly = sqlite3pager_isreadonly(pBt->pPager); pBt->pageSize = SQLITE_PAGE_SIZE; /* FIX ME - read from header */ pBt->maxEmbedFrac = 64; /* FIX ME - read from header */ pBt->minEmbedFrac = 32; /* FIX ME - read from header */ pBt->minLeafFrac = 32; /* FIX ME - read from header */ /* maxLocal is the maximum amount of payload to store locally for ** a cell. Make sure it is small enough so that at least minFanout ** cells can will fit on one page. We assume a 10-byte page header. ** Besides the payload, the cell must store: ** 2-byte pointer to next cell ** 4-byte child pointer ** 9-byte nKey value ** 4-byte nData value ** 4-byte overflow page pointer ** So a cell consists of a header which is as much as 19 bytes long, ** 0 to N bytes of payload, and an optional 4 byte overflow page pointer. */ assert(pBt->maxEmbedFrac>0 && 255/pBt->maxEmbedFrac>=3 ); pBt->maxLocal = (pBt->pageSize-10)*pBt->maxEmbedFrac/255 - 23; pBt->minLocal = (pBt->pageSize-10)*pBt->minEmbedFrac/255 - 23; pBt->maxLeaf = pBt->pageSize - 33; pBt->minLeaf = (pBt->pageSize-10)*pBt->minLeafFrac/255 - 23; assert( pBt->maxLeaf + 23 <= MX_CELL_SIZE ); *ppBtree = pBt; return SQLITE_OK; } /* ** Close an open database and invalidate all cursors. */ |
︙ | ︙ | |||
1433 1434 1435 1436 1437 1438 1439 | ** Failure is not possible. If the cursor is not currently ** pointing to an entry (which can happen, for example, if ** the database is empty) then *pSize is set to 0. */ int sqlite3BtreeDataSize(BtCursor *pCur, u32 *pSize){ MemPage *pPage; unsigned char *cell; | < | < < | 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 | ** Failure is not possible. If the cursor is not currently ** pointing to an entry (which can happen, for example, if ** the database is empty) then *pSize is set to 0. */ int sqlite3BtreeDataSize(BtCursor *pCur, u32 *pSize){ MemPage *pPage; unsigned char *cell; if( !pCur->isValid ){ return pCur->status ? pCur->status : SQLITE_INTERNAL; } pPage = pCur->pPage; assert( pPage!=0 ); assert( pPage->isInit ); pageIntegrity(pPage); if( !pPage->hasData ){ *pSize = 0; }else{ assert( pCur->idx>=0 && pCur->idx<pPage->nCell ); cell = pPage->aCell[pCur->idx]; cell += 2; /* Skip the offset to the next cell */ if( !pPage->leaf ){ cell += 4; /* Skip the child pointer */ } getVarint32(cell, pSize); } return SQLITE_OK; } /* ** Read payload information from the entry that the pCur cursor is ** pointing to. Begin reading the payload at "offset" and read |
︙ | ︙ | |||
1478 1479 1480 1481 1482 1483 1484 | int skipKey /* offset begins at data if this is true */ ){ unsigned char *aPayload; Pgno nextPage; int rc; MemPage *pPage; Btree *pBt; | < < | > < < | < < | < < < < | | | < | | | | | | 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 | int skipKey /* offset begins at data if this is true */ ){ unsigned char *aPayload; Pgno nextPage; int rc; MemPage *pPage; Btree *pBt; int ovflSize; CellInfo info; assert( pCur!=0 && pCur->pPage!=0 ); assert( pCur->isValid ); pBt = pCur->pBt; pPage = pCur->pPage; pageIntegrity(pPage); assert( pCur->idx>=0 && pCur->idx<pPage->nCell ); aPayload = pPage->aCell[pCur->idx]; parseCell(pPage, aPayload, &info); aPayload += info.nHeader; if( pPage->intKey ){ info.nKey = 0; } assert( offset>=0 ); if( skipKey ){ offset += info.nKey; } if( offset+amt > info.nKey+info.nData ){ return SQLITE_ERROR; } if( offset<info.nLocal ){ int a = amt; if( a+offset>info.nLocal ){ a = info.nLocal - offset; } memcpy(pBuf, &aPayload[offset], a); if( a==amt ){ return SQLITE_OK; } offset = 0; pBuf += a; amt -= a; }else{ offset -= info.nLocal; } if( amt>0 ){ nextPage = get4byte(&aPayload[info.nLocal]); } ovflSize = pBt->pageSize - 4; while( amt>0 && nextPage ){ rc = sqlite3pager_get(pBt->pPager, nextPage, (void**)&aPayload); if( rc!=0 ){ return rc; } |
︙ | ︙ | |||
1625 1626 1627 1628 1629 1630 1631 | BtCursor *pCur, /* Cursor pointing to entry to read from */ int amt, /* Amount requested */ int skipKey /* read beginning at data if this is true */ ){ unsigned char *aPayload; MemPage *pPage; Btree *pBt; | < < | < < | < < | < < < < | < | | | | | | | | 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 | BtCursor *pCur, /* Cursor pointing to entry to read from */ int amt, /* Amount requested */ int skipKey /* read beginning at data if this is true */ ){ unsigned char *aPayload; MemPage *pPage; Btree *pBt; CellInfo info; assert( pCur!=0 && pCur->pPage!=0 ); assert( pCur->isValid ); pBt = pCur->pBt; pPage = pCur->pPage; pageIntegrity(pPage); assert( pCur->idx>=0 && pCur->idx<pPage->nCell ); aPayload = pPage->aCell[pCur->idx]; parseCell(pPage, aPayload, &info); aPayload += info.nHeader; if( pPage->intKey ){ info.nKey = 0; } if( skipKey ){ aPayload += info.nKey; info.nLocal -= info.nKey; if( amt<0 ) amt = info.nData; assert( amt<=info.nData ); }else{ if( amt<0 ) amt = info.nKey; assert( amt<=info.nKey ); } if( amt>info.nLocal ){ return 0; /* If any of the data is not local, return nothing */ } return aPayload; } /* |
︙ | ︙ | |||
2293 2294 2295 2296 2297 2298 2299 | } /* ** Free any overflow pages associated with the given Cell. */ static int clearCell(MemPage *pPage, unsigned char *pCell){ Btree *pBt = pPage->pBt; | | < < > | < < | < < < | | 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 | } /* ** Free any overflow pages associated with the given Cell. */ static int clearCell(MemPage *pPage, unsigned char *pCell){ Btree *pBt = pPage->pBt; CellInfo info; Pgno ovflPgno; int rc; parseCell(pPage, pCell, &info); if( info.iOverflow==0 ){ return SQLITE_OK; /* No overflow pages. Return without doing anything */ } ovflPgno = get4byte(&pCell[info.iOverflow]); while( ovflPgno!=0 ){ MemPage *pOvfl; rc = getPage(pBt, ovflPgno, &pOvfl); if( rc ) return rc; ovflPgno = get4byte(pOvfl->aData); rc = freePage(pOvfl); if( rc ) return rc; |
︙ | ︙ | |||
2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 | MemPage *pOvfl = 0; MemPage *pToRelease = 0; unsigned char *pPrior; unsigned char *pPayload; Btree *pBt = pPage->pBt; Pgno pgnoOvfl = 0; int nHeader; /* Fill in the header. */ nHeader = 2; if( !pPage->leaf ){ nHeader += 4; } if( pPage->hasData ){ nHeader += putVarint(&pCell[nHeader], nData); } nHeader += putVarint(&pCell[nHeader], *(u64*)&nKey); /* Fill in the payload */ | > > > > > > > < < < < < < | < | | | 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 | MemPage *pOvfl = 0; MemPage *pToRelease = 0; unsigned char *pPrior; unsigned char *pPayload; Btree *pBt = pPage->pBt; Pgno pgnoOvfl = 0; int nHeader; CellInfo info; /* Fill in the header. */ nHeader = 2; if( !pPage->leaf ){ nHeader += 4; } if( pPage->hasData ){ nHeader += putVarint(&pCell[nHeader], nData); }else{ nData = 0; } nHeader += putVarint(&pCell[nHeader], *(u64*)&nKey); parseCell(pPage, pCell, &info); assert( info.nHeader==nHeader ); assert( info.nKey==nKey ); assert( info.nData==nData ); /* Fill in the payload */ nPayload = nData; if( pPage->intKey ){ pSrc = pData; nSrc = nData; nData = 0; }else{ nPayload += nKey; pSrc = pKey; nSrc = nKey; } *pnSize = info.nSize; spaceLeft = info.nLocal; pPayload = &pCell[nHeader]; pPrior = &pCell[info.iOverflow]; while( nPayload>0 ){ if( spaceLeft==0 ){ rc = allocatePage(pBt, &pOvfl, &pgnoOvfl, pgnoOvfl); if( rc ){ releasePage(pToRelease); clearCell(pPage, pCell); |
︙ | ︙ | |||
3102 3103 3104 3105 3106 3107 3108 | int sz; pCell = apCell[j]; sz = szCell[j] + leafCorrection; if( !pNew->leaf ){ memcpy(&pNew->aData[6], pCell+2, 4); pTemp = 0; }else if( leafData ){ | < < | | | | 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172 3173 3174 | int sz; pCell = apCell[j]; sz = szCell[j] + leafCorrection; if( !pNew->leaf ){ memcpy(&pNew->aData[6], pCell+2, 4); pTemp = 0; }else if( leafData ){ CellInfo info; j--; parseCell(pNew, apCell[j], &info); pCell = aInsBuf[i]; fillInCell(pParent, pCell, 0, info.nKey, 0, 0, &sz); pTemp = 0; }else{ pCell -= 4; pTemp = aInsBuf[i]; } insertCell(pParent, nxDiv, pCell, sz, pTemp); put4byte(&pParent->aCell[nxDiv][2], pNew->pgno); |
︙ | ︙ | |||
3573 3574 3575 3576 3577 3578 3579 | printf("PAGE %d: flags=0x%02x frag=%d parent=%d\n", pgno, data[hdr], data[hdr+5], (pPage->isInit && pPage->pParent) ? pPage->pParent->pgno : 0); i = 0; assert( hdr == (pgno==1 ? 100 : 0) ); idx = get2byte(&data[hdr+3]); while( idx>0 && idx<=pBt->pageSize ){ | < < | | > > > > < | | | | | | 3625 3626 3627 3628 3629 3630 3631 3632 3633 3634 3635 3636 3637 3638 3639 3640 3641 3642 3643 3644 3645 3646 3647 3648 3649 3650 3651 3652 3653 3654 3655 3656 3657 3658 3659 3660 3661 3662 3663 | printf("PAGE %d: flags=0x%02x frag=%d parent=%d\n", pgno, data[hdr], data[hdr+5], (pPage->isInit && pPage->pParent) ? pPage->pParent->pgno : 0); i = 0; assert( hdr == (pgno==1 ? 100 : 0) ); idx = get2byte(&data[hdr+3]); while( idx>0 && idx<=pBt->pageSize ){ CellInfo info; Pgno child; unsigned char *pCell = &data[idx]; int sz; pCell = &data[idx]; parseCell(pPage, pCell, &info); sz = info.nSize; sprintf(range,"%d..%d", idx, idx+sz-1); if( pPage->leaf ){ child = 0; }else{ child = get4byte(&pCell[2]); } sz = info.nData; if( !pPage->intKey ) sz += info.nKey; if( sz>sizeof(payload)-1 ) sz = sizeof(payload)-1; memcpy(payload, &pCell[info.nHeader], sz); for(j=0; j<sz; j++){ if( payload[j]<0x20 || payload[j]>0x7f ) payload[j] = '.'; } payload[sz] = 0; printf( "cell %2d: i=%-10s chld=%-4d nk=%-4lld nd=%-4d payload=%s\n", i, range, child, info.nKey, info.nData, payload ); if( pPage->isInit && pPage->aCell[i]!=pCell ){ printf("**** aCell[%d] does not match on prior entry ****\n", i); } i++; idx = get2byte(pCell); } |
︙ | ︙ | |||
3787 3788 3789 3790 3791 3792 3793 | N -= n; } iPage = get4byte(pOvfl); sqlite3pager_unref(pOvfl); } } | < < < < < < < < < < < < < < < < < | 3840 3841 3842 3843 3844 3845 3846 3847 3848 3849 3850 3851 3852 3853 | N -= n; } iPage = get4byte(pOvfl); sqlite3pager_unref(pOvfl); } } /* ** Do various sanity checks on a single page of a tree. Return ** the tree depth. Root pages return 0. Parents of root pages ** return 1, and so forth. ** ** These checks are done: ** |
︙ | ︙ | |||
3836 3837 3838 3839 3840 3841 3842 | char *zUpperBound, /* All keys should be less than this, if not NULL */ int nUpper /* Number of characters in zUpperBound */ ){ MemPage *pPage; int i, rc, depth, d2, pgno, cnt; int hdr; u8 *data; | < < < > | < < | > > | | | | | | | 3872 3873 3874 3875 3876 3877 3878 3879 3880 3881 3882 3883 3884 3885 3886 3887 3888 3889 3890 3891 3892 3893 3894 3895 3896 3897 3898 3899 3900 3901 3902 3903 3904 3905 3906 3907 3908 3909 3910 3911 3912 3913 3914 3915 3916 3917 3918 3919 3920 3921 3922 3923 3924 3925 3926 3927 3928 3929 3930 | char *zUpperBound, /* All keys should be less than this, if not NULL */ int nUpper /* Number of characters in zUpperBound */ ){ MemPage *pPage; int i, rc, depth, d2, pgno, cnt; int hdr; u8 *data; BtCursor cur; Btree *pBt; int maxLocal, pageSize; char zMsg[100]; char zContext[100]; char hit[MX_PAGE_SIZE]; /* Check that the page exists */ cur.pBt = pBt = pCheck->pBt; pageSize = pBt->pageSize; if( iPage==0 ) return 0; if( checkRef(pCheck, iPage, zParentContext) ) return 0; if( (rc = getPage(pBt, (Pgno)iPage, &pPage))!=0 ){ sprintf(zMsg, "unable to get the page. error code=%d", rc); checkAppendMsg(pCheck, zContext, zMsg); return 0; } maxLocal = pPage->leafData ? pBt->maxLeaf : pBt->maxLocal; if( (rc = initPage(pPage, pParent))!=0 ){ sprintf(zMsg, "initPage() returns error code %d", rc); checkAppendMsg(pCheck, zContext, zMsg); releasePage(pPage); return 0; } /* Check out all the cells. */ depth = 0; cur.pPage = pPage; for(i=0; i<pPage->nCell; i++){ u8 *pCell; int sz; CellInfo info; /* Check payload overflow pages */ sprintf(zContext, "On tree page %d cell %d: ", iPage, i); pCell = pPage->aCell[i]; parseCell(pPage, pCell, &info); sz = info.nData; if( !pPage->intKey ) sz += info.nKey; if( sz>info.nLocal ){ int nPage = (sz - info.nLocal + pageSize - 5)/(pageSize - 4); checkList(pCheck, 0, get4byte(&pCell[info.iOverflow]),nPage,zContext); } /* Check sanity of left child page. */ if( !pPage->leaf ){ pgno = get4byte(&pCell[2]); d2 = checkTreePage(pCheck,pgno,pPage,zContext,0,0,0,0); |
︙ | ︙ |