/ Check-in [c86b7c06]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Improved comments and speed tweaks to btree.c. (CVS 1497)
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: c86b7c065a798cd108189b96e87d100115862ff2
User & Date: drh 2004-05-30 19:19:05
Context
2004-05-30
20:46
Various speed enhancements. (CVS 1498) check-in: a0db15bb user: drh tags: trunk
19:19
Improved comments and speed tweaks to btree.c. (CVS 1497) check-in: c86b7c06 user: drh tags: trunk
02:14
Additional minor speed improvements. (CVS 1496) check-in: a90264c0 user: drh tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to src/btree.c.

5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
..
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
..
93
94
95
96
97
98
99
100
101















102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
...
251
252
253
254
255
256
257



258
259
260
261
262
263
264
...
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
...
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
...
361
362
363
364
365
366
367
368
369


370
371
372
373
374
375
376
377
...
400
401
402
403
404
405
406
407
408
409
410
411
412
413

414
415
416
417
418

419
420
421
422
423
424
425
426
427
428
429

430
431
432
433
434
435
436
437
438
439
440

441















442
443
444
445
446
447
448
449
...
730
731
732
733
734
735
736

























737
738
739
740
741
742
743
...
748
749
750
751
752
753
754
755
756
757
758
759



760
761
762
763
764
765
766
...
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
...
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
....
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
....
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
....
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
....
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
....
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
....
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
....
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
....
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
....
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
....
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
....
3357
3358
3359
3360
3361
3362
3363
3364
3365
3366
3367
3368
3369
3370
3371
** a legal notice, here is a blessing:
**
**    May you do good and not evil.
**    May you find forgiveness for yourself and forgive others.
**    May you share freely, never taking more than you give.
**
*************************************************************************
** $Id: btree.c,v 1.150 2004/05/30 02:14:18 drh Exp $
**
** This file implements a external (disk-based) database using BTrees.
** For a detailed discussion of BTrees, refer to
**
**     Donald E. Knuth, THE ART OF COMPUTER PROGRAMMING, Volume 3:
**     "Sorting And Searching", pages 473-480. Addison-Wesley
**     Publishing Company, Reading, Massachusetts.
................................................................................
** The file is divided into pages.  The first page is called page 1,
** the second is page 2, and so forth.  A page number of zero indicates
** "no such page".  The page size can be anything between 512 and 65536.
** Each page can be either a btree page, a freelist page or an overflow
** page.
**
** The first page is always a btree page.  The first 100 bytes of the first
** page contain a special header that describes the file.  The format
** of that header is as follows:
**
**   OFFSET   SIZE    DESCRIPTION
**      0      16     Header string: "SQLite format 3\000"
**     16       2     Page size in bytes.  
**     18       1     File format write version
**     19       1     File format read version
**     20       1     Bytes of unused space at the end of each page
................................................................................
** The min leaf payload fraction is like the min embedded payload fraction
** except that it applies to leaf nodes in a LEAFDATA tree.  The maximum
** payload fraction for a LEAFDATA tree is always 100% (or 255) and it
** not specified in the header.
**
** Each btree pages is divided into three sections:  The header, the
** cell pointer array, and the cell area area.  Page 1 also has a 100-byte
** file header that occurs before the page header.   The 100-byte file
** header occurs on page 1 only.















**
** The page headers looks like this:
**
**   OFFSET   SIZE     DESCRIPTION
**      0       1      Flags. 1: intkey, 2: zerodata, 4: leafdata, 8: leaf
**      1       2      byte offset to the first freeblock
**      3       2      number of cells on this page
**      5       2      first byte past the cell array area
**      7       1      number of fragmented free bytes
**      8       4      Right child (the Ptr(N+1) value).  Omitted if leaf
**
** The flags define the format of this btree page.  The leaf flag means that
** this page has no children.  The zerodata flag means that this page carries
** only keys and no data.  The intkey flag means that the key is a single
** variable length integer at the beginning of the payload.
**
** The cell pointer array begins on the first byte after the page header.
................................................................................
  u8 nOverflow;        /* Number of overflow cell bodies in aCell[] */
  u8 intKey;           /* True if intkey flag is set */
  u8 leaf;             /* True if leaf flag is set */
  u8 zeroData;         /* True if table stores keys only */
  u8 leafData;         /* True if tables stores data on leaves only */
  u8 hasData;          /* True if this page stores data */
  u8 hdrOffset;        /* 100 for page 1.  0 otherwise */



  u16 cellOffset;      /* Index in aData of first cell pointer */
  u16 idxParent;       /* Index in parent of this node */
  u16 nFree;           /* Number of free bytes on the page */
  u16 nCell;           /* Number of cells on this page, local and ovfl */
  struct _OvflCell {   /* Cells that will not fit on aData[] */
    u8 *pCell;           /* Pointers to the body of the overflow cell */
    u16 idx;             /* Insert this cell before idx-th non-overflow cell */
................................................................................
  int maxLeaf;          /* Maximum local payload in a LEAFDATA table */
  int minLeaf;          /* Minimum local payload in a LEAFDATA table */
};
typedef Btree Bt;

/*
** An instance of the following structure is used to hold information
** about a cell.  The parseCell() function fills in this structure
** based on information extract from the raw disk page.
*/
typedef struct CellInfo CellInfo;
struct CellInfo {
  u8 *pCell;     /* Pointer to the start of cell content */
  i64 nKey;      /* The key for INTKEY tables, or number of bytes in key */
  u32 nData;     /* Number of bytes of data */
  u16 nHeader;   /* Size of the cell header in bytes */
  u16 nLocal;    /* Amount of payload held locally */
  u16 iOverflow; /* Offset to overflow page number.  Zero if no overflow */
  u16 nSize;     /* Total size of the cell content (on the main b-tree page) */
};

/*
** A cursor is a pointer to a particular entry in the BTree.
** The entry is identified by its MemPage and the index in
** MemPage.aCell[] of the entry.
*/
................................................................................
  BtCursor *pShared;        /* Loop of cursors with the same root page */
  int (*xCompare)(void*,int,const void*,int,const void*); /* Key comp func */
  void *pArg;               /* First arg to xCompare() */
  Pgno pgnoRoot;            /* The root page of this tree */
  MemPage *pPage;           /* Page that contains the entry */
  int idx;                  /* Index of the entry in pPage->aCell[] */
  CellInfo info;            /* A parse of the cell we are pointing at */
  u8 infoValid;             /* True if information in BtCursor.info is valid */
  u8 wrFlag;                /* True if writable */
  u8 isValid;               /* TRUE if points to a valid entry */
  u8 status;                /* Set to SQLITE_ABORT if cursors is invalidated */
};

/*
** Read or write a two- and four-byte big-endian integer values.
................................................................................
** file.
*/
#define getVarint    sqlite3GetVarint
#define getVarint32  sqlite3GetVarint32
#define putVarint    sqlite3PutVarint

/*
** Return a pointer to the start of cell content for the given
** cell of a page.  This routine works only for pages that


** do not contain overflow cells.
*/
static u8 *findCell(MemPage *pPage, int iCell){
  u8 *data = pPage->aData;
  assert( iCell>=0 );
  assert( iCell<get2byte(&data[pPage->hdrOffset+3]) );
  return data + get2byte(&data[pPage->cellOffset+2*iCell]);
}
................................................................................
** body of the cell as its second argument.
*/
static void parseCellPtr(
  MemPage *pPage,         /* Page containing the cell */
  u8 *pCell,              /* Pointer to the cell text. */
  CellInfo *pInfo         /* Fill in this structure */
){
  int n;
  int nPayload;
  Btree *pBt;
  int minLocal, maxLocal;

  pInfo->pCell = pCell;
  assert( pPage->leaf==0 || pPage->leaf==1 );

  n = 4 - 4*pPage->leaf;
  if( pPage->hasData ){
    n += getVarint32(&pCell[n], &pInfo->nData);
  }else{
    pInfo->nData = 0;

  }
  n += getVarint(&pCell[n], &pInfo->nKey);
  pInfo->nHeader = n;
  nPayload = pInfo->nData;
  if( !pPage->intKey ){
    nPayload += pInfo->nKey;
  }
  pBt = pPage->pBt;
  if( pPage->leafData ){
    minLocal = pBt->minLeaf;
    maxLocal = pBt->maxLeaf;

  }else{
    minLocal = pBt->minLocal;
    maxLocal = pBt->maxLocal;
  }
  if( nPayload<=maxLocal ){
    pInfo->nLocal = nPayload;
    pInfo->iOverflow = 0;
    pInfo->nSize = nPayload + n;
    if( pInfo->nSize<4 ){
      pInfo->nSize = 4;  /* Minimum cell size is 4 */
    }

  }else{















    int surplus = minLocal + (nPayload - minLocal)%(pBt->usableSize - 4);
    if( surplus <= maxLocal ){
      pInfo->nLocal = surplus;
    }else{
      pInfo->nLocal = minLocal;
    }
    pInfo->iOverflow = pInfo->nLocal + n;
    pInfo->nSize = pInfo->iOverflow + 4;
................................................................................
    int top;
    pbegin = get2byte(&data[hdr+1]);
    memcpy(&data[hdr+1], &data[pbegin], 2);
    top = get2byte(&data[hdr+5]);
    put2byte(&data[hdr+5], top + get2byte(&data[pbegin+2]));
  }
}


























/*
** Initialize the auxiliary information for a disk block.
**
** The pParent parameter must be a pointer to the MemPage which
** is the parent of the page being initialized.  The root of a
** BTree has no parent and so for that page, pParent==NULL.
................................................................................
** guarantee that the page is well-formed.  It only shows that
** we failed to detect any corruption.
*/
static int initPage(
  MemPage *pPage,        /* The page to be initialized */
  MemPage *pParent       /* The parent.  Might be NULL */
){
  int c, pc, i, hdr;
  unsigned char *data;
  int usableSize, cellOffset;
  int nFree;
  int top;




  assert( pPage->pBt!=0 );
  assert( pParent==0 || pParent->pBt==pPage->pBt );
  assert( pPage->pgno==sqlite3pager_pagenumber(pPage->aData) );
  assert( pPage->aData == &((unsigned char*)pPage)[-pPage->pBt->pageSize] );
  assert( pPage->pParent==0 || pPage->pParent==pParent );
  assert( pPage->pParent==pParent || !pPage->isInit );
................................................................................
  if( pPage->isInit ) return SQLITE_OK;
  if( pPage->pParent==0 && pParent!=0 ){
    pPage->pParent = pParent;
    sqlite3pager_ref(pParent->aData);
  }
  hdr = pPage->hdrOffset;
  data = pPage->aData;
  c = data[hdr];
  assert( pPage->hdrOffset==(pPage->pgno==1 ? 100 : 0) );
  pPage->intKey = (c & (PTF_INTKEY|PTF_LEAFDATA))!=0;
  pPage->zeroData = (c & PTF_ZERODATA)!=0;
  pPage->leafData = (c & PTF_LEAFDATA)!=0;
  pPage->leaf = (c & PTF_LEAF)!=0;
  pPage->hasData = !(pPage->zeroData || (!pPage->leaf && pPage->leafData));
  pPage->nOverflow = 0;
  pPage->idxShift = 0;
  usableSize = pPage->pBt->usableSize;
  pPage->cellOffset = cellOffset = hdr + 12 - 4*pPage->leaf;
  top = get2byte(&data[hdr+5]);
  pPage->nCell = get2byte(&data[hdr+3]);

................................................................................
  memset(&data[hdr], 0, pBt->usableSize - hdr);
  data[hdr] = flags;
  first = hdr + 8 + 4*((flags&PTF_LEAF)==0);
  memset(&data[hdr+1], 0, 4);
  data[hdr+7] = 0;
  put2byte(&data[hdr+5], pBt->usableSize);
  pPage->nFree = pBt->usableSize - first;
  pPage->intKey = (flags & (PTF_INTKEY|PTF_LEAFDATA))!=0;
  pPage->zeroData = (flags & PTF_ZERODATA)!=0;
  pPage->leafData = (flags & PTF_LEAFDATA)!=0;
  pPage->leaf = (flags & PTF_LEAF)!=0;
  pPage->hasData = !(pPage->zeroData || (!pPage->leaf && pPage->leafData));
  pPage->hdrOffset = hdr;
  pPage->cellOffset = first;
  pPage->nOverflow = 0;
  pPage->idxShift = 0;
  pPage->nCell = 0;
  pPage->isInit = 1;
  pageIntegrity(pPage);
................................................................................
    goto create_cursor_exception;
  }
  pCur->xCompare = xCmp ? xCmp : dfltCompare;
  pCur->pArg = pArg;
  pCur->pBt = pBt;
  pCur->wrFlag = wrFlag;
  pCur->idx = 0;
  pCur->infoValid = 0;
  pCur->pNext = pBt->pCursor;
  if( pCur->pNext ){
    pCur->pNext->pPrev = pCur;
  }
  pCur->pPrev = 0;
  pRing = pBt->pCursor;
  while( pRing && pRing->pgnoRoot!=pCur->pgnoRoot ){ pRing = pRing->pNext; }
................................................................................
** Make sure the BtCursor.info field of the given cursor is valid.
** If it is not already valid, call parseCell() to fill it in.
**
** BtCursor.info is a cache of the information in the current cell.
** Using this cache reduces the number of calls to parseCell().
*/
static void getCellInfo(BtCursor *pCur){
  if( !pCur->infoValid ){
    parseCell(pCur->pPage, pCur->idx, &pCur->info);
    pCur->infoValid = 1;
  }else{
#ifndef NDEBUG
    CellInfo info;
    parseCell(pCur->pPage, pCur->idx, &info);
    assert( memcmp(&info, &pCur->info, sizeof(info))==0 );
#endif
  }
................................................................................
  pageIntegrity(pNewPage);
  pNewPage->idxParent = pCur->idx;
  pOldPage = pCur->pPage;
  pOldPage->idxShift = 0;
  releasePage(pOldPage);
  pCur->pPage = pNewPage;
  pCur->idx = 0;
  pCur->infoValid = 0;
  if( pNewPage->nCell<1 ){
    return SQLITE_CORRUPT;
  }
  return SQLITE_OK;
}

/*
................................................................................
  assert( pParent!=0 );
  pageIntegrity(pParent);
  idxParent = pPage->idxParent;
  sqlite3pager_ref(pParent->aData);
  oldPgno = pPage->pgno;
  releasePage(pPage);
  pCur->pPage = pParent;
  pCur->infoValid = 0;
  assert( pParent->idxShift==0 );
  pCur->idx = idxParent;
}

/*
** Move the cursor to the root page
*/
................................................................................
    pCur->isValid = 0;
    return rc;
  }
  releasePage(pCur->pPage);
  pageIntegrity(pRoot);
  pCur->pPage = pRoot;
  pCur->idx = 0;
  pCur->infoValid = 0;
  if( pRoot->nCell==0 && !pRoot->leaf ){
    Pgno subpage;
    assert( pRoot->pgno==1 );
    subpage = get4byte(&pRoot->aData[pRoot->hdrOffset+8]);
    assert( subpage>0 );
    pCur->isValid = 1;
    rc = moveToChild(pCur, subpage);
................................................................................
  while( !(pPage = pCur->pPage)->leaf ){
    pgno = get4byte(&pPage->aData[pPage->hdrOffset+8]);
    pCur->idx = pPage->nCell;
    rc = moveToChild(pCur, pgno);
    if( rc ) return rc;
  }
  pCur->idx = pPage->nCell - 1;
  pCur->infoValid = 0;
  return SQLITE_OK;
}

/* Move the cursor to the first entry in the table.  Return SQLITE_OK
** on success.  Set *pRes to 0 if the cursor actually points to something
** or set *pRes to 1 if the table is empty.
*/
................................................................................
    lwr = 0;
    upr = pPage->nCell-1;
    pageIntegrity(pPage);
    while( lwr<=upr ){
      const void *pCellKey;
      i64 nCellKey;
      pCur->idx = (lwr+upr)/2;
      pCur->infoValid = 0;
      sqlite3BtreeKeySize(pCur, &nCellKey);
      if( pPage->intKey ){
        if( nCellKey<nKey ){
          c = -1;
        }else if( nCellKey>nKey ){
          c = +1;
        }else{
................................................................................
    }
    if( chldPg==0 ){
      assert( pCur->idx>=0 && pCur->idx<pCur->pPage->nCell );
      if( pRes ) *pRes = c;
      return SQLITE_OK;
    }
    pCur->idx = lwr;
    pCur->infoValid = 0;
    rc = moveToChild(pCur, chldPg);
    if( rc ){
      return rc;
    }
  }
  /* NOT REACHED */
}
................................................................................
  if( pCur->isValid==0 ){
    *pRes = 1;
    return SQLITE_OK;
  }
  assert( pPage->isInit );
  assert( pCur->idx<pPage->nCell );
  pCur->idx++;
  pCur->infoValid = 0;
  if( pCur->idx>=pPage->nCell ){
    if( !pPage->leaf ){
      rc = moveToChild(pCur, get4byte(&pPage->aData[pPage->hdrOffset+8]));
      if( rc ) return rc;
      rc = moveToLeftmost(pCur);
      *pRes = 0;
      return rc;
................................................................................
        *pRes = 1;
        return SQLITE_OK;
      }
      moveToParent(pCur);
      pPage = pCur->pPage;
    }
    pCur->idx--;
    pCur->infoValid = 0;
    if( pPage->leafData ){
      rc = sqlite3BtreePrevious(pCur, pRes);
    }else{
      rc = SQLITE_OK;
    }
  }
  *pRes = 0;
................................................................................
    szOld = cellSizePtr(pPage, oldCell);
    rc = clearCell(pPage, oldCell);
    if( rc ) return rc;
    dropCell(pPage, pCur->idx, szOld);
  }else if( loc<0 && pPage->nCell>0 ){
    assert( pPage->leaf );
    pCur->idx++;
    pCur->infoValid = 0;
  }else{
    assert( pPage->leaf );
  }
  insertCell(pPage, pCur->idx, newCell, szNew, 0);
  rc = balance(pPage);
  /* sqlite3BtreePageDump(pCur->pBt, pCur->pgnoRoot, 1); */
  /* fflush(stdout); */







|







 







|
|







 







|
|
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







|

|







 







>
>
>







 







|







|


|







 







<







 







|
|
>
>
|







 







|
|
<
<



>
|

|

<
>



|



|
|
|
<
>
|
<
<
<
<


|
|
|

>

>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
|







 







>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







 







|
|
|
|
|
>
>
>







 







|
<
<
<
<
<
<







 







|
<
<
<
<







 







|







 







|

<







 







|







 







|







 







|







 







|







 







|







 







|







 







|







 







|







 







|







5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
..
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
..
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
...
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
...
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
...
343
344
345
346
347
348
349

350
351
352
353
354
355
356
...
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
...
419
420
421
422
423
424
425
426
427


428
429
430
431
432
433
434
435

436
437
438
439
440
441
442
443
444
445
446

447
448




449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
...
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
...
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
...
825
826
827
828
829
830
831
832






833
834
835
836
837
838
839
...
875
876
877
878
879
880
881
882




883
884
885
886
887
888
889
....
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
....
1549
1550
1551
1552
1553
1554
1555
1556
1557

1558
1559
1560
1561
1562
1563
1564
....
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
....
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
....
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
....
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
....
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
....
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
....
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
....
2219
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
....
3404
3405
3406
3407
3408
3409
3410
3411
3412
3413
3414
3415
3416
3417
3418
** a legal notice, here is a blessing:
**
**    May you do good and not evil.
**    May you find forgiveness for yourself and forgive others.
**    May you share freely, never taking more than you give.
**
*************************************************************************
** $Id: btree.c,v 1.151 2004/05/30 19:19:05 drh Exp $
**
** This file implements a external (disk-based) database using BTrees.
** For a detailed discussion of BTrees, refer to
**
**     Donald E. Knuth, THE ART OF COMPUTER PROGRAMMING, Volume 3:
**     "Sorting And Searching", pages 473-480. Addison-Wesley
**     Publishing Company, Reading, Massachusetts.
................................................................................
** The file is divided into pages.  The first page is called page 1,
** the second is page 2, and so forth.  A page number of zero indicates
** "no such page".  The page size can be anything between 512 and 65536.
** Each page can be either a btree page, a freelist page or an overflow
** page.
**
** The first page is always a btree page.  The first 100 bytes of the first
** page contain a special header (the "file header") that describes the file.
** The format of the file header is as follows:
**
**   OFFSET   SIZE    DESCRIPTION
**      0      16     Header string: "SQLite format 3\000"
**     16       2     Page size in bytes.  
**     18       1     File format write version
**     19       1     File format read version
**     20       1     Bytes of unused space at the end of each page
................................................................................
** The min leaf payload fraction is like the min embedded payload fraction
** except that it applies to leaf nodes in a LEAFDATA tree.  The maximum
** payload fraction for a LEAFDATA tree is always 100% (or 255) and it
** not specified in the header.
**
** Each btree pages is divided into three sections:  The header, the
** cell pointer array, and the cell area area.  Page 1 also has a 100-byte
** file header that occurs before the page header.
**
**      |----------------|
**      | file header    |   100 bytes.  Page 1 only.
**      |----------------|
**      | page header    |   8 bytes for leaves.  12 bytes for interior nodes
**      |----------------|
**      | cell pointer   |   |  2 bytes per cell.  Sorted order.
**      | array          |   |  Grows downward
**      |                |   v
**      |----------------|
**      | unallocated    |
**      | space          |
**      |----------------|   ^  Grows upwards
**      | cell content   |   |  Arbitrary order interspersed with freeblocks.
**      | area           |   |  and free space fragments.
**      |----------------|
**
** The page headers looks like this:
**
**   OFFSET   SIZE     DESCRIPTION
**      0       1      Flags. 1: intkey, 2: zerodata, 4: leafdata, 8: leaf
**      1       2      byte offset to the first freeblock
**      3       2      number of cells on this page
**      5       2      first byte of the cell content area
**      7       1      number of fragmented free bytes
**      8       4      Right child (the Ptr(N+1) value).  Omitted on leaves.
**
** The flags define the format of this btree page.  The leaf flag means that
** this page has no children.  The zerodata flag means that this page carries
** only keys and no data.  The intkey flag means that the key is a single
** variable length integer at the beginning of the payload.
**
** The cell pointer array begins on the first byte after the page header.
................................................................................
  u8 nOverflow;        /* Number of overflow cell bodies in aCell[] */
  u8 intKey;           /* True if intkey flag is set */
  u8 leaf;             /* True if leaf flag is set */
  u8 zeroData;         /* True if table stores keys only */
  u8 leafData;         /* True if tables stores data on leaves only */
  u8 hasData;          /* True if this page stores data */
  u8 hdrOffset;        /* 100 for page 1.  0 otherwise */
  u8 childPtrSize;     /* 0 if leaf==1.  4 if leaf==0 */
  u8 maxLocal;         /* Copy of Btree.maxLocal or Btree.maxLeaf */
  u8 minLocal;         /* Copy of Btree.minLocal or Btree.minLeaf */
  u16 cellOffset;      /* Index in aData of first cell pointer */
  u16 idxParent;       /* Index in parent of this node */
  u16 nFree;           /* Number of free bytes on the page */
  u16 nCell;           /* Number of cells on this page, local and ovfl */
  struct _OvflCell {   /* Cells that will not fit on aData[] */
    u8 *pCell;           /* Pointers to the body of the overflow cell */
    u16 idx;             /* Insert this cell before idx-th non-overflow cell */
................................................................................
  int maxLeaf;          /* Maximum local payload in a LEAFDATA table */
  int minLeaf;          /* Minimum local payload in a LEAFDATA table */
};
typedef Btree Bt;

/*
** An instance of the following structure is used to hold information
** about a cell.  The parseCellPtr() function fills in this structure
** based on information extract from the raw disk page.
*/
typedef struct CellInfo CellInfo;
struct CellInfo {
  u8 *pCell;     /* Pointer to the start of cell content */
  i64 nKey;      /* The key for INTKEY tables, or number of bytes in key */
  u32 nData;     /* Number of bytes of data */
  u16 nHeader;   /* Size of the cell content header in bytes */
  u16 nLocal;    /* Amount of payload held locally */
  u16 iOverflow; /* Offset to overflow page number.  Zero if no overflow */
  u16 nSize;     /* Size of the cell content on the main b-tree page */
};

/*
** A cursor is a pointer to a particular entry in the BTree.
** The entry is identified by its MemPage and the index in
** MemPage.aCell[] of the entry.
*/
................................................................................
  BtCursor *pShared;        /* Loop of cursors with the same root page */
  int (*xCompare)(void*,int,const void*,int,const void*); /* Key comp func */
  void *pArg;               /* First arg to xCompare() */
  Pgno pgnoRoot;            /* The root page of this tree */
  MemPage *pPage;           /* Page that contains the entry */
  int idx;                  /* Index of the entry in pPage->aCell[] */
  CellInfo info;            /* A parse of the cell we are pointing at */

  u8 wrFlag;                /* True if writable */
  u8 isValid;               /* TRUE if points to a valid entry */
  u8 status;                /* Set to SQLITE_ABORT if cursors is invalidated */
};

/*
** Read or write a two- and four-byte big-endian integer values.
................................................................................
** file.
*/
#define getVarint    sqlite3GetVarint
#define getVarint32  sqlite3GetVarint32
#define putVarint    sqlite3PutVarint

/*
** Given a btree page and a cell index (0 means the first cell on
** the page, 1 means the second cell, and so forth) return a pointer
** to the cell content.
**
** This routine works only for pages that do not contain overflow cells.
*/
static u8 *findCell(MemPage *pPage, int iCell){
  u8 *data = pPage->aData;
  assert( iCell>=0 );
  assert( iCell<get2byte(&data[pPage->hdrOffset+3]) );
  return data + get2byte(&data[pPage->cellOffset+2*iCell]);
}
................................................................................
** body of the cell as its second argument.
*/
static void parseCellPtr(
  MemPage *pPage,         /* Page containing the cell */
  u8 *pCell,              /* Pointer to the cell text. */
  CellInfo *pInfo         /* Fill in this structure */
){
  int n;                  /* Number bytes in cell content header */
  u32 nPayload;           /* Number of bytes of cell payload */



  pInfo->pCell = pCell;
  assert( pPage->leaf==0 || pPage->leaf==1 );
  n = pPage->childPtrSize;
  assert( n==4-4*pPage->leaf );
  if( pPage->hasData ){
    n += getVarint32(&pCell[n], &nPayload);
  }else{

    nPayload = 0;
  }
  n += getVarint(&pCell[n], &pInfo->nKey);
  pInfo->nHeader = n;
  pInfo->nData = nPayload;
  if( !pPage->intKey ){
    nPayload += pInfo->nKey;
  }
  if( nPayload<=pPage->maxLocal ){
    /* This is the (easy) common case where the entire payload fits
    ** on the local page.  No overflow is required.

    */
    int nSize;          /* Total size of cell content in bytes */




    pInfo->nLocal = nPayload;
    pInfo->iOverflow = 0;
    nSize = nPayload + n;
    if( nSize<4 ){
      nSize = 4;        /* Minimum cell size is 4 */
    }
    pInfo->nSize = nSize;
  }else{
    /* If the payload will not fit completely on the local page, we have
    ** to decide how much to store locally and how much to spill onto
    ** overflow pages.  The strategy is to minimize the amount of unused
    ** space on overflow pages while keeping the amount of local storage
    ** in between minLocal and maxLocal.
    **
    ** Warning:  changing the way overflow payload is distributed in any
    ** way will result in an incompatible file format.
    */
    int minLocal;  /* Minimum amount of payload held locally */
    int maxLocal;  /* Maximum amount of payload held locally */
    int surplus;   /* Overflow payload available for local storage */

    minLocal = pPage->minLocal;
    maxLocal = pPage->maxLocal;
    surplus = minLocal + (nPayload - minLocal)%(pPage->pBt->usableSize - 4);
    if( surplus <= maxLocal ){
      pInfo->nLocal = surplus;
    }else{
      pInfo->nLocal = minLocal;
    }
    pInfo->iOverflow = pInfo->nLocal + n;
    pInfo->nSize = pInfo->iOverflow + 4;
................................................................................
    int top;
    pbegin = get2byte(&data[hdr+1]);
    memcpy(&data[hdr+1], &data[pbegin], 2);
    top = get2byte(&data[hdr+5]);
    put2byte(&data[hdr+5], top + get2byte(&data[pbegin+2]));
  }
}

/*
** Decode the flags byte (the first byte of the header) for a page
** and initialize fields of the MemPage structure accordingly.
*/
static void decodeFlags(MemPage *pPage, int flagByte){
  Btree *pBt;     /* A copy of pPage->pBt */

  assert( pPage->hdrOffset==(pPage->pgno==1 ? 100 : 0) );
  pPage->intKey = (flagByte & (PTF_INTKEY|PTF_LEAFDATA))!=0;
  pPage->zeroData = (flagByte & PTF_ZERODATA)!=0;
  pPage->leaf = (flagByte & PTF_LEAF)!=0;
  pPage->childPtrSize = 4*(pPage->leaf==0);
  pBt = pPage->pBt;
  if( flagByte & PTF_LEAFDATA ){
    pPage->leafData = 1;
    pPage->maxLocal = pBt->maxLeaf;
    pPage->minLocal = pBt->minLeaf;
  }else{
    pPage->leafData = 0;
    pPage->maxLocal = pBt->maxLocal;
    pPage->minLocal = pBt->minLocal;
  }
  pPage->hasData = !(pPage->zeroData || (!pPage->leaf && pPage->leafData));
}

/*
** Initialize the auxiliary information for a disk block.
**
** The pParent parameter must be a pointer to the MemPage which
** is the parent of the page being initialized.  The root of a
** BTree has no parent and so for that page, pParent==NULL.
................................................................................
** guarantee that the page is well-formed.  It only shows that
** we failed to detect any corruption.
*/
static int initPage(
  MemPage *pPage,        /* The page to be initialized */
  MemPage *pParent       /* The parent.  Might be NULL */
){
  int pc;            /* Address of a freeblock within pPage->aData[] */
  int i;             /* Loop counter */
  int hdr;           /* Offset to beginning of page header */
  u8 *data;          /* Equal to pPage->aData */
  int usableSize;    /* Amount of usable space on each page */
  int cellOffset;    /* Offset from start of page to first cell pointer */
  int nFree;         /* Number of unused bytes on the page */
  int top;           /* First byte of the cell content area */

  assert( pPage->pBt!=0 );
  assert( pParent==0 || pParent->pBt==pPage->pBt );
  assert( pPage->pgno==sqlite3pager_pagenumber(pPage->aData) );
  assert( pPage->aData == &((unsigned char*)pPage)[-pPage->pBt->pageSize] );
  assert( pPage->pParent==0 || pPage->pParent==pParent );
  assert( pPage->pParent==pParent || !pPage->isInit );
................................................................................
  if( pPage->isInit ) return SQLITE_OK;
  if( pPage->pParent==0 && pParent!=0 ){
    pPage->pParent = pParent;
    sqlite3pager_ref(pParent->aData);
  }
  hdr = pPage->hdrOffset;
  data = pPage->aData;
  decodeFlags(pPage, data[hdr]);






  pPage->nOverflow = 0;
  pPage->idxShift = 0;
  usableSize = pPage->pBt->usableSize;
  pPage->cellOffset = cellOffset = hdr + 12 - 4*pPage->leaf;
  top = get2byte(&data[hdr+5]);
  pPage->nCell = get2byte(&data[hdr+3]);

................................................................................
  memset(&data[hdr], 0, pBt->usableSize - hdr);
  data[hdr] = flags;
  first = hdr + 8 + 4*((flags&PTF_LEAF)==0);
  memset(&data[hdr+1], 0, 4);
  data[hdr+7] = 0;
  put2byte(&data[hdr+5], pBt->usableSize);
  pPage->nFree = pBt->usableSize - first;
  decodeFlags(pPage, flags);




  pPage->hdrOffset = hdr;
  pPage->cellOffset = first;
  pPage->nOverflow = 0;
  pPage->idxShift = 0;
  pPage->nCell = 0;
  pPage->isInit = 1;
  pageIntegrity(pPage);
................................................................................
    goto create_cursor_exception;
  }
  pCur->xCompare = xCmp ? xCmp : dfltCompare;
  pCur->pArg = pArg;
  pCur->pBt = pBt;
  pCur->wrFlag = wrFlag;
  pCur->idx = 0;
  pCur->info.nSize = 0;
  pCur->pNext = pBt->pCursor;
  if( pCur->pNext ){
    pCur->pNext->pPrev = pCur;
  }
  pCur->pPrev = 0;
  pRing = pBt->pCursor;
  while( pRing && pRing->pgnoRoot!=pCur->pgnoRoot ){ pRing = pRing->pNext; }
................................................................................
** Make sure the BtCursor.info field of the given cursor is valid.
** If it is not already valid, call parseCell() to fill it in.
**
** BtCursor.info is a cache of the information in the current cell.
** Using this cache reduces the number of calls to parseCell().
*/
static void getCellInfo(BtCursor *pCur){
  if( pCur->info.nSize==0 ){
    parseCell(pCur->pPage, pCur->idx, &pCur->info);

  }else{
#ifndef NDEBUG
    CellInfo info;
    parseCell(pCur->pPage, pCur->idx, &info);
    assert( memcmp(&info, &pCur->info, sizeof(info))==0 );
#endif
  }
................................................................................
  pageIntegrity(pNewPage);
  pNewPage->idxParent = pCur->idx;
  pOldPage = pCur->pPage;
  pOldPage->idxShift = 0;
  releasePage(pOldPage);
  pCur->pPage = pNewPage;
  pCur->idx = 0;
  pCur->info.nSize = 0;
  if( pNewPage->nCell<1 ){
    return SQLITE_CORRUPT;
  }
  return SQLITE_OK;
}

/*
................................................................................
  assert( pParent!=0 );
  pageIntegrity(pParent);
  idxParent = pPage->idxParent;
  sqlite3pager_ref(pParent->aData);
  oldPgno = pPage->pgno;
  releasePage(pPage);
  pCur->pPage = pParent;
  pCur->info.nSize = 0;
  assert( pParent->idxShift==0 );
  pCur->idx = idxParent;
}

/*
** Move the cursor to the root page
*/
................................................................................
    pCur->isValid = 0;
    return rc;
  }
  releasePage(pCur->pPage);
  pageIntegrity(pRoot);
  pCur->pPage = pRoot;
  pCur->idx = 0;
  pCur->info.nSize = 0;
  if( pRoot->nCell==0 && !pRoot->leaf ){
    Pgno subpage;
    assert( pRoot->pgno==1 );
    subpage = get4byte(&pRoot->aData[pRoot->hdrOffset+8]);
    assert( subpage>0 );
    pCur->isValid = 1;
    rc = moveToChild(pCur, subpage);
................................................................................
  while( !(pPage = pCur->pPage)->leaf ){
    pgno = get4byte(&pPage->aData[pPage->hdrOffset+8]);
    pCur->idx = pPage->nCell;
    rc = moveToChild(pCur, pgno);
    if( rc ) return rc;
  }
  pCur->idx = pPage->nCell - 1;
  pCur->info.nSize = 0;
  return SQLITE_OK;
}

/* Move the cursor to the first entry in the table.  Return SQLITE_OK
** on success.  Set *pRes to 0 if the cursor actually points to something
** or set *pRes to 1 if the table is empty.
*/
................................................................................
    lwr = 0;
    upr = pPage->nCell-1;
    pageIntegrity(pPage);
    while( lwr<=upr ){
      const void *pCellKey;
      i64 nCellKey;
      pCur->idx = (lwr+upr)/2;
      pCur->info.nSize = 0;
      sqlite3BtreeKeySize(pCur, &nCellKey);
      if( pPage->intKey ){
        if( nCellKey<nKey ){
          c = -1;
        }else if( nCellKey>nKey ){
          c = +1;
        }else{
................................................................................
    }
    if( chldPg==0 ){
      assert( pCur->idx>=0 && pCur->idx<pCur->pPage->nCell );
      if( pRes ) *pRes = c;
      return SQLITE_OK;
    }
    pCur->idx = lwr;
    pCur->info.nSize = 0;
    rc = moveToChild(pCur, chldPg);
    if( rc ){
      return rc;
    }
  }
  /* NOT REACHED */
}
................................................................................
  if( pCur->isValid==0 ){
    *pRes = 1;
    return SQLITE_OK;
  }
  assert( pPage->isInit );
  assert( pCur->idx<pPage->nCell );
  pCur->idx++;
  pCur->info.nSize = 0;
  if( pCur->idx>=pPage->nCell ){
    if( !pPage->leaf ){
      rc = moveToChild(pCur, get4byte(&pPage->aData[pPage->hdrOffset+8]));
      if( rc ) return rc;
      rc = moveToLeftmost(pCur);
      *pRes = 0;
      return rc;
................................................................................
        *pRes = 1;
        return SQLITE_OK;
      }
      moveToParent(pCur);
      pPage = pCur->pPage;
    }
    pCur->idx--;
    pCur->info.nSize = 0;
    if( pPage->leafData ){
      rc = sqlite3BtreePrevious(pCur, pRes);
    }else{
      rc = SQLITE_OK;
    }
  }
  *pRes = 0;
................................................................................
    szOld = cellSizePtr(pPage, oldCell);
    rc = clearCell(pPage, oldCell);
    if( rc ) return rc;
    dropCell(pPage, pCur->idx, szOld);
  }else if( loc<0 && pPage->nCell>0 ){
    assert( pPage->leaf );
    pCur->idx++;
    pCur->info.nSize = 0;
  }else{
    assert( pPage->leaf );
  }
  insertCell(pPage, pCur->idx, newCell, szNew, 0);
  rc = balance(pPage);
  /* sqlite3BtreePageDump(pCur->pBt, pCur->pgnoRoot, 1); */
  /* fflush(stdout); */