SQLite

Check-in [6c304024bb]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Parameterize the number of adjacent pages that participate in the balancing algorithm in the BTree. But leave the setting at the current value of 3. (CVS 812)
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 6c304024bbd21a1886a57ada795531349aa627d7
User & Date: drh 2003-01-04 19:44:08.000
Context
2003-01-05
21:41
More optimizations. (CVS 813) (check-in: 5809132f5b user: drh tags: trunk)
2003-01-04
19:44
Parameterize the number of adjacent pages that participate in the balancing algorithm in the BTree. But leave the setting at the current value of 3. (CVS 812) (check-in: 6c304024bb user: drh tags: trunk)
18:53
Another optimization to the btree logic. (CVS 811) (check-in: 03d2067361 user: drh tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to src/btree.c.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
/*
** 2001 September 15
**
** The author disclaims copyright to this source code.  In place of
** a legal notice, here is a blessing:
**
**    May you do good and not evil.
**    May you find forgiveness for yourself and forgive others.
**    May you share freely, never taking more than you give.
**
*************************************************************************
** $Id: btree.c,v 1.78 2003/01/04 18:53:28 drh Exp $
**
** This file implements a external (disk-based) database using BTrees.
** For a detailed discussion of BTrees, refer to
**
**     Donald E. Knuth, THE ART OF COMPUTER PROGRAMMING, Volume 3:
**     "Sorting And Searching", pages 473-480. Addison-Wesley
**     Publishing Company, Reading, Massachusetts.











|







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
/*
** 2001 September 15
**
** The author disclaims copyright to this source code.  In place of
** a legal notice, here is a blessing:
**
**    May you do good and not evil.
**    May you find forgiveness for yourself and forgive others.
**    May you share freely, never taking more than you give.
**
*************************************************************************
** $Id: btree.c,v 1.79 2003/01/04 19:44:08 drh Exp $
**
** This file implements a external (disk-based) database using BTrees.
** For a detailed discussion of BTrees, refer to
**
**     Donald E. Knuth, THE ART OF COMPUTER PROGRAMMING, Volume 3:
**     "Sorting And Searching", pages 473-480. Addison-Wesley
**     Publishing Company, Reading, Massachusetts.
2057
2058
2059
2060
2061
2062
2063















2064
2065
2066
2067
2068
2069
2070
      *((uptr*)&pTo->apCell[i]) = x + to - from;
    }else{
      pTo->apCell[i] = pFrom->apCell[i];
    }
  }
}
















/*
** This routine redistributes Cells on pPage and up to two siblings
** of pPage so that all pages have about the same amount of free space.
** Usually one sibling on either side of pPage is used in the balancing,
** though both siblings might come from one side if pPage is the first
** or last child of its parent.  If pPage has fewer than two siblings
** (something which can only happen if pPage is the root page or a 







>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
      *((uptr*)&pTo->apCell[i]) = x + to - from;
    }else{
      pTo->apCell[i] = pFrom->apCell[i];
    }
  }
}

/*
** The following parameters determine how many adjacent pages get involved
** in a balancing operation.  NN is the number of neighbors on either side
** of the page that participate in the balancing operation.  NB is the
** total number of pages that participate, including the target page and
** NN neighbors on either side.
**
** The minimum value of NN is 1 (of course).  Increasing NN above 1
** (to 2 or 3) gives a modest improvement in SELECT and DELETE performance
** in exchange for a larger degradation in INSERT and UPDATE performance.
** The value of NN appears to give the best results overall.
*/
#define NN 1             /* Number of neighbors on either side of pPage */
#define NB (NN*2+1)      /* Total pages involved in the balance */

/*
** This routine redistributes Cells on pPage and up to two siblings
** of pPage so that all pages have about the same amount of free space.
** Usually one sibling on either side of pPage is used in the balancing,
** though both siblings might come from one side if pPage is the first
** or last child of its parent.  If pPage has fewer than two siblings
** (something which can only happen if pPage is the root page or a 
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125


2126


2127
2128


2129


2130
2131
2132
2133
2134
2135
2136
**
** If this routine fails for any reason, it might leave the database
** in a corrupted state.  So if this routine fails, the database should
** be rolled back.
*/
static int balance(Btree *pBt, MemPage *pPage, BtCursor *pCur){
  MemPage *pParent;            /* The parent of pPage */
  MemPage *apOld[3];           /* pPage and up to two siblings */
  Pgno pgnoOld[3];             /* Page numbers for each page in apOld[] */
  MemPage *apNew[4];           /* pPage and up to 3 siblings after balancing */
  Pgno pgnoNew[4];             /* Page numbers for each page in apNew[] */
  int idxDiv[3];               /* Indices of divider cells in pParent */
  Cell *apDiv[3];              /* Divider cells in pParent */
  int nCell;                   /* Number of cells in apCell[] */
  int nOld;                    /* Number of pages in apOld[] */
  int nNew;                    /* Number of pages in apNew[] */
  int nDiv;                    /* Number of cells in apDiv[] */
  int i, j, k;                 /* Loop counters */
  int idx;                     /* Index of pPage in pParent->apCell[] */
  int nxDiv;                   /* Next divider slot in pParent->apCell[] */
  int rc;                      /* The return code */
  int iCur;                    /* apCell[iCur] is the cell of the cursor */
  MemPage *pOldCurPage;        /* The cursor originally points to this page */
  int subtotal;                /* Subtotal of bytes in cells on one page */
  int cntNew[4];               /* Index in apCell[] of cell after i-th page */
  int szNew[4];                /* Combined size of cells place on i-th page */
  MemPage *extraUnref = 0;     /* A page that needs to be unref-ed */


  Cell *apCell[MX_CELL*3+5];   /* All cells from pages being balanceed */


  int szCell[MX_CELL*3+5];     /* Local size of all cells */
  Cell aTemp[2];               /* Temporary holding area for apDiv[] */


  MemPage aOld[3];             /* Temporary copies of pPage and its siblings */



  /* 
  ** Return without doing any work if pPage is neither overfull nor
  ** underfull.
  */
  assert( sqlitepager_iswriteable(pPage) );
  if( !pPage->isOverfull && pPage->nFree<SQLITE_PAGE_SIZE/2 







<
<
<
<
<
<











<
<

>
>
|
>
>
|
|
>
>
|
>
>







2114
2115
2116
2117
2118
2119
2120






2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131


2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
**
** If this routine fails for any reason, it might leave the database
** in a corrupted state.  So if this routine fails, the database should
** be rolled back.
*/
static int balance(Btree *pBt, MemPage *pPage, BtCursor *pCur){
  MemPage *pParent;            /* The parent of pPage */






  int nCell;                   /* Number of cells in apCell[] */
  int nOld;                    /* Number of pages in apOld[] */
  int nNew;                    /* Number of pages in apNew[] */
  int nDiv;                    /* Number of cells in apDiv[] */
  int i, j, k;                 /* Loop counters */
  int idx;                     /* Index of pPage in pParent->apCell[] */
  int nxDiv;                   /* Next divider slot in pParent->apCell[] */
  int rc;                      /* The return code */
  int iCur;                    /* apCell[iCur] is the cell of the cursor */
  MemPage *pOldCurPage;        /* The cursor originally points to this page */
  int subtotal;                /* Subtotal of bytes in cells on one page */


  MemPage *extraUnref = 0;     /* A page that needs to be unref-ed */
  MemPage *apOld[NB];          /* pPage and up to two siblings */
  Pgno pgnoOld[NB];            /* Page numbers for each page in apOld[] */
  MemPage *apNew[NB+1];        /* pPage and up to NB siblings after balancing */
  Pgno pgnoNew[NB+1];          /* Page numbers for each page in apNew[] */
  int idxDiv[NB];              /* Indices of divider cells in pParent */
  Cell *apDiv[NB];             /* Divider cells in pParent */
  Cell aTemp[NB];              /* Temporary holding area for apDiv[] */
  int cntNew[NB+1];            /* Index in apCell[] of cell after i-th page */
  int szNew[NB+1];             /* Combined size of cells place on i-th page */
  MemPage aOld[NB];            /* Temporary copies of pPage and its siblings */
  Cell *apCell[(MX_CELL+2)*NB]; /* All cells from pages being balanced */
  int szCell[(MX_CELL+2)*NB];  /* Local size of all cells */

  /* 
  ** Return without doing any work if pPage is neither overfull nor
  ** underfull.
  */
  assert( sqlitepager_iswriteable(pPage) );
  if( !pPage->isOverfull && pPage->nFree<SQLITE_PAGE_SIZE/2 
2239
2240
2241
2242
2243
2244
2245
2246
2247
2248
2249
2250
2251
2252
2253
2254
2255
2256


2257
2258
2259
2260
2261
2262
2263
2264
2265
  ** directly to balance_cleanup at any moment.
  */
  nOld = nNew = 0;
  sqlitepager_ref(pParent);

  /*
  ** Find sibling pages to pPage and the Cells in pParent that divide
  ** the siblings.  An attempt is made to find one sibling on either
  ** side of pPage.  Both siblings are taken from one side, however, if
  ** pPage is either the first or last child of its parent.  If pParent
  ** has 3 or fewer children then all children of pParent are taken.
  */
  if( idx==pParent->nCell ){
    nxDiv = idx - 2;
  }else{
    nxDiv = idx - 1;
  }
  if( nxDiv<0 ) nxDiv = 0;


  nDiv = 0;
  for(i=0, k=nxDiv; i<3; i++, k++){
    if( k<pParent->nCell ){
      idxDiv[i] = k;
      apDiv[i] = pParent->apCell[k];
      nDiv++;
      pgnoOld[i] = SWAB32(pBt, apDiv[i]->h.leftChild);
    }else if( k==pParent->nCell ){
      pgnoOld[i] = SWAB32(pBt, pParent->u.hdr.rightChild);







|
|
|
|

<
|
|
|

|
>
>

|







2254
2255
2256
2257
2258
2259
2260
2261
2262
2263
2264
2265

2266
2267
2268
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
  ** directly to balance_cleanup at any moment.
  */
  nOld = nNew = 0;
  sqlitepager_ref(pParent);

  /*
  ** Find sibling pages to pPage and the Cells in pParent that divide
  ** the siblings.  An attempt is made to find NN siblings on either
  ** side of pPage.  More siblings are taken from one side, however, if
  ** pPage there are fewer than NN siblings on the other side.  If pParent
  ** has NB or fewer children then all children of pParent are taken.
  */

  nxDiv = idx - NN;
  if( nxDiv + NB > pParent->nCell ){
    nxDiv = pParent->nCell - NB + 1;
  }
  if( nxDiv<0 ){
    nxDiv = 0;
  }
  nDiv = 0;
  for(i=0, k=nxDiv; i<NB; i++, k++){
    if( k<pParent->nCell ){
      idxDiv[i] = k;
      apDiv[i] = pParent->apCell[k];
      nDiv++;
      pgnoOld[i] = SWAB32(pBt, apDiv[i]->h.leftChild);
    }else if( k==pParent->nCell ){
      pgnoOld[i] = SWAB32(pBt, pParent->u.hdr.rightChild);
2396
2397
2398
2399
2400
2401
2402

2403
2404
2405
2406
2407
2408
2409
2410
2411
2412
2413
  ** Put the new pages in accending order.  This helps to
  ** keep entries in the disk file in order so that a scan
  ** of the table is a linear scan through the file.  That
  ** in turn helps the operating system to deliver pages
  ** from the disk more rapidly.
  **
  ** An O(n^2) insertion sort algorithm is used, but since

  ** n is never more than 3, that should not be a problem.
  **
  ** This one optimization makes the database about 25%
  ** faster for large insertions and deletions.
  */
  for(i=0; i<k-1; i++){
    int minV = pgnoNew[i];
    int minI = i;
    for(j=i+1; j<k; j++){
      if( pgnoNew[j]<minV ){
        minI = j;







>
|

|
|







2412
2413
2414
2415
2416
2417
2418
2419
2420
2421
2422
2423
2424
2425
2426
2427
2428
2429
2430
  ** Put the new pages in accending order.  This helps to
  ** keep entries in the disk file in order so that a scan
  ** of the table is a linear scan through the file.  That
  ** in turn helps the operating system to deliver pages
  ** from the disk more rapidly.
  **
  ** An O(n^2) insertion sort algorithm is used, but since
  ** n is never more than NB (a small constant), that should
  ** not be a problem.
  **
  ** When NB==3, this one optimization makes the database
  ** about 25% faster for large insertions and deletions.
  */
  for(i=0; i<k-1; i++){
    int minV = pgnoNew[i];
    int minI = i;
    for(j=i+1; j<k; j++){
      if( pgnoNew[j]<minV ){
        minI = j;