SQLite

Check-in [31e480f68d]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Improved comments an presentation for the recent IN operator decision improvement.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: 31e480f68dfd887cfd9114f9f9fec53d751d3ecc27e36f55c0166b51b2fbb08c
User & Date: drh 2018-06-09 14:13:46.495
Context
2018-06-09
16:49
Slightly smaller and faster code by encapsulating wal-index hash table location information in a separate WalHashLoc object rather than passing around the various elements as separate variables. (check-in: 538a365b7a user: drh tags: trunk)
14:13
Improved comments an presentation for the recent IN operator decision improvement. (check-in: 31e480f68d user: drh tags: trunk)
02:49
Performance improvement to sqlite3WhereExprUsage(). (check-in: fd09341390 user: drh tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to src/where.c.
2447
2448
2449
2450
2451
2452
2453
2454
2455
2456
2457
2458
2459
2460
2461
2462
2463
2464
2465
2466
2467
2468
2469
2470
2471
2472
2473
2474

2475
2476
2477
2478
2479
2480
2481
        || (pNew->wsFlags & WHERE_COLUMN_NULL)!=0 
        || (pNew->wsFlags & WHERE_COLUMN_IN)!=0 
        || (pNew->wsFlags & WHERE_SKIPSCAN)!=0 
    );

    if( eOp & WO_IN ){
      Expr *pExpr = pTerm->pExpr;
      LogEst M, logK;
      if( ExprHasProperty(pExpr, EP_xIsSelect) ){
        /* "x IN (SELECT ...)":  TUNING: the SELECT returns 25 rows */
        int i;
        nIn = 46;  assert( 46==sqlite3LogEst(25) );

        /* The expression may actually be of the form (x, y) IN (SELECT...).
        ** In this case there is a separate term for each of (x) and (y).
        ** However, the nIn multiplier should only be applied once, not once
        ** for each such term. The following loop checks that pTerm is the
        ** first such term in use, and sets nIn back to 0 if it is not. */
        for(i=0; i<pNew->nLTerm-1; i++){
          if( pNew->aLTerm[i] && pNew->aLTerm[i]->pExpr==pExpr ) nIn = 0;
        }
      }else if( ALWAYS(pExpr->x.pList && pExpr->x.pList->nExpr) ){
        /* "x IN (value, value, ...)" */
        nIn = sqlite3LogEst(pExpr->x.pList->nExpr);
        assert( nIn>0 );  /* RHS always has 2 or more terms...  The parser
                          ** changes "x IN (?)" into "x=?". */
      }
      if( pProbe->hasStat1 ){

        /* Let:
        **   N = the total number of rows in the table
        **   K = the number of entries on the RHS of the IN operator
        **   M = the number of rows in the table that match terms to the 
        **       to the left in the same index.  If the IN operator is on
        **       the left-most index column, M==N.
        **







<




















>







2447
2448
2449
2450
2451
2452
2453

2454
2455
2456
2457
2458
2459
2460
2461
2462
2463
2464
2465
2466
2467
2468
2469
2470
2471
2472
2473
2474
2475
2476
2477
2478
2479
2480
2481
        || (pNew->wsFlags & WHERE_COLUMN_NULL)!=0 
        || (pNew->wsFlags & WHERE_COLUMN_IN)!=0 
        || (pNew->wsFlags & WHERE_SKIPSCAN)!=0 
    );

    if( eOp & WO_IN ){
      Expr *pExpr = pTerm->pExpr;

      if( ExprHasProperty(pExpr, EP_xIsSelect) ){
        /* "x IN (SELECT ...)":  TUNING: the SELECT returns 25 rows */
        int i;
        nIn = 46;  assert( 46==sqlite3LogEst(25) );

        /* The expression may actually be of the form (x, y) IN (SELECT...).
        ** In this case there is a separate term for each of (x) and (y).
        ** However, the nIn multiplier should only be applied once, not once
        ** for each such term. The following loop checks that pTerm is the
        ** first such term in use, and sets nIn back to 0 if it is not. */
        for(i=0; i<pNew->nLTerm-1; i++){
          if( pNew->aLTerm[i] && pNew->aLTerm[i]->pExpr==pExpr ) nIn = 0;
        }
      }else if( ALWAYS(pExpr->x.pList && pExpr->x.pList->nExpr) ){
        /* "x IN (value, value, ...)" */
        nIn = sqlite3LogEst(pExpr->x.pList->nExpr);
        assert( nIn>0 );  /* RHS always has 2 or more terms...  The parser
                          ** changes "x IN (?)" into "x=?". */
      }
      if( pProbe->hasStat1 ){
        LogEst M, logK, safetyMargin;
        /* Let:
        **   N = the total number of rows in the table
        **   K = the number of entries on the RHS of the IN operator
        **   M = the number of rows in the table that match terms to the 
        **       to the left in the same index.  If the IN operator is on
        **       the left-most index column, M==N.
        **
2489
2490
2491
2492
2493
2494
2495

2496
2497
2498
2499
2500
2501
2502
2503
        ** a safety margin of 2 (LogEst: 10) that favors using the IN operator
        ** with the index, as using an index has better worst-case behavior.
        ** If we do not have real sqlite_stat1 data, always prefer to use
        ** the index.
        */
        M = pProbe->aiRowLogEst[saved_nEq];
        logK = estLog(nIn);

        if( M + logK + 10 < nIn + rLogSize ){
          WHERETRACE(0x40,
            ("Scan preferred over IN operator on column %d of \"%s\" (%d<%d)\n",
             saved_nEq, pProbe->zName, M+logK+10, nIn+rLogSize));
          continue;
        }else{
          WHERETRACE(0x40,
            ("IN operator preferred on column %d of \"%s\" (%d>=%d)\n",







>
|







2489
2490
2491
2492
2493
2494
2495
2496
2497
2498
2499
2500
2501
2502
2503
2504
        ** a safety margin of 2 (LogEst: 10) that favors using the IN operator
        ** with the index, as using an index has better worst-case behavior.
        ** If we do not have real sqlite_stat1 data, always prefer to use
        ** the index.
        */
        M = pProbe->aiRowLogEst[saved_nEq];
        logK = estLog(nIn);
        safetyMargin = 10;  /* TUNING: extra weight for indexed IN */
        if( M + logK + safetyMargin < nIn + rLogSize ){
          WHERETRACE(0x40,
            ("Scan preferred over IN operator on column %d of \"%s\" (%d<%d)\n",
             saved_nEq, pProbe->zName, M+logK+10, nIn+rLogSize));
          continue;
        }else{
          WHERETRACE(0x40,
            ("IN operator preferred on column %d of \"%s\" (%d>=%d)\n",