/ Check-in [b442525b]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Change the cost estimator in the query planner to take into account the logN rowid lookup cost when going from an index to a table.
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | stat2-enhancement
Files: files | file ages | folders
SHA1: b442525b0ba642bb8d57b87b7b9e373b6046454a
User & Date: drh 2011-01-24 15:11:23
Context
2011-01-24
17:46
Restructuring and generalizing analyze5.test. The whole script is currently disabled and will need to be reenabled prior to merging with trunk. check-in: 31fcc706 user: drh tags: stat2-enhancement
15:11
Change the cost estimator in the query planner to take into account the logN rowid lookup cost when going from an index to a table. check-in: b442525b user: drh tags: stat2-enhancement
2011-01-22
00:10
Add the ability to use indices for constraints of the form "x IS NOT NULL" when sqlite_stat2 is available and most entries for column x are NULL. check-in: 5d5bddd2 user: drh tags: stat2-enhancement
Changes
Hide Diffs Side-by-Side Diffs Ignore Whitespace Patch

Changes to src/where.c.

  2915   2915           whereEqualScanEst(pParse, pProbe, pFirstTerm->pExpr->pRight, &nRow);
  2916   2916         }else if( pFirstTerm->eOperator==WO_IN && bInEst==0 ){
  2917   2917           whereInScanEst(pParse, pProbe, pFirstTerm->pExpr->x.pList, &nRow);
  2918   2918         }
  2919   2919       }
  2920   2920   #endif /* SQLITE_ENABLE_STAT2 */
  2921   2921   
  2922         -    /* Assume constant cost to access a row and logarithmic cost to
  2923         -    ** do a binary search.  Hence, the initial cost is the number of output
  2924         -    ** rows plus log2(table-size) times the number of binary searches.
  2925         -    */
  2926         -    cost = nRow + nInMul*estLog(aiRowEst[0]);
  2927         -
  2928   2922       /* Adjust the number of rows and the cost downward to reflect rows
  2929   2923       ** that are excluded by range constraints.
  2930   2924       */
  2931   2925       nRow = (nRow * (double)estBound) / (double)100;
  2932         -    cost = (cost * (double)estBound) / (double)100;
  2933   2926   
  2934         -    /* Add in the estimated cost of sorting the result
         2927  +    /* Assume constant cost to access a row and logarithmic cost to
         2928  +    ** do a binary search.  Hence, the initial cost is the number of output
         2929  +    ** rows plus log2(table-size) times the number of binary searches.
         2930  +    */
         2931  +    if( pIdx && bLookup ){
         2932  +      cost = nRow + (nInMul+nRow)*estLog(aiRowEst[0]);
         2933  +    }else{
         2934  +      cost = nRow + nInMul*estLog(aiRowEst[0]);
         2935  +    }
         2936  +
         2937  +    /* Add in the estimated cost of sorting the result.  This cost is expanded
         2938  +    ** by a fudge factor of 3.0 to account for the fact that a sorting step 
         2939  +    ** involves a write and is thus more expensive than a lookup step.
  2935   2940       */
  2936   2941       if( bSort ){
  2937         -      cost += cost*estLog(cost);
         2942  +      cost += nRow*estLog(nRow)*(double)3;
  2938   2943       }
  2939   2944   
  2940         -    /* If all information can be taken directly from the index, we avoid
  2941         -    ** doing table lookups.  This reduces the cost by half.  (Not really -
  2942         -    ** this needs to be fixed.)
  2943         -    */
  2944         -    if( pIdx && bLookup==0 ){
  2945         -      cost /= (double)2;
  2946         -    }
  2947   2945       /**** Cost of using this index has now been computed ****/
  2948   2946   
  2949   2947       /* If there are additional constraints on this table that cannot
  2950   2948       ** be used with the current index, but which might lower the number
  2951   2949       ** of output rows, adjust the nRow value accordingly.  This only 
  2952   2950       ** matters if the current index is the least costly, so do not bother
  2953   2951       ** with this step if we already know this index will not be chosen.