Index: src/analyze.c ================================================================== --- src/analyze.c +++ src/analyze.c @@ -8,10 +8,112 @@ ** May you find forgiveness for yourself and forgive others. ** May you share freely, never taking more than you give. ** ************************************************************************* ** This file contains code associated with the ANALYZE command. +** +** The ANALYZE command gather statistics about the content of tables +** and indices. These statistics are made available to the query planner +** to help it make better decisions about how to perform queries. +** +** The following system tables are or have been supported: +** +** CREATE TABLE sqlite_stat1(tbl, idx, stat); +** CREATE TABLE sqlite_stat2(tbl, idx, sampleno, sample); +** CREATE TABLE sqlite_stat3(tbl, idx, nEq, nLt, nDLt, sample); +** +** Additional tables might be added in future releases of SQLite. +** The sqlite_stat2 table is not created or used unless the SQLite version +** is between 3.6.18 and 3.7.8, inclusive, and unless SQLite is compiled +** with SQLITE_ENABLE_STAT2. The sqlite_stat2 table is deprecated. +** The sqlite_stat2 table is superceded by sqlite_stat3, which is only +** created and used by SQLite versions 3.7.9 and later and with +** SQLITE_ENABLE_STAT3 defined. The fucntionality of sqlite_stat3 +** is a superset of sqlite_stat2. +** +** Format of sqlite_stat1: +** +** There is normally one row per index, with the index identified by the +** name in the idx column. The tbl column is the name of the table to +** which the index belongs. In each such row, the stat column will be +** a string consisting of a list of integers. The first integer in this +** list is the number of rows in the index and in the table. The second +** integer is the average number of rows in the index that have the same +** value in the first column of the index. The third integer is the average +** number of rows in the index that have the same value for the first two +** columns. The N-th integer (for N>1) is the average number of rows in +** the index which have the same value for the first N-1 columns. For +** a K-column index, there will be K+1 integers in the stat column. If +** the index is unique, then the last integer will be 1. +** +** The list of integers in the stat column can optionally be followed +** by the keyword "unordered". The "unordered" keyword, if it is present, +** must be separated from the last integer by a single space. If the +** "unordered" keyword is present, then the query planner assumes that +** the index is unordered and will not use the index for a range query. +** +** If the sqlite_stat1.idx column is NULL, then the sqlite_stat1.stat +** column contains a single integer which is the (estimated) number of +** rows in the table identified by sqlite_stat1.tbl. +** +** Format of sqlite_stat2: +** +** The sqlite_stat2 is only created and is only used if SQLite is compiled +** with SQLITE_ENABLE_STAT2 and if the SQLite version number is between +** 3.6.18 and 3.7.8. The "stat2" table contains additional information +** about the distribution of keys within an index. The index is identified by +** the "idx" column and the "tbl" column is the name of the table to which +** the index belongs. There are usually 10 rows in the sqlite_stat2 +** table for each index. +** +** The sqlite_stat2 entries for an index that have sampleno between 0 and 9 +** inclusive are samples of the left-most key value in the index taken at +** evenly spaced points along the index. Let the number of samples be S +** (10 in the standard build) and let C be the number of rows in the index. +** Then the sampled rows are given by: +** +** rownumber = (i*C*2 + C)/(S*2) +** +** For i between 0 and S-1. Conceptually, the index space is divided into +** S uniform buckets and the samples are the middle row from each bucket. +** +** The format for sqlite_stat2 is recorded here for legacy reference. This +** version of SQLite does not support sqlite_stat2. It neither reads nor +** writes the sqlite_stat2 table. This version of SQLite only supports +** sqlite_stat3. +** +** Format for sqlite_stat3: +** +** The sqlite_stat3 is an enhancement to sqlite_stat2. A new name is +** used to avoid compatibility problems. +** +** The format of the sqlite_stat3 table is similar to the format of +** the sqlite_stat2 table. There are multiple entries for each index. +** The idx column names the index and the tbl column is the table of the +** index. If the idx and tbl columns are the same, then the sample is +** of the INTEGER PRIMARY KEY. The sample column is a value taken from +** the left-most column of the index. The nEq column is the approximate +** number of entires in the index whose left-most column exactly matches +** the sample. nLt is the approximate number of entires whose left-most +** column is less than the sample. The nDLt column is the approximate +** number of distinct left-most entries in the index that are less than +** the sample. +** +** Future versions of SQLite might change to store a string containing +** multiple integers values in the nDLt column of sqlite_stat3. The first +** integer will be the number of prior index entires that are distinct in +** the left-most column. The second integer will be the number of prior index +** entries that are distinct in the first two columns. The third integer +** will be the number of prior index entries that are distinct in the first +** three columns. And so forth. With that extension, the nDLt field is +** similar in function to the sqlite_stat1.stat field. +** +** There can be an arbitrary number of sqlite_stat3 entries per index. +** The ANALYZE command will typically generate sqlite_stat3 tables +** that contain between 10 and 40 samples which are distributed across +** the key space, though not uniformly, and which include samples with +** largest possible nEq values. */ #ifndef SQLITE_OMIT_ANALYZE #include "sqliteInt.h" /* @@ -40,12 +142,18 @@ static const struct { const char *zName; const char *zCols; } aTable[] = { { "sqlite_stat1", "tbl,idx,stat" }, -#ifdef SQLITE_ENABLE_STAT2 - { "sqlite_stat2", "tbl,idx,sampleno,sample" }, +#ifdef SQLITE_ENABLE_STAT3 + { "sqlite_stat3", "tbl,idx,neq,nlt,ndlt,sample" }, +#endif + }; + static const char *azToDrop[] = { + "sqlite_stat2", +#ifndef SQLITE_ENABLE_STAT3 + "sqlite_stat3", #endif }; int aRoot[] = {0, 0}; u8 aCreateTbl[] = {0, 0}; @@ -57,10 +165,24 @@ if( v==0 ) return; assert( sqlite3BtreeHoldsAllMutexes(db) ); assert( sqlite3VdbeDb(v)==db ); pDb = &db->aDb[iDb]; + /* Drop all statistics tables that this version of SQLite does not + ** understand. + */ + for(i=0; izName); + if( pTab ){ + sqlite3CodeDropTable(pParse, pTab, iDb, 0); + break; + } + } + + /* Create new statistic tables if they do not exist, or clear them + ** if they do already exist. + */ for(i=0; izName))==0 ){ /* The sqlite_stat[12] table does not exist. Create it. Note that a @@ -87,17 +209,237 @@ sqlite3VdbeAddOp2(v, OP_Clear, aRoot[i], iDb); } } } - /* Open the sqlite_stat[12] tables for writing. */ + /* Open the sqlite_stat[13] tables for writing. */ for(i=0; ia[0])*mxSample; + p = sqlite3_malloc( n ); + if( p==0 ){ + sqlite3_result_error_nomem(context); + return; + } + memset(p, 0, n); + p->a = (struct Stat3Sample*)&p[1]; + p->nRow = nRow; + p->mxSample = mxSample; + p->nPSample = p->nRow/(mxSample/3+1) + 1; + sqlite3_randomness(sizeof(p->iPrn), &p->iPrn); + sqlite3_result_blob(context, p, sizeof(p), sqlite3_free); +} +static const FuncDef stat3InitFuncdef = { + 2, /* nArg */ + SQLITE_UTF8, /* iPrefEnc */ + 0, /* flags */ + 0, /* pUserData */ + 0, /* pNext */ + stat3Init, /* xFunc */ + 0, /* xStep */ + 0, /* xFinalize */ + "stat3_init", /* zName */ + 0, /* pHash */ + 0 /* pDestructor */ +}; + + +/* +** Implementation of the stat3_push(nEq,nLt,nDLt,rowid,P) SQL function. The +** arguments describe a single key instance. This routine makes the +** decision about whether or not to retain this key for the sqlite_stat3 +** table. +** +** The return value is NULL. +*/ +static void stat3Push( + sqlite3_context *context, + int argc, + sqlite3_value **argv +){ + Stat3Accum *p = (Stat3Accum*)sqlite3_value_blob(argv[4]); + tRowcnt nEq = sqlite3_value_int64(argv[0]); + tRowcnt nLt = sqlite3_value_int64(argv[1]); + tRowcnt nDLt = sqlite3_value_int64(argv[2]); + i64 rowid = sqlite3_value_int64(argv[3]); + u8 isPSample = 0; + u8 doInsert = 0; + int iMin = p->iMin; + struct Stat3Sample *pSample; + int i; + u32 h; + + UNUSED_PARAMETER(context); + UNUSED_PARAMETER(argc); + if( nEq==0 ) return; + h = p->iPrn = p->iPrn*1103515245 + 12345; + if( (nLt/p->nPSample)!=((nEq+nLt)/p->nPSample) ){ + doInsert = isPSample = 1; + }else if( p->nSamplemxSample ){ + doInsert = 1; + }else{ + if( nEq>p->a[iMin].nEq || (nEq==p->a[iMin].nEq && h>p->a[iMin].iHash) ){ + doInsert = 1; + } + } + if( !doInsert ) return; + if( p->nSample==p->mxSample ){ + assert( p->nSample - iMin - 1 >= 0 ); + memmove(&p->a[iMin], &p->a[iMin+1], sizeof(p->a[0])*(p->nSample-iMin-1)); + pSample = &p->a[p->nSample-1]; + }else{ + pSample = &p->a[p->nSample++]; + } + pSample->iRowid = rowid; + pSample->nEq = nEq; + pSample->nLt = nLt; + pSample->nDLt = nDLt; + pSample->iHash = h; + pSample->isPSample = isPSample; + + /* Find the new minimum */ + if( p->nSample==p->mxSample ){ + pSample = p->a; + i = 0; + while( pSample->isPSample ){ + i++; + pSample++; + assert( inSample ); + } + nEq = pSample->nEq; + h = pSample->iHash; + iMin = i; + for(i++, pSample++; inSample; i++, pSample++){ + if( pSample->isPSample ) continue; + if( pSample->nEqnEq==nEq && pSample->iHashnEq; + h = pSample->iHash; + } + } + p->iMin = iMin; + } +} +static const FuncDef stat3PushFuncdef = { + 5, /* nArg */ + SQLITE_UTF8, /* iPrefEnc */ + 0, /* flags */ + 0, /* pUserData */ + 0, /* pNext */ + stat3Push, /* xFunc */ + 0, /* xStep */ + 0, /* xFinalize */ + "stat3_push", /* zName */ + 0, /* pHash */ + 0 /* pDestructor */ +}; + +/* +** Implementation of the stat3_get(P,N,...) SQL function. This routine is +** used to query the results. Content is returned for the Nth sqlite_stat3 +** row where N is between 0 and S-1 and S is the number of samples. The +** value returned depends on the number of arguments. +** +** argc==2 result: rowid +** argc==3 result: nEq +** argc==4 result: nLt +** argc==5 result: nDLt +*/ +static void stat3Get( + sqlite3_context *context, + int argc, + sqlite3_value **argv +){ + int n = sqlite3_value_int(argv[1]); + Stat3Accum *p = (Stat3Accum*)sqlite3_value_blob(argv[0]); + + assert( p!=0 ); + if( p->nSample<=n ) return; + switch( argc ){ + case 2: sqlite3_result_int64(context, p->a[n].iRowid); break; + case 3: sqlite3_result_int64(context, p->a[n].nEq); break; + case 4: sqlite3_result_int64(context, p->a[n].nLt); break; + default: sqlite3_result_int64(context, p->a[n].nDLt); break; + } +} +static const FuncDef stat3GetFuncdef = { + -1, /* nArg */ + SQLITE_UTF8, /* iPrefEnc */ + 0, /* flags */ + 0, /* pUserData */ + 0, /* pNext */ + stat3Get, /* xFunc */ + 0, /* xStep */ + 0, /* xFinalize */ + "stat3_get", /* zName */ + 0, /* pHash */ + 0 /* pDestructor */ +}; +#endif /* SQLITE_ENABLE_STAT3 */ + + + /* ** Generate code to do an analysis of all indices associated with ** a single table. */ @@ -117,24 +459,31 @@ int endOfLoop; /* The end of the loop */ int jZeroRows = -1; /* Jump from here if number of rows is zero */ int iDb; /* Index of database containing pTab */ int regTabname = iMem++; /* Register containing table name */ int regIdxname = iMem++; /* Register containing index name */ - int regSampleno = iMem++; /* Register containing next sample number */ - int regCol = iMem++; /* Content of a column analyzed table */ + int regStat1 = iMem++; /* The stat column of sqlite_stat1 */ +#ifdef SQLITE_ENABLE_STAT3 + int regNumEq = regStat1; /* Number of instances. Same as regStat1 */ + int regNumLt = iMem++; /* Number of keys less than regSample */ + int regNumDLt = iMem++; /* Number of distinct keys less than regSample */ + int regSample = iMem++; /* The next sample value */ + int regRowid = regSample; /* Rowid of a sample */ + int regAccum = iMem++; /* Register to hold Stat3Accum object */ + int regLoop = iMem++; /* Loop counter */ + int regCount = iMem++; /* Number of rows in the table or index */ + int regTemp1 = iMem++; /* Intermediate register */ + int regTemp2 = iMem++; /* Intermediate register */ + int once = 1; /* One-time initialization */ + int shortJump = 0; /* Instruction address */ + int iTabCur = pParse->nTab++; /* Table cursor */ +#endif + int regCol = iMem++; /* Content of a column in analyzed table */ int regRec = iMem++; /* Register holding completed record */ int regTemp = iMem++; /* Temporary use register */ - int regRowid = iMem++; /* Rowid for the inserted record */ - -#ifdef SQLITE_ENABLE_STAT2 - int addr = 0; /* Instruction address */ - int regTemp2 = iMem++; /* Temporary use register */ - int regSamplerecno = iMem++; /* Index of next sample to record */ - int regRecno = iMem++; /* Current sample index */ - int regLast = iMem++; /* Index of last sample to record */ - int regFirst = iMem++; /* Index of first sample to record */ -#endif + int regNewRowid = iMem++; /* Rowid for the inserted record */ + v = sqlite3GetVdbe(pParse); if( v==0 || NEVER(pTab==0) ){ return; } @@ -163,13 +512,18 @@ iIdxCur = pParse->nTab++; sqlite3VdbeAddOp4(v, OP_String8, 0, regTabname, 0, pTab->zName, 0); for(pIdx=pTab->pIndex; pIdx; pIdx=pIdx->pNext){ int nCol; KeyInfo *pKey; + int addrIfNot = 0; /* address of OP_IfNot */ + int *aChngAddr; /* Array of jump instruction addresses */ if( pOnlyIdx && pOnlyIdx!=pIdx ) continue; + VdbeNoopComment((v, "Begin analysis of %s", pIdx->zName)); nCol = pIdx->nColumn; + aChngAddr = sqlite3DbMallocRaw(db, sizeof(int)*nCol); + if( aChngAddr==0 ) continue; pKey = sqlite3IndexKeyinfo(pParse, pIdx); if( iMem+1+(nCol*2)>pParse->nMem ){ pParse->nMem = iMem+1+(nCol*2); } @@ -180,35 +534,24 @@ VdbeComment((v, "%s", pIdx->zName)); /* Populate the register containing the index name. */ sqlite3VdbeAddOp4(v, OP_String8, 0, regIdxname, 0, pIdx->zName, 0); -#ifdef SQLITE_ENABLE_STAT2 - - /* If this iteration of the loop is generating code to analyze the - ** first index in the pTab->pIndex list, then register regLast has - ** not been populated. In this case populate it now. */ - if( pTab->pIndex==pIdx ){ - sqlite3VdbeAddOp2(v, OP_Integer, SQLITE_INDEX_SAMPLES, regSamplerecno); - sqlite3VdbeAddOp2(v, OP_Integer, SQLITE_INDEX_SAMPLES*2-1, regTemp); - sqlite3VdbeAddOp2(v, OP_Integer, SQLITE_INDEX_SAMPLES*2, regTemp2); - - sqlite3VdbeAddOp2(v, OP_Count, iIdxCur, regLast); - sqlite3VdbeAddOp2(v, OP_Null, 0, regFirst); - addr = sqlite3VdbeAddOp3(v, OP_Lt, regSamplerecno, 0, regLast); - sqlite3VdbeAddOp3(v, OP_Divide, regTemp2, regLast, regFirst); - sqlite3VdbeAddOp3(v, OP_Multiply, regLast, regTemp, regLast); - sqlite3VdbeAddOp2(v, OP_AddImm, regLast, SQLITE_INDEX_SAMPLES*2-2); - sqlite3VdbeAddOp3(v, OP_Divide, regTemp2, regLast, regLast); - sqlite3VdbeJumpHere(v, addr); - } - - /* Zero the regSampleno and regRecno registers. */ - sqlite3VdbeAddOp2(v, OP_Integer, 0, regSampleno); - sqlite3VdbeAddOp2(v, OP_Integer, 0, regRecno); - sqlite3VdbeAddOp2(v, OP_Copy, regFirst, regSamplerecno); -#endif +#ifdef SQLITE_ENABLE_STAT3 + if( once ){ + once = 0; + sqlite3OpenTable(pParse, iTabCur, iDb, pTab, OP_OpenRead); + } + sqlite3VdbeAddOp2(v, OP_Count, iIdxCur, regCount); + sqlite3VdbeAddOp2(v, OP_Integer, SQLITE_STAT3_SAMPLES, regTemp1); + sqlite3VdbeAddOp2(v, OP_Integer, 0, regNumEq); + sqlite3VdbeAddOp2(v, OP_Integer, 0, regNumLt); + sqlite3VdbeAddOp2(v, OP_Integer, -1, regNumDLt); + sqlite3VdbeAddOp4(v, OP_Function, 1, regCount, regAccum, + (char*)&stat3InitFuncdef, P4_FUNCDEF); + sqlite3VdbeChangeP5(v, 2); +#endif /* SQLITE_ENABLE_STAT3 */ /* The block of memory cells initialized here is used as follows. ** ** iMem: ** The total number of rows in the table. @@ -234,79 +577,87 @@ /* Start the analysis loop. This loop runs through all the entries in ** the index b-tree. */ endOfLoop = sqlite3VdbeMakeLabel(v); sqlite3VdbeAddOp2(v, OP_Rewind, iIdxCur, endOfLoop); topOfLoop = sqlite3VdbeCurrentAddr(v); - sqlite3VdbeAddOp2(v, OP_AddImm, iMem, 1); + sqlite3VdbeAddOp2(v, OP_AddImm, iMem, 1); /* Increment row counter */ for(i=0; iazColl!=0 ); assert( pIdx->azColl[i]!=0 ); pColl = sqlite3LocateCollSeq(pParse, pIdx->azColl[i]); - sqlite3VdbeAddOp4(v, OP_Ne, regCol, 0, iMem+nCol+i+1, - (char*)pColl, P4_COLLSEQ); + aChngAddr[i] = sqlite3VdbeAddOp4(v, OP_Ne, regCol, 0, iMem+nCol+i+1, + (char*)pColl, P4_COLLSEQ); sqlite3VdbeChangeP5(v, SQLITE_NULLEQ); - } - if( db->mallocFailed ){ - /* If a malloc failure has occurred, then the result of the expression - ** passed as the second argument to the call to sqlite3VdbeJumpHere() - ** below may be negative. Which causes an assert() to fail (or an - ** out-of-bounds write if SQLITE_DEBUG is not defined). */ - return; + VdbeComment((v, "jump if column %d changed", i)); +#ifdef SQLITE_ENABLE_STAT3 + if( i==0 ){ + sqlite3VdbeAddOp2(v, OP_AddImm, regNumEq, 1); + VdbeComment((v, "incr repeat count")); + } +#endif } sqlite3VdbeAddOp2(v, OP_Goto, 0, endOfLoop); for(i=0; inColumn, regRowid); + sqlite3VdbeAddOp3(v, OP_Add, regNumEq, regNumLt, regNumLt); + sqlite3VdbeAddOp2(v, OP_AddImm, regNumDLt, 1); + sqlite3VdbeAddOp2(v, OP_Integer, 1, regNumEq); +#endif } - sqlite3VdbeJumpHere(v, addr2); /* Set jump dest for the OP_Ne */ sqlite3VdbeAddOp2(v, OP_AddImm, iMem+i+1, 1); sqlite3VdbeAddOp3(v, OP_Column, iIdxCur, i, iMem+nCol+i+1); } + sqlite3DbFree(db, aChngAddr); - /* End of the analysis loop. */ + /* Always jump here after updating the iMem+1...iMem+1+nCol counters */ sqlite3VdbeResolveLabel(v, endOfLoop); + sqlite3VdbeAddOp2(v, OP_Next, iIdxCur, topOfLoop); sqlite3VdbeAddOp1(v, OP_Close, iIdxCur); +#ifdef SQLITE_ENABLE_STAT3 + sqlite3VdbeAddOp4(v, OP_Function, 1, regNumEq, regTemp2, + (char*)&stat3PushFuncdef, P4_FUNCDEF); + sqlite3VdbeChangeP5(v, 5); + sqlite3VdbeAddOp2(v, OP_Integer, -1, regLoop); + shortJump = + sqlite3VdbeAddOp2(v, OP_AddImm, regLoop, 1); + sqlite3VdbeAddOp4(v, OP_Function, 1, regAccum, regTemp1, + (char*)&stat3GetFuncdef, P4_FUNCDEF); + sqlite3VdbeChangeP5(v, 2); + sqlite3VdbeAddOp1(v, OP_IsNull, regTemp1); + sqlite3VdbeAddOp3(v, OP_NotExists, iTabCur, shortJump, regTemp1); + sqlite3VdbeAddOp3(v, OP_Column, iTabCur, pIdx->aiColumn[0], regSample); + sqlite3ColumnDefault(v, pTab, pIdx->aiColumn[0], regSample); + sqlite3VdbeAddOp4(v, OP_Function, 1, regAccum, regNumEq, + (char*)&stat3GetFuncdef, P4_FUNCDEF); + sqlite3VdbeChangeP5(v, 3); + sqlite3VdbeAddOp4(v, OP_Function, 1, regAccum, regNumLt, + (char*)&stat3GetFuncdef, P4_FUNCDEF); + sqlite3VdbeChangeP5(v, 4); + sqlite3VdbeAddOp4(v, OP_Function, 1, regAccum, regNumDLt, + (char*)&stat3GetFuncdef, P4_FUNCDEF); + sqlite3VdbeChangeP5(v, 5); + sqlite3VdbeAddOp4(v, OP_MakeRecord, regTabname, 6, regRec, "bbbbbb", 0); + sqlite3VdbeAddOp2(v, OP_NewRowid, iStatCur+1, regNewRowid); + sqlite3VdbeAddOp3(v, OP_Insert, iStatCur+1, regRec, regNewRowid); + sqlite3VdbeAddOp2(v, OP_Goto, 0, shortJump); + sqlite3VdbeJumpHere(v, shortJump+2); +#endif /* Store the results in sqlite_stat1. ** ** The result is a single row of the sqlite_stat1 table. The first ** two columns are the names of the table and index. The third column @@ -322,50 +673,51 @@ ** ** If K==0 then no entry is made into the sqlite_stat1 table. ** If K>0 then it is always the case the D>0 so division by zero ** is never possible. */ - sqlite3VdbeAddOp2(v, OP_SCopy, iMem, regSampleno); + sqlite3VdbeAddOp2(v, OP_SCopy, iMem, regStat1); if( jZeroRows<0 ){ jZeroRows = sqlite3VdbeAddOp1(v, OP_IfNot, iMem); } for(i=0; ipIndex==0 ){ sqlite3VdbeAddOp3(v, OP_OpenRead, iIdxCur, pTab->tnum, iDb); VdbeComment((v, "%s", pTab->zName)); - sqlite3VdbeAddOp2(v, OP_Count, iIdxCur, regSampleno); + sqlite3VdbeAddOp2(v, OP_Count, iIdxCur, regStat1); sqlite3VdbeAddOp1(v, OP_Close, iIdxCur); - jZeroRows = sqlite3VdbeAddOp1(v, OP_IfNot, regSampleno); + jZeroRows = sqlite3VdbeAddOp1(v, OP_IfNot, regStat1); }else{ sqlite3VdbeJumpHere(v, jZeroRows); jZeroRows = sqlite3VdbeAddOp0(v, OP_Goto); } sqlite3VdbeAddOp2(v, OP_Null, 0, regIdxname); sqlite3VdbeAddOp4(v, OP_MakeRecord, regTabname, 3, regRec, "aaa", 0); - sqlite3VdbeAddOp2(v, OP_NewRowid, iStatCur, regRowid); - sqlite3VdbeAddOp3(v, OP_Insert, iStatCur, regRec, regRowid); + sqlite3VdbeAddOp2(v, OP_NewRowid, iStatCur, regNewRowid); + sqlite3VdbeAddOp3(v, OP_Insert, iStatCur, regRec, regNewRowid); sqlite3VdbeChangeP5(v, OPFLAG_APPEND); if( pParse->nMemnMem = regRec; sqlite3VdbeJumpHere(v, jZeroRows); } + /* ** Generate code that will cause the most recent index analysis to ** be loaded into internal hash tables where is can be used. */ @@ -386,11 +738,11 @@ int iStatCur; int iMem; sqlite3BeginWriteOperation(pParse, 0, iDb); iStatCur = pParse->nTab; - pParse->nTab += 2; + pParse->nTab += 3; openStatTable(pParse, iDb, iStatCur, 0, 0); iMem = pParse->nMem+1; assert( sqlite3SchemaMutexHeld(db, iDb, 0) ); for(k=sqliteHashFirst(&pSchema->tblHash); k; k=sqliteHashNext(k)){ Table *pTab = (Table*)sqliteHashData(k); @@ -411,11 +763,11 @@ assert( pTab!=0 ); assert( sqlite3BtreeHoldsAllMutexes(pParse->db) ); iDb = sqlite3SchemaToIndex(pParse->db, pTab->pSchema); sqlite3BeginWriteOperation(pParse, 0, iDb); iStatCur = pParse->nTab; - pParse->nTab += 2; + pParse->nTab += 3; if( pOnlyIdx ){ openStatTable(pParse, iDb, iStatCur, pOnlyIdx->zName, "idx"); }else{ openStatTable(pParse, iDb, iStatCur, pTab->zName, "tbl"); } @@ -516,11 +868,11 @@ static int analysisLoader(void *pData, int argc, char **argv, char **NotUsed){ analysisInfo *pInfo = (analysisInfo*)pData; Index *pIndex; Table *pTable; int i, c, n; - unsigned int v; + tRowcnt v; const char *z; assert( argc==3 ); UNUSED_PARAMETER2(NotUsed, argc); @@ -559,40 +911,172 @@ /* ** If the Index.aSample variable is not NULL, delete the aSample[] array ** and its contents. */ void sqlite3DeleteIndexSamples(sqlite3 *db, Index *pIdx){ -#ifdef SQLITE_ENABLE_STAT2 +#ifdef SQLITE_ENABLE_STAT3 if( pIdx->aSample ){ int j; - for(j=0; jnSample; j++){ IndexSample *p = &pIdx->aSample[j]; if( p->eType==SQLITE_TEXT || p->eType==SQLITE_BLOB ){ sqlite3DbFree(db, p->u.z); } } sqlite3DbFree(db, pIdx->aSample); + } + if( db && db->pnBytesFreed==0 ){ + pIdx->nSample = 0; + pIdx->aSample = 0; } #else UNUSED_PARAMETER(db); UNUSED_PARAMETER(pIdx); #endif } +#ifdef SQLITE_ENABLE_STAT3 +/* +** Load content from the sqlite_stat3 table into the Index.aSample[] +** arrays of all indices. +*/ +static int loadStat3(sqlite3 *db, const char *zDb){ + int rc; /* Result codes from subroutines */ + sqlite3_stmt *pStmt = 0; /* An SQL statement being run */ + char *zSql; /* Text of the SQL statement */ + Index *pPrevIdx = 0; /* Previous index in the loop */ + int idx = 0; /* slot in pIdx->aSample[] for next sample */ + int eType; /* Datatype of a sample */ + IndexSample *pSample; /* A slot in pIdx->aSample[] */ + + if( !sqlite3FindTable(db, "sqlite_stat3", zDb) ){ + return SQLITE_OK; + } + + zSql = sqlite3MPrintf(db, + "SELECT idx,count(*) FROM %Q.sqlite_stat3" + " GROUP BY idx", zDb); + if( !zSql ){ + return SQLITE_NOMEM; + } + rc = sqlite3_prepare(db, zSql, -1, &pStmt, 0); + sqlite3DbFree(db, zSql); + if( rc ) return rc; + + while( sqlite3_step(pStmt)==SQLITE_ROW ){ + char *zIndex; /* Index name */ + Index *pIdx; /* Pointer to the index object */ + int nSample; /* Number of samples */ + + zIndex = (char *)sqlite3_column_text(pStmt, 0); + if( zIndex==0 ) continue; + nSample = sqlite3_column_int(pStmt, 1); + pIdx = sqlite3FindIndex(db, zIndex, zDb); + if( pIdx==0 ) continue; + assert( pIdx->nSample==0 ); + pIdx->nSample = nSample; + pIdx->aSample = sqlite3MallocZero( nSample*sizeof(IndexSample) ); + pIdx->avgEq = pIdx->aiRowEst[1]; + if( pIdx->aSample==0 ){ + db->mallocFailed = 1; + sqlite3_finalize(pStmt); + return SQLITE_NOMEM; + } + } + rc = sqlite3_finalize(pStmt); + if( rc ) return rc; + + zSql = sqlite3MPrintf(db, + "SELECT idx,neq,nlt,ndlt,sample FROM %Q.sqlite_stat3", zDb); + if( !zSql ){ + return SQLITE_NOMEM; + } + rc = sqlite3_prepare(db, zSql, -1, &pStmt, 0); + sqlite3DbFree(db, zSql); + if( rc ) return rc; + + while( sqlite3_step(pStmt)==SQLITE_ROW ){ + char *zIndex; /* Index name */ + Index *pIdx; /* Pointer to the index object */ + int i; /* Loop counter */ + tRowcnt sumEq; /* Sum of the nEq values */ + + zIndex = (char *)sqlite3_column_text(pStmt, 0); + if( zIndex==0 ) continue; + pIdx = sqlite3FindIndex(db, zIndex, zDb); + if( pIdx==0 ) continue; + if( pIdx==pPrevIdx ){ + idx++; + }else{ + pPrevIdx = pIdx; + idx = 0; + } + assert( idxnSample ); + pSample = &pIdx->aSample[idx]; + pSample->nEq = (tRowcnt)sqlite3_column_int64(pStmt, 1); + pSample->nLt = (tRowcnt)sqlite3_column_int64(pStmt, 2); + pSample->nDLt = (tRowcnt)sqlite3_column_int64(pStmt, 3); + if( idx==pIdx->nSample-1 ){ + if( pSample->nDLt>0 ){ + for(i=0, sumEq=0; i<=idx-1; i++) sumEq += pIdx->aSample[i].nEq; + pIdx->avgEq = (pSample->nLt - sumEq)/pSample->nDLt; + } + if( pIdx->avgEq<=0 ) pIdx->avgEq = 1; + } + eType = sqlite3_column_type(pStmt, 4); + pSample->eType = (u8)eType; + switch( eType ){ + case SQLITE_INTEGER: { + pSample->u.i = sqlite3_column_int64(pStmt, 4); + break; + } + case SQLITE_FLOAT: { + pSample->u.r = sqlite3_column_double(pStmt, 4); + break; + } + case SQLITE_NULL: { + break; + } + default: assert( eType==SQLITE_TEXT || eType==SQLITE_BLOB ); { + const char *z = (const char *)( + (eType==SQLITE_BLOB) ? + sqlite3_column_blob(pStmt, 4): + sqlite3_column_text(pStmt, 4) + ); + int n = z ? sqlite3_column_bytes(pStmt, 4) : 0; + pSample->nByte = n; + if( n < 1){ + pSample->u.z = 0; + }else{ + pSample->u.z = sqlite3Malloc(n); + if( pSample->u.z==0 ){ + db->mallocFailed = 1; + sqlite3_finalize(pStmt); + return SQLITE_NOMEM; + } + memcpy(pSample->u.z, z, n); + } + } + } + } + return sqlite3_finalize(pStmt); +} +#endif /* SQLITE_ENABLE_STAT3 */ + /* -** Load the content of the sqlite_stat1 and sqlite_stat2 tables. The +** Load the content of the sqlite_stat1 and sqlite_stat3 tables. The ** contents of sqlite_stat1 are used to populate the Index.aiRowEst[] -** arrays. The contents of sqlite_stat2 are used to populate the +** arrays. The contents of sqlite_stat3 are used to populate the ** Index.aSample[] arrays. ** ** If the sqlite_stat1 table is not present in the database, SQLITE_ERROR -** is returned. In this case, even if SQLITE_ENABLE_STAT2 was defined -** during compilation and the sqlite_stat2 table is present, no data is +** is returned. In this case, even if SQLITE_ENABLE_STAT3 was defined +** during compilation and the sqlite_stat3 table is present, no data is ** read from it. ** -** If SQLITE_ENABLE_STAT2 was defined during compilation and the -** sqlite_stat2 table is not present in the database, SQLITE_ERROR is +** If SQLITE_ENABLE_STAT3 was defined during compilation and the +** sqlite_stat3 table is not present in the database, SQLITE_ERROR is ** returned. However, in this case, data is read from the sqlite_stat1 ** table (if it is present) before returning. ** ** If an OOM error occurs, this function always sets db->mallocFailed. ** This means if the caller does not care about other errors, the return @@ -610,12 +1094,14 @@ /* Clear any prior statistics */ assert( sqlite3SchemaMutexHeld(db, iDb, 0) ); for(i=sqliteHashFirst(&db->aDb[iDb].pSchema->idxHash);i;i=sqliteHashNext(i)){ Index *pIdx = sqliteHashData(i); sqlite3DefaultRowEst(pIdx); +#ifdef SQLITE_ENABLE_STAT3 sqlite3DeleteIndexSamples(db, pIdx); pIdx->aSample = 0; +#endif } /* Check to make sure the sqlite_stat1 table exists */ sInfo.db = db; sInfo.zDatabase = db->aDb[iDb].zName; @@ -623,91 +1109,23 @@ return SQLITE_ERROR; } /* Load new statistics out of the sqlite_stat1 table */ zSql = sqlite3MPrintf(db, - "SELECT tbl, idx, stat FROM %Q.sqlite_stat1", sInfo.zDatabase); + "SELECT tbl,idx,stat FROM %Q.sqlite_stat1", sInfo.zDatabase); if( zSql==0 ){ rc = SQLITE_NOMEM; }else{ rc = sqlite3_exec(db, zSql, analysisLoader, &sInfo, 0); sqlite3DbFree(db, zSql); } - /* Load the statistics from the sqlite_stat2 table. */ -#ifdef SQLITE_ENABLE_STAT2 - if( rc==SQLITE_OK && !sqlite3FindTable(db, "sqlite_stat2", sInfo.zDatabase) ){ - rc = SQLITE_ERROR; - } - if( rc==SQLITE_OK ){ - sqlite3_stmt *pStmt = 0; - - zSql = sqlite3MPrintf(db, - "SELECT idx,sampleno,sample FROM %Q.sqlite_stat2", sInfo.zDatabase); - if( !zSql ){ - rc = SQLITE_NOMEM; - }else{ - rc = sqlite3_prepare(db, zSql, -1, &pStmt, 0); - sqlite3DbFree(db, zSql); - } - - if( rc==SQLITE_OK ){ - while( sqlite3_step(pStmt)==SQLITE_ROW ){ - char *zIndex; /* Index name */ - Index *pIdx; /* Pointer to the index object */ - - zIndex = (char *)sqlite3_column_text(pStmt, 0); - pIdx = zIndex ? sqlite3FindIndex(db, zIndex, sInfo.zDatabase) : 0; - if( pIdx ){ - int iSample = sqlite3_column_int(pStmt, 1); - if( iSample=0 ){ - int eType = sqlite3_column_type(pStmt, 2); - - if( pIdx->aSample==0 ){ - static const int sz = sizeof(IndexSample)*SQLITE_INDEX_SAMPLES; - pIdx->aSample = (IndexSample *)sqlite3DbMallocRaw(0, sz); - if( pIdx->aSample==0 ){ - db->mallocFailed = 1; - break; - } - memset(pIdx->aSample, 0, sz); - } - - assert( pIdx->aSample ); - { - IndexSample *pSample = &pIdx->aSample[iSample]; - pSample->eType = (u8)eType; - if( eType==SQLITE_INTEGER || eType==SQLITE_FLOAT ){ - pSample->u.r = sqlite3_column_double(pStmt, 2); - }else if( eType==SQLITE_TEXT || eType==SQLITE_BLOB ){ - const char *z = (const char *)( - (eType==SQLITE_BLOB) ? - sqlite3_column_blob(pStmt, 2): - sqlite3_column_text(pStmt, 2) - ); - int n = sqlite3_column_bytes(pStmt, 2); - if( n>24 ){ - n = 24; - } - pSample->nByte = (u8)n; - if( n < 1){ - pSample->u.z = 0; - }else{ - pSample->u.z = sqlite3DbStrNDup(0, z, n); - if( pSample->u.z==0 ){ - db->mallocFailed = 1; - break; - } - } - } - } - } - } - } - rc = sqlite3_finalize(pStmt); - } + /* Load the statistics from the sqlite_stat3 table. */ +#ifdef SQLITE_ENABLE_STAT3 + if( rc==SQLITE_OK ){ + rc = loadStat3(db, sInfo.zDatabase); } #endif if( rc==SQLITE_NOMEM ){ db->mallocFailed = 1; Index: src/build.c ================================================================== --- src/build.c +++ src/build.c @@ -1988,11 +1988,15 @@ Parse *pParse, /* The parsing context */ int iDb, /* The database number */ const char *zType, /* "idx" or "tbl" */ const char *zName /* Name of index or table */ ){ - static const char *azStatTab[] = { "sqlite_stat1", "sqlite_stat2" }; + static const char *azStatTab[] = { + "sqlite_stat1", + "sqlite_stat2", + "sqlite_stat3", + }; int i; const char *zDbName = pParse->db->aDb[iDb].zName; for(i=0; idb, azStatTab[i], zDbName) ){ sqlite3NestedParse(pParse, @@ -2000,10 +2004,80 @@ zDbName, azStatTab[i], zType, zName ); } } } + +/* +** Generate code to drop a table. +*/ +void sqlite3CodeDropTable(Parse *pParse, Table *pTab, int iDb, int isView){ + Vdbe *v; + sqlite3 *db = pParse->db; + Trigger *pTrigger; + Db *pDb = &db->aDb[iDb]; + + v = sqlite3GetVdbe(pParse); + assert( v!=0 ); + sqlite3BeginWriteOperation(pParse, 1, iDb); + +#ifndef SQLITE_OMIT_VIRTUALTABLE + if( IsVirtual(pTab) ){ + sqlite3VdbeAddOp0(v, OP_VBegin); + } +#endif + + /* Drop all triggers associated with the table being dropped. Code + ** is generated to remove entries from sqlite_master and/or + ** sqlite_temp_master if required. + */ + pTrigger = sqlite3TriggerList(pParse, pTab); + while( pTrigger ){ + assert( pTrigger->pSchema==pTab->pSchema || + pTrigger->pSchema==db->aDb[1].pSchema ); + sqlite3DropTriggerPtr(pParse, pTrigger); + pTrigger = pTrigger->pNext; + } + +#ifndef SQLITE_OMIT_AUTOINCREMENT + /* Remove any entries of the sqlite_sequence table associated with + ** the table being dropped. This is done before the table is dropped + ** at the btree level, in case the sqlite_sequence table needs to + ** move as a result of the drop (can happen in auto-vacuum mode). + */ + if( pTab->tabFlags & TF_Autoincrement ){ + sqlite3NestedParse(pParse, + "DELETE FROM %Q.sqlite_sequence WHERE name=%Q", + pDb->zName, pTab->zName + ); + } +#endif + + /* Drop all SQLITE_MASTER table and index entries that refer to the + ** table. The program name loops through the master table and deletes + ** every row that refers to a table of the same name as the one being + ** dropped. Triggers are handled seperately because a trigger can be + ** created in the temp database that refers to a table in another + ** database. + */ + sqlite3NestedParse(pParse, + "DELETE FROM %Q.%s WHERE tbl_name=%Q and type!='trigger'", + pDb->zName, SCHEMA_TABLE(iDb), pTab->zName); + if( !isView && !IsVirtual(pTab) ){ + destroyTable(pParse, pTab); + } + + /* Remove the table entry from SQLite's internal schema and modify + ** the schema cookie. + */ + if( IsVirtual(pTab) ){ + sqlite3VdbeAddOp4(v, OP_VDestroy, iDb, 0, 0, pTab->zName, 0); + } + sqlite3VdbeAddOp4(v, OP_DropTable, iDb, 0, 0, pTab->zName, 0); + sqlite3ChangeCookie(pParse, iDb); + sqliteViewResetAll(db, iDb); +} /* ** This routine is called to do the work of a DROP TABLE statement. ** pName is the name of the table to be dropped. */ @@ -2093,72 +2167,15 @@ /* Generate code to remove the table from the master table ** on disk. */ v = sqlite3GetVdbe(pParse); if( v ){ - Trigger *pTrigger; - Db *pDb = &db->aDb[iDb]; sqlite3BeginWriteOperation(pParse, 1, iDb); - -#ifndef SQLITE_OMIT_VIRTUALTABLE - if( IsVirtual(pTab) ){ - sqlite3VdbeAddOp0(v, OP_VBegin); - } -#endif + sqlite3ClearStatTables(pParse, iDb, "tbl", pTab->zName); sqlite3FkDropTable(pParse, pName, pTab); - - /* Drop all triggers associated with the table being dropped. Code - ** is generated to remove entries from sqlite_master and/or - ** sqlite_temp_master if required. - */ - pTrigger = sqlite3TriggerList(pParse, pTab); - while( pTrigger ){ - assert( pTrigger->pSchema==pTab->pSchema || - pTrigger->pSchema==db->aDb[1].pSchema ); - sqlite3DropTriggerPtr(pParse, pTrigger); - pTrigger = pTrigger->pNext; - } - -#ifndef SQLITE_OMIT_AUTOINCREMENT - /* Remove any entries of the sqlite_sequence table associated with - ** the table being dropped. This is done before the table is dropped - ** at the btree level, in case the sqlite_sequence table needs to - ** move as a result of the drop (can happen in auto-vacuum mode). - */ - if( pTab->tabFlags & TF_Autoincrement ){ - sqlite3NestedParse(pParse, - "DELETE FROM %s.sqlite_sequence WHERE name=%Q", - pDb->zName, pTab->zName - ); - } -#endif - - /* Drop all SQLITE_MASTER table and index entries that refer to the - ** table. The program name loops through the master table and deletes - ** every row that refers to a table of the same name as the one being - ** dropped. Triggers are handled seperately because a trigger can be - ** created in the temp database that refers to a table in another - ** database. - */ - sqlite3NestedParse(pParse, - "DELETE FROM %Q.%s WHERE tbl_name=%Q and type!='trigger'", - pDb->zName, SCHEMA_TABLE(iDb), pTab->zName); - sqlite3ClearStatTables(pParse, iDb, "tbl", pTab->zName); - if( !isView && !IsVirtual(pTab) ){ - destroyTable(pParse, pTab); - } - - /* Remove the table entry from SQLite's internal schema and modify - ** the schema cookie. - */ - if( IsVirtual(pTab) ){ - sqlite3VdbeAddOp4(v, OP_VDestroy, iDb, 0, 0, pTab->zName, 0); - } - sqlite3VdbeAddOp4(v, OP_DropTable, iDb, 0, 0, pTab->zName, 0); - sqlite3ChangeCookie(pParse, iDb); - } - sqliteViewResetAll(db, iDb); + sqlite3CodeDropTable(pParse, pTab, iDb, isView); + } exit_drop_table: sqlite3SrcListDelete(db, pName); } @@ -2637,24 +2654,24 @@ */ nName = sqlite3Strlen30(zName); nCol = pList->nExpr; pIndex = sqlite3DbMallocZero(db, sizeof(Index) + /* Index structure */ + sizeof(tRowcnt)*(nCol+1) + /* Index.aiRowEst */ sizeof(int)*nCol + /* Index.aiColumn */ - sizeof(int)*(nCol+1) + /* Index.aiRowEst */ sizeof(char *)*nCol + /* Index.azColl */ sizeof(u8)*nCol + /* Index.aSortOrder */ nName + 1 + /* Index.zName */ nExtra /* Collation sequence names */ ); if( db->mallocFailed ){ goto exit_create_index; } - pIndex->azColl = (char**)(&pIndex[1]); + pIndex->aiRowEst = (tRowcnt*)(&pIndex[1]); + pIndex->azColl = (char**)(&pIndex->aiRowEst[nCol+1]); pIndex->aiColumn = (int *)(&pIndex->azColl[nCol]); - pIndex->aiRowEst = (unsigned *)(&pIndex->aiColumn[nCol]); - pIndex->aSortOrder = (u8 *)(&pIndex->aiRowEst[nCol+1]); + pIndex->aSortOrder = (u8 *)(&pIndex->aiColumn[nCol]); pIndex->zName = (char *)(&pIndex->aSortOrder[nCol]); zExtra = (char *)(&pIndex->zName[nName+1]); memcpy(pIndex->zName, zName, nName+1); pIndex->pTable = pTab; pIndex->nColumn = pList->nExpr; @@ -2927,13 +2944,13 @@ ** Apart from that, we have little to go on besides intuition as to ** how aiRowEst[] should be initialized. The numbers generated here ** are based on typical values found in actual indices. */ void sqlite3DefaultRowEst(Index *pIdx){ - unsigned *a = pIdx->aiRowEst; + tRowcnt *a = pIdx->aiRowEst; int i; - unsigned n; + tRowcnt n; assert( a!=0 ); a[0] = pIdx->pTable->nRowEst; if( a[0]<10 ) a[0] = 10; n = 10; for(i=1; i<=pIdx->nColumn; i++){ Index: src/ctime.c ================================================================== --- src/ctime.c +++ src/ctime.c @@ -114,10 +114,13 @@ #ifdef SQLITE_ENABLE_RTREE "ENABLE_RTREE", #endif #ifdef SQLITE_ENABLE_STAT2 "ENABLE_STAT2", +#endif +#ifdef SQLITE_ENABLE_STAT3 + "ENABLE_STAT3", #endif #ifdef SQLITE_ENABLE_UNLOCK_NOTIFY "ENABLE_UNLOCK_NOTIFY", #endif #ifdef SQLITE_ENABLE_UPDATE_DELETE_LIMIT Index: src/os_win.c ================================================================== --- src/os_win.c +++ src/os_win.c @@ -2613,11 +2613,11 @@ if( h==INVALID_HANDLE_VALUE ){ pFile->lastErrno = GetLastError(); winLogError(SQLITE_CANTOPEN, "winOpen", zUtf8Name); free(zConverted); - if( isReadWrite ){ + if( isReadWrite && !isExclusive ){ return winOpen(pVfs, zName, id, ((flags|SQLITE_OPEN_READONLY)&~(SQLITE_OPEN_CREATE|SQLITE_OPEN_READWRITE)), pOutFlags); }else{ return SQLITE_CANTOPEN_BKPT; } Index: src/sqlite.h.in ================================================================== --- src/sqlite.h.in +++ src/sqlite.h.in @@ -2843,11 +2843,11 @@ ** a schema change, on the first [sqlite3_step()] call following any change ** to the [sqlite3_bind_text | bindings] of that [parameter]. ** ^The specific value of WHERE-clause [parameter] might influence the ** choice of query plan if the parameter is the left-hand side of a [LIKE] ** or [GLOB] operator or if the parameter is compared to an indexed column -** and the [SQLITE_ENABLE_STAT2] compile-time option is enabled. +** and the [SQLITE_ENABLE_STAT3] compile-time option is enabled. ** the ** ** */ int sqlite3_prepare( Index: src/sqliteInt.h ================================================================== --- src/sqliteInt.h +++ src/sqliteInt.h @@ -449,10 +449,22 @@ ** is 0x00000000ffffffff. But because of quirks of some compilers, we ** have to specify the value in the less intuitive manner shown: */ #define SQLITE_MAX_U32 ((((u64)1)<<32)-1) +/* +** The datatype used to store estimates of the number of rows in a +** table or index. This is an unsigned integer type. For 99.9% of +** the world, a 32-bit integer is sufficient. But a 64-bit integer +** can be used at compile-time if desired. +*/ +#ifdef SQLITE_64BIT_STATS + typedef u64 tRowcnt; /* 64-bit only if requested at compile-time */ +#else + typedef u32 tRowcnt; /* 32-bit is the default */ +#endif + /* ** Macros to determine whether the machine is big or little endian, ** evaluated at runtime. */ #ifdef SQLITE_AMALGAMATION @@ -1282,11 +1294,11 @@ int iPKey; /* If not negative, use aCol[iPKey] as the primary key */ int nCol; /* Number of columns in this table */ Column *aCol; /* Information about each column */ Index *pIndex; /* List of SQL indexes on this table. */ int tnum; /* Root BTree node for this table (see note above) */ - unsigned nRowEst; /* Estimated rows in table - from sqlite_stat1 table */ + tRowcnt nRowEst; /* Estimated rows in table - from sqlite_stat1 table */ Select *pSelect; /* NULL for tables. Points to definition if a view. */ u16 nRef; /* Number of pointers to this Table */ u8 tabFlags; /* Mask of TF_* values */ u8 keyConf; /* What to do in case of uniqueness conflict on iPKey */ FKey *pFKey; /* Linked list of all foreign keys in this table */ @@ -1481,11 +1493,11 @@ */ struct Index { char *zName; /* Name of this index */ int nColumn; /* Number of columns in the table used by this index */ int *aiColumn; /* Which columns are used by this index. 1st is 0 */ - unsigned *aiRowEst; /* Result of ANALYZE: Est. rows selected by each column */ + tRowcnt *aiRowEst; /* Result of ANALYZE: Est. rows selected by each column */ Table *pTable; /* The SQL table being indexed */ int tnum; /* Page containing root of this index in database file */ u8 onError; /* OE_Abort, OE_Ignore, OE_Replace, or OE_None */ u8 autoIndex; /* True if is automatically created (ex: by UNIQUE) */ u8 bUnordered; /* Use this index for == or IN queries only */ @@ -1492,24 +1504,32 @@ char *zColAff; /* String defining the affinity of each column */ Index *pNext; /* The next index associated with the same table */ Schema *pSchema; /* Schema containing this index */ u8 *aSortOrder; /* Array of size Index.nColumn. True==DESC, False==ASC */ char **azColl; /* Array of collation sequence names for index */ - IndexSample *aSample; /* Array of SQLITE_INDEX_SAMPLES samples */ +#ifdef SQLITE_ENABLE_STAT3 + int nSample; /* Number of elements in aSample[] */ + tRowcnt avgEq; /* Average nEq value for key values not in aSample */ + IndexSample *aSample; /* Samples of the left-most key */ +#endif }; /* ** Each sample stored in the sqlite_stat2 table is represented in memory ** using a structure of this type. */ struct IndexSample { union { char *z; /* Value if eType is SQLITE_TEXT or SQLITE_BLOB */ - double r; /* Value if eType is SQLITE_FLOAT or SQLITE_INTEGER */ + double r; /* Value if eType is SQLITE_FLOAT */ + i64 i; /* Value if eType is SQLITE_INTEGER */ } u; u8 eType; /* SQLITE_NULL, SQLITE_INTEGER ... etc. */ - u8 nByte; /* Size in byte of text or blob. */ + int nByte; /* Size in byte of text or blob. */ + tRowcnt nEq; /* Est. number of rows where the key equals this sample */ + tRowcnt nLt; /* Est. number of rows where key is less than this sample */ + tRowcnt nDLt; /* Est. number of distinct keys less than this sample */ }; /* ** Each token coming out of the lexer is an instance of ** this structure. Tokens are also used as part of an expression. @@ -2714,10 +2734,11 @@ #else # define sqlite3ViewGetColumnNames(A,B) 0 #endif void sqlite3DropTable(Parse*, SrcList*, int, int); +void sqlite3CodeDropTable(Parse*, Table*, int, int); void sqlite3DeleteTable(sqlite3*, Table*); #ifndef SQLITE_OMIT_AUTOINCREMENT void sqlite3AutoincrementBegin(Parse *pParse); void sqlite3AutoincrementEnd(Parse *pParse); #else @@ -2970,11 +2991,11 @@ void sqlite3ValueSetStr(sqlite3_value*, int, const void *,u8, void(*)(void*)); void sqlite3ValueFree(sqlite3_value*); sqlite3_value *sqlite3ValueNew(sqlite3 *); char *sqlite3Utf16to8(sqlite3 *, const void*, int, u8); -#ifdef SQLITE_ENABLE_STAT2 +#ifdef SQLITE_ENABLE_STAT3 char *sqlite3Utf8to16(sqlite3 *, u8, char *, int, int *); #endif int sqlite3ValueFromExpr(sqlite3 *, Expr *, u8, u8, sqlite3_value **); void sqlite3ValueApplyAffinity(sqlite3_value *, u8, u8); #ifndef SQLITE_AMALGAMATION Index: src/test_config.c ================================================================== --- src/test_config.c +++ src/test_config.c @@ -421,10 +421,16 @@ #ifdef SQLITE_ENABLE_STAT2 Tcl_SetVar2(interp, "sqlite_options", "stat2", "1", TCL_GLOBAL_ONLY); #else Tcl_SetVar2(interp, "sqlite_options", "stat2", "0", TCL_GLOBAL_ONLY); #endif + +#ifdef SQLITE_ENABLE_STAT3 + Tcl_SetVar2(interp, "sqlite_options", "stat3", "1", TCL_GLOBAL_ONLY); +#else + Tcl_SetVar2(interp, "sqlite_options", "stat3", "0", TCL_GLOBAL_ONLY); +#endif #if !defined(SQLITE_ENABLE_LOCKING_STYLE) # if defined(__APPLE__) # define SQLITE_ENABLE_LOCKING_STYLE 1 # else Index: src/utf.c ================================================================== --- src/utf.c +++ src/utf.c @@ -462,11 +462,11 @@ ** no longer required. ** ** If a malloc failure occurs, NULL is returned and the db.mallocFailed ** flag set. */ -#ifdef SQLITE_ENABLE_STAT2 +#ifdef SQLITE_ENABLE_STAT3 char *sqlite3Utf8to16(sqlite3 *db, u8 enc, char *z, int n, int *pnOut){ Mem m; memset(&m, 0, sizeof(m)); m.db = db; sqlite3VdbeMemSetStr(&m, z, n, SQLITE_UTF8, SQLITE_STATIC); Index: src/vdbeaux.c ================================================================== --- src/vdbeaux.c +++ src/vdbeaux.c @@ -573,12 +573,12 @@ /* ** Change the P2 operand of instruction addr so that it points to ** the address of the next instruction to be coded. */ void sqlite3VdbeJumpHere(Vdbe *p, int addr){ - assert( addr>=0 ); - sqlite3VdbeChangeP2(p, addr, p->nOp); + assert( addr>=0 || p->db->mallocFailed ); + if( addr>=0 ) sqlite3VdbeChangeP2(p, addr, p->nOp); } /* ** If the input FuncDef structure is ephemeral, then free it. If Index: src/vdbemem.c ================================================================== --- src/vdbemem.c +++ src/vdbemem.c @@ -1024,15 +1024,15 @@ *ppVal = 0; return SQLITE_OK; } op = pExpr->op; - /* op can only be TK_REGISTER if we have compiled with SQLITE_ENABLE_STAT2. + /* op can only be TK_REGISTER if we have compiled with SQLITE_ENABLE_STAT3. ** The ifdef here is to enable us to achieve 100% branch test coverage even - ** when SQLITE_ENABLE_STAT2 is omitted. + ** when SQLITE_ENABLE_STAT3 is omitted. */ -#ifdef SQLITE_ENABLE_STAT2 +#ifdef SQLITE_ENABLE_STAT3 if( op==TK_REGISTER ) op = pExpr->op2; #else if( NEVER(op==TK_REGISTER) ) op = pExpr->op2; #endif Index: src/where.c ================================================================== --- src/where.c +++ src/where.c @@ -116,14 +116,14 @@ #define TERM_CODED 0x04 /* This term is already coded */ #define TERM_COPIED 0x08 /* Has a child */ #define TERM_ORINFO 0x10 /* Need to free the WhereTerm.u.pOrInfo object */ #define TERM_ANDINFO 0x20 /* Need to free the WhereTerm.u.pAndInfo obj */ #define TERM_OR_OK 0x40 /* Used during OR-clause processing */ -#ifdef SQLITE_ENABLE_STAT2 +#ifdef SQLITE_ENABLE_STAT3 # define TERM_VNULL 0x80 /* Manufactured x>NULL or x<=NULL term */ #else -# define TERM_VNULL 0x00 /* Disabled if not using stat2 */ +# define TERM_VNULL 0x00 /* Disabled if not using stat3 */ #endif /* ** An instance of the following structure holds all information about a ** WHERE clause. Mostly this is a container for one or more WhereTerms. @@ -1338,12 +1338,12 @@ pNewTerm->prereqAll = pTerm->prereqAll; } } #endif /* SQLITE_OMIT_VIRTUALTABLE */ -#ifdef SQLITE_ENABLE_STAT2 - /* When sqlite_stat2 histogram data is available an operator of the +#ifdef SQLITE_ENABLE_STAT3 + /* When sqlite_stat3 histogram data is available an operator of the ** form "x IS NOT NULL" can sometimes be evaluated more efficiently ** as "x>NULL" if x is not an INTEGER PRIMARY KEY. So construct a ** virtual term of that form. ** ** Note that the virtual term must be tagged with TERM_VNULL. This @@ -1377,11 +1377,11 @@ pTerm->nChild = 1; pTerm->wtFlags |= TERM_COPIED; pNewTerm->prereqAll = pTerm->prereqAll; } } -#endif /* SQLITE_ENABLE_STAT2 */ +#endif /* SQLITE_ENABLE_STAT */ /* Prevent ON clause terms of a LEFT JOIN from being used to drive ** an index for tables to the left of the join. */ pTerm->prereqRight |= extraRight; @@ -2425,71 +2425,90 @@ */ bestOrClauseIndex(pParse, pWC, pSrc, notReady, notValid, pOrderBy, pCost); } #endif /* SQLITE_OMIT_VIRTUALTABLE */ -/* -** Argument pIdx is a pointer to an index structure that has an array of -** SQLITE_INDEX_SAMPLES evenly spaced samples of the first indexed column -** stored in Index.aSample. These samples divide the domain of values stored -** the index into (SQLITE_INDEX_SAMPLES+1) regions. -** Region 0 contains all values less than the first sample value. Region -** 1 contains values between the first and second samples. Region 2 contains -** values between samples 2 and 3. And so on. Region SQLITE_INDEX_SAMPLES -** contains values larger than the last sample. -** -** If the index contains many duplicates of a single value, then it is -** possible that two or more adjacent samples can hold the same value. -** When that is the case, the smallest possible region code is returned -** when roundUp is false and the largest possible region code is returned -** when roundUp is true. -** -** If successful, this function determines which of the regions value -** pVal lies in, sets *piRegion to the region index (a value between 0 -** and SQLITE_INDEX_SAMPLES+1, inclusive) and returns SQLITE_OK. -** Or, if an OOM occurs while converting text values between encodings, -** SQLITE_NOMEM is returned and *piRegion is undefined. -*/ -#ifdef SQLITE_ENABLE_STAT2 -static int whereRangeRegion( +#ifdef SQLITE_ENABLE_STAT3 +/* +** Estimate the location of a particular key among all keys in an +** index. Store the results in aStat as follows: +** +** aStat[0] Est. number of rows less than pVal +** aStat[1] Est. number of rows equal to pVal +** +** Return SQLITE_OK on success. +*/ +static int whereKeyStats( Parse *pParse, /* Database connection */ Index *pIdx, /* Index to consider domain of */ sqlite3_value *pVal, /* Value to consider */ - int roundUp, /* Return largest valid region if true */ - int *piRegion /* OUT: Region of domain in which value lies */ + int roundUp, /* Round up if true. Round down if false */ + tRowcnt *aStat /* OUT: stats written here */ ){ + tRowcnt n; + IndexSample *aSample; + int i, eType; + int isEq = 0; + i64 v; + double r, rS; + assert( roundUp==0 || roundUp==1 ); - if( ALWAYS(pVal) ){ - IndexSample *aSample = pIdx->aSample; - int i = 0; - int eType = sqlite3_value_type(pVal); - - if( eType==SQLITE_INTEGER || eType==SQLITE_FLOAT ){ - double r = sqlite3_value_double(pVal); - for(i=0; i=SQLITE_TEXT ) break; - if( roundUp ){ - if( aSample[i].u.r>r ) break; - }else{ - if( aSample[i].u.r>=r ) break; - } - } - }else if( eType==SQLITE_NULL ){ - i = 0; - if( roundUp ){ - while( inSample>0 ); + if( pVal==0 ) return SQLITE_ERROR; + n = pIdx->aiRowEst[0]; + aSample = pIdx->aSample; + i = 0; + eType = sqlite3_value_type(pVal); + + if( eType==SQLITE_INTEGER ){ + v = sqlite3_value_int64(pVal); + r = (i64)v; + for(i=0; inSample; i++){ + if( aSample[i].eType==SQLITE_NULL ) continue; + if( aSample[i].eType>=SQLITE_TEXT ) break; + if( aSample[i].eType==SQLITE_INTEGER ){ + if( aSample[i].u.i>=v ){ + isEq = aSample[i].u.i==v; + break; + } + }else{ + assert( aSample[i].eType==SQLITE_FLOAT ); + if( aSample[i].u.r>=r ){ + isEq = aSample[i].u.r==r; + break; + } + } + } + }else if( eType==SQLITE_FLOAT ){ + r = sqlite3_value_double(pVal); + for(i=0; inSample; i++){ + if( aSample[i].eType==SQLITE_NULL ) continue; + if( aSample[i].eType>=SQLITE_TEXT ) break; + if( aSample[i].eType==SQLITE_FLOAT ){ + rS = aSample[i].u.r; + }else{ + rS = aSample[i].u.i; + } + if( rS>=r ){ + isEq = rS==r; + break; + } + } + }else if( eType==SQLITE_NULL ){ + i = 0; + if( aSample[0].eType==SQLITE_NULL ) isEq = 1; + }else{ + assert( eType==SQLITE_TEXT || eType==SQLITE_BLOB ); + for(i=0; inSample; i++){ + if( aSample[i].eType==SQLITE_TEXT || aSample[i].eType==SQLITE_BLOB ){ + break; + } + } + if( inSample ){ sqlite3 *db = pParse->db; CollSeq *pColl; const u8 *z; - int n; - - /* pVal comes from sqlite3ValueFromExpr() so the type cannot be NULL */ - assert( eType==SQLITE_TEXT || eType==SQLITE_BLOB ); - if( eType==SQLITE_BLOB ){ z = (const u8 *)sqlite3_value_blob(pVal); pColl = db->pDfltColl; assert( pColl->enc==SQLITE_UTF8 ); }else{ @@ -2504,16 +2523,16 @@ return SQLITE_NOMEM; } assert( z && pColl && pColl->xCmp ); } n = sqlite3ValueBytes(pVal, pColl->enc); - - for(i=0; inSample; i++){ int c; int eSampletype = aSample[i].eType; - if( eSampletype==SQLITE_NULL || eSampletypeenc!=SQLITE_UTF8 ){ int nSample; char *zSample = sqlite3Utf8to16( db, pColl->enc, aSample[i].u.z, aSample[i].nByte, &nSample @@ -2527,20 +2546,51 @@ }else #endif { c = pColl->xCmp(pColl->pUser, aSample[i].nByte, aSample[i].u.z, n, z); } - if( c-roundUp>=0 ) break; + if( c>=0 ){ + if( c==0 ) isEq = 1; + break; + } } } + } - assert( i>=0 && i<=SQLITE_INDEX_SAMPLES ); - *piRegion = i; + /* At this point, aSample[i] is the first sample that is greater than + ** or equal to pVal. Or if i==pIdx->nSample, then all samples are less + ** than pVal. If aSample[i]==pVal, then isEq==1. + */ + if( isEq ){ + assert( inSample ); + aStat[0] = aSample[i].nLt; + aStat[1] = aSample[i].nEq; + }else{ + tRowcnt iLower, iUpper, iGap; + if( i==0 ){ + iLower = 0; + iUpper = aSample[0].nLt; + }else{ + iUpper = i>=pIdx->nSample ? n : aSample[i].nLt; + iLower = aSample[i-1].nEq + aSample[i-1].nLt; + } + aStat[1] = pIdx->avgEq; + if( iLower>=iUpper ){ + iGap = 0; + }else{ + iGap = iUpper - iLower; + } + if( roundUp ){ + iGap = (iGap*2)/3; + }else{ + iGap = iGap/3; + } + aStat[0] = iLower + iGap; } return SQLITE_OK; } -#endif /* #ifdef SQLITE_ENABLE_STAT2 */ +#endif /* SQLITE_ENABLE_STAT3 */ /* ** If expression pExpr represents a literal value, set *pp to point to ** an sqlite3_value structure containing the same value, with affinity ** aff applied to it, before returning. It is the responsibility of the @@ -2554,11 +2604,11 @@ ** ** If neither of the above apply, set *pp to NULL. ** ** If an error occurs, return an error code. Otherwise, SQLITE_OK. */ -#ifdef SQLITE_ENABLE_STAT2 +#ifdef SQLITE_ENABLE_STAT3 static int valueFromExpr( Parse *pParse, Expr *pExpr, u8 aff, sqlite3_value **pp @@ -2602,106 +2652,92 @@ ** ** ... FROM t1 WHERE a > ? AND a < ? ... ** ** then nEq should be passed 0. ** -** The returned value is an integer between 1 and 100, inclusive. A return -** value of 1 indicates that the proposed range scan is expected to visit -** approximately 1/100th (1%) of the rows selected by the nEq equality -** constraints (if any). A return value of 100 indicates that it is expected -** that the range scan will visit every row (100%) selected by the equality -** constraints. +** The returned value is an integer divisor to reduce the estimated +** search space. A return value of 1 means that range constraints are +** no help at all. A return value of 2 means range constraints are +** expected to reduce the search space by half. And so forth... ** -** In the absence of sqlite_stat2 ANALYZE data, each range inequality -** reduces the search space by 3/4ths. Hence a single constraint (x>?) -** results in a return of 25 and a range constraint (x>? AND x?) +** results in a return of 4 and a range constraint (x>? AND xaCol[] of the range-compared column */ WhereTerm *pLower, /* Lower bound on the range. ex: "x>123" Might be NULL */ WhereTerm *pUpper, /* Upper bound on the range. ex: "x<455" Might be NULL */ - int *piEst /* OUT: Return value */ + double *pRangeDiv /* OUT: Reduce search space by this divisor */ ){ int rc = SQLITE_OK; -#ifdef SQLITE_ENABLE_STAT2 - - if( nEq==0 && p->aSample ){ - sqlite3_value *pLowerVal = 0; - sqlite3_value *pUpperVal = 0; - int iEst; - int iLower = 0; - int iUpper = SQLITE_INDEX_SAMPLES; - int roundUpUpper = 0; - int roundUpLower = 0; +#ifdef SQLITE_ENABLE_STAT3 + + if( nEq==0 && p->nSample ){ + sqlite3_value *pRangeVal; + tRowcnt iLower = 0; + tRowcnt iUpper = p->aiRowEst[0]; + tRowcnt a[2]; u8 aff = p->pTable->aCol[p->aiColumn[0]].affinity; if( pLower ){ Expr *pExpr = pLower->pExpr->pRight; - rc = valueFromExpr(pParse, pExpr, aff, &pLowerVal); + rc = valueFromExpr(pParse, pExpr, aff, &pRangeVal); assert( pLower->eOperator==WO_GT || pLower->eOperator==WO_GE ); - roundUpLower = (pLower->eOperator==WO_GT) ?1:0; + if( rc==SQLITE_OK + && whereKeyStats(pParse, p, pRangeVal, 0, a)==SQLITE_OK + ){ + iLower = a[0]; + if( pLower->eOperator==WO_GT ) iLower += a[1]; + } + sqlite3ValueFree(pRangeVal); } if( rc==SQLITE_OK && pUpper ){ Expr *pExpr = pUpper->pExpr->pRight; - rc = valueFromExpr(pParse, pExpr, aff, &pUpperVal); + rc = valueFromExpr(pParse, pExpr, aff, &pRangeVal); assert( pUpper->eOperator==WO_LT || pUpper->eOperator==WO_LE ); - roundUpUpper = (pUpper->eOperator==WO_LE) ?1:0; - } - - if( rc!=SQLITE_OK || (pLowerVal==0 && pUpperVal==0) ){ - sqlite3ValueFree(pLowerVal); - sqlite3ValueFree(pUpperVal); - goto range_est_fallback; - }else if( pLowerVal==0 ){ - rc = whereRangeRegion(pParse, p, pUpperVal, roundUpUpper, &iUpper); - if( pLower ) iLower = iUpper/2; - }else if( pUpperVal==0 ){ - rc = whereRangeRegion(pParse, p, pLowerVal, roundUpLower, &iLower); - if( pUpper ) iUpper = (iLower + SQLITE_INDEX_SAMPLES + 1)/2; - }else{ - rc = whereRangeRegion(pParse, p, pUpperVal, roundUpUpper, &iUpper); - if( rc==SQLITE_OK ){ - rc = whereRangeRegion(pParse, p, pLowerVal, roundUpLower, &iLower); - } - } - WHERETRACE(("range scan regions: %d..%d\n", iLower, iUpper)); - - iEst = iUpper - iLower; - testcase( iEst==SQLITE_INDEX_SAMPLES ); - assert( iEst<=SQLITE_INDEX_SAMPLES ); - if( iEst<1 ){ - *piEst = 50/SQLITE_INDEX_SAMPLES; - }else{ - *piEst = (iEst*100)/SQLITE_INDEX_SAMPLES; - } - sqlite3ValueFree(pLowerVal); - sqlite3ValueFree(pUpperVal); - return rc; - } -range_est_fallback: + if( rc==SQLITE_OK + && whereKeyStats(pParse, p, pRangeVal, 1, a)==SQLITE_OK + ){ + iUpper = a[0]; + if( pUpper->eOperator==WO_LE ) iUpper += a[1]; + } + sqlite3ValueFree(pRangeVal); + } + if( rc==SQLITE_OK ){ + if( iUpper<=iLower ){ + *pRangeDiv = (double)p->aiRowEst[0]; + }else{ + *pRangeDiv = (double)p->aiRowEst[0]/(double)(iUpper - iLower); + } + WHERETRACE(("range scan regions: %u..%u div=%g\n", + (u32)iLower, (u32)iUpper, *pRangeDiv)); + return SQLITE_OK; + } + } #else UNUSED_PARAMETER(pParse); UNUSED_PARAMETER(p); UNUSED_PARAMETER(nEq); #endif assert( pLower || pUpper ); - *piEst = 100; - if( pLower && (pLower->wtFlags & TERM_VNULL)==0 ) *piEst /= 4; - if( pUpper ) *piEst /= 4; + *pRangeDiv = (double)1; + if( pLower && (pLower->wtFlags & TERM_VNULL)==0 ) *pRangeDiv *= (double)4; + if( pUpper ) *pRangeDiv *= (double)4; return rc; } -#ifdef SQLITE_ENABLE_STAT2 +#ifdef SQLITE_ENABLE_STAT3 /* ** Estimate the number of rows that will be returned based on ** an equality constraint x=VALUE and where that VALUE occurs in ** the histogram data. This only works when x is the left-most -** column of an index and sqlite_stat2 histogram data is available +** column of an index and sqlite_stat3 histogram data is available ** for that index. When pExpr==NULL that means the constraint is ** "x IS NULL" instead of "x=VALUE". ** ** Write the estimated row count into *pnRow and return SQLITE_OK. ** If unable to make an estimate, leave *pnRow unchanged and return @@ -2717,44 +2753,36 @@ Index *p, /* The index whose left-most column is pTerm */ Expr *pExpr, /* Expression for VALUE in the x=VALUE constraint */ double *pnRow /* Write the revised row estimate here */ ){ sqlite3_value *pRhs = 0; /* VALUE on right-hand side of pTerm */ - int iLower, iUpper; /* Range of histogram regions containing pRhs */ u8 aff; /* Column affinity */ int rc; /* Subfunction return code */ - double nRowEst; /* New estimate of the number of rows */ + tRowcnt a[2]; /* Statistics */ assert( p->aSample!=0 ); + assert( p->nSample>0 ); aff = p->pTable->aCol[p->aiColumn[0]].affinity; if( pExpr ){ rc = valueFromExpr(pParse, pExpr, aff, &pRhs); if( rc ) goto whereEqualScanEst_cancel; }else{ pRhs = sqlite3ValueNew(pParse->db); } if( pRhs==0 ) return SQLITE_NOTFOUND; - rc = whereRangeRegion(pParse, p, pRhs, 0, &iLower); - if( rc ) goto whereEqualScanEst_cancel; - rc = whereRangeRegion(pParse, p, pRhs, 1, &iUpper); - if( rc ) goto whereEqualScanEst_cancel; - WHERETRACE(("equality scan regions: %d..%d\n", iLower, iUpper)); - if( iLower>=iUpper ){ - nRowEst = p->aiRowEst[0]/(SQLITE_INDEX_SAMPLES*2); - if( nRowEst<*pnRow ) *pnRow = nRowEst; - }else{ - nRowEst = (iUpper-iLower)*p->aiRowEst[0]/SQLITE_INDEX_SAMPLES; - *pnRow = nRowEst; - } - + rc = whereKeyStats(pParse, p, pRhs, 0, a); + if( rc==SQLITE_OK ){ + WHERETRACE(("equality scan regions: %d\n", (int)a[1])); + *pnRow = a[1]; + } whereEqualScanEst_cancel: sqlite3ValueFree(pRhs); return rc; } -#endif /* defined(SQLITE_ENABLE_STAT2) */ +#endif /* defined(SQLITE_ENABLE_STAT3) */ -#ifdef SQLITE_ENABLE_STAT2 +#ifdef SQLITE_ENABLE_STAT3 /* ** Estimate the number of rows that will be returned based on ** an IN constraint where the right-hand side of the IN operator ** is a list of values. Example: ** @@ -2773,64 +2801,29 @@ Parse *pParse, /* Parsing & code generating context */ Index *p, /* The index whose left-most column is pTerm */ ExprList *pList, /* The value list on the RHS of "x IN (v1,v2,v3,...)" */ double *pnRow /* Write the revised row estimate here */ ){ - sqlite3_value *pVal = 0; /* One value from list */ - int iLower, iUpper; /* Range of histogram regions containing pRhs */ - u8 aff; /* Column affinity */ - int rc = SQLITE_OK; /* Subfunction return code */ - double nRowEst; /* New estimate of the number of rows */ - int nSpan = 0; /* Number of histogram regions spanned */ - int nSingle = 0; /* Histogram regions hit by a single value */ - int nNotFound = 0; /* Count of values that are not constants */ - int i; /* Loop counter */ - u8 aSpan[SQLITE_INDEX_SAMPLES+1]; /* Histogram regions that are spanned */ - u8 aSingle[SQLITE_INDEX_SAMPLES+1]; /* Histogram regions hit once */ + int rc = SQLITE_OK; /* Subfunction return code */ + double nEst; /* Number of rows for a single term */ + double nRowEst = (double)0; /* New estimate of the number of rows */ + int i; /* Loop counter */ assert( p->aSample!=0 ); - aff = p->pTable->aCol[p->aiColumn[0]].affinity; - memset(aSpan, 0, sizeof(aSpan)); - memset(aSingle, 0, sizeof(aSingle)); - for(i=0; inExpr; i++){ - sqlite3ValueFree(pVal); - rc = valueFromExpr(pParse, pList->a[i].pExpr, aff, &pVal); - if( rc ) break; - if( pVal==0 || sqlite3_value_type(pVal)==SQLITE_NULL ){ - nNotFound++; - continue; - } - rc = whereRangeRegion(pParse, p, pVal, 0, &iLower); - if( rc ) break; - rc = whereRangeRegion(pParse, p, pVal, 1, &iUpper); - if( rc ) break; - if( iLower>=iUpper ){ - aSingle[iLower] = 1; - }else{ - assert( iLower>=0 && iUpper<=SQLITE_INDEX_SAMPLES ); - while( iLowernExpr; i++){ + nEst = p->aiRowEst[0]; + rc = whereEqualScanEst(pParse, p, pList->a[i].pExpr, &nEst); + nRowEst += nEst; } if( rc==SQLITE_OK ){ - for(i=nSpan=0; i<=SQLITE_INDEX_SAMPLES; i++){ - if( aSpan[i] ){ - nSpan++; - }else if( aSingle[i] ){ - nSingle++; - } - } - nRowEst = (nSpan*2+nSingle)*p->aiRowEst[0]/(2*SQLITE_INDEX_SAMPLES) - + nNotFound*p->aiRowEst[1]; if( nRowEst > p->aiRowEst[0] ) nRowEst = p->aiRowEst[0]; *pnRow = nRowEst; - WHERETRACE(("IN row estimate: nSpan=%d, nSingle=%d, nNotFound=%d, est=%g\n", - nSpan, nSingle, nNotFound, nRowEst)); + WHERETRACE(("IN row estimate: est=%g\n", nRowEst)); } - sqlite3ValueFree(pVal); return rc; } -#endif /* defined(SQLITE_ENABLE_STAT2) */ +#endif /* defined(SQLITE_ENABLE_STAT3) */ /* ** Find the best query plan for accessing a particular table. Write the ** best query plan and its cost into the WhereCost object supplied as the @@ -2873,11 +2866,11 @@ Index *pProbe; /* An index we are evaluating */ Index *pIdx; /* Copy of pProbe, or zero for IPK index */ int eqTermMask; /* Current mask of valid equality operators */ int idxEqTermMask; /* Index mask of valid equality operators */ Index sPk; /* A fake index object for the primary key */ - unsigned int aiRowEstPk[2]; /* The aiRowEst[] value for the sPk index */ + tRowcnt aiRowEstPk[2]; /* The aiRowEst[] value for the sPk index */ int aiColumnPk = -1; /* The aColumn[] value for the sPk index */ int wsFlagMask; /* Allowed flags in pCost->plan.wsFlag */ /* Initialize the cost to a worst-case value */ memset(pCost, 0, sizeof(*pCost)); @@ -2928,14 +2921,14 @@ } /* Loop over all indices looking for the best one to use */ for(; pProbe; pIdx=pProbe=pProbe->pNext){ - const unsigned int * const aiRowEst = pProbe->aiRowEst; + const tRowcnt * const aiRowEst = pProbe->aiRowEst; double cost; /* Cost of using pProbe */ double nRow; /* Estimated number of rows in result set */ - double log10N; /* base-10 logarithm of nRow (inexact) */ + double log10N = (double)1; /* base-10 logarithm of nRow (inexact) */ int rev; /* True to scan in reverse order */ int wsFlags = 0; Bitmask used = 0; /* The following variables are populated based on the properties of @@ -2971,18 +2964,16 @@ ** Set to true if there was at least one "x IN (SELECT ...)" term used ** in determining the value of nInMul. Note that the RHS of the ** IN operator must be a SELECT, not a value list, for this variable ** to be true. ** - ** estBound: - ** An estimate on the amount of the table that must be searched. A - ** value of 100 means the entire table is searched. Range constraints - ** might reduce this to a value less than 100 to indicate that only - ** a fraction of the table needs searching. In the absence of - ** sqlite_stat2 ANALYZE data, a single inequality reduces the search - ** space to 1/4rd its original size. So an x>? constraint reduces - ** estBound to 25. Two constraints (x>? AND xnColumn; nEq++){ @@ -3033,23 +3024,23 @@ nInMul *= pExpr->x.pList->nExpr; } }else if( pTerm->eOperator & WO_ISNULL ){ wsFlags |= WHERE_COLUMN_NULL; } -#ifdef SQLITE_ENABLE_STAT2 +#ifdef SQLITE_ENABLE_STAT3 if( nEq==0 && pProbe->aSample ) pFirstTerm = pTerm; #endif used |= pTerm->prereqRight; } - /* Determine the value of estBound. */ + /* Determine the value of rangeDiv */ if( nEqnColumn && pProbe->bUnordered==0 ){ int j = pProbe->aiColumn[nEq]; if( findTerm(pWC, iCur, j, notReady, WO_LT|WO_LE|WO_GT|WO_GE, pIdx) ){ WhereTerm *pTop = findTerm(pWC, iCur, j, notReady, WO_LT|WO_LE, pIdx); WhereTerm *pBtm = findTerm(pWC, iCur, j, notReady, WO_GT|WO_GE, pIdx); - whereRangeScanEst(pParse, pProbe, nEq, pBtm, pTop, &estBound); + whereRangeScanEst(pParse, pProbe, nEq, pBtm, pTop, &rangeDiv); if( pTop ){ nBound = 1; wsFlags |= WHERE_TOP_LIMIT; used |= pTop->prereqRight; } @@ -3117,32 +3108,34 @@ if( bInEst && nRow*2>aiRowEst[0] ){ nRow = aiRowEst[0]/2; nInMul = (int)(nRow / aiRowEst[nEq]); } -#ifdef SQLITE_ENABLE_STAT2 +#ifdef SQLITE_ENABLE_STAT3 /* If the constraint is of the form x=VALUE or x IN (E1,E2,...) ** and we do not think that values of x are unique and if histogram ** data is available for column x, then it might be possible ** to get a better estimate on the number of rows based on ** VALUE and how common that value is according to the histogram. */ if( nRow>(double)1 && nEq==1 && pFirstTerm!=0 && aiRowEst[1]>1 ){ + assert( (pFirstTerm->eOperator & (WO_EQ|WO_ISNULL|WO_IN))!=0 ); if( pFirstTerm->eOperator & (WO_EQ|WO_ISNULL) ){ testcase( pFirstTerm->eOperator==WO_EQ ); testcase( pFirstTerm->eOperator==WO_ISNULL ); whereEqualScanEst(pParse, pProbe, pFirstTerm->pExpr->pRight, &nRow); - }else if( pFirstTerm->eOperator==WO_IN && bInEst==0 ){ + }else if( bInEst==0 ){ + assert( pFirstTerm->eOperator==WO_IN ); whereInScanEst(pParse, pProbe, pFirstTerm->pExpr->x.pList, &nRow); } } -#endif /* SQLITE_ENABLE_STAT2 */ +#endif /* SQLITE_ENABLE_STAT3 */ /* Adjust the number of output rows and downward to reflect rows ** that are excluded by range constraints. */ - nRow = (nRow * (double)estBound) / (double)100; + nRow = nRow/rangeDiv; if( nRow<1 ) nRow = 1; /* Experiments run on real SQLite databases show that the time needed ** to do a binary search to locate a row in a table or index is roughly ** log10(N) times the time to move from one row to the next row within @@ -3267,14 +3260,14 @@ if( nRow<2 ) nRow = 2; } WHERETRACE(( - "%s(%s): nEq=%d nInMul=%d estBound=%d bSort=%d bLookup=%d wsFlags=0x%x\n" + "%s(%s): nEq=%d nInMul=%d rangeDiv=%d bSort=%d bLookup=%d wsFlags=0x%x\n" " notReady=0x%llx log10N=%.1f nRow=%.1f cost=%.1f used=0x%llx\n", pSrc->pTab->zName, (pIdx ? pIdx->zName : "ipk"), - nEq, nInMul, estBound, bSort, bLookup, wsFlags, + nEq, nInMul, (int)rangeDiv, bSort, bLookup, wsFlags, notReady, log10N, nRow, cost, used )); /* If this index is the best we have seen so far, then record this ** index and its cost in the pCost structure. Index: test/alter.test ================================================================== --- test/alter.test +++ test/alter.test @@ -845,10 +845,11 @@ # set system_table_list {1 sqlite_master} catchsql ANALYZE ifcapable analyze { lappend system_table_list 2 sqlite_stat1 } ifcapable stat2 { lappend system_table_list 3 sqlite_stat2 } +ifcapable stat3 { lappend system_table_list 4 sqlite_stat3 } foreach {tn tbl} $system_table_list { do_test alter-15.$tn.1 { catchsql "ALTER TABLE $tbl RENAME TO xyz" } [list 1 "table $tbl may not be altered"] Index: test/analyze.test ================================================================== --- test/analyze.test +++ test/analyze.test @@ -286,11 +286,11 @@ SELECT * FROM t4 WHERE x=1234; } } {} # Verify that DROP TABLE and DROP INDEX remove entries from the -# sqlite_stat1 and sqlite_stat2 tables. +# sqlite_stat1 and sqlite_stat3 tables. # do_test analyze-5.0 { execsql { DELETE FROM t3; DELETE FROM t4; @@ -304,15 +304,15 @@ ANALYZE; SELECT DISTINCT idx FROM sqlite_stat1 ORDER BY 1; SELECT DISTINCT tbl FROM sqlite_stat1 ORDER BY 1; } } {t3i1 t3i2 t3i3 t4i1 t4i2 t3 t4} -ifcapable stat2 { +ifcapable stat3 { do_test analyze-5.1 { execsql { - SELECT DISTINCT idx FROM sqlite_stat2 ORDER BY 1; - SELECT DISTINCT tbl FROM sqlite_stat2 ORDER BY 1; + SELECT DISTINCT idx FROM sqlite_stat3 ORDER BY 1; + SELECT DISTINCT tbl FROM sqlite_stat3 ORDER BY 1; } } {t3i1 t3i2 t3i3 t4i1 t4i2 t3 t4} } do_test analyze-5.2 { execsql { @@ -319,15 +319,15 @@ DROP INDEX t3i2; SELECT DISTINCT idx FROM sqlite_stat1 ORDER BY 1; SELECT DISTINCT tbl FROM sqlite_stat1 ORDER BY 1; } } {t3i1 t3i3 t4i1 t4i2 t3 t4} -ifcapable stat2 { +ifcapable stat3 { do_test analyze-5.3 { execsql { - SELECT DISTINCT idx FROM sqlite_stat2 ORDER BY 1; - SELECT DISTINCT tbl FROM sqlite_stat2 ORDER BY 1; + SELECT DISTINCT idx FROM sqlite_stat3 ORDER BY 1; + SELECT DISTINCT tbl FROM sqlite_stat3 ORDER BY 1; } } {t3i1 t3i3 t4i1 t4i2 t3 t4} } do_test analyze-5.4 { execsql { @@ -334,15 +334,15 @@ DROP TABLE t3; SELECT DISTINCT idx FROM sqlite_stat1 ORDER BY 1; SELECT DISTINCT tbl FROM sqlite_stat1 ORDER BY 1; } } {t4i1 t4i2 t4} -ifcapable stat2 { +ifcapable stat3 { do_test analyze-5.5 { execsql { - SELECT DISTINCT idx FROM sqlite_stat2 ORDER BY 1; - SELECT DISTINCT tbl FROM sqlite_stat2 ORDER BY 1; + SELECT DISTINCT idx FROM sqlite_stat3 ORDER BY 1; + SELECT DISTINCT tbl FROM sqlite_stat3 ORDER BY 1; } } {t4i1 t4i2 t4} } # This test corrupts the database file so it must be the last test Index: test/analyze3.test ================================================================== --- test/analyze3.test +++ test/analyze3.test @@ -15,11 +15,11 @@ # set testdir [file dirname $argv0] source $testdir/tester.tcl -ifcapable !stat2 { +ifcapable !stat3 { finish_test return } #---------------------------------------------------------------------- @@ -95,14 +95,14 @@ } } {} do_eqp_test analyze3-1.1.2 { SELECT sum(y) FROM t1 WHERE x>200 AND x<300 -} {0 0 0 {SEARCH TABLE t1 USING INDEX i1 (x>? AND x? AND x0 AND x<1100 -} {0 0 0 {SCAN TABLE t1 (~111 rows)}} +} {0 0 0 {SEARCH TABLE t1 USING INDEX i1 (x>? AND x200 AND x<300 } } {199 0 14850} do_test analyze3-1.1.5 { @@ -115,21 +115,21 @@ set u [expr int(300)] sf_execsql { SELECT sum(y) FROM t1 WHERE x>$l AND x<$u } } {199 0 14850} do_test analyze3-1.1.7 { sf_execsql { SELECT sum(y) FROM t1 WHERE x>0 AND x<1100 } -} {999 999 499500} +} {2000 0 499500} do_test analyze3-1.1.8 { set l [string range "0" 0 end] set u [string range "1100" 0 end] sf_execsql { SELECT sum(y) FROM t1 WHERE x>$l AND x<$u } -} {999 999 499500} +} {2000 0 499500} do_test analyze3-1.1.9 { set l [expr int(0)] set u [expr int(1100)] sf_execsql { SELECT sum(y) FROM t1 WHERE x>$l AND x<$u } -} {999 999 499500} +} {2000 0 499500} # The following tests are similar to the block above. The difference is # that the indexed column has TEXT affinity in this case. In the tests # above the affinity is INTEGER. @@ -144,14 +144,14 @@ ANALYZE; } } {} do_eqp_test analyze3-1.2.2 { SELECT sum(y) FROM t2 WHERE x>1 AND x<2 -} {0 0 0 {SEARCH TABLE t2 USING INDEX i2 (x>? AND x? AND x0 AND x<99 -} {0 0 0 {SCAN TABLE t2 (~111 rows)}} +} {0 0 0 {SEARCH TABLE t2 USING INDEX i2 (x>? AND x12 AND x<20 } } {161 0 4760} do_test analyze3-1.2.5 { set l [string range "12" 0 end] @@ -163,21 +163,21 @@ set u [expr int(20)] sf_execsql {SELECT typeof($l), typeof($u), sum(y) FROM t2 WHERE x>$l AND x<$u} } {161 0 integer integer 4760} do_test analyze3-1.2.7 { sf_execsql { SELECT sum(y) FROM t2 WHERE x>0 AND x<99 } -} {999 999 490555} +} {1981 0 490555} do_test analyze3-1.2.8 { set l [string range "0" 0 end] set u [string range "99" 0 end] sf_execsql {SELECT typeof($l), typeof($u), sum(y) FROM t2 WHERE x>$l AND x<$u} -} {999 999 text text 490555} +} {1981 0 text text 490555} do_test analyze3-1.2.9 { set l [expr int(0)] set u [expr int(99)] sf_execsql {SELECT typeof($l), typeof($u), sum(y) FROM t2 WHERE x>$l AND x<$u} -} {999 999 integer integer 490555} +} {1981 0 integer integer 490555} # Same tests a third time. This time, column x has INTEGER affinity and # is not the leftmost column of the table. This triggered a bug causing # SQLite to use sub-optimal query plans in 3.6.18 and earlier. # @@ -191,14 +191,14 @@ ANALYZE; } } {} do_eqp_test analyze3-1.3.2 { SELECT sum(y) FROM t3 WHERE x>200 AND x<300 -} {0 0 0 {SEARCH TABLE t3 USING INDEX i3 (x>? AND x? AND x0 AND x<1100 -} {0 0 0 {SCAN TABLE t3 (~111 rows)}} +} {0 0 0 {SEARCH TABLE t3 USING INDEX i3 (x>? AND x200 AND x<300 } } {199 0 14850} do_test analyze3-1.3.5 { @@ -211,21 +211,21 @@ set u [expr int(300)] sf_execsql { SELECT sum(y) FROM t3 WHERE x>$l AND x<$u } } {199 0 14850} do_test analyze3-1.3.7 { sf_execsql { SELECT sum(y) FROM t3 WHERE x>0 AND x<1100 } -} {999 999 499500} +} {2000 0 499500} do_test analyze3-1.3.8 { set l [string range "0" 0 end] set u [string range "1100" 0 end] sf_execsql { SELECT sum(y) FROM t3 WHERE x>$l AND x<$u } -} {999 999 499500} +} {2000 0 499500} do_test analyze3-1.3.9 { set l [expr int(0)] set u [expr int(1100)] sf_execsql { SELECT sum(y) FROM t3 WHERE x>$l AND x<$u } -} {999 999 499500} +} {2000 0 499500} #------------------------------------------------------------------------- # Test that the values of bound SQL variables may be used for the LIKE # optimization. # @@ -246,11 +246,11 @@ } execsql COMMIT } {} do_eqp_test analyze3-2.2 { SELECT count(a) FROM t1 WHERE b LIKE 'a%' -} {0 0 0 {SEARCH TABLE t1 USING INDEX i1 (b>? AND b? AND b=0 AND z<=0} t1z 400 2 {z>=1 AND z<=1} t1z 300 - 3 {z>=2 AND z<=2} t1z 200 - 4 {z>=3 AND z<=3} t1z 100 - 5 {z>=4 AND z<=4} t1z 50 - 6 {z>=-1 AND z<=-1} t1z 50 - 7 {z>1 AND z<3} t1z 200 + 3 {z>=2 AND z<=2} t1z 175 + 4 {z>=3 AND z<=3} t1z 125 + 5 {z>=4 AND z<=4} t1z 1 + 6 {z>=-1 AND z<=-1} t1z 1 + 7 {z>1 AND z<3} t1z 175 8 {z>0 AND z<100} t1z 600 9 {z>=1 AND z<100} t1z 600 10 {z>1 AND z<100} t1z 300 11 {z>=2 AND z<100} t1z 300 - 12 {z>2 AND z<100} t1z 100 - 13 {z>=3 AND z<100} t1z 100 - 14 {z>3 AND z<100} t1z 50 - 15 {z>=4 AND z<100} t1z 50 - 16 {z>=-100 AND z<=-1} t1z 50 + 12 {z>2 AND z<100} t1z 125 + 13 {z>=3 AND z<100} t1z 125 + 14 {z>3 AND z<100} t1z 1 + 15 {z>=4 AND z<100} t1z 1 + 16 {z>=-100 AND z<=-1} t1z 1 17 {z>=-100 AND z<=0} t1z 400 - 18 {z>=-100 AND z<0} t1z 50 + 18 {z>=-100 AND z<0} t1z 1 19 {z>=-100 AND z<=1} t1z 700 20 {z>=-100 AND z<2} t1z 700 - 21 {z>=-100 AND z<=2} t1z 900 - 22 {z>=-100 AND z<3} t1z 900 + 21 {z>=-100 AND z<=2} t1z 875 + 22 {z>=-100 AND z<3} t1z 875 31 {z>=0.0 AND z<=0.0} t1z 400 32 {z>=1.0 AND z<=1.0} t1z 300 - 33 {z>=2.0 AND z<=2.0} t1z 200 - 34 {z>=3.0 AND z<=3.0} t1z 100 - 35 {z>=4.0 AND z<=4.0} t1z 50 - 36 {z>=-1.0 AND z<=-1.0} t1z 50 - 37 {z>1.5 AND z<3.0} t1z 200 - 38 {z>0.5 AND z<100} t1z 600 + 33 {z>=2.0 AND z<=2.0} t1z 175 + 34 {z>=3.0 AND z<=3.0} t1z 125 + 35 {z>=4.0 AND z<=4.0} t1z 1 + 36 {z>=-1.0 AND z<=-1.0} t1z 1 + 37 {z>1.5 AND z<3.0} t1z 174 + 38 {z>0.5 AND z<100} t1z 599 39 {z>=1.0 AND z<100} t1z 600 - 40 {z>1.5 AND z<100} t1z 300 + 40 {z>1.5 AND z<100} t1z 299 41 {z>=2.0 AND z<100} t1z 300 - 42 {z>2.1 AND z<100} t1z 100 - 43 {z>=3.0 AND z<100} t1z 100 - 44 {z>3.2 AND z<100} t1z 50 - 45 {z>=4.0 AND z<100} t1z 50 - 46 {z>=-100 AND z<=-1.0} t1z 50 + 42 {z>2.1 AND z<100} t1z 124 + 43 {z>=3.0 AND z<100} t1z 125 + 44 {z>3.2 AND z<100} t1z 1 + 45 {z>=4.0 AND z<100} t1z 1 + 46 {z>=-100 AND z<=-1.0} t1z 1 47 {z>=-100 AND z<=0.0} t1z 400 - 48 {z>=-100 AND z<0.0} t1z 50 + 48 {z>=-100 AND z<0.0} t1z 1 49 {z>=-100 AND z<=1.0} t1z 700 50 {z>=-100 AND z<2.0} t1z 700 - 51 {z>=-100 AND z<=2.0} t1z 900 - 52 {z>=-100 AND z<3.0} t1z 900 - - 101 {z=-1} t1z 50 - 102 {z=0} t1z 400 - 103 {z=1} t1z 300 - 104 {z=2} t1z 200 - 105 {z=3} t1z 100 - 106 {z=4} t1z 50 - 107 {z=-10.0} t1z 50 + 51 {z>=-100 AND z<=2.0} t1z 875 + 52 {z>=-100 AND z<3.0} t1z 875 + + 101 {z=-1} t1z 1 + 102 {z=0} t1z 400 + 103 {z=1} t1z 300 + 104 {z=2} t1z 175 + 105 {z=3} t1z 125 + 106 {z=4} t1z 1 + 107 {z=-10.0} t1z 1 108 {z=0.0} t1z 400 109 {z=1.0} t1z 300 - 110 {z=2.0} t1z 200 - 111 {z=3.0} t1z 100 - 112 {z=4.0} t1z 50 - 113 {z=1.5} t1z 50 - 114 {z=2.5} t1z 50 + 110 {z=2.0} t1z 175 + 111 {z=3.0} t1z 125 + 112 {z=4.0} t1z 1 + 113 {z=1.5} t1z 1 + 114 {z=2.5} t1z 1 - 201 {z IN (-1)} t1z 50 + 201 {z IN (-1)} t1z 1 202 {z IN (0)} t1z 400 203 {z IN (1)} t1z 300 - 204 {z IN (2)} t1z 200 - 205 {z IN (3)} t1z 100 - 206 {z IN (4)} t1z 50 - 207 {z IN (0.5)} t1z 50 + 204 {z IN (2)} t1z 175 + 205 {z IN (3)} t1z 125 + 206 {z IN (4)} t1z 1 + 207 {z IN (0.5)} t1z 1 208 {z IN (0,1)} t1z 700 - 209 {z IN (0,1,2)} t1z 900 + 209 {z IN (0,1,2)} t1z 875 210 {z IN (0,1,2,3)} {} 100 211 {z IN (0,1,2,3,4,5)} {} 100 - 212 {z IN (1,2)} t1z 500 + 212 {z IN (1,2)} t1z 475 213 {z IN (2,3)} t1z 300 214 {z=3 OR z=2} t1z 300 - 215 {z IN (-1,3)} t1z 150 - 216 {z=-1 OR z=3} t1z 150 + 215 {z IN (-1,3)} t1z 126 + 216 {z=-1 OR z=3} t1z 126 - 300 {y=0} {} 100 - 301 {y=1} t1y 50 - 302 {y=0.1} t1y 50 + 300 {y=0} t1y 974 + 301 {y=1} t1y 26 + 302 {y=0.1} t1y 1 400 {x IS NULL} t1x 400 } { # Verify that the expected index is used with the expected row count @@ -202,27 +190,29 @@ } # Verify that range queries generate the correct row count estimates # foreach {testid where index rows} { - 500 {x IS NULL AND u='charlie'} t1u 20 - 501 {x=1 AND u='charlie'} t1x 5 - 502 {x IS NULL} {} 100 - 503 {x=1} t1x 50 - 504 {x IS NOT NULL} t1x 25 + 500 {x IS NULL AND u='charlie'} t1u 17 + 501 {x=1 AND u='charlie'} t1x 1 + 502 {x IS NULL} t1x 995 + 503 {x=1} t1x 1 + 504 {x IS NOT NULL} t1x 2 505 {+x IS NOT NULL} {} 500 506 {upper(x) IS NOT NULL} {} 500 } { # Verify that the expected index is used with the expected row count +if {$testid==50299} {breakpoint; set sqlite_where_trace 1} do_test analyze5-1.${testid}a { set x [lindex [eqp "SELECT * FROM t1 WHERE $where"] 3] set idx {} regexp {INDEX (t1.) } $x all idx regexp {~([0-9]+) rows} $x all nrow list $idx $nrow } [list $index $rows] +if {$testid==50299} exit # Verify that the same result is achieved regardless of whether or not # the index is used do_test analyze5-1.${testid}b { set w2 [string map {y +y z +z} $where] Index: test/analyze6.test ================================================================== --- test/analyze6.test +++ test/analyze6.test @@ -15,11 +15,11 @@ # set testdir [file dirname $argv0] source $testdir/tester.tcl -ifcapable !stat2 { +ifcapable !stat3 { finish_test return } set testprefix analyze6 Index: test/analyze7.test ================================================================== --- test/analyze7.test +++ test/analyze7.test @@ -80,32 +80,34 @@ execsql {EXPLAIN QUERY PLAN SELECT * FROM t1 WHERE b=123;} } {0 0 0 {SEARCH TABLE t1 USING INDEX t1b (b=?) (~10 rows)}} do_test analyze7-3.2.1 { execsql {EXPLAIN QUERY PLAN SELECT * FROM t1 WHERE c=?;} } {0 0 0 {SEARCH TABLE t1 USING INDEX t1cd (c=?) (~86 rows)}} -ifcapable stat2 { - # If ENABLE_STAT2 is defined, SQLite comes up with a different estimated +ifcapable stat3 { + # If ENABLE_STAT3 is defined, SQLite comes up with a different estimated # row count for (c=2) than it does for (c=?). do_test analyze7-3.2.2 { execsql {EXPLAIN QUERY PLAN SELECT * FROM t1 WHERE c=2;} - } {0 0 0 {SEARCH TABLE t1 USING INDEX t1cd (c=?) (~51 rows)}} + } {0 0 0 {SEARCH TABLE t1 USING INDEX t1cd (c=?) (~57 rows)}} } else { - # If ENABLE_STAT2 is not defined, the expected row count for (c=2) is the + # If ENABLE_STAT3 is not defined, the expected row count for (c=2) is the # same as that for (c=?). do_test analyze7-3.2.3 { execsql {EXPLAIN QUERY PLAN SELECT * FROM t1 WHERE c=2;} } {0 0 0 {SEARCH TABLE t1 USING INDEX t1cd (c=?) (~86 rows)}} } do_test analyze7-3.3 { execsql {EXPLAIN QUERY PLAN SELECT * FROM t1 WHERE a=123 AND b=123} } {0 0 0 {SEARCH TABLE t1 USING INDEX t1a (a=?) (~1 rows)}} -do_test analyze7-3.4 { - execsql {EXPLAIN QUERY PLAN SELECT * FROM t1 WHERE c=123 AND b=123} -} {0 0 0 {SEARCH TABLE t1 USING INDEX t1b (b=?) (~2 rows)}} -do_test analyze7-3.5 { - execsql {EXPLAIN QUERY PLAN SELECT * FROM t1 WHERE a=123 AND c=123} -} {0 0 0 {SEARCH TABLE t1 USING INDEX t1a (a=?) (~1 rows)}} +ifcapable {!stat3} { + do_test analyze7-3.4 { + execsql {EXPLAIN QUERY PLAN SELECT * FROM t1 WHERE c=123 AND b=123} + } {0 0 0 {SEARCH TABLE t1 USING INDEX t1b (b=?) (~2 rows)}} + do_test analyze7-3.5 { + execsql {EXPLAIN QUERY PLAN SELECT * FROM t1 WHERE a=123 AND c=123} + } {0 0 0 {SEARCH TABLE t1 USING INDEX t1a (a=?) (~1 rows)}} +} do_test analyze7-3.6 { execsql {EXPLAIN QUERY PLAN SELECT * FROM t1 WHERE c=123 AND d=123 AND b=123} } {0 0 0 {SEARCH TABLE t1 USING INDEX t1cd (c=? AND d=?) (~1 rows)}} finish_test ADDED test/analyze8.test Index: test/analyze8.test ================================================================== --- /dev/null +++ test/analyze8.test @@ -0,0 +1,103 @@ +# 2011 August 13 +# +# The author disclaims copyright to this source code. In place of +# a legal notice, here is a blessing: +# +# May you do good and not evil. +# May you find forgiveness for yourself and forgive others. +# May you share freely, never taking more than you give. +# +#*********************************************************************** +# +# This file implements tests for SQLite library. The focus of the tests +# in this file is testing the capabilities of sqlite_stat3. +# + +set testdir [file dirname $argv0] +source $testdir/tester.tcl + +ifcapable !stat3 { + finish_test + return +} + +set testprefix analyze8 + +proc eqp {sql {db db}} { + uplevel execsql [list "EXPLAIN QUERY PLAN $sql"] $db +} + +# Scenario: +# +# Two indices. One has mostly singleton entries, but for a few +# values there are hundreds of entries. The other has 10-20 +# entries per value. +# +# Verify that the query planner chooses the first index for the singleton +# entries and the second index for the others. +# +do_test 1.0 { + db eval { + CREATE TABLE t1(a,b,c,d); + CREATE INDEX t1a ON t1(a); + CREATE INDEX t1b ON t1(b); + CREATE INDEX t1c ON t1(c); + } + for {set i 0} {$i<1000} {incr i} { + if {$i%2==0} {set a $i} {set a [expr {($i%8)*100}]} + set b [expr {$i/10}] + set c [expr {$i/8}] + set c [expr {$c*$c*$c}] + db eval {INSERT INTO t1 VALUES($a,$b,$c,$i)} + } + db eval {ANALYZE} +} {} + +# The a==100 comparison is expensive because there are many rows +# with a==100. And so for those cases, choose the t1b index. +# +# Buf ro a==99 and a==101, there are far fewer rows so choose +# the t1a index. +# +do_test 1.1 { + eqp {SELECT * FROM t1 WHERE a=100 AND b=55} +} {0 0 0 {SEARCH TABLE t1 USING INDEX t1b (b=?) (~2 rows)}} +do_test 1.2 { + eqp {SELECT * FROM t1 WHERE a=99 AND b=55} +} {0 0 0 {SEARCH TABLE t1 USING INDEX t1a (a=?) (~1 rows)}} +do_test 1.3 { + eqp {SELECT * FROM t1 WHERE a=101 AND b=55} +} {0 0 0 {SEARCH TABLE t1 USING INDEX t1a (a=?) (~1 rows)}} +do_test 1.4 { + eqp {SELECT * FROM t1 WHERE a=100 AND b=56} +} {0 0 0 {SEARCH TABLE t1 USING INDEX t1b (b=?) (~2 rows)}} +do_test 1.5 { + eqp {SELECT * FROM t1 WHERE a=99 AND b=56} +} {0 0 0 {SEARCH TABLE t1 USING INDEX t1a (a=?) (~1 rows)}} +do_test 1.6 { + eqp {SELECT * FROM t1 WHERE a=101 AND b=56} +} {0 0 0 {SEARCH TABLE t1 USING INDEX t1a (a=?) (~1 rows)}} +do_test 2.1 { + eqp {SELECT * FROM t1 WHERE a=100 AND b BETWEEN 50 AND 54} +} {0 0 0 {SEARCH TABLE t1 USING INDEX t1b (b>? AND b? AND b? AND c? AND c?" Index: tool/warnings.sh ================================================================== --- tool/warnings.sh +++ tool/warnings.sh @@ -7,13 +7,13 @@ make sqlite3.c-debug echo '********** No optimizations. Includes FTS4 and RTREE *********' gcc -c -Wshadow -Wall -Wextra -pedantic-errors -Wno-long-long -std=c89 \ -ansi -DHAVE_STDINT_H -DSQLITE_ENABLE_FTS4 -DSQLITE_ENABLE_RTREE \ sqlite3.c -echo '********** No optimizations. ENABLE_STAT2. THREADSAFE=0 *******' +echo '********** No optimizations. ENABLE_STAT3. THREADSAFE=0 *******' gcc -c -Wshadow -Wall -Wextra -pedantic-errors -Wno-long-long -std=c89 \ - -ansi -DSQLITE_ENABLE_STAT2 -DSQLITE_THREADSAFE=0 \ + -ansi -DSQLITE_ENABLE_STAT3 -DSQLITE_THREADSAFE=0 \ sqlite3.c echo '********** Optimized -O3. Includes FTS4 and RTREE ************' gcc -O3 -c -Wshadow -Wall -Wextra -pedantic-errors -Wno-long-long -std=c89 \ -ansi -DHAVE_STDINT_H -DSQLITE_ENABLE_FTS4 -DSQLITE_ENABLE_RTREE \ sqlite3.c