Given a single SQL statement, there might be dozens, hundreds, or even thousands of ways to implement that statement, depending on the complexity of the statement itself and of the underlying database schema. The task of the query planner is to select the algorithm that minimizes disk I/O and CPU overhead.

Additional background information is available in the indexing tutorial document.

With release 3.8.0 (2013-08-26), the SQLite query planner was reimplemented as the Next Generation Query Planner or "NGQP". All of the features, techniques, and algorithms described in this document are applicable to both the pre-3.8.0 legacy query planner and to the NGQP. For further information on how the NGQP differs from the legacy query planner, see the detailed description of the NGQP.

2. WHERE Clause Analysis

The WHERE clause on a query is broken up into "terms" where each term is separated from the others by an AND operator. If the WHERE clause is composed of constraints separated by the OR operator then the entire clause is considered to be a single "term" to which the OR-clause optimization is applied.

All terms of the WHERE clause are analyzed to see if they can be satisfied using indexes. To be usable by an index a term must usually be of one of the following forms:


  column = expression
  column IS expression
  column > expression
  column >= expression
  column < expression
  column <= expression
  expression = column
  expression > column
  expression >= column
  expression < column
  expression <= column
  column IN (expression-list)
  column IN (subquery)
  column IS NULL
  column LIKE pattern
  column GLOB pattern

If an index is created using a statement like this:

CREATE INDEX idx_ex1 ON ex1(a,b,c,d,e,...,y,z);

Then the index might be used if the initial columns of the index (columns a, b, and so forth) appear in WHERE clause terms. The initial columns of the index must be used with the = or IN or IS operators. The right-most column that is used can employ inequalities. For the right-most column of an index that is used, there can be up to two inequalities that must sandwich the allowed values of the column between two extremes.

It is not necessary for every column of an index to appear in a WHERE clause term in order for that index to be used. However, there cannot be gaps in the columns of the index that are used. Thus for the example index above, if there is no WHERE clause term that constrains column c, then terms that constrain columns a and b can be used with the index but not terms that constrain columns d through z. Similarly, index columns will not normally be used (for indexing purposes) if they are to the right of a column that is constrained only by inequalities. (See the skip-scan optimization below for the exception.)

In the case of indexes on expressions, whenever the word "column" is used in the foregoing text, one can substitute "indexed expression" (meaning a copy of the expression that appears in the CREATE INDEX statement) and everything will work the same.