SQLite

Check-in [b1dceef050]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Updates to the query optimizer overview document. (CVS 2651)
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: b1dceef0508ffe20ab2ff8fa5e5b5a44f4f224aa
User & Date: drh 2005-08-31 13:48:35.000
Context
2005-08-31
18:20
{quote: KeyInfo} generation moved to a common subroutine. (CVS 2652) (check-in: a25801df06 user: drh tags: trunk)
13:48
Updates to the query optimizer overview document. (CVS 2651) (check-in: b1dceef050 user: drh tags: trunk)
13:13
Explicit typecasts to silence nuisance compiler warnings. Ticket #1398. (CVS 2650) (check-in: 90712ea727 user: drh tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to www/optoverview.tcl.
1
2
3
4
5
6
7
8
9
10
11
#
# Run this TCL script to generate HTML for the goals.html file.
#
set rcsid {$Id: optoverview.tcl,v 1.2 2005/08/31 03:13:12 drh Exp $}
source common.tcl
header {The SQLite Query Optimizer Overview}

proc CODE {text} {
  puts "<blockquote><pre>"
  puts $text
  puts "</pre></blockquote>"



|







1
2
3
4
5
6
7
8
9
10
11
#
# Run this TCL script to generate HTML for the goals.html file.
#
set rcsid {$Id: optoverview.tcl,v 1.3 2005/08/31 13:48:35 drh Exp $}
source common.tcl
header {The SQLite Query Optimizer Overview}

proc CODE {text} {
  puts "<blockquote><pre>"
  puts $text
  puts "</pre></blockquote>"
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74

75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
  puts "<h$n>$num $name</h$n>"
}

HEADING 0 {The SQLite Query Optimizer Overview}

PARAGRAPH {
  This document provides a terse overview of how the query optimizer
  for SQLite works.  This is not a tutorial.  Some prior knowledge of how
  database engines operate is likely needed in order to fully understand
  this text.
}

HEADING 1 {WHERE clause analysis} where_clause

PARAGRAPH {
  The WHERE clause on a query is broken up into "terms" where each term
  is separated from the others by an AND operator.
  Parentheses are ignored when dividing the WHERE clause into terms.
}
PARAGRAPH {
  All terms of the WHERE clause are analyzed.

  Terms that cannot be satisfied through the use of indices become
  tests that are evaluated against each row of the relevant input
  tables.  No tests are done for terms that are completely satisfied by
  indices.  Sometimes
  one or more terms will provide hints to indices but still must be
  evaluated against each row of the input tables.
}

PARAGRAPH {
  Sometimes the analysis of a term will cause new "virtual" terms to
  be added to the WHERE clause.  Virtual terms can be used with
  indices to restrict a search.  But virtual terms never generate code
  that is tested against input rows.
}

PARAGRAPH {
  In order be used by an index, a term must be of one of the following







|
|
|







<


|
>









|







54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70

71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
  puts "<h$n>$num $name</h$n>"
}

HEADING 0 {The SQLite Query Optimizer Overview}

PARAGRAPH {
  This document provides a terse overview of how the query optimizer
  for SQLite works.  This is not a tutorial.  The reader is likely to
  need some prior knowledge of how database engines operate 
  in order to fully understand this text.
}

HEADING 1 {WHERE clause analysis} where_clause

PARAGRAPH {
  The WHERE clause on a query is broken up into "terms" where each term
  is separated from the others by an AND operator.

}
PARAGRAPH {
  All terms of the WHERE clause are analyzed to see if they can be
  satisfied using indices.
  Terms that cannot be satisfied through the use of indices become
  tests that are evaluated against each row of the relevant input
  tables.  No tests are done for terms that are completely satisfied by
  indices.  Sometimes
  one or more terms will provide hints to indices but still must be
  evaluated against each row of the input tables.
}

PARAGRAPH {
  The analysis of a term might cause new "virtual" terms to
  be added to the WHERE clause.  Virtual terms can be used with
  indices to restrict a search.  But virtual terms never generate code
  that is tested against input rows.
}

PARAGRAPH {
  In order be used by an index, a term must be of one of the following
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
  the *=* or *IN* operators except for
  the right-most column which can use inequalities.  For the right-most
  column of an index that is used, there can be up to two inequalities
  that must sandwich the allowed values of the column between two extremes.
}
PARAGRAPH {
  It is not necessary for every column of an index to appear in a
  WHERE clause term in order for that index to be used.  In fact, there
  are sometimes advantages if this is not the case.
  But there can not be gaps in the columns of the index that are used.
  Thus for the example index above, if there is no WHERE clause term
  that constraints column c, then terms that constraint columns a and b can
  be used with the index but not terms that constraint columns d through z.
  Similarly, no index column will be used (for indexing purposes)
  that is to the right of a 
  column that is constrained only by inequalities.
  Thus for the index above and WHERE clause like this:
}
CODE {
  ... WHERE a=5 AND b IN (1,2,3) AND c>12 AND d='hello'
}
PARAGRAPH {
  Only columns a, b, and c of the index would be usable.  The d column
  would not be usable because it occurs to the right of c and c is







|
<







|







118
119
120
121
122
123
124
125

126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
  the *=* or *IN* operators except for
  the right-most column which can use inequalities.  For the right-most
  column of an index that is used, there can be up to two inequalities
  that must sandwich the allowed values of the column between two extremes.
}
PARAGRAPH {
  It is not necessary for every column of an index to appear in a
  WHERE clause term in order for that index to be used. 

  But there can not be gaps in the columns of the index that are used.
  Thus for the example index above, if there is no WHERE clause term
  that constraints column c, then terms that constraint columns a and b can
  be used with the index but not terms that constraint columns d through z.
  Similarly, no index column will be used (for indexing purposes)
  that is to the right of a 
  column that is constrained only by inequalities.
  For the index above and WHERE clause like this:
}
CODE {
  ... WHERE a=5 AND b IN (1,2,3) AND c>12 AND d='hello'
}
PARAGRAPH {
  Only columns a, b, and c of the index would be usable.  The d column
  would not be usable because it occurs to the right of c and c is
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
  PRAGMA case_sensitive_like=ON;
}
PARAGRAPH {
  Then the LIKE operator pays attention to case and the example above would
  evaluate to false.  Note that case insensitivity only applies to
  latin1 characters - basically the upper and lower case letters of English
  in the lower 127 byte codes of ASCII.  International character sets
  are always case sensitive in SQLite unless a user-supplied collating
  sequence is used.  But if you employ a user-supplied collating sequence,
  the LIKE optimization describe here will never be taken.
}
PARAGRAPH {
  The LIKE operator is case insensitive by default because this is what
  the SQL standard requires.  You can change the default behavior at
  compile time by using the -DSQLITE_CASE_SENSITIVE_LIKE command-line option







|







232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
  PRAGMA case_sensitive_like=ON;
}
PARAGRAPH {
  Then the LIKE operator pays attention to case and the example above would
  evaluate to false.  Note that case insensitivity only applies to
  latin1 characters - basically the upper and lower case letters of English
  in the lower 127 byte codes of ASCII.  International character sets
  are case sensitive in SQLite unless a user-supplied collating
  sequence is used.  But if you employ a user-supplied collating sequence,
  the LIKE optimization describe here will never be taken.
}
PARAGRAPH {
  The LIKE operator is case insensitive by default because this is what
  the SQL standard requires.  You can change the default behavior at
  compile time by using the -DSQLITE_CASE_SENSITIVE_LIKE command-line option
385
386
387
388
389
390
391

392
393
394
395
396
397
398
399
  updates to the sqlite_stat1 table (except by running ANALYZE) is
  not recommended.
}
PARAGRAPH {
  Terms of the WHERE clause can be manually disqualified for use with
  indices by prepending a unary *+* operator to the column name.  The
  unary *+* is a no-op and will not slow down the evaluation of the test

  in any way.  But it will prevent that term from constraining an index.
  So, in the example above, if the query were rewritten as:
}
CODE {
  SELECT z FROM ex2 WHERE +x=5 AND y=6;
}
PARAGRAPH {
  The *+* operator on the *x* column would prevent that term from 







>
|







384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
  updates to the sqlite_stat1 table (except by running ANALYZE) is
  not recommended.
}
PARAGRAPH {
  Terms of the WHERE clause can be manually disqualified for use with
  indices by prepending a unary *+* operator to the column name.  The
  unary *+* is a no-op and will not slow down the evaluation of the test
  specified by the term.
  But it will prevent the term from constraining an index.
  So, in the example above, if the query were rewritten as:
}
CODE {
  SELECT z FROM ex2 WHERE +x=5 AND y=6;
}
PARAGRAPH {
  The *+* operator on the *x* column would prevent that term from 
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
HEADING 1 {ORDER BY optimizations} order_by

PARAGRAPH {
  SQLite attempts to use an index to satisfy the ORDER BY clause of a
  query when possible.
  When faced with the choice of using an index to satisfy WHERE clause
  constraints or satisfying an ORDER BY clause, SQLite does the same
  work analysis that it normally does when choosing between two indices
  and selects that one that it believes will result in the fastest answer.
  Usually this means satisfying the WHERE clause constraint.
}

HEADING 1 {Subquery flattening} flattening

PARAGRAPH {
  When a subquery occurs in the FROM clause of a SELECT, the default
  behavior is to evaluate the subquery into a transient table, then run







|
|
|







418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
HEADING 1 {ORDER BY optimizations} order_by

PARAGRAPH {
  SQLite attempts to use an index to satisfy the ORDER BY clause of a
  query when possible.
  When faced with the choice of using an index to satisfy WHERE clause
  constraints or satisfying an ORDER BY clause, SQLite does the same
  work analysis described in section 6.0
  and chooses the index that it believes will result in the fastest answer.

}

HEADING 1 {Subquery flattening} flattening

PARAGRAPH {
  When a subquery occurs in the FROM clause of a SELECT, the default
  behavior is to evaluate the subquery into a transient table, then run
500
501
502
503
504
505
506
507
508
  SELECT MAX(x) FROM table;
}
PARAGRAPH {
  In order for these optimizations to occur, they must appear in exactly
  the form shown above - changing only the name of the table and column.
  It is not permissible to add a WHERE clause or do any arithmetic on the
  result.  The result set must contain a single column.
  And the column in the MIN or MAX function must be an indexed column.
}







|

500
501
502
503
504
505
506
507
508
  SELECT MAX(x) FROM table;
}
PARAGRAPH {
  In order for these optimizations to occur, they must appear in exactly
  the form shown above - changing only the name of the table and column.
  It is not permissible to add a WHERE clause or do any arithmetic on the
  result.  The result set must contain a single column.
  The column in the MIN or MAX function must be an indexed column.
}