Documentation Source Text

Check-in [828ca81aae]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Further improvement and expansion of the NGQP documentation, including adding a checklist for preventing and dealing with bad choices by the query planner.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 828ca81aaeb7d4c32381e17aefe08b8396223e0b
User & Date: drh 2013-06-25 16:31:40.016
Context
2013-06-26
14:35
Updates to the changes.html page. Fix typos on the NGQP documentation. (check-in: e0c107dec3 user: drh tags: trunk)
2013-06-25
16:31
Further improvement and expansion of the NGQP documentation, including adding a checklist for preventing and dealing with bad choices by the query planner. (check-in: 828ca81aae user: drh tags: trunk)
01:44
Second draft of the NGQP document. (check-in: e09abae6c3 user: drh tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to pages/queryplanner-ng.in.
12
13
14
15
16
17
18
19
20

21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43

44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
  hd_puts [string trimright $txt]\n
  hd_puts "</pre></table></center>\n"
}
hd_keywords {next generation query planner}
</tcl>
<h1 align="center">The Next Generation Query Planner</h1>

<h2>1.0 Executive Summary</h2>


<p>Given an SQL statement, the task of the "query planner" is to figure
out the best algorithm to accomplish that statement.
Beginning with SQLite version 3.8.0, the query planner component has been
rewritten so that it runs faster and generates better query plans.  The
rewrite is called the "next generation query planner" or "NGQP".
</p>

<p>This article overviews the importance of query planning, describes some
of the problems inherent to query planning, and outlines how the NGQP
solves those problems.</p>

<p>The NGQP is almost always better than the legacy query planner.
However, there may exist legacy applications that unknowingly depend on 
undefined and/or suboptimal behavior in the legacy query planner, and
upgrading to the NGQP on those legacy applications could cause performance
regressions.  This risk is considered and suggestions for reducing the
risk are outlined.</p>

<h2>2.0 Background</h2>

<p>For simple queries against a single table with few indices, there is usually
an obvious choice for the best algorithm.
But for larger and more complex queries, multi-way joins with many indices

and subqueries, there can be hundreds, thousands, or millions
of "reasonable" algorithms for computing the result.
The job of the query planner is to chose a single "best" query plan from
this multitude of possibilities.</p>

<p>The existence of a query planner is what makes SQL database engines
so useful and powerful.
The query planner frees the programmer from the chore of selecting
a particular query plan, and thereby allows the programmer to
focus more mental energy on higher-level application issues and on 
providing more value to the end user.  For simple queries where the choice
of query plans is obvious, this may not seem like a big deal.
But as applications and schemas and queries grow more complex, a
clever query planner can greatly speed and simplify the work of application
development. 
There is amazing power in being about to tell
the database engine what content is desired, and then leave the database
engine to work out the best way to retrieve that content.</p>

<p>Writing a good query planner is an inexact science.
The query planner must work with incomplete information.
It cannot determine how long any particular plan will take
without actually running that plan.  So when comparing two
or more plans to figure out which is "best", the query planner has to make
some guesses and assumption and those guesses and assumptions that will 
sometimes be wrong. A good query plan is one that will
find the correct solution often enough that application
programmers only rarely need to get involved.</p>

<h3>2.1 Query Planning In SQLite</h3>

<p>The query engine in SQLite computes joins using nested loops, 
one loop for each table
in the join.  (Additional loops might be inserted to handle IN operators
in the WHERE clause, but those extra loops are not considered here.)
One or more indices might be used on each loop to speed the search,
or a loop might be a "full table scan" using only the original table.
Thus query planning decomposes into two subtasks:</p>
<ol>







|

>
|
|

|











|
|





|
>

|




|




|




|







|






|







12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
  hd_puts [string trimright $txt]\n
  hd_puts "</pre></table></center>\n"
}
hd_keywords {next generation query planner}
</tcl>
<h1 align="center">The Next Generation Query Planner</h1>

<h2>1.0 Introduction</h2>

<p>
The task of the "query planner" is to figure
out the best algorithm or "query plan" to accomplish an SQL statement.
Beginning with SQLite version 3.8.0, the query planner component has been
rewritten so that it runs faster and generates better plans.  The
rewrite is called the "next generation query planner" or "NGQP".
</p>

<p>This article overviews the importance of query planning, describes some
of the problems inherent to query planning, and outlines how the NGQP
solves those problems.</p>

<p>The NGQP is almost always better than the legacy query planner.
However, there may exist legacy applications that unknowingly depend on 
undefined and/or suboptimal behavior in the legacy query planner, and
upgrading to the NGQP on those legacy applications could cause performance
regressions.  This risk is considered and a checklist is provided
for reducing the risk and fixing any issues that do arise.</p>

<h2>2.0 Background</h2>

<p>For simple queries against a single table with few indices, there is usually
an obvious choice for the best algorithm.
But for larger and more complex queries, such as
multi-way joins with many indices
and subqueries, there can be hundreds, thousands, or millions
of reasonable algorithms for computing the result.
The job of the query planner is to chose a single "best" query plan from
this multitude of possibilities.</p>

<p>The existence of a query planner is what makes SQL database engines
so amazingly useful and powerful.
The query planner frees the programmer from the chore of selecting
a particular query plan, and thereby allows the programmer to
focus more mental energy on higher-level application issues and on 
providing more value to the end user.  For simple queries where the choice
of query plans is obvious, this is not of huge importance.
But as applications and schemas and queries grow more complex, a
clever query planner can greatly speed and simplify the work of application
development. 
There is amazing power in being about to tell
the database engine what content is desired, and then let the database
engine to work out the best way to retrieve that content.</p>

<p>Writing a good query planner is an inexact science.
The query planner must work with incomplete information.
It cannot determine how long any particular plan will take
without actually running that plan.  So when comparing two
or more plans to figure out which is "best", the query planner has to make
some guesses and assumption and those guesses and assumptions will 
sometimes be wrong. A good query plan is one that will
find the correct solution often enough that application
programmers only rarely need to get involved.</p>

<h3>2.1 Query Planning In SQLite</h3>

<p>SQLite computes joins using nested loops, 
one loop for each table
in the join.  (Additional loops might be inserted to handle IN operators
in the WHERE clause, but those extra loops are not considered here.)
One or more indices might be used on each loop to speed the search,
or a loop might be a "full table scan" using only the original table.
Thus query planning decomposes into two subtasks:</p>
<ol>
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
then SQLite will not suddenly decide to start using a new and different
query plan once your application is in the field, and thereby possibly
causing a performance problem.  If your application works in the lab, it
will continue working the same way after deployment.</p>

<p>Enterprise-class client/server SQL database engines do not normally 
make this guarantee.
In client/server SQL database engines, the server will keeps continuously
updated statistics on the sizes of tables and on the quality of indices 
and the query planner those statistics to aid in selecting the best plans.
As content is added, deleted, or changed in the database, the statistics 
will evolve and may cause the query planner to begin using a different 
query plan for some particular query.  Usually the new plan will be better 
for the evolving structure of the data.  But sometimes the new query plan will
cause a performance reduction.  With a client/server database engine, there
is typically a Database Administrator (DBA) on hand to deal with these 
rare problems as they come up.  But DBAs are not available to fix problems 
in an embedded database like SQLite, and hence SQLite goes to great care 
to make sure that plans do not change unexpectedly after deployment.</p>

<p>This stability guarantee applies equally to the legacy query planner in
SQLite and to the NGQP.</p>

<p>It is important to note that changing versions of SQLite might cause
changes in query plans, as new enhancements are added to the query planner.
The same version of SQLite it will







|
|
|







|
|







106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
then SQLite will not suddenly decide to start using a new and different
query plan once your application is in the field, and thereby possibly
causing a performance problem.  If your application works in the lab, it
will continue working the same way after deployment.</p>

<p>Enterprise-class client/server SQL database engines do not normally 
make this guarantee.
In client/server SQL database engines, the server keeps track of
statistics on the sizes of tables and on the quality of indices 
and the query planner uses those statistics to help select the best plans.
As content is added, deleted, or changed in the database, the statistics 
will evolve and may cause the query planner to begin using a different 
query plan for some particular query.  Usually the new plan will be better 
for the evolving structure of the data.  But sometimes the new query plan will
cause a performance reduction.  With a client/server database engine, there
is typically a Database Administrator (DBA) on hand to deal with these 
rare problems as they come up.  But DBAs are not available to fix problems 
in an embedded database like SQLite, and hence SQLite is careful to
ensure that plans do not change unexpectedly after deployment.</p>

<p>This stability guarantee applies equally to the legacy query planner in
SQLite and to the NGQP.</p>

<p>It is important to note that changing versions of SQLite might cause
changes in query plans, as new enhancements are added to the query planner.
The same version of SQLite it will
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
<img src="images/qp/tpchq8.gif">
</center>

<p>
In the diagram, each of the 8 tables in the FROM clause of the query is
identified by a large circle with the label of the FROM-clause term:
N2, S, L, P, O, C, N1 and R.
The arcs in the graph represent the cost of computing each term assuming
that the origin of the arc is in an outer loop.  For example, the
cost of running the S loop as an inner loop to L is 2.30 whereas the
cost of running the S loop as an outer loop to L is 9.17.</p>

<p>The "cost" here is logarithmic.  With nested loops, the work
is multiplied, not added.  But it customary to think of graphs
with additive weights and so the graph shows the logarithm of the
various costs.  The graph shows a cost advantage of S being inside of







|
|







166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
<img src="images/qp/tpchq8.gif">
</center>

<p>
In the diagram, each of the 8 tables in the FROM clause of the query is
identified by a large circle with the label of the FROM-clause term:
N2, S, L, P, O, C, N1 and R.
The arcs in the graph represent the estimated cost of computing each term 
assuming that the origin of the arc is in an outer loop.  For example, the
cost of running the S loop as an inner loop to L is 2.30 whereas the
cost of running the S loop as an outer loop to L is 9.17.</p>

<p>The "cost" here is logarithmic.  With nested loops, the work
is multiplied, not added.  But it customary to think of graphs
with additive weights and so the graph shows the logarithm of the
various costs.  The graph shows a cost advantage of S being inside of
188
189
190
191
192
193
194
195
196




197
198
199
200
201
202
203
result.  One can think of the *-costs as a short-hand notation indicating
multiple arcs with that cost, one each from each of the other nodes in the
graph.  The graph is is therefore "complete", meaning that there are arcs
(some explicit and some implied) in both directions between every pair of 
nodes in the graph.</p>

<p>The problem of finding the best query plan is equivalent to finding
a minimum-cost path through the graph shown above that visits each node
exactly once.</p>





<h3>3.2 Complications</h3>

<p>The presentation of the query planner problem above is a simplification.
The costs are estimates.  We cannot
know what the true cost of running a loop is until we actually run the loop.
SQLite makes guesses for the cost of running a loop based on







|

>
>
>
>







190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
result.  One can think of the *-costs as a short-hand notation indicating
multiple arcs with that cost, one each from each of the other nodes in the
graph.  The graph is is therefore "complete", meaning that there are arcs
(some explicit and some implied) in both directions between every pair of 
nodes in the graph.</p>

<p>The problem of finding the best query plan is equivalent to finding
a minimum-cost path through the graph that visits each node
exactly once.</p>

<p>(Side note:  The costs estimates in the TPC-H Q8 graph where computed
by the query planner in SQLite 3.7.16 and converted using a base-10 logarithm.)
</p>

<h3>3.2 Complications</h3>

<p>The presentation of the query planner problem above is a simplification.
The costs are estimates.  We cannot
know what the true cost of running a loop is until we actually run the loop.
SQLite makes guesses for the cost of running a loop based on
257
258
259
260
261
262
263
264

265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
<p>Unfortunately, the query plan computed by NN for TPC-H Q8 is not optimal.
The plan computed using NN is R-N1-N2-S-C-O-L-P with a cost of 36.92.  
The notation
in the previous sentence means that the R table is run in the outer loop,
N1 is in the next inner loop, N2 is in the third loop, and so forth down
to P which is in the inner-most loop.  The shortest path through the
graph (as found via exhaustive search) is  P-L-O-C-N1-R-S-N2
with a cost of 27.38.  This might not seem like much, but remember that

the costs are logarithmic, so the shortest path is nearly 14,000 times
faster than that path found using the NN heuristic.</p>

<p>One solution to this problem is to change SQLite to do an exhaustive
search for the best path.  But an exhaustive search requires time 
proportional to
K! (where K is the number of tables in the join) and so when you get 
beyond a 10-way join, the time
to run [sqlite3_prepare()] becomes very large.</p>

<h3>3.4 N Nearest Neighbors</h3>

<p>The NGQP uses a new heuristic for seeking the best path through the
graph: "N Nearest Neighbors" (or "NNN").  With NNN, instead of
choosing just one nearest neighbor for each step, the algorithm keeps
track of the N bests paths at each step.</p>

<p>Suppose N==4.  Then for the TPC-H Q8 graph, the first step finds
the four shortest paths to visit any single node in the graph:

<blockquote>
    R (cost: 3.56) <br>
    N1 (cost: 5.52) <br>
    N2 (cost: 5.52) <br>
    P (cost: 7.71) <br>







|
>










|


|

|

|







263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
<p>Unfortunately, the query plan computed by NN for TPC-H Q8 is not optimal.
The plan computed using NN is R-N1-N2-S-C-O-L-P with a cost of 36.92.  
The notation
in the previous sentence means that the R table is run in the outer loop,
N1 is in the next inner loop, N2 is in the third loop, and so forth down
to P which is in the inner-most loop.  The shortest path through the
graph (as found via exhaustive search) is  P-L-O-C-N1-R-S-N2
with a cost of 27.38.  The difference might not seem like much, but 
remember that
the costs are logarithmic, so the shortest path is nearly 14,000 times
faster than that path found using the NN heuristic.</p>

<p>One solution to this problem is to change SQLite to do an exhaustive
search for the best path.  But an exhaustive search requires time 
proportional to
K! (where K is the number of tables in the join) and so when you get 
beyond a 10-way join, the time
to run [sqlite3_prepare()] becomes very large.</p>

<h3>3.4 The N Nearest Neighbors or "N3" Heuristic</h3>

<p>The NGQP uses a new heuristic for seeking the best path through the
graph: "N Nearest Neighbors" (hereafter "N3").  With N3, instead of
choosing just one nearest neighbor for each step, the algorithm keeps
track of the N bests paths at each step for some small integer N.</p>

<p>Suppose N=4.  Then for the TPC-H Q8 graph, the first step finds
the four shortest paths to visit any single node in the graph:

<blockquote>
    R (cost: 3.56) <br>
    N1 (cost: 5.52) <br>
    N2 (cost: 5.52) <br>
    P (cost: 7.71) <br>
312
313
314
315
316
317
318
319
320
321

322

323
324
325
326
327
328
329
330
331
332
333
334
335
336
337


338
339
340
341
342
343
344


345
346
347
348
349
350
351
352
353
    R-N2-S (cost: 15.08) <br>
</blockquote>

<p>And so forth.  There are 8 nodes in the TPC-H Q8 query, 
so this process repeats a total of 8
times.  In the general case of a K-way join, the storage requirements
is O(N) and the computation time is O(K*N), which is significantly faster
than the O(2**K) exact solution.</p>

<p>But what value to choose for N?  One might try N==K.  This makes the

algorithm O(N**2) which is actually still quite efficient, since the

maximum value of K is 64.  But that is not enough for the TPC-H Q8
problem.  With N==8 on TPC-H Q8 the NNN algorithm finds 
the solution R-N1-C-O-L-S-N2-P with a cost of 29.78.  
That is an improvement over NN, but it is still
not optimal.  NNN finds the optimal solution for TPC-H Q8 
when N is 10 or greater.</p>

<p>The initial implementation of NGQP choses N==1 for simple queries, N==5
for two-way joins and N==10 for all joins with three or more tables.  This
formula for selecting N might change in subsequent releases.</p>

<tcl>hd_fragment hazards {hazards of upgrading to the NGQP}</tcl>
<h2>4.0 Hazards Of Upgrading To NGQP</h2>

<p>For most applications, upgrading from the legacy query planner to the NGQP


is a no-brainer.  Just drop in the newer version of SQLite and recompile and
the application will run faster.  There are no API changes nor modifications
to compilation procedures.</p>

<p>But as with any query planner change, upgrading to the NGQP does carry
a small risk of introducing performance regressions.  The problem here is
not that the NGQP is incorrect or buggy.  More likely,


the legacy query planner merely stumbled over a better query plan
by stupid luck and the NGQP is not quite so lucky.</p>

<h3>4.1 Case Study: Upgrading Fossil to the NGQP</h3>

<p>The <a href="http://www.fossil-scm.org/">Fossil DVCS</a> is the version
control system used to track all of the SQLite source code.
A Fossil repository is an SQLite database file.
(Readers are invited to ponder this recursion as an independent exercise.)







|

|
>
|
>
|
|

|
|


|
|






>
>
|
|




|
>
>
|
|







319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
    R-N2-S (cost: 15.08) <br>
</blockquote>

<p>And so forth.  There are 8 nodes in the TPC-H Q8 query, 
so this process repeats a total of 8
times.  In the general case of a K-way join, the storage requirements
is O(N) and the computation time is O(K*N), which is significantly faster
than the O(2<small><sup>K</sup></small>) exact solution.</p>

<p>But what value to choose for N?  One might try N=K.  This makes the
algorithm O(K<small><sup>2</sup></small>) 
which is actually still quite efficient, since the
maximum value of K is 64 and rarely exceeds 10.  
But that is not enough for the TPC-H Q8
problem.  With N=8 on TPC-H Q8 the N3 algorithm finds 
the solution R-N1-C-O-L-S-N2-P with a cost of 29.78.  
That is a big improvement over NN, but it is still
not optimal.  N3 finds the optimal solution for TPC-H Q8 
when N is 10 or greater.</p>

<p>The initial implementation of NGQP choses N=1 for simple queries, N=5
for two-way joins and N=10 for all joins with three or more tables.  This
formula for selecting N might change in subsequent releases.</p>

<tcl>hd_fragment hazards {hazards of upgrading to the NGQP}</tcl>
<h2>4.0 Hazards Of Upgrading To NGQP</h2>

<p>For most applications, upgrading from the legacy query planner to the NGQP
requires little thought or effort.
Simply replace the older SQLite version with the newer version of SQLite 
and recompile and the application will run faster.  
There are no API changes nor modifications
to compilation procedures.</p>

<p>But as with any query planner change, upgrading to the NGQP does carry
a small risk of introducing performance regressions.  The problem here is
not that the NGQP is incorrect or buggy or inferior to the legacy query
planner.  Given accurate cost estimates, the NGQP will always pick a better
plan than older query planners.  The problem is that cost estimates might
sometimes be inaccurate and the legacy query planner just happened to stumbled 
over a good plan by stupid luck while the NGQP is not as lucky.</p>

<h3>4.1 Case Study: Upgrading Fossil to the NGQP</h3>

<p>The <a href="http://www.fossil-scm.org/">Fossil DVCS</a> is the version
control system used to track all of the SQLite source code.
A Fossil repository is an SQLite database file.
(Readers are invited to ponder this recursion as an independent exercise.)
412
413
414
415
416
417
418



419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
<p><ol>
<li> The check-in has a "trunk" tag.
<li> The check-in has a child that has a "trunk" tag.
<li> The check-in has a parent that has a "trunk" tag.
</ol></p>

<p>



These three conditions are implemented by the three OR-connected
EXISTS statements in the WHERE clause of the query.
The slowdown that occurred with the NGQP was caused by the last two
of the three conditions.  We will examine the middle condition
more closely.
The subquery of the second condition can be rewritten (with minor
and immaterial simplifications) as follows:</p>

<blockquote><pre>
SELECT 1
  FROM plink JOIN tagxref ON tagxref.rid=plink.cid
 WHERE tagxref.tagid=$trunk
   AND plink.pid=$ckid;
</pre></blockquote>

<p>The PLINK table is used to show parent-child relationships between
check-ins.  The TAGXREF table is used to show which tags are assigned
to which check-ins.  For reference, the relevant portions of the schemas
for these two tables is shown here:</p>

<blockquote><pre>
CREATE TABLE plink(
  pid INTEGER REFERENCES blob,
  cid INTEGER REFERENCES blob
);
CREATE UNIQUE INDEX plink_i1 ON plink(pid,cid);

CREATE TABLE tagxref(
  tagid INTEGER REFERENCES tag,
  mtime TIMESTAMP,
  rid INTEGER REFERENCE blob,
  UNIQUE(rid, tagid)
);
CREATE INDEX tagxref_i1 ON tagxref(tagid, mtime);
</pre></blockquote>

<p>There are only two reasonable ways to implement this query.
(There are many other possible algorithms, but none of the
others are contenders for being the "best" algorithm and are thus
ignored here.)</p>

<ol type="1">
<li value=1><p>
Find all children of checkin $ckid and test each one to see if
it has the $trunk tag.
<li value=2><p>
Find all checkins with the $trunk tag and test each one to see if
it is a child of $ckid.
</ol>

<p>
Intuitively, we humans understand that algorithm-1 is best.
Each check-in is likely to have few children (one child is
the most common case) and each child can be checked for the
$trunk tag in logarithmic time.  Indeed, algorithm-1 is the
faster choice in practice.  But the NGQP has no intuition.  The
NGQP must use hard math, and algorithm-2 is slightly
better mathematically.  This is because, in the absence of other information,
the NGQP must assume that the indices PLINK_I1 and TAGXREF_I1 are of
equal quality and are equally selective.  Algorithm-2 uses one field
of the TAGXREF_I1 index and both fields of the PLINK_I1 index whereas







>
>
>
|

|
|
|










|
|
|




















|
<



|


|






|







425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473

474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
<p><ol>
<li> The check-in has a "trunk" tag.
<li> The check-in has a child that has a "trunk" tag.
<li> The check-in has a parent that has a "trunk" tag.
</ol></p>

<p>
The first condition causes all of the trunk check-ins to be displayed and
the second and third cause check-ins that merge into or are derived from
the trunk to also be included.
The three conditions are implemented by the three OR-connected
EXISTS statements in the WHERE clause of the query.
The slowdown that occurred with the NGQP was caused by the second and
third conditions.  The problem is the same in each, so we will examine
just the second conditions.
The subquery of the second condition can be rewritten (with minor
and immaterial simplifications) as follows:</p>

<blockquote><pre>
SELECT 1
  FROM plink JOIN tagxref ON tagxref.rid=plink.cid
 WHERE tagxref.tagid=$trunk
   AND plink.pid=$ckid;
</pre></blockquote>

<p>The PLINK table holds parent-child relationships between
check-ins.  The TAGXREF table maps tags into check-ins.
For reference, the relevant portions of the schemas
for these two tables is shown here:</p>

<blockquote><pre>
CREATE TABLE plink(
  pid INTEGER REFERENCES blob,
  cid INTEGER REFERENCES blob
);
CREATE UNIQUE INDEX plink_i1 ON plink(pid,cid);

CREATE TABLE tagxref(
  tagid INTEGER REFERENCES tag,
  mtime TIMESTAMP,
  rid INTEGER REFERENCE blob,
  UNIQUE(rid, tagid)
);
CREATE INDEX tagxref_i1 ON tagxref(tagid, mtime);
</pre></blockquote>

<p>There are only two reasonable ways to implement this query.
(There are many other possible algorithms, but none of the
others are contenders for being the "best" algorithm.)</p>


<ol type="1">
<li value=1><p>
Find all children of check-in $ckid and test each one to see if
it has the $trunk tag.
<li value=2><p>
Find all check-ins with the $trunk tag and test each one to see if
it is a child of $ckid.
</ol>

<p>
Intuitively, we humans understand that algorithm-1 is best.
Each check-in is likely to have few children (one child is
the most common case) and each child can be tested for the
$trunk tag in logarithmic time.  Indeed, algorithm-1 is the
faster choice in practice.  But the NGQP has no intuition.  The
NGQP must use hard math, and algorithm-2 is slightly
better mathematically.  This is because, in the absence of other information,
the NGQP must assume that the indices PLINK_I1 and TAGXREF_I1 are of
equal quality and are equally selective.  Algorithm-2 uses one field
of the TAGXREF_I1 index and both fields of the PLINK_I1 index whereas
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524

525
526


527
528
529
530
531

532
533







534
535
536
537
538
539
540



541

542
543
544
545
546
547
548
549
550
551

552
553
554



















555


































































<p>
Unfortunately, algorithm-2 is slower than algorithm-1 in
this application.
</p>

<p>
The problem is that the indices are not of equal quality.
A check-in is likely to only have one child check-in.  There might be two
or three in some cases, but the usual number will be one.  So the first
field of PLINK_I1 will usually narrow down the search to just a single
row.  But there might be many more check-ins on the same branch, especially
if that branch is "trunk", so that the first field of TAGXREF_I1 will be
of little help in narrowing down the search.  As of this writing
(2013-06-24) on the SQLite repository, roughly 38 out of every 100 rows
in the TAGXREF table assign "trunk" tags to checkins.  Hence searching on just
the first field of the TAGXREF_I1 index is practically useless.
</p>

<p>
The NGQP has no way of knowing that TAGXREF_I1 is almost useless in this
query, unless [ANALYZE] has been run on the database.  The [ANALYZE] 
gathers statistics on the quality of the various indices are stored the
result in SQLITE_STAT1 table.  Having access to this statistical information,
the NGQP easily choses algorithm-1 as the best algorithm, by a wide
margin.</p>

<p>Why didn't the legacy query planner choose algorithm-2?
Easy: because the nearest neighbor greedy algorithm used by the legacy
query planner never even considered algorithm-2.  Graphs of the planning
problem look like this:</p>

<center>
<img src="images/qp/fqp1.gif">
</center>

<p>
In the "without ANALYZE" case on the left, the NN algorithm chooses 
loop P (PLINK) as the outer loop because 4.9 is less than 5.3, resulting

in the selection of algorithm-1. NN completely misses the fact that 
5.2+4.4 makes a slightly cheaper plan than 4.9+4.8. But the NNN algorithm 


does select the lower cost plan, resulting in algorithm-2.
</p>

<p>
Note that after ANALYZE has been run, algorithm-1

will be selected by both NN and NNN.
</p>








<h3>4.2 Fixing The Problem</h3>

<p>Running [ANALYZE] on the repository database immediately fixed the
performance problem.  However, we want Fossil to be robust and to always
work quickly regardless of whether or not its repository has been analyzed.
For this reason, the query was modify to use the CROSS JOIN operator 



instead of simply JOIN.  SQLite will not reorder the tables of a CROSS JOIN.

This is a feature designed specifically to allow knowledgeable programmers
to enforce a particular loop nesting order.  Once the join
was changed to CROSS JOIN (the addition of a single keyword) the NGQP was
forced to chose the faster algorithm-1 regardless of whether or not
statistical information had been gathered using ANALYZE.</p>

<p>The previous sentence called algorithm-1 "faster", but this
is not strictly true.  Algorithm-1 is faster in common repositories, but
it is possible to construct a repository composed mostly of
new branches, one check-in per branch, and with many branches originating from

the same parent.  In that case, TAGXREF_I1 would become more selective
than PLINK_I1 and algorithm-2 really would be the faster choice.
However such branch-heavy repositories are unlikely to appear in



















practice.</p>









































































|
<

|
|
|
<
<
<




|
|





|
|








|
>
|
|
>
>
|



|
>
|

>
>
>
>
>
>
>







>
>
>
|
>
|





|

|
|
>
|

|
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
|
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
502
503
504
505
506
507
508
509

510
511
512
513



514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
<p>
Unfortunately, algorithm-2 is slower than algorithm-1 in
this application.
</p>

<p>
The problem is that the indices are not of equal quality.
A check-in is likely to only have one child.  So the first

field of PLINK_I1 will usually narrow down the search to just a single
row.  But there are thousands and thousands check-ins tagged with "trunk", 
so the first field of TAGXREF_I1 will be
of little help in narrowing down the search.



</p>

<p>
The NGQP has no way of knowing that TAGXREF_I1 is almost useless in this
query, unless [ANALYZE] has been run on the database.  The [ANALYZE] command
gathers statistics on the quality of the various indices and stores the
result in SQLITE_STAT1 table.  Having access to this statistical information,
the NGQP easily choses algorithm-1 as the best algorithm, by a wide
margin.</p>

<p>Why didn't the legacy query planner choose algorithm-2?
Easy: because the NN algorithm
never even considered algorithm-2.  Graphs of the planning
problem look like this:</p>

<center>
<img src="images/qp/fqp1.gif">
</center>

<p>
In the "without ANALYZE" case on the left, the NN algorithm chooses 
loop P (PLINK) as the outer loop because 4.9 is less than 5.2, resulting
in path P-T which is algorithm-1. NN only looks a the single best choice
at each step so it completely misses the fact that 
5.2+4.4 makes a slightly cheaper plan than 4.9+4.8. But the N3 algorithm
keeps track of the 5 best paths for a 2-way join, so it ends up
selecting path T-P because of its slightly lower overall cost.
Path T-P is algorithm-2.
</p>

<p>
Note that with ANALYZE the cost estimates are
better aligned with reality and algorithm-1 is 
selected by both NN and N3.
</p>

<p>(Side note:  The costs estimates in the two most recent graphs 
were computed by the NGQP using a base-2 logarithm and slightly different
cost assumptions compared to the legacy query planner.  
Hence, the cost estimates in
these latter two graphs are not directly comparible to the cost estimates
in the TPC-H Q8 graph.)</p>

<h3>4.2 Fixing The Problem</h3>

<p>Running [ANALYZE] on the repository database immediately fixed the
performance problem.  However, we want Fossil to be robust and to always
work quickly regardless of whether or not its repository has been analyzed.
For this reason, the query was modify to use the CROSS JOIN operator 
instead of simply JOIN.
(<a href="http://www.fossil-scm.org/fossil/fdiff?v1=c498d980579e18f8&v2=2a35f6ffed42e024&sbs=1">The
difference</a>)
SQLite will not reorder the tables of a CROSS JOIN.
This is a long-standing feature of SQLite that is specifically designed
to allow knowledgeable programmers
to enforce a particular loop nesting order.  Once the join
was changed to CROSS JOIN (the addition of a single keyword) the NGQP was
forced to chose the faster algorithm-1 regardless of whether or not
statistical information had been gathered using ANALYZE.</p>

<p>We say that algorithm-1 "faster", but this
is not strictly true.  Algorithm-1 is faster in common repositories, but
it is possible to construct a repository in which
every check-in is on a different uniquely-named branch and
all check-ins are children of the root check-in.
In that case, TAGXREF_I1 would become more selective
than PLINK_I1 and algorithm-2 really would be the faster choice.
However such repositories are very unlikely to appear in
practice and so hard-coding the loop nested order using the
CROSS JOIN syntax is a reasonable solution
to the problem i nthis case.</p>

<tcl>hd_fragment howtofix</tcl>
<h2>5.0 Checklist For Avoiding Or Fixing Query Planner Problems</h2>

<ol>
<li><p><b>Don't panic!</b>
Cases where the query planner picks an inferior plan are actually quite
rare.  You are unlikely to run accross any problems in your application.
If you are not having performance issues, the you do not need to worry
about any of this.</p>

<li><p><b>Create appropriate indices.</b>
Most SQL performance problems arise not because of query planner issues
but rather due to lack of appropriate indices.  Make sure indices are
available to assist all large queries.  Most performance issues can be
resolved by one or two CREATE INDEX commands and with no changes to
application code.</p>

<li><p><b>Avoid creating low-quality indices.</b>.
A low-quality index (for the purpose of this checklist) is one where
there are more than 10 or 20 rows in the table that have the same value
for the left-most column of the index.  In particular, avoid using
boolean or "enum" columns as the left-most columns of your indices.</p>

<p>The performance problem in Fossil
described in the previous section came about because there were over
ten thousands entries in the TAGXREF table with the same value for the 
left-most column (the TAGID column) of the TAGXREF_I1 index.</p>

<li><p><b>If you must use a low-quality index, be sure to run ANALYZE.</b>
Low-quality indices will not confuse the query planner as long as the
query planner knows that the indices are of low quality.  And the way
the query planner knows this is by the content of the SQLITE_STAT1 table,
normally computed by the ANALYZE command.</p>

<p>Of course, ANALYZE only works effectively if you have a significant 
amount of content in your database in the first place.  When creating a 
new database that you expect to accumulate a lot of data, you can run 
the command "ANALYZE sqlite_master" to create the SQLITE_STAT1 table,
then prepopulate the SQLITE_STAT1 table (using ordinary INSERT statements)
with content that describes a typical
database for your application - perhaps content that you extracted after
running ANALYZE on a well-populated template database in the lab.</p>

<li><p><b>Instrument your code.</b>
Add logic that lets you know quickly and easily which queries are taking
too much time.  Then work on just those specific queries.</p>

<li><p><b>Use the CROSS JOIN syntax to enforce a particular
loop nesting order on queries that might use low-quality indices in
unanalyzed database.</b>
SQLite treats the keyword CROSS JOIN specially, forcing the table to the left
to be an an outer loop relative to the table on the right.</p>

<p>Avoid this step if possible, as it defeats one of the huge advantages
of the whole SQL language concept, specifically that the application 
programmer does not need to get involved with query planning.  If you
do use CROSS JOIN, wait until late in your development cycle to do so,
and comment the use of CROSS JOIN carefully so that you can take it out
later if possible.  Avoid using CROSS JOIN early in the development
cycle as doing so is a premature optimization, which is well known to
be <a href="http://c2.com/cgi/wiki?PrematureOptimization">the root of
all evil</a>.</p>

<li><p><b>Use the [INDEXED BY] syntax to enforce the selection of
particular indices on problem queries.</b>
As with the previous bullet, avoid this step if possible, and 
especially avoid doing this early in development as it is clearly a
premature optimization.</p>
</ol>

<h2>6.0 Summary</h2>

<p>The query planner in SQLite normally does a terrific job of selecting
fast algorithms for running your SQL statements.  This is true of the 
legacy query planner and even more true of the new NGQP.  There may be
an occasional situation where, due to incomplete information, the query
planner selects a suboptimal plan.  This will happen less often with the
NGQP than with the legacy query planner, but it might still happen.  Only
in those rare cases do application developers need to get involved and
help the query planner to do the right thing.  In the common case, the
NGQP is just a new enhancement to SQLite that makes the application run
a little faster and which requires no new developer though or action.</p>