Documentation Source Text
Check-in [e1646aea15d8a7c4c15d303893c39fbb0849d30c]
Not logged in

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
SHA1 Hash:e1646aea15d8a7c4c15d303893c39fbb0849d30c
Date: 2013-06-24 19:23:40
User: drh
Comment:Work toward documenting the NGQP changes.
Tags And Properties
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to pages/changes.in

37
38
39
40
41
42
43






44
45
46
47
48
49
50
      <a href="http://www.sqlite.org/src/timeline">
      http://www.sqlite.org/src/timeline</a>.</p>
    }
    hd_close_aux
    hd_enable_main 1
  }
}






chng {2013-05-20 (3.7.17)} {
<li>Add support for [memory-mapped I/O].
<li>Add the [sqlite3_strglob()] convenience interface.
<li>Assigned the integer at offset 68 in the [database header] as the
    [Application ID] for when SQLite is used as an [application file-format].
    Added the [PRAGMA application_id] command to query and set the Application ID.
<li>Report rollback recovery in the [error log] as SQLITE_NOTICE_RECOVER_ROLLBACK.







>
>
>
>
>
>







37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
      <a href="http://www.sqlite.org/src/timeline">
      http://www.sqlite.org/src/timeline</a>.</p>
    }
    hd_close_aux
    hd_enable_main 1
  }
}

chng {2013-08-15 (3.8.0)} {
<li>Cut-over to the [next generation query planner] for faster and better query plans.
<li>Added the [FTS4 notindexed option], allowing non-indexed columns in an FTS4 table.
}

chng {2013-05-20 (3.7.17)} {
<li>Add support for [memory-mapped I/O].
<li>Add the [sqlite3_strglob()] convenience interface.
<li>Assigned the integer at offset 68 in the [database header] as the
    [Application ID] for when SQLite is used as an [application file-format].
    Added the [PRAGMA application_id] command to query and set the Application ID.
<li>Report rollback recovery in the [error log] as SQLITE_NOTICE_RECOVER_ROLLBACK.

Changes to pages/index.in

91
92
93
94
95
96
97
98
99

100
101
102
103
104
105
106
107

</td>
<td width="20"></td><td bgcolor="#044a64" width="1"></td><td width="20"></td>
<td valign="top">
<h3>Current Status</h3>

<p><ul>
<li><a href="releaselog/3_7_17.html">Version 3.7.17</a>
of SQLite is recommended for all new development.

Upgrading from all previous SQLite versions
is recommended.</li>
</ul></p>

<h3>Common Links</h3>

<p><ul>
<li> <a href="features.html">Features</a> </li>







|

>
|







91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108

</td>
<td width="20"></td><td bgcolor="#044a64" width="1"></td><td width="20"></td>
<td valign="top">
<h3>Current Status</h3>

<p><ul>
<li><a href="releaselog/3_8_0.html">Version 3.8.0</a>
of SQLite is recommended for all new development.
Upgrading from version 3.7.17 is optional.
Upgrading from all other prior versions of SQLite
is recommended.</li>
</ul></p>

<h3>Common Links</h3>

<p><ul>
<li> <a href="features.html">Features</a> </li>

Changes to pages/queryplanner-ng.in

12
13
14
15
16
17
18
19
20

21
22
23
24
25
26
27
28


29










































































































30
31
32
33
34
35








36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74




75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92

93
94
95
96
97
98
99
100
101

102
103
104

105
106
107
108

109








110
111


112
113
114
115
116

117
118
119
120
121
122


123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

201
202



203
204
205
206
207
208
209
210
211


212
213
214

215







































































































































































































  hd_puts [string trimright $txt]\n
  hd_puts "</pre></table></center>\n"
}
hd_keywords {next generation query planner}
</tcl>
<h1 align="center">The Next Generation Query Planner</h1>

<blockquote>
<p style="border:1px solid black; color: red;"><i><b>

This article is about proposed enhancements to SQLite that have not yet
been implemented.  Everything mentioned here is preliminary and subject
to change.  If and when the plans outlined in this document are
completed, this document will be revised into something along the lines
of "How The Query Planner Works".</b></i></p>
</blockquote>

<h2>Introduction</h2>













































































































<p>
"TPC-H Q8" is a test query from the
<a href="http://www.tpc.org/tpch/">Transaction Processing Performance
Council</a>.  SQLite version 3.7.17 and earlier do not choose a good
query plan for TPC-H Q8.  This article explains why that is and what
we intend to do about it for SQLite version 3.7.18 and later.








</p>

<h2>The Query</h2>

<p>
TPC-H Q8 is an eight-way join.
As SQLite uses only loop joins, the main task of the query planner is
to figure out the best nesting order of the 8 loops in order to minimize
the amount of CPU time and I/O required to complete the join.
A simplified model of this problem for the case of TPC-H Q8 is shown
by the following (hand drawn) diagram:
</p>

<center>
<img src="images/qp/tpchq8.gif">
</center>

<p>
In the diagram, each of the 8 terms in the FROM clause of the query is
identified by a large circle with the label of the FROM-clause term:
N2, S, L, P, O, C, N1 and R.
The arcs in the graph represent the cost of computing each term assuming
that the origin of the arc is in an outer loop.  For example, the
cost of running the S loop as an inner loop to L is 2.30 whereas the
cost of running the S loop as an outer loop to L is 9.17.</p>

<p>The "cost" is these cases is logarithmic.  With nested loops, the CPU time
and I/O is multiplied, not added.  But it easier to think of a
graph with additive weights and so the graph shows the logarithm of the
various costs.  The graph shows a cost advantage of S being inside of
L of about 6.87, but this translates into the query running about
963 times faster when S loop is inside of the L loop rather than
being outside of it.</p>

<p>The arrows from the small circles labeled with "*" indicate the cost
of running each loop with no dependencies.  The outer loop must use this
*-cost.  Inner loops have the option of using the *-cost or a cost assuming
one of the other terms is in an outer loop, whichever gives the best
result.</p>





<p>The problem of finding the best query plan is equivalent to finding
a minimum-cost path through the graph shown above that visits each node
exactly once.</p>

<h3>Complications</h3>

<p>The presentation of the query planner problem above is a simplification.
Keep in mind, first of all, that the costs are merely estimates.  We cannot
know what the true cost of running a loop is until we actually run the loop.
SQLite makes guesses for the cost of running a loop based on
the availability of indices and constraints found in the WHERE
clause.  These guesses are usually pretty good, but they can sometimes be
off.  Using the [ANALYZE] command to collect additional statistical
information about the database can sometimes enable SQLite to make better
guesses about the cost.  But sometimes [ANALYZE] does not help.</p>

<p>Further, the costs are not really a single number as shown above.

SQLite computes several different costs for each loop that apply at
different times.  For example, there is a "setup" cost that is incurred
just once when the query starts.  The setup cost is the cost of computing
an [automatic indexing | automatic index] for a table that does not already
have an index.  There is an additional startup cost which is the cost of
initializing the loop for each iteration of the outer loop.  Then there
is the cost of running each step of the loop.  Finally, there is an estimate
of the number rows generated by the loop, which is information needed in
estimating the costs of inner loops.</p>


<p>A general query need not be a simple graph as shown above.
Dependencies need not be on a single other loop.  For example, you might

have a query where a particular loop depends on two or more other terms
being in outer loops, instead of just a single term.  We cannot really
draw a graph for these kinds of queries since there is no way for an
arc to originate at two or more nodes at once.</p>










<p>In the TPC-H Q8 query, the setup and startup costs are all negligible
and all dependencies are from a single other node. So for this case, at least,


the graph above is a reasonable representation of what needs to be computed.
The general case involves a lot of extra complication.
But the point of this article is to explain what is going on, and that
extra complication does not aid in the explanation, and is therefore
omitted.</p>


<h2>Finding The Best Query Plan</h2>

<p>Since its inception, SQLite has always used the rough equilvalent of
the "Nearest Neighbor" or "NN" heuristic when searching for the best query plan
for a particular query.  The NN heuristic chooses the lowest-cost arc


to follow next.  The NN heuristic works surprisingly well in most cases.
And NN is fast, so that SQLite is able to quickly find good query plans
for even large 64-way joins.  In contrast, other SQL database engines such
as Oracle, PostgreSQL, MySQL, and SQL Server tend to bog down when the
number of tables in a join goes above 10 or 15.</p>

<p>However, the query plan computed by NN for TPC-H Q8 is not optimal.
The plan computed using NN is R-N1-N2-S-C-O-L-P with a cost of 36.92.  
The notation
in the previous sentence means that the R table is run in the outer loop,
N1 is in the next inner loop, N2 is in the third loop, and so forth down
to P which is in the inner-most loop.  But the shortest path through the
graph (as found via exhaustive search) is  P-L-O-C-N1-R-S-N2
with a cost of 27.38.  This might not seem like much, but remember that
the costs are logarithmic, so the shortest path is nearly 14,000 times
faster than that path found using the NN heuristic.</p>

<p>One solution to this problem is to change SQLite to do an exhaustive
search for the best query plan.  But the SQLite developers are unwilling
to do this because an exhaustive search requires time proportional to
K! (where K is the number of tables in the join) and so when you get 
beyond a 10-way join, then time
to run [sqlite3_prepare()] becomes very large.</p>

<h3>N Nearest Neighbors</h3>

<p>The proposed solution to this dilemma 
is to recode the SQLite query planner use a new
heuristic: "N Nearest Neighbors" (or "NNN").  With NNN, instead of
choosing just one nearest neighbor for each step, the algorithm keeps
track of the N bests paths at each step.</p>

<p>Suppose N==4.  Then for the TPC-H Q8 graph, the first step would find
the four shortest paths to visit any single node in the graph:

<blockquote>
    R (cost: 3.56) <br>
    N1 (cost: 5.52) <br>
    N2 (cost: 5.52) <br>
    P (cost: 7.71) <br>
</blockquote>

<p>The second step would find the four shortest paths to visit two nodes 
and that begin with one of the four paths from the previous step.  In the
case where two or more paths are equivalent (they have the same set of
visited nodes, though possibly in a different order) then only the
first and lowest cost path is retained.  We have:</p>

<blockquote>
    R-N1 (cost: 7.03) <br>
    R-N2 (cost: 9.08) <br>
    N2-N1 (cost: 11.04) <br>
    R-P {cost: 11.27} <br>
</blockquote>

<p>The third stop starts with the four shortest two-node paths and finds
the four shortest non-equivalent three-node paths:</p>

<blockquote>
    R-N1-N2 (cost: 12.55) <br>
    R-N1-C (cost: 13.43) <br>
    R-N1-P (cost: 14.74) <br>
    R-N2-S (cost: 15.08) <br>
</blockquote>

<p>And so forth.  There are 8 nodes in the TPC-H Q8 query, 
so this process repeats a total of 8
times.  In the general case of a K-way join, the storage requirements
is O(N) and the computation time is O(K*N).  That is a lot less than an
exact solution which requires O(2**K) time.</p>

<p>But what value to choose for N?  We might make N==K.  This makes the
algorithm O(N**2) which is actually still quite efficient, since the
maximum value of K is 64.  But that is not enough for the TPC-H Q8
problem.  With N==8 on TPC-H Q8 the NNN algorithm finds 
the solution R-N1-C-O-L-S-N2-P with a cost of 29.78.  
That is an improvement over NN, but it is still
not optimal.</p>


<p>NNN finds the optimal solution for TPC-H Q8 when N is 10 or greater.</p>




<p>The current thinking is to make N a parameter with a default
value of 15 or so.  This will likely find at least a very good plan in most
circumstances.  We think you will need to work very hard to
construct a query where NNN with N==15 does not find the optimal plan.
Remember that the all SQLite versions prior to 3.7.18 do
an excellent job with N==1.  The value of N==15 would be the compile-time
default, but can be changed using a PRAGMA.</p>



<p>Another idea begin considered is to initially attempt to find a query
plan using N==1 and then fall back and make a second attempt with a larger
N if the first attempt fails to find a provably optimal plan.  A provably

optimal plan is one that uses the lowest-cost arc into each node.</p>














































































































































































































|
<
>
|
|
|
|
|
|

|
>
>

>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>



<
<
<
>
>
>
>
>
>
>
>


|



|
|
|









|







|
|
|









|
>
>
>
>





|


|






|

|
>
|



|
<


|
>

<
|
>
|
|
|
|
>

>
>
>
>
>
>
>
>
|
<
>
>

|
<
<
<
>

|

|
|
<
>
>
|
|
|
|


|




|






|
|

|


|

|
<
|



|









|
|


|








|
|











|
|

|





|
>

<
>
>
>

<
<
<
|
|
<
<

>
>
|
|
<
>
|
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
12
13
14
15
16
17
18
19

20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140



141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215

216
217
218
219
220

221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237

238
239
240
241



242
243
244
245
246
247

248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276

277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328

329
330
331
332



333
334


335
336
337
338
339

340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
  hd_puts [string trimright $txt]\n
  hd_puts "</pre></table></center>\n"
}
hd_keywords {next generation query planner}
</tcl>
<h1 align="center">The Next Generation Query Planner</h1>

<h2>1.0 Executive Summary</h2>


<p>Given an SQL statement, the task of the "query planner" is to figure
out the best algorithm to use to accomplish that statement.
Beginning with SQLite version 3.8.0, the query planner component has been
rewritten so that it runs faster and generates better query plans.  The
rewrite is called the "next generation query planner" or "NGQP".
</p>

<p>This article overviews the importance of query planning, describes some
of the problems inherent to query planning, and outlines how the NGQP
solves those problems.</p>

<p>The NGQP is almost always better than the legacy query planner.
However, there may exist legacy applications that unknowingly depend on 
undefined and/or suboptimal behavior in the legacy query planner, and
upgrading to the NGQP on those legacy applications could cause performance
regressions.  This risk is considered and suggestions for reducing the
risk are outlined.</p>

<h2>2.0 Background</h2>

<p>For simple queries against a single table with few indices, there is usually
an obvious choice for the best algorithm.
But for larger and more complex queries, multi-way joins with many indices
and subqueries, there can be hundreds, thousands, or millions
of "reasonable" algorithms for computing the result.
The job of the query planner is to chose a single "best" query plan from
this multitude of possibilities.</p>

<p>The existence of a query planner is what makes SQL database engines
so useful and powerful.
The query planner frees the programmer from the chore of selecting
a particular query plan, and thereby allows the programmer to
focus more mental energy on higher-level application issues and on 
providing more value to the end user.  For simple queries where the choice
of query plans is obvious, this may not seem like a big deal.
But as applications and schemas and queries grow more complex, a
clever query planner can greatly speed and simplify the work of application
development. 
There is amazing power in being about to tell
the database engine what content is desired, and then leave the database
engine to worry over the details of how to best retrieve that content.</p>

<p>Writing a good query planner is an inexact science.
The query planner must work with incomplete information.
It cannot determine how long any particular plan will take
without actually running that plan.  So when comparing two
or more plans to figure out which is "best", the query planner has to make
some guesses and assumption and those guesses and assumptions that will 
sometimes be wrong. A good query plan is one that will
find the correct solution often enough that application
programmers only rarely need to get involved.</p>

<h3>2.1 Query Planning In SQLite</h3>

<p>The query engine in SQLite computes joins using nested loops, 
one loop for each table
in the join (plus possibly additional loops for IN operators in the
WHERE cause).
One or more indices might be used on each loop to speed the search,
or a loop might be a "full table scan" using only the original table.
Thus the task of query planner
breaks down into two subtasks:</p>
<ol>
<li> Picking the nested order of the various loops
<li> Choosing good indices for each loop
</ol>
<p>Picking the nesting order is generally the more challenging problem.
Once the nesting order of the join is established, the choice of indices
for each loop is normally obvious.</p>

<tcl>hd_fragment qpstab {query planner stability guarantee}</tcl>
<h3>2.2 The SQLite Query Planner Stability Guarantee</h3>

<p>SQLite will always pick the exact same query plan for any
given SQL statement as long as:
<ol type="a">
<li>the database schema does not change in significant ways such as 
    adding or dropping indices,</li>
<li>the ANALYZE command is not run, </li>
<li>SQLite is not compiled with [SQLITE_ENABLE_STAT3], and</li>
<li>the same version of SQLite is used.</li>
</ol>
This stability guarantee means that if all of your queries run efficiently
during testing, and if your application does not change the schema,
then SQLite will not suddenly decide to start using a new and different
query plan once your application is in the field, and thereby possibly
causing a performance problem.  If your application works in the lab, it
will continue working the same way after deployment.</p>

<p>Enterprise-class SQL database engines do not normally make this guarantee.
In client/server SQL database engines, the server normally keeps running 
statistics on the sizes of tables and on the quality of indices and uses
those statistics to aid in query planning.  As content is added, deleted,
or changed in the database, these statistics will evolve and may cause the
query planner to suddenly start using a different query plan for some
particular query.  Usually the new plan will be better for the
evolving structure of the content.  But sometimes the new query plan will
cause a performance reduction.  With a client/server database engine, there
is typically a Database Administrator (DBA) on hand to deal with these 
rare problems as they come up.  But DBAs are not available to fix problems 
in an embedded database like SQLite, and hence SQLite goes to great care 
to make sure that 
query plans do not change unexpectedly after deployment.</p>

<p>This stability guarantee applies equally to the legacy query planner in
SQLite and to the NGQP.</p>

<p>It is important to note that any particular version of SQLite will
always pick the same query plan, but if you relink your application to use
a different version of SQLite, then query plans might change.  In rare
cases, this might lead to a performance regression.  This is one reason
you should consider statically linking your applications against SQLite 
rather than try to use a system-wide SQLite shared library which might 
change without your knowledge or control.</p>

<h2>3.0 An Insurmountable Problem</h2>

<p>
"TPC-H Q8" is a test query from the
<a href="http://www.tpc.org/tpch/">Transaction Processing Performance



Council</a>.  The query planners in SQLite versions 3.7.17 and earlier 
do not choose good plans for TPC-H Q8.  And it has been determined that 
no amount
of tweaking of the legacy query planner will fix that.  In order to find
a good solution to the TPC-H Q8 query, and to continue improving the 
quality of SQLite's query planner, it became necessary to redesign the
query planner.  This section tries to explain why this redesign was
necessary and how the NGQP is different and addresses the TPC-H Q8 problem.
</p>

<h3>3.1 The Query</h3>

<p>
TPC-H Q8 is an eight-way join.
As observed above, the main task of the query planner is
to figure out the best nesting order of the eight loops in order to minimize
the work needed to complete the join.
A simplified model of this problem for the case of TPC-H Q8 is shown
by the following (hand drawn) diagram:
</p>

<center>
<img src="images/qp/tpchq8.gif">
</center>

<p>
In the diagram, each of the 8 tables in the FROM clause of the query is
identified by a large circle with the label of the FROM-clause term:
N2, S, L, P, O, C, N1 and R.
The arcs in the graph represent the cost of computing each term assuming
that the origin of the arc is in an outer loop.  For example, the
cost of running the S loop as an inner loop to L is 2.30 whereas the
cost of running the S loop as an outer loop to L is 9.17.</p>

<p>The "cost" here is logarithmic.  With nested loops, the work
is multiplied, not added.  But it customary to think of graphs
with additive weights and so the graph shows the logarithm of the
various costs.  The graph shows a cost advantage of S being inside of
L of about 6.87, but this translates into the query running about
963 times faster when S loop is inside of the L loop rather than
being outside of it.</p>

<p>The arrows from the small circles labeled with "*" indicate the cost
of running each loop with no dependencies.  The outer loop must use this
*-cost.  Inner loops have the option of using the *-cost or a cost assuming
one of the other terms is in an outer loop, whichever gives the best
result.  One can think of the *-costs as a short-hand notation indicating
multiple arcs with that cost, one each from each of the other nodes in the
graph.  The graph is is therefore "complete", meaning that there are arcs
(some explicit and some implied) in both directions between every pair of 
nodes in the graph.</p>

<p>The problem of finding the best query plan is equivalent to finding
a minimum-cost path through the graph shown above that visits each node
exactly once.</p>

<h3>3.2 Complications</h3>

<p>The presentation of the query planner problem above is a simplification.
The costs are estimates.  We cannot
know what the true cost of running a loop is until we actually run the loop.
SQLite makes guesses for the cost of running a loop based on
the availability of indices and constraints found in the WHERE
clause.  These guesses are usually pretty good, but they can sometimes be
off.  Using the [ANALYZE] command to collect additional statistical
information about the database can sometimes enable SQLite to make better
guesses about the cost.</p>

<p>The costs are comprised of multiple numbers, not a single number as
shown in the graph.
SQLite computes several different estimated costs for each loop that apply at
different times.  For example, there is a "setup" cost that is incurred
just once when the query starts.  The setup cost is the cost of computing
an [automatic indexing | automatic index] for a table that does not already
have an index.  Then there

is the cost of running each step of the loop.  Finally, there is an estimate
of the number rows generated by the loop, which is information needed in
estimating the costs of inner loops.  Sorting costs may come into play
if the query has an ORDER BY clause.</p>


<p>In a general query, dependencies need not be on a single loop, and hence
the matrix of dependencies might not be representable as a graph.
For example, one of the WHERE clause constraints might be
S.a=L.b+P.c, implying that the S loop must be an inner loop of both
L and P.  Such dependencies cannot be drawn as a graph
since there is no way for an arc to originate at two or more nodes at
once.</p>

<p>If the query contains an ORDER BY clause or a GROUP BY clause or if
the query uses the DISTINCT keyword then it is advantageous to select a 
path through the graph that causes rows to naturally appear in sorted order, 
so that no separate sorting step is required.  Automatic elimination of 
ORDER BY clauses
can make a large performance difference, so this is another factor
that needs to be considered in a complete implementation.</p>

<p>In the TPC-H Q8 query, the setup costs are all negligible,

all dependencies are between individual nodes, and there is no ORDER BY,
GROUP BY, or DISTINCT clause. So for TPC-H Q8,
the graph above is a reasonable representation of what needs to be computed.
The general case involves a lot of extra complication, which for clarity



is neglected in the remainder of this article.</p>

<h3>3.3 Finding The Best Query Plan</h3>

<p>Since its inception, SQLite has always used the rough equivalent of
the "Nearest Neighbor" or "NN" heuristic when searching for the best query plan.

The NN heuristic makes a single traversal of the graph, always chosing
the lowest-cost arc as the next step.  
The NN heuristic works surprisingly well in most cases.
And NN is fast, so that SQLite is able to quickly find good plans
for even large 64-way joins.  In contrast, other SQL database engines that
do more extensive searching tend to bog down when the
number of tables in a join goes above 10 or 15.</p>

<p>Unfortunately, the query plan computed by NN for TPC-H Q8 is not optimal.
The plan computed using NN is R-N1-N2-S-C-O-L-P with a cost of 36.92.  
The notation
in the previous sentence means that the R table is run in the outer loop,
N1 is in the next inner loop, N2 is in the third loop, and so forth down
to P which is in the inner-most loop.  The shortest path through the
graph (as found via exhaustive search) is  P-L-O-C-N1-R-S-N2
with a cost of 27.38.  This might not seem like much, but remember that
the costs are logarithmic, so the shortest path is nearly 14,000 times
faster than that path found using the NN heuristic.</p>

<p>One solution to this problem is to change SQLite to do an exhaustive
search for the best path.  But an exhaustive search requires time 
proportional to
K! (where K is the number of tables in the join) and so when you get 
beyond a 10-way join, the time
to run [sqlite3_prepare()] becomes very large.</p>

<h3>3.4 N Nearest Neighbors</h3>

<p>The NGQP uses a new heuristic for seeking the best path through the

graph: "N Nearest Neighbors" (or "NNN").  With NNN, instead of
choosing just one nearest neighbor for each step, the algorithm keeps
track of the N bests paths at each step.</p>

<p>Suppose N==4.  Then for the TPC-H Q8 graph, the first step finds
the four shortest paths to visit any single node in the graph:

<blockquote>
    R (cost: 3.56) <br>
    N1 (cost: 5.52) <br>
    N2 (cost: 5.52) <br>
    P (cost: 7.71) <br>
</blockquote>

<p>The second step finds the four shortest paths to visit two nodes 
beginning with one of the four paths from the previous step.  In the
case where two or more paths are equivalent (they have the same set of
visited nodes, though possibly in a different order) then only the
first and lowest-cost path is retained.  We have:</p>

<blockquote>
    R-N1 (cost: 7.03) <br>
    R-N2 (cost: 9.08) <br>
    N2-N1 (cost: 11.04) <br>
    R-P {cost: 11.27} <br>
</blockquote>

<p>The third step starts with the four shortest two-node paths and finds
the four shortest three-node paths:</p>

<blockquote>
    R-N1-N2 (cost: 12.55) <br>
    R-N1-C (cost: 13.43) <br>
    R-N1-P (cost: 14.74) <br>
    R-N2-S (cost: 15.08) <br>
</blockquote>

<p>And so forth.  There are 8 nodes in the TPC-H Q8 query, 
so this process repeats a total of 8
times.  In the general case of a K-way join, the storage requirements
is O(N) and the computation time is O(K*N), which is significantly faster
than the O(2**K) exact solution.</p>

<p>But what value to choose for N?  One might try N==K.  This makes the
algorithm O(N**2) which is actually still quite efficient, since the
maximum value of K is 64.  But that is not enough for the TPC-H Q8
problem.  With N==8 on TPC-H Q8 the NNN algorithm finds 
the solution R-N1-C-O-L-S-N2-P with a cost of 29.78.  
That is an improvement over NN, but it is still
not optimal.  NNN finds the optimal solution for TPC-H Q8 
when N is 10 or greater.</p>


<p>The initial implementation of NGQP choses N==1 for simple queries, N==5
for two-way joins and N==10 for all joins with three or more tables.  This
formula for selecting N might change in subsequent releases.</p>




<tcl>hd_fragment hazards {hazards of upgrading to the NGQP}</tcl>
<h2>4.0 Hazards Of Upgrading To NGQP</h2>



<p>For most applications, upgrading from the legacy query planner to the NGQP
is a no-brainer.  Just drop in the newer version of SQLite and recompile and
the application will run faster.  There are no API changes or changes
to compilation procedures.</p>


<p>But as with any query planner change, upgrading to the NGQP does carry
a small risk of introducing performance regressions.  The problem here is
not that the NGQP is incorrect or buggy.  More likely,
the legacy query planner merely stumbled over a better query plan
by stupid luck and the NGQP is not quite so lucky.</p>

<h3>4.1 Case Study: Upgrading Fossil to the NGQP</h3>

<p>The <a href="http://www.fossil-scm.org/">Fossil DVCS</a> is the version
control system used to track all of the SQLite source code.
A Fossil repository is an SQLite database file.
(Readers are invited to ponder this recursion as an independent exercise.)
</p>

<p>Fossil is both the version-control system for SQLite and also a test
platform for SQLite.  Whenever changes or enhancements are made to SQLite, 
Fossil is one of the first applications to test and evaluate those
changes.  So Fossil was an early adopter of the NGQP.  And NGQP caused a
performance issue in Fossil, as described in the sequel.</p>

<p>One of the many reports that Fossil makes available is a timeline of
changes to a single branch showing all merges in and out of that branch.
See [http://www.sqlite.org/src/timeline?nd&n=200&r=trunk] for a typical
example of such a report.  Generating such a report normally takes just
a few milliseconds.  But after upgrading to the NGQP we noticed that
this one report was taking closer to 10 seconds for the trunk of the
repository.</p>

<p>The core query used to generate the branch timeline is shown below.
(Readers are not expected to understand the details of this query.
Commentary will follow.)</p>

<blockquote><pre>
SELECT
     blob.rid AS blobRid,
     uuid AS uuid,
     datetime(event.mtime,'localtime') AS timestamp,
     coalesce(ecomment, comment) AS comment,
     coalesce(euser, user) AS user,
     blob.rid IN leaf AS leaf,
     bgcolor AS bgColor,
     event.type AS eventType,
     (SELECT group_concat(substr(tagname,5), ', ')
        FROM tag, tagxref
       WHERE tagname GLOB 'sym-*'
         AND tag.tagid=tagxref.tagid
         AND tagxref.rid=blob.rid
         AND tagxref.tagtype>0) AS tags,
     tagid AS tagid,
     brief AS brief,
     event.mtime AS mtime
  FROM event CROSS JOIN blob
 WHERE blob.rid=event.objid
   AND (EXISTS(SELECT 1 FROM tagxref
                WHERE tagid=11 AND tagtype>0 AND rid=blob.rid)
        OR EXISTS(SELECT 1 FROM plink JOIN tagxref ON rid=cid
                   WHERE tagid=11 AND tagtype>0 AND pid=blob.rid)
        OR EXISTS(SELECT 1 FROM plink JOIN tagxref ON rid=pid
                   WHERE tagid=11 AND tagtype>0 AND cid=blob.rid))
 ORDER BY event.mtime DESC
 LIMIT 200;
</pre></blockquote>

<p>This query is only a two-way join, but it contains four subqueries,
one in the result set and three more in the WHERE clause.  This query is not
especially complicated, but even so it replaces hundreds or 
perhaps thousands of lines of procedural code.</p>

<p>The gist of the query is this:  Scan down the EVENT table looking
for the most recent 200 check-ins that satisfy one of three conditions:
<ol>
<li> The check-in has a "trunk" tag.
<li> The check-in has a child that has a "trunk" tag.
<li> The check-in has a parent that has a "trunk" tag.
</ol>
These three conditions are implemented by the three OR-connected
EXISTS statements in the WHERE clause of the query.</p>

<p>The slowdown that occurred with the NGQP was caused by the last two
of the three conditions.  We will examine the middle condition
more closely.</p>

<p>The subquery of the second condition can be rewritten (with minor
and immaterial simplifications) as follows:</p>

<blockquote><pre>
SELECT 1
  FROM plink JOIN tagxref ON tagxref.rid=plink.cid
 WHERE tagxref.tagid=$trunk
   AND plink.pid=$ckid;
</pre></blockquote>

<p>The PLINK table is used to show parent-child relationships between
check-ins.  The TAGXREF table is used to show which tags are assigned
to which check-ins.  For reference, the relevant portions of the schemas
for these two tables is shown here:</p>

<blockquote><pre>
CREATE TABLE plink(
  pid INTEGER REFERENCES blob,
  cid INTEGER REFERENCES blob
);
CREATE UNIQUE INDEX plink_i1 ON plink(pid,cid);

CREATE TABLE tagxref(
  tagid INTEGER REFERENCES tag,
  mtime TIMESTAMP,
  rid INTEGER REFERENCE blob,
  UNIQUE(rid, tagid)
);
CREATE INDEX tagxref_i1 ON tagxref(tagid, mtime);
</pre></blockquote>

<p>There are only two reasonable ways to implement this query.
(There are many other possible algorithms, but none of the
others are contenders for being the "best" algorithm and are thus
ignored here.)<p.

<ol>
<li value=1><p>
Find all children of checkin $ckid and test each one to see if
it has the $trunk tag.
<li value=2><p>
Find all checkins with the $trunk tag and test each one to see if
it is a child of $ckid.
</ol>

<p>
Intuitively, we humans understand that algorithm (1) is best.
Each check-in is only likely to have a handful of children (having one child is
most common case) and each one of those children can be checked for the
$trunk tag in logarithmic time.  And, indeed, algorithm (1) is the
faster choice in practice.  But the NGQP is not equipped with intuition.  The
NGQP has to use hard math, and algorithm (2) works out to be slightly
better mathematically.  This is because, in the absence of other information,
the NGQP must assume that the indices PLINK_I1 and TAGXREF_I1 are of
equal quality and are equally selective.  Algorithm (2) uses one field
of the TAGXREF_I1 index and both fields of the PLINK_I1 index whereas
algorithm (1) only uses the first field of each index.  Since 
algorithm (2) uses more index material, the NGQP is correct
to judge it to be the better algorithm.  The scores are close and
algorithm (2) just barely squeaks ahead of algorithm (1).  But
algorithm (2) really is the correct choice here.
</p>

<p>
Unfortunately, algorithm (2) is many times slower than algorithm (1) in
this application.
</p>

<p>
The problem is that the indices are not of equal quality.
A check-in is likely to only have one child check-in.  There might be two
or three in some cases, but the usual number will be one.  So the first
field of PLINK_I1 will usually narrow down the search to just a single
row.  But there might be many more check-ins on the same branch, especially
if that branch is "trunk", so that the first field of TAGXREF_I1 will be
of little help in narrowing down the search.  As of this writing
(2013-06-24) on the SQLite repository, roughly 38 out of every 100 rows
in the TAGXREF table assign "trunk" tags to checkins.  Hence searching on just
the first field of the TAGXREF_I1 index is practically useless.
</p>

<p>
The NGQP has no way of knowing that TAGXREF_I1 is almost useless in this
query, unless we have run [ANALYZE] on the database.  After [ANALYZE] is
run, statistics on the quality of the various indices are stored in the
SQLITE_STAT1 table.  Having access to this statistical information,
the NGQP easily choses algorithm (1) as the best algorithm, by a wide
margin.</p>

<p>Why didn't the legacy query planner choose algorithm (2)?
Easy: because the nearest neighbor greedy algorithm used by the legacy
query planner never even considered algorithm (2).
Algorithm (1) was the only option that the legacy
query planner gave any thought to.  And hence
the legacy query planner just happened to make the faster choice by
stupid luck.</p>

<h3>4.2 Fixing The Problem</h3>

<p>Running [ANALYZE] on the repository database immediately fixed the
performance problem.  However, we want Fossil to be robust and to always
work quickly regardless of whether or not its repository has been analyzed.
For this reason, the query was modify to use the CROSS JOIN operator 
instead of simply JOIN.  SQLite will not reorder the tables of a CROSS JOIN.
This is a feature designed specifically to allow knowledgeable programmers
to enforce a particular loop nesting order.  Once the join
was changed to CROSS JOIN (the addition of a single keyword) the NGQP was
forced to chose the faster algorithm (1) regardless of whether or not
statistical information had been gathered using ANALYZE.</p>

<p>The previous sentence called algorithm (1) "faster", but this
is not strictly true.  Algorithm (1) is faster in common repositories, but
it is possible (in theory) to construct a repository composed mostly of
new branches, one check-in per branch, with many branches originating from
the same parent.  In that case, TAGXREF_I1 would become more selective
than PLINK_I1 and algorithm (2) really would be the faster choice.
However such branch-heavy repositories are unlikely to appear in actual
practice.</p>