Documentation Source Text

Check-in [fb11eda20b]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Add examples to WITH documentation showing how to control depth-first versus breath-first search of a tree using an ORDER BY clause on the recursive query.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: fb11eda20b22f82e164c237016167699dcdadae1
User & Date: drh 2014-01-24 13:20:25.547
Context
2014-01-24
23:43
Fix typos in the WITH clause documentation reported on the mailing list. (check-in: 5415be54c0 user: drh tags: trunk)
13:20
Add examples to WITH documentation showing how to control depth-first versus breath-first search of a tree using an ORDER BY clause on the recursive query. (check-in: fb11eda20b user: drh tags: trunk)
04:05
Further refinement and cleanup of the WITH and SELECT documentation. (check-in: 2941cf658c user: drh tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to pages/lang.in.
3031
3032
3033
3034
3035
3036
3037
3038
3039
3040
3041
3042
3043
3044
3045
3046
3047
3048
3049
3050
3051
3052
<tcl>
###############################################################################
Section {WITH clause} with {{common table expressions}}

RecursiveBubbleDiagram with-clause
</tcl>

<p>Common Table Expressions or CTEs are temporary views or tables that exist
only for the duration of a single SQL statement.  There are two kinds of
common table expressions: "ordinary" and "recursive". Ordinary 
common table expressions are helpful for making
queries easier to understand by factoring
subqueries out of the main SQL statement.
Recursive common table expression
provide the ability to do a hierarchical or
recursive queries of trees and graphs, a capability
that is not otherwise available in the SQL language.

All common table expressions (ordinary and recursive) are 
created by prepending a WITH clause in front of a [SELECT], [INSERT], [DELETE],
or [UPDATE] statement.  A single WITH clause can specify one or more
common table expressions.







|






|







3031
3032
3033
3034
3035
3036
3037
3038
3039
3040
3041
3042
3043
3044
3045
3046
3047
3048
3049
3050
3051
3052
<tcl>
###############################################################################
Section {WITH clause} with {{common table expressions}}

RecursiveBubbleDiagram with-clause
</tcl>

<p>Common Table Expressions or CTEs act like temporary [views] that exist
only for the duration of a single SQL statement.  There are two kinds of
common table expressions: "ordinary" and "recursive". Ordinary 
common table expressions are helpful for making
queries easier to understand by factoring
subqueries out of the main SQL statement.
Recursive common table expression
provide the ability to do hierarchical or
recursive queries of trees and graphs, a capability
that is not otherwise available in the SQL language.

All common table expressions (ordinary and recursive) are 
created by prepending a WITH clause in front of a [SELECT], [INSERT], [DELETE],
or [UPDATE] statement.  A single WITH clause can specify one or more
common table expressions.
3067
3068
3069
3070
3071
3072
3073
3074
3075
3076
3077
3078
3079
3080
3081
3082
3083
3084
3085
3086

3087
3088
3089
3090
3091
3092
3093
3094
3095
3096
3097
3098
3099
3100
3101
3102
3103
3104
3105
3106
3107
3108
3109
3110
3111
3112
3113
3114
3115
3116
3117
3118
3119
3120
3121


3122
3123
3124
3125
3126
3127
3128
3129
3130
3131
3132
3133
3134
3135
3136

<p>A recursive common table expression can be used to write a query that
walks a tree or graph.  A recursive common table expression has the same
basic syntax as an ordinary common table expression, but with the following
additional features:

<ol>
<li> The "select-stmt" after the AS keyword in the common table expression
     must be a [compound select] where the right-most compound operator is
     either UNION or UNION ALL.
<li> The table named on the left-hand side of the AS keyword must appear
     exactly once in the FROM clause of the right-most SELECT statement
     of the compound select, and nowhere else.
</ol>

<p>To put it another way, a recursive common table expression must
look like the following:

<tcl>RecursiveBubbleDiagram recursive-cte</tcl>


<p>Call the cte-table-name for a recursive common table expression
the "recursive table".  In the bubble diagram above, the recursive
table must appear exactly once in the FROM clause of the recursive-select
and must not appear anywhere else in either the initial-select or the
recursive-select, including subqueries.  The initial-select may be
a compound-select, but it may not include an ORDER BY, LIMIT, or OFFSET.
The recursive-select must be a simple select, not a compound.  The
recursive-select is allowed to include an ORDER BY, LIMIT, and/or OFFSET.

<p>The basic algorithm for computing the content of the recursive table
is as follows:

<ol>
<li> Run the initial-select and add the results to a queue.
<li> While the queue is not empty:
<ol type="a">
<li> Extract a single row from the queue.
<li> Write that single row into the recursive table
<li> Pretend that the single row just extracted is the only
     row in the recursive table and run the recursive-select,
     adding all results to the queue.
</ol>
</ol>

<p>The basic procedure above may modified by the following additional rules:

<ul>
<li><p>
  If the the initial-select and the recursive-select are connected by
  UNION, then only add rows to the queue if no identical row has been
  previously added to the queue.  Repeated rows are discarded before being
  added to the queue even if the repeated rows have already been extracted
  from the queue by the recursion stop.  If the operator is UNION ALL,
  then all rows generated by both the initial-select and the
  recursive-select are always added to the queue even if there are repeats.


<li><p>
  The LIMIT clause, if present, restricts which rows are added
  to the recursive table in step 2b.  If a LIMIT clause is present, it 
  restricts the total number of rows that will be added to the recursive table.
  Once the limit is reached, the recursion stops.  (As is always the case 
  with LIMIT, a limit of zero means that no rows are ever added to the
  recursive table, and a negative limit means an unlimited number of rows
  may be added to the recursive table.)
<li><p>
  The OFFSET clause, if it is present and has a positive value N, prevents the
  first N rows from being added to the recursive table.
  The first N rows are still processed by the recursive-select; they
  just are not added to the recursive table.  Rows are not counted toward
  fulfilling the LIMIT until all OFFSET rows have been skipped.
<li><p>







|
|











>
|
|



|











|










|
|
|

|

|
>
>

|
|
<
|
|

|







3067
3068
3069
3070
3071
3072
3073
3074
3075
3076
3077
3078
3079
3080
3081
3082
3083
3084
3085
3086
3087
3088
3089
3090
3091
3092
3093
3094
3095
3096
3097
3098
3099
3100
3101
3102
3103
3104
3105
3106
3107
3108
3109
3110
3111
3112
3113
3114
3115
3116
3117
3118
3119
3120
3121
3122
3123
3124
3125
3126
3127

3128
3129
3130
3131
3132
3133
3134
3135
3136
3137
3138

<p>A recursive common table expression can be used to write a query that
walks a tree or graph.  A recursive common table expression has the same
basic syntax as an ordinary common table expression, but with the following
additional features:

<ol>
<li> The "select-stmt" after the AS keyword
     must be a [compound select] where the right-most [compound-operator] is
     either UNION or UNION ALL.
<li> The table named on the left-hand side of the AS keyword must appear
     exactly once in the FROM clause of the right-most SELECT statement
     of the compound select, and nowhere else.
</ol>

<p>To put it another way, a recursive common table expression must
look like the following:

<tcl>RecursiveBubbleDiagram recursive-cte</tcl>

<p>We refer to the table named by the cte-table-name in a recursive
common table expression as the "recursive table".
In the recursive-cte bubble diagram above, the recursive
table must appear exactly once in the FROM clause of the recursive-select
and must not appear anywhere else in either the initial-select or the
recursive-select, including subqueries.  The initial-select may be
a [compound select], but it may not include an ORDER BY, LIMIT, or OFFSET.
The recursive-select must be a simple select, not a compound.  The
recursive-select is allowed to include an ORDER BY, LIMIT, and/or OFFSET.

<p>The basic algorithm for computing the content of the recursive table
is as follows:

<ol>
<li> Run the initial-select and add the results to a queue.
<li> While the queue is not empty:
<ol type="a">
<li> Extract a single row from the queue.
<li> Insert that single row into the recursive table
<li> Pretend that the single row just extracted is the only
     row in the recursive table and run the recursive-select,
     adding all results to the queue.
</ol>
</ol>

<p>The basic procedure above may modified by the following additional rules:

<ul>
<li><p>
  If a UNION operator connects the the initial-select with the
  recursive-select, then only add rows to the queue if no identical row has
  been previously added to the queue.  Repeated rows are discarded before being
  added to the queue even if the repeated rows have already been extracted
  from the queue by the recursion step.  If the operator is UNION ALL,
  then all rows generated by both the initial-select and the
  recursive-select are always added to the queue even if they are repeats.
  When determining if a row is repeated, NULL values compare
  equal to one another and not equal to any other value.
<li><p>
  The LIMIT clause, if present, determines the maximum number of rows that
  will ever be added to the recursive table in step 2b.

  Once the limit is reached, the recursion stops.
  A limit of zero means that no rows are ever added to the
  recursive table, and a negative limit means an unlimited number of rows
  may be added to the recursive table.
<li><p>
  The OFFSET clause, if it is present and has a positive value N, prevents the
  first N rows from being added to the recursive table.
  The first N rows are still processed by the recursive-select; they
  just are not added to the recursive table.  Rows are not counted toward
  fulfilling the LIMIT until all OFFSET rows have been skipped.
<li><p>
3148
3149
3150
3151
3152
3153
3154
3155
3156
3157
3158
3159
3160
3161
3162
3163
3164
3165
3166
3167
3168
3169
3170
3171
3172
3173
3174
3175
3176
3177
3178
3179
3180
3181
3182
3183
3184
3185
3186
3187
3188
3189
3190
3191
3192
3193
3194
3195
3196
3197
3198
3199
3200
3201
3202
3203
3204
3205
3206
3207
3208
3209
3210
3211
3212

<blockquote><pre>
WITH RECURSIVE
  cnt(x) AS (VALUES(1) UNION ALL SELECT x+1 FROM cnt WHERE x<1000000)
SELECT x FROM cnt;
</pre></blockquote>

<p>Consider how this query works.  The initial-select, which is this
case is just "VALUES(1)", runs first and returns a single row
with a single column "1".  This one row is added to the queue.  In
step 2a, that one row is extracted from the queue and added to "cnt".
Then the recursive query is run in accordance with step 2c generating
a single new row with value "2" to add to the queue.  The queue still
has one row, so step 2 repeats.  The 2 is extracted and added to the
recursive table by steps 2a and 2b.  Then the row containing 2 is used 
as if where the complete content of the recursive table and the 
recursive-select is run again, resulting in a value of 3 being added
to the queue.  This repeats 999999 times until finally at step 2a the
only value on the queue is a row containing 1000000.  That row is
extracted and added to the recursive table.  But this time, the
WHERE clause causes the recursive-select to return no rows, so the
queue remains empty and the recursion stops.

<p><b>Optimization note:</b>
In the discussion above, when we say things like "write the row into
the recursive table", we mean than conceptually, not in reality.
Read literally, it sounds as if SQLite is accumulating a huge table
containing one million rows, then going back and scanning that table
from top to bottom to generate the result.  What really happens though,
in this case, is that the query optimizer realizes that values in the
"cnt" recursive table are only used once.  So as each row is added to
the recursive table, it is immediately returned as a result of the
SELECT statement and then discarded.  SQLite does <em>not</em> accumulate
a temporary table containing a million rows.  Very little memory is
needed to run the above example.  However, if the example had used
UNION instead of UNION ALL, then SQLite would have had to keep around
all previously generated content in order to check for duplicates.
For this reason, programmers should strive to use UNION ALL instead
of UNION when feasible.

<p>Here is a variation on the same example:

<blockquote><pre>
WITH RECURSIVE
  cnt(x) AS (
     SELECT 1
     UNION ALL
     SELECT x+1 FROM cnt LIMIT 1000000
  )
SELECT x FROM cnt;
</pre></blockquote>

<p>There are two differences in this variation.  The initial-select is
"SELECT 1" instead of "VALUES(1)".  But those are really just different
syntaxes for saying exactly the same thing.  The other change is that the
recursion is stopped by a LIMIT rather than a WHERE clause.  The use of
LIMIT means that when the one-millionth row is added to the "cnt" table
(and immediately returned by the SELECT, thanks to the query optimizer)
then the recursion stops immediately regardless of how many rows might be
left in the queue.  On more complex queries, it can sometimes be
difficult to ensure that the WHERE clause will eventually cause the
queue to drain and the recursion to terminate.  But the LIMIT clause will
always stop the recursion.  So it is good practice to always include a
LIMIT clause as a safety if an upper bound on the size of the recursion 
is known.







|
|


|

|


|







|
|
|

|
|

|








|












|



|







3150
3151
3152
3153
3154
3155
3156
3157
3158
3159
3160
3161
3162
3163
3164
3165
3166
3167
3168
3169
3170
3171
3172
3173
3174
3175
3176
3177
3178
3179
3180
3181
3182
3183
3184
3185
3186
3187
3188
3189
3190
3191
3192
3193
3194
3195
3196
3197
3198
3199
3200
3201
3202
3203
3204
3205
3206
3207
3208
3209
3210
3211
3212
3213
3214

<blockquote><pre>
WITH RECURSIVE
  cnt(x) AS (VALUES(1) UNION ALL SELECT x+1 FROM cnt WHERE x<1000000)
SELECT x FROM cnt;
</pre></blockquote>

<p>Consider how this query works.  The initial-select
runs first and returns a single row
with a single column "1".  This one row is added to the queue.  In
step 2a, that one row is extracted from the queue and added to "cnt".
Then the recursive-select is run in accordance with step 2c generating
a single new row with value "2" to add to the queue.  The queue still
has one row, so step 2 repeats.  The "2" row is extracted and added to the
recursive table by steps 2a and 2b.  Then the row containing 2 is used 
as if where the complete content of the recursive table and the 
recursive-select is run again, resulting in a row with value "3" being added
to the queue.  This repeats 999999 times until finally at step 2a the
only value on the queue is a row containing 1000000.  That row is
extracted and added to the recursive table.  But this time, the
WHERE clause causes the recursive-select to return no rows, so the
queue remains empty and the recursion stops.

<p><b>Optimization note:</b>
In the discussion above, statements like "insert the row into
the recursive table" should be understood conceptually, not literally.
It sounds as if SQLite is accumulating a huge table
containing one million rows, then going back and scanning that table
from top to bottom to generate the result.  What really happens
is that the query optimizer sees that values in the
"cnt" recursive table are only used once.  So as each row is added to
the recursive table, that row is immediately returned as a result of the main
SELECT statement and then discarded.  SQLite does <em>not</em> accumulate
a temporary table containing a million rows.  Very little memory is
needed to run the above example.  However, if the example had used
UNION instead of UNION ALL, then SQLite would have had to keep around
all previously generated content in order to check for duplicates.
For this reason, programmers should strive to use UNION ALL instead
of UNION when feasible.

<p>Here is a variation on the previous example:

<blockquote><pre>
WITH RECURSIVE
  cnt(x) AS (
     SELECT 1
     UNION ALL
     SELECT x+1 FROM cnt LIMIT 1000000
  )
SELECT x FROM cnt;
</pre></blockquote>

<p>There are two differences in this variation.  The initial-select is
"SELECT 1" instead of "VALUES(1)".  But those are just different
syntaxes for saying exactly the same thing.  The other change is that the
recursion is stopped by a LIMIT rather than a WHERE clause.  The use of
LIMIT means that when the one-millionth row is added to the "cnt" table
(and returned by the main SELECT, thanks to the query optimizer)
then the recursion stops immediately regardless of how many rows might be
left in the queue.  On more complex queries, it can sometimes be
difficult to ensure that the WHERE clause will eventually cause the
queue to drain and the recursion to terminate.  But the LIMIT clause will
always stop the recursion.  So it is good practice to always include a
LIMIT clause as a safety if an upper bound on the size of the recursion 
is known.
3223
3224
3225
3226
3227
3228
3229
3230
3231
3232
3233
3234
3235
3236
3237
3238
3239
3240
3241
3242
3243
3244
3245
3246
3247
3248
3249
3250
3251
3252
3253
3254
3255
3256
3257
3258
3259
3260
3261
3262
3263
3264
3265
3266
3267
3268
3269
3270
3271
3272
  boss TEXT REFERENCES org,
  height INT,
  -- other content omitted
);
</pre></blockquote>

<p>Every member in the organization has a name, and most members have
a single boss.  The head of the organization has no boss and so the 
boss field would be NULL for that one row.  The rows of the "org" table
form a tree.

<p>Here is a query that computes the average height over everyone
in Alice's organization, including Alice:

<blockquote><pre>
WITH RECURSIVE
  works_for_alice(n) AS (
    VALUES('Alice')
    UNION
    SELECT name FROM org, works_for_alice
     WHERE org.boss=works_for_alice.n
  )
SELECT avg(height) FROM org
 WHERE org.name IN works_for_alice;
</pre></blockquote>

<p>Next we consider an example that uses two 
common table expressions in a single WITH clause.  
The following table records a family tree:

<blockquote><pre>
CREATE TABLE family(
  name TEXT PRIMARY KEY,
  mom TEXT REFERENCES family,
  dad TEXT REFERENCES family,
  born DATETIME,
  died DATETIME, -- NULL if still alive
  -- other content
);
</pre></blockquote>

<p>The "family" table is similar to the earlier "org" table except that 
now there are two parents to each member (as there normally is in real life).
We want to know all living ancestors of Alice, from oldest to youngest.
An ordinary common table expression, "parent_of", is defined first.  That
ordinary CTE is a view that can be used to find all parents of any
individual.  That ordinary CTE is then used in the "ancestor_of_alice"
recursive CTE.  The recursive CTE is then used in the final query:

<blockquote><pre>







|
|
<
















|















|







3225
3226
3227
3228
3229
3230
3231
3232
3233

3234
3235
3236
3237
3238
3239
3240
3241
3242
3243
3244
3245
3246
3247
3248
3249
3250
3251
3252
3253
3254
3255
3256
3257
3258
3259
3260
3261
3262
3263
3264
3265
3266
3267
3268
3269
3270
3271
3272
3273
  boss TEXT REFERENCES org,
  height INT,
  -- other content omitted
);
</pre></blockquote>

<p>Every member in the organization has a name, and most members have
a single boss.  (The head of the whole organization has a NULL
"boss" field.) The rows of the "org" table form a tree.


<p>Here is a query that computes the average height over everyone
in Alice's organization, including Alice:

<blockquote><pre>
WITH RECURSIVE
  works_for_alice(n) AS (
    VALUES('Alice')
    UNION
    SELECT name FROM org, works_for_alice
     WHERE org.boss=works_for_alice.n
  )
SELECT avg(height) FROM org
 WHERE org.name IN works_for_alice;
</pre></blockquote>

<p>The next example uses two 
common table expressions in a single WITH clause.  
The following table records a family tree:

<blockquote><pre>
CREATE TABLE family(
  name TEXT PRIMARY KEY,
  mom TEXT REFERENCES family,
  dad TEXT REFERENCES family,
  born DATETIME,
  died DATETIME, -- NULL if still alive
  -- other content
);
</pre></blockquote>

<p>The "family" table is similar to the earlier "org" table except that 
now there are two parents to each member.
We want to know all living ancestors of Alice, from oldest to youngest.
An ordinary common table expression, "parent_of", is defined first.  That
ordinary CTE is a view that can be used to find all parents of any
individual.  That ordinary CTE is then used in the "ancestor_of_alice"
recursive CTE.  The recursive CTE is then used in the final query:

<blockquote><pre>
3306
3307
3308
3309
3310
3311
3312
3313
3314
3315
3316
3317
3318
3319
3320
3321
3322
3323
3324
3325
3326
3327
3328
3329
3330
3331
3332
3333
3334
3335
3336
3337
3338
3339
3340
3341
3342
3343
3344
3345
3346
3347
3348
3349
3350



















































































3351
3352
3353
3354
3355
3356
3357
3358
  PRIMARY KEY(xfrom,xto)
);
CREATE INDEX derivedfrom_back ON derivedfrom(xto,xfrom);
</pre></blockquote>

<p>This graph is acyclic.  And we assume that the mtime of every
child checkin is no less than the mtime of all its parents.  But
unlike the earlier examples, there might be multiple paths of
differing lengths between any two checkins.

<p>We want to know the twenty most recent ancestors in time (out of
the thousands and thousands of ancestors in the whole DAG) for
checkin "@BASELINE".  (A query similar to this is used
by the <a href="http://www.fossil-scm.org/">Fossil</a> VCS to
show the 20 most recent ancestors of a check.  For example:
<a href="http://www.sqlite.org/src/timeline?p=trunk&n=20">http://www.sqlite.org/src/timeline?p=trunk&n=20</a>.)

<blockquote><pre>
WITH RECURSIVE
  ancestor(id,mtime) AS (
    SELECT id, mtime FROM checkin WHERE id=@BASELINE
    UNION
    SELECT derivedfrom.xfrom, checkin.mtime
      FROM ancestor, derivedfrom, checkin
     WHERE ancestor.id=derivedfrom.xto
       AND checkin.id=derivedfrom.xfrom
     ORDER BY checkin.mtime DESC
     LIMIT 20
  )
SELECT * FROM checkin JOIN ancestor ON id=xfrom;
</pre></blockquote>

<p>
The "ORDERBY checkin.mtime DESC" term in the recursive-select makes
the query run much faster by preventing it from following
branches that merge checkins
from long ago.  The ORDER BY forces the recursive-select to focus
on the most recent checkins, the ones we want.  Without the ORDER BY
on the recursive-select, we would be forced to find all ancestors
of the desired check-in, sort them all by mtime, then take the top
twenty.  The ORDER BY essentially sets up a priority queue that
forces the recursive query to look at the most recent ancestors first,
allowing us to insert a LIMIT close and restrict the scope of the
query to just the checkins of interest.




















































































<tcl>hd_fragment rcex3</tcl>
<h4>Outlandish Recursive Query Examples</h4>

<p>The following query computes an approximation of the Mandelbrot Set
and outputs the result as ASCII-art:

<blockquote><pre>
WITH RECURSIVE







|






|
|













|








|
|
|

|


>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
|







3307
3308
3309
3310
3311
3312
3313
3314
3315
3316
3317
3318
3319
3320
3321
3322
3323
3324
3325
3326
3327
3328
3329
3330
3331
3332
3333
3334
3335
3336
3337
3338
3339
3340
3341
3342
3343
3344
3345
3346
3347
3348
3349
3350
3351
3352
3353
3354
3355
3356
3357
3358
3359
3360
3361
3362
3363
3364
3365
3366
3367
3368
3369
3370
3371
3372
3373
3374
3375
3376
3377
3378
3379
3380
3381
3382
3383
3384
3385
3386
3387
3388
3389
3390
3391
3392
3393
3394
3395
3396
3397
3398
3399
3400
3401
3402
3403
3404
3405
3406
3407
3408
3409
3410
3411
3412
3413
3414
3415
3416
3417
3418
3419
3420
3421
3422
3423
3424
3425
3426
3427
3428
3429
3430
3431
3432
3433
3434
3435
3436
3437
3438
3439
3440
3441
3442
  PRIMARY KEY(xfrom,xto)
);
CREATE INDEX derivedfrom_back ON derivedfrom(xto,xfrom);
</pre></blockquote>

<p>This graph is acyclic.  And we assume that the mtime of every
child checkin is no less than the mtime of all its parents.  But
unlike the earlier examples, this graph might have multiple paths of
differing lengths between any two checkins.

<p>We want to know the twenty most recent ancestors in time (out of
the thousands and thousands of ancestors in the whole DAG) for
checkin "@BASELINE".  (A query similar to this is used
by the <a href="http://www.fossil-scm.org/">Fossil</a> VCS to
show the N most recent ancestors of a check.  For example:
<a href="http://www.sqlite.org/src/timeline?p=trunk&n=30">http://www.sqlite.org/src/timeline?p=trunk&n=30</a>.)

<blockquote><pre>
WITH RECURSIVE
  ancestor(id,mtime) AS (
    SELECT id, mtime FROM checkin WHERE id=@BASELINE
    UNION
    SELECT derivedfrom.xfrom, checkin.mtime
      FROM ancestor, derivedfrom, checkin
     WHERE ancestor.id=derivedfrom.xto
       AND checkin.id=derivedfrom.xfrom
     ORDER BY checkin.mtime DESC
     LIMIT 20
  )
SELECT * FROM checkin JOIN ancestor USING(id);
</pre></blockquote>

<p>
The "ORDERBY checkin.mtime DESC" term in the recursive-select makes
the query run much faster by preventing it from following
branches that merge checkins
from long ago.  The ORDER BY forces the recursive-select to focus
on the most recent checkins, the ones we want.  Without the ORDER BY
on the recursive-select, one would be forced to compute the complete set of
thousands of ancestors, sort them by all mtime, then take the top twenty.
The ORDER BY essentially sets up a priority queue that
forces the recursive query to look at the most recent ancestors first,
allowing the use of a LIMIT clause to restrict the scope of the
query to just the checkins of interest.

<tcl>hd_fragment withorderby</tcl>
<h4>Controlling Depth-First Versus Breath-First Search Of a Tree
Using ORDER BY</h4>

<p>An ORDER BY clause on the recursive-select can be used to control
whether the search of a tree is depth-first or breath-first.  To
illustrate, we will use a variation on the "org" table from an example
above, without the "height" column, and with some real data inserted:

<blockquote><pre>
CREATE TABLE org(
  name TEXT PRIMARY KEY,
  boss TEXT REFERENCES org
) WITHOUT ROWID;
INSERT INTO org VALUES('Alice',NULL);
INSERT INTO org VALUES('Bob','Alice');
INSERT INTO org VALUES('Cindy','Alice');
INSERT INTO org VALUES('Dave','Bob');
INSERT INTO org VALUES('Emma','Bob');
INSERT INTO org VALUES('Fred','Cindy');
INSERT INTO org VALUES('Gail','Cindy');
</pre></blockquote>

<p>Here is a query to show the tree structure in a breath-first pattern:

<blockquote><pre>
WITH RECURSIVE
  under_alice(name,level) AS (
    VALUES('Alice',0)
    UNION ALL
    SELECT org.name, under_alice.level+1
      FROM org JOIN under_alice ON org.name=under_alice.boss
     ORDER BY 2
  )
SELECT substr('..........',1,level*3) || name FROM under_alice;
</pre></blockquote>

<p>The "ORDER BY 2" (which means the same as "ORDER BY under_alice.level+1")
causes higher levels in the organization chart (with smaller "level" values)
to be processed first, resulting in a breath-first search.  The output is:

<blockquote><pre>
Alice
...Bob
...Cindy
......Dave
......Emma
......Fred
......Gail
</pre></blockquote>

<p>But if we change the ORDER BY clause to add the "DESC" modifier, that will
cause lower levels in the organization (with larger "level" values) to be
processed first by the recursive-select, resulting in a depth-first search:

<blockquote><pre>
WITH RECURSIVE
  under_alice(name,level) AS (
    VALUES('Alice',0)
    UNION ALL
    SELECT org.name, under_alice.level+1
      FROM org JOIN under_alice ON org.name=under_alice.boss
     ORDER BY 2 <b>DESC</b>
  )
SELECT substr('..........',1,level*3) || name FROM under_alice;
</pre></blockquote>

<p>The output of this revised query is:

<blockquote><pre>
Alice
...Bob
......Dave
......Emma
...Cindy
......Fred
......Gail
</pre></blockquote>

<p>When the ORDER BY clause is omitted from the recursive-select, the
queue behaves as a FIFO, which results in a breath-first search.


<tcl>hd_fragment mandelbrot</tcl>
<h4>Outlandish Recursive Query Examples</h4>

<p>The following query computes an approximation of the Mandelbrot Set
and outputs the result as ASCII-art:

<blockquote><pre>
WITH RECURSIVE
3375
3376
3377
3378
3379
3380
3381
3382
3383
3384
3385
3386
3387
3388
3389
3390
3391
3392
3393
3394
3395
3396
3397
3398
</pre></blockquote>

<p>In this query, the "xaxis" and "yaxis" CTEs define the grid of points for
which the Mandelbrot Set will be approximated.  Each row in the
"m(iter,cx,cy,x,y)" CTE means that after "iter" iterations, the Mandelbrot
iteration starting at cx,cy has reached point x,y.  The number of iterations
in this example is limited to 28 (which severely limits the resolution of
the computation, but this is just an example, after all).
The "m2(iter,cx,cy)" CTE holds the maximum number of iterations reached when
starting at point cx,cy.
Finally, the entry in the "a(t)" CTE holds a string 
which is a single line of the output ASCII-art.
The SELECT statement at the end just queries the "a" CTE to
retrieve all lines of ASCII-art, one by one.

<p>If you copy/paste the query above into an SQLite [command-line shell],
the output should be the following:

<blockquote><pre>
                                    ....#
                                   ..#*..
                                 ..+####+.
                            .......+####....   +
                           ..##+*##########+.++++







|


|




|
|







3459
3460
3461
3462
3463
3464
3465
3466
3467
3468
3469
3470
3471
3472
3473
3474
3475
3476
3477
3478
3479
3480
3481
3482
</pre></blockquote>

<p>In this query, the "xaxis" and "yaxis" CTEs define the grid of points for
which the Mandelbrot Set will be approximated.  Each row in the
"m(iter,cx,cy,x,y)" CTE means that after "iter" iterations, the Mandelbrot
iteration starting at cx,cy has reached point x,y.  The number of iterations
in this example is limited to 28 (which severely limits the resolution of
the computation, but is sufficient for low-resolution ASCII-art output).
The "m2(iter,cx,cy)" CTE holds the maximum number of iterations reached when
starting at point cx,cy.
Finally, each row in the "a(t)" CTE holds a string 
which is a single line of the output ASCII-art.
The SELECT statement at the end just queries the "a" CTE to
retrieve all lines of ASCII-art, one by one.

<p>Running the query above into an SQLite [command-line shell] results
in the following output:

<blockquote><pre>
                                    ....#
                                   ..#*..
                                 ..+####+.
                            .......+####....   +
                           ..##+*##########+.++++
3411
3412
3413
3414
3415
3416
3417
3418
3419
3420
3421
3422
3423
3424
3425
                            .......+####....   +
                                 ..+####+.
                                   ..#*..
                                    ....#
                                    +.
</pre></blockquote>

<tcl>hd_fragment mandelbrot</tcl>
<p>This next query solves a Sudoku puzzle.  The state of the puzzle is
defined by an 81-character string formed by reading entries from the
puzzle box row by row from left to right and then from top to bottom.
Blank squares in the puzzle are denoted by a "." character.  
Thus the input string

<blockquote>







|







3495
3496
3497
3498
3499
3500
3501
3502
3503
3504
3505
3506
3507
3508
3509
                            .......+####....   +
                                 ..+####+.
                                   ..#*..
                                    ....#
                                    +.
</pre></blockquote>

<tcl>hd_fragment sudoku</tcl>
<p>This next query solves a Sudoku puzzle.  The state of the puzzle is
defined by an 81-character string formed by reading entries from the
puzzle box row by row from left to right and then from top to bottom.
Blank squares in the puzzle are denoted by a "." character.  
Thus the input string

<blockquote>
3489
3490
3491
3492
3493
3494
3495
3496
3497
3498
3499
3500
3501
3502
3503

3504
3505
3506
3507
3508
3509
3510
position.  The complicated "NOT EXISTS" subquery is the magic that
figures out whether or not each candidate "s" string is a valid
sudoku puzzle or not.

<p>The final answer is found by looking for a string with ind==0.
If the original sudoku problem did not have a unique solution, then
the query will return all possible solutions.  If the original problem
was unsolvable, then no rows will be returned.  In this case, the
answer of:

<blockquote>
534678912672195348198342567859761423426853791713924856961537284287419635345286179
</blockquote>

<p>Is computed in about 300 milliseconds.


<h3>Limitations And Caveats</h3>

<ul>
<li><p>
The WITH clause cannot be used within a [CREATE TRIGGER].
<li><p>







|
|





|
>







3573
3574
3575
3576
3577
3578
3579
3580
3581
3582
3583
3584
3585
3586
3587
3588
3589
3590
3591
3592
3593
3594
3595
position.  The complicated "NOT EXISTS" subquery is the magic that
figures out whether or not each candidate "s" string is a valid
sudoku puzzle or not.

<p>The final answer is found by looking for a string with ind==0.
If the original sudoku problem did not have a unique solution, then
the query will return all possible solutions.  If the original problem
was unsolvable, then no rows will be returned.  In this case, the unique
answer is:

<blockquote>
534678912672195348198342567859761423426853791713924856961537284287419635345286179
</blockquote>

<p>The solution was computed in less than 300 milliseconds on a modern
workstation.

<h3>Limitations And Caveats</h3>

<ul>
<li><p>
The WITH clause cannot be used within a [CREATE TRIGGER].
<li><p>