Documentation Source Text

Check-in [5415be54c0]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Fix typos in the WITH clause documentation reported on the mailing list.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 5415be54c097d7c5251ccc328b21dc1e0901f896
User & Date: drh 2014-01-24 23:43:28.779
Context
2014-01-29
13:52
Typo fix. CAST(...TO...) should be CAST(...AS...). (check-in: 64427669a7 user: drh tags: trunk)
2014-01-24
23:43
Fix typos in the WITH clause documentation reported on the mailing list. (check-in: 5415be54c0 user: drh tags: trunk)
13:20
Add examples to WITH documentation showing how to control depth-first versus breath-first search of a tree using an ORDER BY clause on the recursive query. (check-in: fb11eda20b user: drh tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to pages/lang.in.
3050
3051
3052
3053
3054
3055
3056
3057
3058
3059
3060
3061
3062
3063
3064
3065
3066
3067
3068
3069
3070
3071
3072
3073
3074
3075
3076
3077
3078
3079
3080
3081
created by prepending a WITH clause in front of a [SELECT], [INSERT], [DELETE],
or [UPDATE] statement.  A single WITH clause can specify one or more
common table expressions.

<tcl>hd_fragment ordinarycte {ordinary common table expressions}</tcl>
<h3>Ordinary Common Table Expressions</h3>

<p>An ordinary common table expression works as if it where a [view] that
exists for the duration of a single statement.  Ordinary common table
expressions are useful for factoring out subqueries and making the overall
SQL statement easier to read and understand.

<tcl>
hd_fragment recursivecte {recursive common table expressions} \
{recursive query}
</tcl>
<h3>Recursive Common Table Expressions</h3>

<p>A recursive common table expression can be used to write a query that
walks a tree or graph.  A recursive common table expression has the same
basic syntax as an ordinary common table expression, but with the following
additional features:

<ol>
<li> The "select-stmt" after the AS keyword
     must be a [compound select] where the right-most [compound-operator] is
     either UNION or UNION ALL.
<li> The table named on the left-hand side of the AS keyword must appear
     exactly once in the FROM clause of the right-most SELECT statement
     of the compound select, and nowhere else.
</ol>








|
















|







3050
3051
3052
3053
3054
3055
3056
3057
3058
3059
3060
3061
3062
3063
3064
3065
3066
3067
3068
3069
3070
3071
3072
3073
3074
3075
3076
3077
3078
3079
3080
3081
created by prepending a WITH clause in front of a [SELECT], [INSERT], [DELETE],
or [UPDATE] statement.  A single WITH clause can specify one or more
common table expressions.

<tcl>hd_fragment ordinarycte {ordinary common table expressions}</tcl>
<h3>Ordinary Common Table Expressions</h3>

<p>An ordinary common table expression works as if it were a [view] that
exists for the duration of a single statement.  Ordinary common table
expressions are useful for factoring out subqueries and making the overall
SQL statement easier to read and understand.

<tcl>
hd_fragment recursivecte {recursive common table expressions} \
{recursive query}
</tcl>
<h3>Recursive Common Table Expressions</h3>

<p>A recursive common table expression can be used to write a query that
walks a tree or graph.  A recursive common table expression has the same
basic syntax as an ordinary common table expression, but with the following
additional features:

<ol>
<li> The "select-stmt"
     must be a [compound select] where the right-most [compound-operator] is
     either UNION or UNION ALL.
<li> The table named on the left-hand side of the AS keyword must appear
     exactly once in the FROM clause of the right-most SELECT statement
     of the compound select, and nowhere else.
</ol>

3109
3110
3111
3112
3113
3114
3115
3116
3117
3118
3119
3120
3121
3122
3123
</ol>
</ol>

<p>The basic procedure above may modified by the following additional rules:

<ul>
<li><p>
  If a UNION operator connects the the initial-select with the
  recursive-select, then only add rows to the queue if no identical row has
  been previously added to the queue.  Repeated rows are discarded before being
  added to the queue even if the repeated rows have already been extracted
  from the queue by the recursion step.  If the operator is UNION ALL,
  then all rows generated by both the initial-select and the
  recursive-select are always added to the queue even if they are repeats.
  When determining if a row is repeated, NULL values compare







|







3109
3110
3111
3112
3113
3114
3115
3116
3117
3118
3119
3120
3121
3122
3123
</ol>
</ol>

<p>The basic procedure above may modified by the following additional rules:

<ul>
<li><p>
  If a UNION operator connects the initial-select with the
  recursive-select, then only add rows to the queue if no identical row has
  been previously added to the queue.  Repeated rows are discarded before being
  added to the queue even if the repeated rows have already been extracted
  from the queue by the recursion step.  If the operator is UNION ALL,
  then all rows generated by both the initial-select and the
  recursive-select are always added to the queue even if they are repeats.
  When determining if a row is repeated, NULL values compare
3158
3159
3160
3161
3162
3163
3164
3165
3166
3167
3168
3169
3170
3171
3172
runs first and returns a single row
with a single column "1".  This one row is added to the queue.  In
step 2a, that one row is extracted from the queue and added to "cnt".
Then the recursive-select is run in accordance with step 2c generating
a single new row with value "2" to add to the queue.  The queue still
has one row, so step 2 repeats.  The "2" row is extracted and added to the
recursive table by steps 2a and 2b.  Then the row containing 2 is used 
as if where the complete content of the recursive table and the 
recursive-select is run again, resulting in a row with value "3" being added
to the queue.  This repeats 999999 times until finally at step 2a the
only value on the queue is a row containing 1000000.  That row is
extracted and added to the recursive table.  But this time, the
WHERE clause causes the recursive-select to return no rows, so the
queue remains empty and the recursion stops.








|







3158
3159
3160
3161
3162
3163
3164
3165
3166
3167
3168
3169
3170
3171
3172
runs first and returns a single row
with a single column "1".  This one row is added to the queue.  In
step 2a, that one row is extracted from the queue and added to "cnt".
Then the recursive-select is run in accordance with step 2c generating
a single new row with value "2" to add to the queue.  The queue still
has one row, so step 2 repeats.  The "2" row is extracted and added to the
recursive table by steps 2a and 2b.  Then the row containing 2 is used 
as if it were the complete content of the recursive table and the 
recursive-select is run again, resulting in a row with value "3" being added
to the queue.  This repeats 999999 times until finally at step 2a the
only value on the queue is a row containing 1000000.  That row is
extracted and added to the recursive table.  But this time, the
WHERE clause causes the recursive-select to return no rows, so the
queue remains empty and the recursion stops.

3190
3191
3192
3193
3194
3195
3196
3197

3198
3199
3200
3201
3202
3203
3204
<p>Here is a variation on the previous example:

<blockquote><pre>
WITH RECURSIVE
  cnt(x) AS (
     SELECT 1
     UNION ALL
     SELECT x+1 FROM cnt LIMIT 1000000

  )
SELECT x FROM cnt;
</pre></blockquote>

<p>There are two differences in this variation.  The initial-select is
"SELECT 1" instead of "VALUES(1)".  But those are just different
syntaxes for saying exactly the same thing.  The other change is that the







|
>







3190
3191
3192
3193
3194
3195
3196
3197
3198
3199
3200
3201
3202
3203
3204
3205
<p>Here is a variation on the previous example:

<blockquote><pre>
WITH RECURSIVE
  cnt(x) AS (
     SELECT 1
     UNION ALL
     SELECT x+1 FROM cnt
      LIMIT 1000000
  )
SELECT x FROM cnt;
</pre></blockquote>

<p>There are two differences in this variation.  The initial-select is
"SELECT 1" instead of "VALUES(1)".  But those are just different
syntaxes for saying exactly the same thing.  The other change is that the
3213
3214
3215
3216
3217
3218
3219
3220
3221
3222
3223
3224
3225
3226
3227
LIMIT clause as a safety if an upper bound on the size of the recursion 
is known.

<tcl>hd_fragment rcex2</tcl>
<h4>Hierarchical Query Examples</h4>

<p>Consider a table that describes the members of an organization as
well as the chain-of-command within that organization.

<blockquote><pre>
CREATE TABLE org(
  name TEXT PRIMARY KEY,
  boss TEXT REFERENCES org,
  height INT,
  -- other content omitted







|







3214
3215
3216
3217
3218
3219
3220
3221
3222
3223
3224
3225
3226
3227
3228
LIMIT clause as a safety if an upper bound on the size of the recursion 
is known.

<tcl>hd_fragment rcex2</tcl>
<h4>Hierarchical Query Examples</h4>

<p>Consider a table that describes the members of an organization as
well as the chain-of-command within that organization:

<blockquote><pre>
CREATE TABLE org(
  name TEXT PRIMARY KEY,
  boss TEXT REFERENCES org,
  height INT,
  -- other content omitted
3333
3334
3335
3336
3337
3338
3339
3340
3341
3342
3343
3344
3345
3346
3347
3348
3349
3350
3351
3352
3353
3354
3355
3356
3357
3358
3359
3360
3361
3362
3363
3364
3365
3366
3367
3368
3369
3370
3371
3372
3373
3374
3375
3376
3377
3378
3379
3380
3381
3382
3383
3384
3385
3386
3387
3388
3389
3390
3391
3392
3393
3394
3395
3396
3397
3398
     ORDER BY checkin.mtime DESC
     LIMIT 20
  )
SELECT * FROM checkin JOIN ancestor USING(id);
</pre></blockquote>

<p>
The "ORDERBY checkin.mtime DESC" term in the recursive-select makes
the query run much faster by preventing it from following
branches that merge checkins
from long ago.  The ORDER BY forces the recursive-select to focus
on the most recent checkins, the ones we want.  Without the ORDER BY
on the recursive-select, one would be forced to compute the complete set of
thousands of ancestors, sort them by all mtime, then take the top twenty.
The ORDER BY essentially sets up a priority queue that
forces the recursive query to look at the most recent ancestors first,
allowing the use of a LIMIT clause to restrict the scope of the
query to just the checkins of interest.

<tcl>hd_fragment withorderby</tcl>
<h4>Controlling Depth-First Versus Breath-First Search Of a Tree
Using ORDER BY</h4>

<p>An ORDER BY clause on the recursive-select can be used to control
whether the search of a tree is depth-first or breath-first.  To
illustrate, we will use a variation on the "org" table from an example
above, without the "height" column, and with some real data inserted:

<blockquote><pre>
CREATE TABLE org(
  name TEXT PRIMARY KEY,
  boss TEXT REFERENCES org
) WITHOUT ROWID;
INSERT INTO org VALUES('Alice',NULL);
INSERT INTO org VALUES('Bob','Alice');
INSERT INTO org VALUES('Cindy','Alice');
INSERT INTO org VALUES('Dave','Bob');
INSERT INTO org VALUES('Emma','Bob');
INSERT INTO org VALUES('Fred','Cindy');
INSERT INTO org VALUES('Gail','Cindy');
</pre></blockquote>

<p>Here is a query to show the tree structure in a breath-first pattern:

<blockquote><pre>
WITH RECURSIVE
  under_alice(name,level) AS (
    VALUES('Alice',0)
    UNION ALL
    SELECT org.name, under_alice.level+1
      FROM org JOIN under_alice ON org.name=under_alice.boss
     ORDER BY 2
  )
SELECT substr('..........',1,level*3) || name FROM under_alice;
</pre></blockquote>

<p>The "ORDER BY 2" (which means the same as "ORDER BY under_alice.level+1")
causes higher levels in the organization chart (with smaller "level" values)
to be processed first, resulting in a breath-first search.  The output is:

<blockquote><pre>
Alice
...Bob
...Cindy
......Dave
......Emma







|





|






|



|

















|















|







3334
3335
3336
3337
3338
3339
3340
3341
3342
3343
3344
3345
3346
3347
3348
3349
3350
3351
3352
3353
3354
3355
3356
3357
3358
3359
3360
3361
3362
3363
3364
3365
3366
3367
3368
3369
3370
3371
3372
3373
3374
3375
3376
3377
3378
3379
3380
3381
3382
3383
3384
3385
3386
3387
3388
3389
3390
3391
3392
3393
3394
3395
3396
3397
3398
3399
     ORDER BY checkin.mtime DESC
     LIMIT 20
  )
SELECT * FROM checkin JOIN ancestor USING(id);
</pre></blockquote>

<p>
The "ORDER BY checkin.mtime DESC" term in the recursive-select makes
the query run much faster by preventing it from following
branches that merge checkins
from long ago.  The ORDER BY forces the recursive-select to focus
on the most recent checkins, the ones we want.  Without the ORDER BY
on the recursive-select, one would be forced to compute the complete set of
thousands of ancestors, sort them all by mtime, then take the top twenty.
The ORDER BY essentially sets up a priority queue that
forces the recursive query to look at the most recent ancestors first,
allowing the use of a LIMIT clause to restrict the scope of the
query to just the checkins of interest.

<tcl>hd_fragment withorderby</tcl>
<h4>Controlling Depth-First Versus Breadth-First Search Of a Tree
Using ORDER BY</h4>

<p>An ORDER BY clause on the recursive-select can be used to control
whether the search of a tree is depth-first or breadth-first.  To
illustrate, we will use a variation on the "org" table from an example
above, without the "height" column, and with some real data inserted:

<blockquote><pre>
CREATE TABLE org(
  name TEXT PRIMARY KEY,
  boss TEXT REFERENCES org
) WITHOUT ROWID;
INSERT INTO org VALUES('Alice',NULL);
INSERT INTO org VALUES('Bob','Alice');
INSERT INTO org VALUES('Cindy','Alice');
INSERT INTO org VALUES('Dave','Bob');
INSERT INTO org VALUES('Emma','Bob');
INSERT INTO org VALUES('Fred','Cindy');
INSERT INTO org VALUES('Gail','Cindy');
</pre></blockquote>

<p>Here is a query to show the tree structure in a breadth-first pattern:

<blockquote><pre>
WITH RECURSIVE
  under_alice(name,level) AS (
    VALUES('Alice',0)
    UNION ALL
    SELECT org.name, under_alice.level+1
      FROM org JOIN under_alice ON org.name=under_alice.boss
     ORDER BY 2
  )
SELECT substr('..........',1,level*3) || name FROM under_alice;
</pre></blockquote>

<p>The "ORDER BY 2" (which means the same as "ORDER BY under_alice.level+1")
causes higher levels in the organization chart (with smaller "level" values)
to be processed first, resulting in a breadth-first search.  The output is:

<blockquote><pre>
Alice
...Bob
...Cindy
......Dave
......Emma
3425
3426
3427
3428
3429
3430
3431
3432
3433
3434
3435
3436
3437
3438
3439
......Emma
...Cindy
......Fred
......Gail
</pre></blockquote>

<p>When the ORDER BY clause is omitted from the recursive-select, the
queue behaves as a FIFO, which results in a breath-first search.


<tcl>hd_fragment mandelbrot</tcl>
<h4>Outlandish Recursive Query Examples</h4>

<p>The following query computes an approximation of the Mandelbrot Set
and outputs the result as ASCII-art:







|







3426
3427
3428
3429
3430
3431
3432
3433
3434
3435
3436
3437
3438
3439
3440
......Emma
...Cindy
......Fred
......Gail
</pre></blockquote>

<p>When the ORDER BY clause is omitted from the recursive-select, the
queue behaves as a FIFO, which results in a breadth-first search.


<tcl>hd_fragment mandelbrot</tcl>
<h4>Outlandish Recursive Query Examples</h4>

<p>The following query computes an approximation of the Mandelbrot Set
and outputs the result as ASCII-art:
3500
3501
3502
3503
3504
3505
3506
3507
3508
3509
3510
3511
3512
3513
3514
</pre></blockquote>

<tcl>hd_fragment sudoku</tcl>
<p>This next query solves a Sudoku puzzle.  The state of the puzzle is
defined by an 81-character string formed by reading entries from the
puzzle box row by row from left to right and then from top to bottom.
Blank squares in the puzzle are denoted by a "." character.  
Thus the input string

<blockquote>
53..7....6..195....98....6.8...6...34..8.3..17...2...6.6....28....419..5....8..79
</blockquote>

<p>Corresponds to a puzzle like this:








|







3501
3502
3503
3504
3505
3506
3507
3508
3509
3510
3511
3512
3513
3514
3515
</pre></blockquote>

<tcl>hd_fragment sudoku</tcl>
<p>This next query solves a Sudoku puzzle.  The state of the puzzle is
defined by an 81-character string formed by reading entries from the
puzzle box row by row from left to right and then from top to bottom.
Blank squares in the puzzle are denoted by a "." character.  
Thus the input string:

<blockquote>
53..7....6..195....98....6.8...6...34..8.3..17...2...6.6....28....419..5....8..79
</blockquote>

<p>Corresponds to a puzzle like this:

3557
3558
3559
3560
3561
3562
3563
3564
3565
3566
3567
3568
3569
3570
3571
         )
  )
SELECT s FROM x WHERE ind=0;
</pre></blockquote>

<p>The "input" CTE defines the input puzzle.
The "digits" CTE defines a table that holds all digits between 1 and 9.
The work of solving the puzzle is undertaking by the "x" CTE.
An entry in x(s,ind) means that the 81-character string "s" is a valid
sudoku puzzle (it has no conflicts) and that the first unknown character
is at position "ind", or ind==0 if all character positions are filled in.
The goal, then, is to compute entries for "x" with an "ind" of 0.

<p>The solver works by adding new entries to the "x" recursive table.
Given prior entries, the recursive-select tries to fill in a single new







|







3558
3559
3560
3561
3562
3563
3564
3565
3566
3567
3568
3569
3570
3571
3572
         )
  )
SELECT s FROM x WHERE ind=0;
</pre></blockquote>

<p>The "input" CTE defines the input puzzle.
The "digits" CTE defines a table that holds all digits between 1 and 9.
The work of solving the puzzle is undertaken by the "x" CTE.
An entry in x(s,ind) means that the 81-character string "s" is a valid
sudoku puzzle (it has no conflicts) and that the first unknown character
is at position "ind", or ind==0 if all character positions are filled in.
The goal, then, is to compute entries for "x" with an "ind" of 0.

<p>The solver works by adding new entries to the "x" recursive table.
Given prior entries, the recursive-select tries to fill in a single new