Documentation Source Text

Check-in [52484ca596]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Incremental checkin of work on the common-table-expression documentation.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 52484ca596d71ae91758706b648004c9892a177e
User & Date: drh 2014-01-23 22:25:07.380
Context
2014-01-24
01:20
First cut at WITH documentation. (check-in: 2636279279 user: drh tags: trunk)
2014-01-23
22:25
Incremental checkin of work on the common-table-expression documentation. (check-in: 52484ca596 user: drh tags: trunk)
18:37
Refactoring many of the bubble syntax diagrams, especially related to SELECT. Attempting to make the bubble diagrams easier to follow. (check-in: 9aa1b8df9b user: drh tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to pages/lang.in.
26
27
28
29
30
31
32

33
34
35
36
37
38
39
    {{DROP TABLE} droptable}
    {{DROP INDEX} dropindex}
    {INSERT insert}
    {REPLACE replace}
    {DELETE delete}
    {UPDATE update}
    {SELECT select}

    {comment comment}
    {EXPLAIN explain}
    {expression expr}
    {{BEGIN TRANSACTION} transaction}
    {{COMMIT TRANSACTION} transaction COMMIT}
    {{END TRANSACTION} transaction COMMIT}
    {{ROLLBACK TRANSACTION} transaction ROLLBACK}







>







26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
    {{DROP TABLE} droptable}
    {{DROP INDEX} dropindex}
    {INSERT insert}
    {REPLACE replace}
    {DELETE delete}
    {UPDATE update}
    {SELECT select}
    {{WITH clause} withclause}
    {comment comment}
    {EXPLAIN explain}
    {expression expr}
    {{BEGIN TRANSACTION} transaction}
    {{COMMIT TRANSACTION} transaction COMMIT}
    {{END TRANSACTION} transaction COMMIT}
    {{ROLLBACK TRANSACTION} transaction ROLLBACK}
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
The EXPLAIN QUERY PLAN command is described in 
[explain query plan|more detail here].

<tcl>
##############################################################################
Section expression expr {*expression {expression syntax}}

RecursiveBubbleDiagram expr 1
</tcl>

<p>This section is different from the others.  Most other sections of
this document talks about a particular SQL command.  This section does
not talk about a standalone command but about "expressions" which are 
subcomponents of most other commands.</p>








|







1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
The EXPLAIN QUERY PLAN command is described in 
[explain query plan|more detail here].

<tcl>
##############################################################################
Section expression expr {*expression {expression syntax}}

RecursiveBubbleDiagram expr
</tcl>

<p>This section is different from the others.  Most other sections of
this document talks about a particular SQL command.  This section does
not talk about a standalone command but about "expressions" which are 
subcomponents of most other commands.</p>

3023
3024
3025
3026
3027
3028
3029






































































































































































































3030
3031
3032
3033
3034
3035
3036
</tcl>

<p>^The REPLACE command is an alias for the "[ON CONFLICT | INSERT OR REPLACE]"
variant of the [INSERT] command.  
This alias is provided for compatibility other SQL database engines.  See the 
[INSERT] command documentation for additional information.</p>  







































































































































































































<tcl>
###############################################################################
Section SELECT select {SELECT query}

RecursiveBubbleDiagram select-stmt
</tcl>








>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







3024
3025
3026
3027
3028
3029
3030
3031
3032
3033
3034
3035
3036
3037
3038
3039
3040
3041
3042
3043
3044
3045
3046
3047
3048
3049
3050
3051
3052
3053
3054
3055
3056
3057
3058
3059
3060
3061
3062
3063
3064
3065
3066
3067
3068
3069
3070
3071
3072
3073
3074
3075
3076
3077
3078
3079
3080
3081
3082
3083
3084
3085
3086
3087
3088
3089
3090
3091
3092
3093
3094
3095
3096
3097
3098
3099
3100
3101
3102
3103
3104
3105
3106
3107
3108
3109
3110
3111
3112
3113
3114
3115
3116
3117
3118
3119
3120
3121
3122
3123
3124
3125
3126
3127
3128
3129
3130
3131
3132
3133
3134
3135
3136
3137
3138
3139
3140
3141
3142
3143
3144
3145
3146
3147
3148
3149
3150
3151
3152
3153
3154
3155
3156
3157
3158
3159
3160
3161
3162
3163
3164
3165
3166
3167
3168
3169
3170
3171
3172
3173
3174
3175
3176
3177
3178
3179
3180
3181
3182
3183
3184
3185
3186
3187
3188
3189
3190
3191
3192
3193
3194
3195
3196
3197
3198
3199
3200
3201
3202
3203
3204
3205
3206
3207
3208
3209
3210
3211
3212
3213
3214
3215
3216
3217
3218
3219
3220
3221
3222
3223
3224
3225
3226
3227
3228
3229
3230
3231
3232
3233
3234
3235
</tcl>

<p>^The REPLACE command is an alias for the "[ON CONFLICT | INSERT OR REPLACE]"
variant of the [INSERT] command.  
This alias is provided for compatibility other SQL database engines.  See the 
[INSERT] command documentation for additional information.</p>  

<tcl>
###############################################################################
Section {WITH clause} with {{common table expressions}}

RecursiveBubbleDiagram with-clause
</tcl>

<p>Common Table Expressions or CTEs are temporary views or tables that exist
only for the duration of a single SQL statement.  There are two kinds of
common table expressions: "ordinary" and "recursive". Ordinary 
common table expressions are
a syntactic convenience that make queries easier to read by separating out
subqueries.  Recursive common table expression
provide the ability to do a hiearchical or
recursive query of a tree or graph.

All common table expressions (ordinary and recursive) are 
created by prepending a WITH clause in front of a [SELECT], [INSERT], [DELETE],
or [UPDATE] statement.  A single WITH clause can specify one or more
common table expressions.

<tcl>hd_fragment ordinarycte {ordinary common table expressions}</tcl>
<h3>Ordinary Common Table Expressions</h3>

<p>An ordinary common table expression works as if it where a [view] that
exists for the duration of a single statement.  Ordinary common table
expressions are useful for factoring out subqueries and making the overall
SQL statement easier to read and understand.

<tcl>
hd_fragment recursivecte {recursive common table expressions} \
{recursive query}
</tcl>
<h3>Recursive Common Table Expressions</h3>

<p>A recursive common table expression can be used to write a query that
walks a tree or graph.  A recursive common table expression has the same
basic syntax as an ordinary common table expression, but with the following
additional features:

<ol>
<li> The "select-stmt" after the AS keyword in the common table expression
     must be a [compound select] where the right-most compound operator is
     either UNION or UNION ALL.
<li> The table named on the left-hand side of the AS keyword must appear
     exactly once in the FROM clause of the right-most SELECT statement
     of the compound select, and nowhere else.
</ol>

<p>To put it another way, a recursive common table expression must
look like the following:

RecursiveBubbleDiagram recursive-cte

<p>Call the cte-table-name for a recursive common table expression
the "recursive table".  In the bubble diagram above, the recursive
table must appear exactly once in the FROM clause of the recursive-select
and must not appear anywhere else in either the initial-select or the
recursive-select, including subqueries.  The initial-select may be
a compound-select, but it may not include an ORDER BY, LIMIT, or OFFSET.
The recursive-select must be a simple select (not a compound).  The
recursive-select is allowed to include an ORDER BY, LIMIT, and/or OFFSET.

<p>The basic algorithm for computing the content of the recursive table
is as follows:

<ol>
<li> Run the initial-select and add the results to a queue.
<li> While the queue is not empty:
<ol type="a">
<li> Extract a single row from the queue.
<li> Write that single row into the recursive table
<li> Pretend that the single row just extracted is the only
     row in the recursive table and run the recursive-select,
     adding all results to the queue.
</ol>
</ol>

<p>The basic procedure above may modified by the following additional rules:

<ul>
<li><p>
  If the the initial-select and the recursive-select are connected by
  UNION, then only add rows to the queue if no identical row has been
  previously added to the queue.  Repeated rows are discarded before being
  added to the queue even if the repeated rows have already been extracted
  from the queue by the recursion stop.  If the operator is UNION ALL,
  then all rows generated by both the initial-select and the
  recursive-select are always added to the queue even if there are repeats.
<li><p>
  The LIMIT clause, if present, restrict which rows are added
  to the recursive table in step 2a.  If a LIMIT clause is present, it 
  restricts the total number of rows that will be added to the recursive table.
  Once the limit is reached, the recursion stops.  (As is always the case 
  with LIMIT, a limit of zero means that no rows are ever added to the
  recursive table, and a negative limit means an unlimited number of rows
  may be added to the recursive table.)
<li><p>
  The OFFSET clause, if it present and has a positive value N, prevents the
  first N rows from being added to the recursive table.
  The first N rows are still processed by the recursive-select; they
  just are not added to the recursive table.
<li><p>
  If an ORDER BY clause is present, it determines the order in which rows
  are extracted from the queue in step 2a.  If there is no ORDER BY clause,
  then the order in which rows are extracted is undefined.  (In the current
  implementation, the queue becomes a FIFO if the ORDER BY clause is omitted,
  but applications should not depend on that fact since it might change.)
</ul>

<h4>Recursive Query Examples</h4>

<p>The following query returns all integers between 1 and 1000000:

<blockquote><pre>
WITH RECURSIVE
  cnt(x) AS (VALUES(1) UNION ALL SELECT x+1 FROM cnt WHERE x<1000000)
SELECT x FROM cnt;
</pre></blockquote>

<p>Consider how this query works.  The initial-select, which is this
case is just "VALUES(1") runs first (step 1) and returns a single row
with a single column "1".  This one row is added to the queue.  In
step 2a, that one row is extracted from the queue and added to "cnt".
Then the recursive query is run in accordance with step 2c generating
a single new row with value "2" to add to the queue.  The queue still
has one row, so step 2 repeats.  The 2 is extracted and added to the
recursive table by steps 2a and 2b.  Then the row containing 2 is used 
as if where the complete content of the recursive table and the 
recursive select is run again, resulting in a value of 3 being added
to the queue.  This repeats 999999 times until finally at step 2a the
only value on the queue is a row containing 1000000.  That row is
extracted and added to the recursive table.  But this time, the
WHERE clause of the recursive select prevents it from generating any
new rows.  On the next cycle, the queue is empty and the recursion
stops.

<p><b>Optimization note:</b>
In the discussion above, when we say things like "write the row into
the recursive table", we mean than conceptually, not in reality.
Read literally, it sounds as if SQLite is accumulating a huge table
containing one million rows, then going back and scanning that table
from top to bottom to generate the result.  What really happens though,
in this case, is that the query optimizer realizes that values in the
"cnt" recursive table are only used once.  So as each row is added to
the recursive table, it is immediately returned as a result of the
SELECT statement and then discarded.  SQLite does <em>not</em> accumulate
a temporary table containing a million rows.  Very little memory is
needed to run the above example.  However, if the example had used
UNION instead of UNION ALL, then SQLite would have had to keep around
all previously generated content in order to check for duplicates.
For this reason, programmers should strive to use UNION ALL instead
of UNION when feasible.

<p>Here is a variation on the same example:

<blockquote><pre>
WITH RECURSIVE
  cnt(x) AS (
     SELECT 1
     UNION ALL
     SELECT x+1 FROM cnt LIMIT 1000000
  )
SELECT x FROM cnt;
</pre></blockquote>

<p>There are two differences in this variation.  The initial-select is
"SELECT 1" instead of "VALUES(1)".  But those are really just different
syntaxes for saying exactly the same thing.  The other change is that the
recursion is stopped by a LIMIT rather than a WHERE clause.  The use of
LIMIT means that when the one-millionth row is added to the "cnt" table
(and immediately returned by the SELECT, thanks to the query optimizer)
then the recursion stops immediately regardless of how many rows might be
left in the queue.  On more complex queries, it can sometimes be
difficult to ensure that the WHERE clause will eventually cause the
queue to drain and the recursion to terminate.  But the LIMIT clause will
aways stop the recursions.  So it is good practice to always include a
LIMIT clause as a safety if an upper bound on the size of the recursion 
is known.




<h3>Limitations And Caveats</h3>

<ul>
<li><p>
The WITH clause cannot be used within a [CREATE TRIGGER].
<li><p>
The WITH clause must appear at the beginning of a top-level [SELECT] statement
or at the beginning of a subquery.  The WITH clause cannot be prepended to
the second or subsequent SELECT statement of a [compound select].
<li><p>
The SQL:1999 spec requires that the RECURSIVE keyword follow WITH in any
WITH clause that includs a recursive common table expression.  However, for
compatibility with SqlServer and Oracle, SQLite does not enforce this rule.
</ul>

<tcl>
###############################################################################
Section SELECT select {SELECT query}

RecursiveBubbleDiagram select-stmt
</tcl>

3667
3668
3669
3670
3671
3672
3673
3674
3675
3676
3677
3678
3679
3680
3681
   CAST
   CHECK
   COLLATE
   COLUMN
   COMMIT
   CONFLICT
   CONSTRAINT
   COVERING
   CREATE
   CROSS
   CURRENT_DATE
   CURRENT_TIME
   CURRENT_TIMESTAMP
   DATABASE
   DEFAULT







<







3866
3867
3868
3869
3870
3871
3872

3873
3874
3875
3876
3877
3878
3879
   CAST
   CHECK
   COLLATE
   COLUMN
   COMMIT
   CONFLICT
   CONSTRAINT

   CREATE
   CROSS
   CURRENT_DATE
   CURRENT_TIME
   CURRENT_TIMESTAMP
   DATABASE
   DEFAULT
3734
3735
3736
3737
3738
3739
3740

3741
3742
3743
3744
3745
3746
3747
   ORDER
   OUTER
   PLAN
   PRAGMA
   PRIMARY
   QUERY
   RAISE

   REFERENCES
   REGEXP
   REINDEX
   RELEASE
   RENAME
   REPLACE
   RESTRICT







>







3932
3933
3934
3935
3936
3937
3938
3939
3940
3941
3942
3943
3944
3945
3946
   ORDER
   OUTER
   PLAN
   PRAGMA
   PRIMARY
   QUERY
   RAISE
   RECURSIVE
   REFERENCES
   REGEXP
   REINDEX
   RELEASE
   RENAME
   REPLACE
   RESTRICT
3762
3763
3764
3765
3766
3767
3768
3769
3770
3771
3772


3773
3774
3775
3776
3777
3778
3779
   UNIQUE
   UPDATE
   USING
   VACUUM
   VALUES
   VIEW
   VIRTUAL
   WHEN
   WHERE
   WITH
   WITHOUT


}]

hd_puts {<DIV class="pdf_section">}
Section {SQLite Keywords} keywords {{*SQL keyword} {SQL keywords}}
hd_puts {</DIV>}
</tcl>








<
<


>
>







3961
3962
3963
3964
3965
3966
3967


3968
3969
3970
3971
3972
3973
3974
3975
3976
3977
3978
   UNIQUE
   UPDATE
   USING
   VACUUM
   VALUES
   VIEW
   VIRTUAL


   WITH
   WITHOUT
   WHEN
   WHERE
}]

hd_puts {<DIV class="pdf_section">}
Section {SQLite Keywords} keywords {{*SQL keyword} {SQL keywords}}
hd_puts {</DIV>}
</tcl>

3827
3828
3829
3830
3831
3832
3833
3834
3835
3836
3837
3838
3839
3840
3841
is allowed, then the token is understood to be a string literal instead
of an identifier.</p></li>
</ul>

<p>Programmers are cautioned not to use the two exceptions described in
the previous bullets.  We emphasize that they exist only so that old
and ill-formed SQL statements will run correctly.  Future versions of
SQLite might change to raise errors instead of accepting the malformed
statements covered by the exceptions above.</p>

<p>
SQLite adds new keywords from time to time when it takes on new features.
So to prevent your code from being broken by future enhancements, you should
normally quote any identifier that is an English language word, even if
you do not have to.







|







4026
4027
4028
4029
4030
4031
4032
4033
4034
4035
4036
4037
4038
4039
4040
is allowed, then the token is understood to be a string literal instead
of an identifier.</p></li>
</ul>

<p>Programmers are cautioned not to use the two exceptions described in
the previous bullets.  We emphasize that they exist only so that old
and ill-formed SQL statements will run correctly.  Future versions of
SQLite might raise errors instead of accepting the malformed
statements covered by the exceptions above.</p>

<p>
SQLite adds new keywords from time to time when it takes on new features.
So to prevent your code from being broken by future enhancements, you should
normally quote any identifier that is an English language word, even if
you do not have to.