Documentation Source Text

Check-in [2441ba87c8]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Changes to lang_select.html to make it more testable.
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 2441ba87c8e2403757750c387132149f99e09ac6
User & Date: dan 2010-09-03 19:07:58
Context
2010-09-04
18:36
Fix a couple of errors in the recent changes to lang_select.html. check-in: b47f9e77a6 user: dan tags: trunk
2010-09-03
19:07
Changes to lang_select.html to make it more testable. check-in: 2441ba87c8 user: dan tags: trunk
2010-09-02
23:37
Tweaks to the formatting of the requirements traceability matrix. check-in: e4a472e7f6 user: drh tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to pages/lang.in.

2582
2583
2584
2585
2586
2587
2588
2589

2590
2591
2592
2593




















































































































2594
2595

2596

2597
2598
2599




2600
2601
2602
2603
2604




2605
2606
2607
2608
2609
2610
2611
2612
2613
2614








2615
2616
2617




2618
2619
2620
2621







2622
2623
2624







2625











































2626
2627
2628
2629
2630
2631
2632
....
2665
2666
2667
2668
2669
2670
2671
2672
2673
2674
2675
2676
2677
2678
2679
2680
2681
2682
2683
2684
2685
2686
2687
2688
2689
2690
2691
BubbleDiagram join-source
BubbleDiagram single-source
BubbleDiagram join-op
BubbleDiagram join-constraint
BubbleDiagram ordering-term
BubbleDiagram compound-operator
</tcl>


<p>The SELECT statement is used to query the database.  ^The
result of a SELECT is zero or more rows of data where each row
has a fixed number of columns.  ^The number of columns in the
result is specified by the expression list in between the




















































































































SELECT and FROM keywords.  ^Any arbitrary expression can be used
as a result.  ^If a result expression is }

hd_puts "[Operator *] then all columns of all tables are substituted\n"

hd_puts "for that one expression.  ^(If the expression is the name of\n"
hd_puts "a table followed by [Operator .*] then the result is all columns\n"
hd_puts {in that one table.</p>)^





<p>^The DISTINCT keyword causes a subset of result rows to be returned, 
in which each result row is different.  ^NULL values are not treated as 
distinct from each other.  ^The default behavior is that all result rows 
be returned, which can be made explicit with the keyword ALL.</p>





<p>^The query is executed against one or more tables specified after
the FROM keyword.  ^If multiple tables names are separated by commas,
then the query is against the cross join of the various tables.
^The full SQL-92 join syntax can also be used to specify joins.
^A sub-query
in parentheses may be substituted for any table name in the FROM clause.
^The entire FROM clause may be omitted, in which case the result is a
single row consisting of the values of the expression list.
</p>









<p>^The WHERE clause can be used to limit the number of rows over
which the query operates.</p>





<p>^The GROUP BY clause causes one or more rows of the result to
be combined into a single row of output.  This is especially useful
when the result contains aggregate functions.  ^The expressions in







the GROUP BY clause do <em>not</em> have to be expressions that
appear in the result.  ^The HAVING clause is similar to WHERE except
that HAVING applies after grouping has occurred.  ^The HAVING expression







may refer to values, even aggregate functions, that are not in the result.</p>












































<p>^The ORDER BY clause causes the output rows to be sorted.  
^The argument to ORDER BY is a list of expressions that are used as the
key for the sort.  ^The expressions do not have to be part of the
result for a simple SELECT, but in a compound SELECT each sort
expression must exactly match one of the result columns.  ^Each
sort expression may be optionally followed by a COLLATE keyword and
................................................................................
limit is the first number and the offset is the second number.  ^If a
comma is used instead of the OFFSET keyword, then the offset is the
first number and the limit is the second number.  This seeming
contradiction is intentional - it maximizes compatibility with legacy
SQL database systems.
</p>

<p>^A compound SELECT is formed from two or more simple SELECTs connected
by one of the operators UNION, UNION ALL, INTERSECT, or EXCEPT.  ^In
a compound SELECT, all the constituent SELECTs must specify the
same number of result columns.  ^There may be only a single ORDER BY
clause at the end of the compound SELECT.  ^The UNION and UNION ALL
operators combine the results of the SELECTs to the right and left into
a single big table.  ^The difference is that in UNION all result rows
are distinct where in UNION ALL there may be duplicates.
^The INTERSECT operator takes the intersection of the results of the
left and right SELECTs.  ^EXCEPT takes the result of left SELECT after
removing the results of the right SELECT.  ^When three or more SELECTs
are connected into a compound, they group from left to right.</p>


<tcl>
##############################################################################
Section UPDATE update {UPDATE *UPDATEs}

BubbleDiagram update-stmt 1
BubbleDiagram qualified-table-name








>


|
<
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
|
|
>
|
>
|
<
<
>
>
>
>

<
<
<
<
>
>
>
>

<
<
<
<
<
<
<
<
<
>
>
>
>
>
>
>
>

<
<
>
>
>
>

<
<
<
>
>
>
>
>
>
>
|
|
<
>
>
>
>
>
>
>
|
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







 







<
<
<
<
<
<
<
<
<
<
<
<
<







2582
2583
2584
2585
2586
2587
2588
2589
2590
2591
2592
2593

2594
2595
2596
2597
2598
2599
2600
2601
2602
2603
2604
2605
2606
2607
2608
2609
2610
2611
2612
2613
2614
2615
2616
2617
2618
2619
2620
2621
2622
2623
2624
2625
2626
2627
2628
2629
2630
2631
2632
2633
2634
2635
2636
2637
2638
2639
2640
2641
2642
2643
2644
2645
2646
2647
2648
2649
2650
2651
2652
2653
2654
2655
2656
2657
2658
2659
2660
2661
2662
2663
2664
2665
2666
2667
2668
2669
2670
2671
2672
2673
2674
2675
2676
2677
2678
2679
2680
2681
2682
2683
2684
2685
2686
2687
2688
2689
2690
2691
2692
2693
2694
2695
2696
2697
2698
2699
2700
2701
2702
2703
2704
2705
2706
2707
2708
2709
2710
2711
2712
2713
2714
2715


2716
2717
2718
2719
2720




2721
2722
2723
2724
2725









2726
2727
2728
2729
2730
2731
2732
2733
2734


2735
2736
2737
2738
2739



2740
2741
2742
2743
2744
2745
2746
2747
2748

2749
2750
2751
2752
2753
2754
2755
2756
2757
2758
2759
2760
2761
2762
2763
2764
2765
2766
2767
2768
2769
2770
2771
2772
2773
2774
2775
2776
2777
2778
2779
2780
2781
2782
2783
2784
2785
2786
2787
2788
2789
2790
2791
2792
2793
2794
2795
2796
2797
2798
2799
2800
2801
2802
2803
2804
2805
2806
....
2839
2840
2841
2842
2843
2844
2845













2846
2847
2848
2849
2850
2851
2852
BubbleDiagram join-source
BubbleDiagram single-source
BubbleDiagram join-op
BubbleDiagram join-constraint
BubbleDiagram ordering-term
BubbleDiagram compound-operator
</tcl>


<p>The SELECT statement is used to query the database.  ^The
result of a SELECT is zero or more rows of data where each row
has a fixed number of columns.  


<p>The SELECT statement is the most complicated command in the SQL language.
To make the description easier to follow, some of the passages below describe
the way the data returned by a SELECT statement is determined as a series of
steps. It is important to keep in mind that this is purely illustrative -
in practice neither SQLite nor any other SQL engine is required to follow 
this or any other specific process.

<h3>Simple Select Processing</h3>

<p>The syntax for a simple SELECT statement is depicted in the 
[select-core syntax diagram]. Generating the results of a simple SELECT
statement is presented as a four step process in the description below:

<ol>
  <li> The input dataset is determined by joining data from the tables and
       subqueries specified as part of the FROM clause.
  <li> The input dataset is filtered using the WHERE clause expression.
  <li> The set of result rows is computed by aggregating and calculating
       the result-set expressions for the rows of the filtered input dataset.
  <li> If the query is a "SELECT DISTINCT" query, duplicate rows are removed
       from the set of result rows.
</ol>

<p>^There are two types of simple SELECT statement - aggregate and 
non-aggregate queries. ^A simple SELECT statement is an aggregate query if
it contains either a GROUP BY clause or one or more aggregate functions
in the result-set. ^Otherwise, if a simple SELECT contains no aggregate
functions or a GROUP BY clause, it is a non-aggregate query.

<p><b>1. Calculation of input dataset.</b>

<p>^If a FROM clause is specified, the data on which a simple SELECT query
operates comes from the one or more tables or subqueries (SELECT statements
in parenthesis) specified following the FROM keyword. ^If there is only a
single table or subquery following the FROM clause, then the SELECT statement
operates on the contents of the table or the results of the subquery. If
there is more than one table or subquery following the FROM clause, then the
data from each table or subquery is combined into a single dataset for the
simple SELECT statement to operate on. Exactly how the data is combined
depends on the specific [join-op|join-ops] and 
[join-constraint|join-constraints] used to connect the tables or subqueries
together, as follows:

<ul>
  <li> <p>^If the join-op is a comma (","), then the composite dataset is
       the cartesian product of the sets of records from the left and right
       sides of the join-op.

  <li> <p>^If the join-op is a "CROSS JOIN" or "INNER JOIN", then the composite
       dataset is created in the same way as for the comma join-op and
       the result filtered based on the ON() and USING join-constraint clauses
       (if any).

       <p>^If there is an ON clause specified, then the ON expression is
       evaluated for each row of the cartesian product and the result cast to 
       an integer as if by a [CAST expression]. All rows for which the
       expression evaluates to NULL or zero (integer value 0) are excluded from
       the composite dataset.

       <p>^If there is a USING clause specified as part of the join-constraint,
       then each of the column names specified must exist in the datasets to 
       both the left and right of the join-op. ^(For each pair of namesake
       columns, the expression "lhs.X = rhs.X" is evaluated and the result
       cast to an integer. All rows for which one or more of the expressions
       evaluates to NULL or zero are excluded from the result set.)^ ^When
       comparing values as a result of a USING clause, the normal rules for
       handling affinities, collation sequences and NULL values in comparisons
       apply.

       <p>^For each pair of columns identified by a USING clause, only a
       single result column appears in the output dataset. ^This is the only
       difference between a USING clause and its equivalent ON constraint.

  <li> <p>^If the join-op is a "LEFT JOIN" or "LEFT OUTER JOIN", then the
       composite dataset is created as for a "CROSS JOIN". ^Except, after
       the ON and USING filtering clauses have been applied, an extra row is 
       added to the output for each row in the original left-hand input 
       dataset (if any) that corresponds to no rows at all in the composite
       dataset. ^The added rows contain NULL values in the columns that would
       normally contain values copied from the right-hand input dataset.

  <li> <p>^If the NATURAL keyword is added to any of the join-ops, then an
       implicit USING clause is added to the join-constraints. The implicit
       USING clause contains each of the column names that appear in both
       the left and right-hand input datasets. If the left and right-hand
       input datasets feature no common column names, then the NATURAL keyword
       has no effect on the results of the join.
</ul>

<p>^(When more than two tables are joined together as part of a FROM clause,
the join operations are processed in order from left to right. In other 
words, the FROM clause (A join-op-1 B join-op-2 C) is processed as 
((A join-op-1 B) join-op-2 C).)^
       
<p>^If the FROM clause is omitted from a SELECT statement, then the query
operates as if there were an implicit FROM clause that returns a dataset
consisting of a single row with zero columns.

<p><b>2. WHERE clause filtering.</b>

<p>^If a WHERE clause is specified, the WHERE expression is evaluated for 
each row in the dataset computed by the explicit or implicit FROM clause and
the result cast to an integer value. ^All rows for which the WHERE clause
expression evaluates to a NULL value or to zero (integer value 0) are excluded
from the dataset before continuing.

<p><b>3. Generation of the set of result rows.</b>

<p>Once the input dataset from the FROM clause has been filtered by the
WHERE clause expression (if any), the set of result rows for the simple 
SELECT are calculated. Exactly how this is done depends on whether the simple 
SELECT is an aggregate or non-aggregate query, and whether or not a GROUP
BY clause was specified.

<p> ^The number of columns in the result rows is determined by the expression
list in between the SELECT and FROM keywords. ^Any arbitrary expression can be
used as a result row column.  ^If a result expression is 
} 
hd_puts "[Operator *]" 
hd_puts { then all columns of the input dataset are substituted for that one
expression.  ^(If the expression is the alias of a table or subquery in the


FROM clause followed by } 
hd_puts "[Operator .*]" 
hd_puts { then all columns from the named table or subquery are substituted for
the single expression.)^





<ul>
  <li><p>^If the SELECT statement is <b>a non-aggregate query</b>, then 
    each expression in the result-set is evaluated for each row in the dataset
    filtered by the WHERE clause.










  <li><p>^If the SELECT statement is <b>an aggregate query without a GROUP
    BY</b> clause, then each aggregate expression in the result-set is 
    evaluated once across the entire dataset. ^Each non-aggregate expression
    in the result-set is evaluated once for an arbitrarily selected row of
    the dataset. ^The same arbitrarily selected row is used for each
    non-aggregate expression. Or, if the dataset is empty, then each
    non-aggregate expression is evaluated against a row consisting entirely
    of NULL values.



   <p>^The single row of result-set data created by evaluating the aggregate
    and non-aggregate expressions in the result-set forms the result of an
    aggregate query without a GROUP BY clause. ^An aggregate query without a
    GROUP BY clause always returns exactly one row of data.




  <li><p>^If the SELECT statement is <b>an aggregate query with a GROUP
    BY</b> clause, then each of the expressions specified as part of the
    GROUP BY clause is evaluated for each row of the filtered input dataset.
    ^Each row is grouped with all other rows for which all GROUP BY 
    expressions evaluate to the same values. For the purposes of grouping 
    rows, NULL values are considered equal. The usual rules for selecting
    a collation sequence with which to compare text values apply.
    ^The expressions in the GROUP BY clause do <em>not</em> have to be
    expressions that appear in the result.  


    <p>^If a HAVING clause is specified, it is evaluated once for each group 
    of rows and cast to an integer value. ^If the HAVING clause is an aggregate
    expression, it is evaluated across all rows in the group. ^If a HAVING 
    clause is a non-aggregate expression, it is evaluated with respect to an
    arbitrarily selected row from the group. ^If the result of evaluating the
    HAVING clause is NULL or zero (integer value 0), the group is discarded.
    ^The HAVING expression may refer to values, even aggregate functions, that
    are not in the result.</p>

    <p>^Each expression in the result-set is then evaluated once for each
    group of rows. ^If the expression is an aggregate expression, it is 
    evaluated across all rows in the group. ^Otherwise, it is evaluated against
    a single arbitrarily chosen row from within the group. ^If there is more
    than one non-aggregate expression in the result-set, then all such
    expressions are evaluated for the same row.

    <p>^Each group of input dataset rows contributes a single row to the 
    set of result rows. ^Subject to filtering associated with the DISTINCT
    keyword, the number of rows returned by an aggregate query with a GROUP
    BY clause is the same as the number of groups of rows produced by applying
    the GROUP BY and HAVING clauses to the filtered input dataset.
</ul>

<p><b>4. Removal of duplicate rows.</b>

<p>^(One of the ALL or DISTINCT keywords may follow the SELECT keyword in a 
simple SELECT statement. If neither of these are present, then the behaviour
is as if ALL were specified.)^ ^If the simple SELECT is a SELECT ALL, then the
entire set of result rows are returned by the SELECT. ^If the simple SELECT
is a SELECT DISTINCT, then duplicate rows are removed from the set of result
rows before it is returned. ^For the purposes of detecting duplicate rows,
two NULL values are considered to be equal. ^The normal rules for selecting
a collation sequence to compare text values with apply.

<h3>Compound Select Statements</h3>

<p>^A compound SELECT is formed from two or more simple SELECTs connected
by one of the operators UNION, UNION ALL, INTERSECT, or EXCEPT.  ^In
a compound SELECT, all the constituent SELECTs must specify the
same number of result columns.  ^There may be only a single ORDER BY
clause at the end of the compound SELECT.  ^The UNION and UNION ALL
operators combine the results of the SELECTs to the right and left into
a single big table.  ^The difference is that in UNION all result rows
are distinct where in UNION ALL there may be duplicates.
^The INTERSECT operator takes the intersection of the results of the
left and right SELECTs.  ^EXCEPT takes the result of left SELECT after
removing the results of the right SELECT.  ^When three or more SELECTs
are connected into a compound, they group from left to right.</p>

<h3>ORDER BY and LIMIT/OFFSET Clauses</h3>

<p>^The ORDER BY clause causes the output rows to be sorted.  
^The argument to ORDER BY is a list of expressions that are used as the
key for the sort.  ^The expressions do not have to be part of the
result for a simple SELECT, but in a compound SELECT each sort
expression must exactly match one of the result columns.  ^Each
sort expression may be optionally followed by a COLLATE keyword and
................................................................................
limit is the first number and the offset is the second number.  ^If a
comma is used instead of the OFFSET keyword, then the offset is the
first number and the limit is the second number.  This seeming
contradiction is intentional - it maximizes compatibility with legacy
SQL database systems.
</p>















<tcl>
##############################################################################
Section UPDATE update {UPDATE *UPDATEs}

BubbleDiagram update-stmt 1
BubbleDiagram qualified-table-name