Documentation Source Text

Check-in [17093bc7d6]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Enhancements to the description of how the COLLATE operator works.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 17093bc7d60d46189912cd8f6d9afa4619aa3afd
User & Date: drh 2010-08-13 17:44:04.000
Context
2010-08-14
12:25
Break a big testable statement in lang_expr.html into several smaller blocks. (check-in: 61d45eaf94 user: dan tags: trunk)
2010-08-13
17:44
Enhancements to the description of how the COLLATE operator works. (check-in: 17093bc7d6 user: drh tags: trunk)
2010-08-12
17:55
Initial identification of requirements in the fileformat2.html document. (check-in: 8925c8c2e1 user: drh tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to pages/datatype3.in.
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524

<h3>6.1 Assigning Collating Sequences from SQL</h3>

<p>
^Every column of every
table has an associated collating function.  ^If no collating function
is explicitly defined, then the collating function defaults to BINARY.
^The COLLATE clause of the [column-def | column definition] is used
to define alternative collating functions for a column.
</p>  

<p>
^(The rules for determining which collating function to use for a
binary comparison operator (=, &lt;, &gt;, &lt;=, &gt;=, !=, IS, and
IS NOT) are as follows and in the order shown:)^







|







510
511
512
513
514
515
516
517
518
519
520
521
522
523
524

<h3>6.1 Assigning Collating Sequences from SQL</h3>

<p>
^Every column of every
table has an associated collating function.  ^If no collating function
is explicitly defined, then the collating function defaults to BINARY.
^The COLLATE clause of the [column definition] is used
to define alternative collating functions for a column.
</p>  

<p>
^(The rules for determining which collating function to use for a
binary comparison operator (=, &lt;, &gt;, &lt;=, &gt;=, !=, IS, and
IS NOT) are as follows and in the order shown:)^
535
536
537
538
539
540
541














542
543
544
545
546
547
548
preceded by one or more unary "+" operators is still considered a column name.
</p></li>

<li><p>^Otherwise, the BINARY collating function is used for comparison.
</p></li>
</ol>















<p>
^The expression "x BETWEEN y and z" is logically
equivalent to two comparisons "x &gt;= y AND x &lt;= z" and works with
respect to collating functions as if it were two separate comparisons.
^(The expression "x IN (SELECT y ...)" is handled in the same way as the
expression "x = y" for the purposes of determining the collating sequence.)^
^(The collating sequence used for expressions of the form 







>
>
>
>
>
>
>
>
>
>
>
>
>
>







535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
preceded by one or more unary "+" operators is still considered a column name.
</p></li>

<li><p>^Otherwise, the BINARY collating function is used for comparison.
</p></li>
</ol>

<p>
^An operand of a comparison is considered to have an explicit
collating function assignment (rule 1 above) 
if any subexpression of the operand uses
the postfix [COLLATE operator].  ^Thus, if a [COLLATE operator] is used
anywhere in a comparision expression, the collating function defined
by that operator is used for string comparison regardless of what 
table columns might be a part of that expression.  ^If two or more
[COLLATE operator] subexpressions appear anywhere in a comparison, the 
left most explicit collating function is used regardless of how deeply the
COLLATE operators are nested in the expression and regardless of
how the expression is parenthesized.
</p>

<p>
^The expression "x BETWEEN y and z" is logically
equivalent to two comparisons "x &gt;= y AND x &lt;= z" and works with
respect to collating functions as if it were two separate comparisons.
^(The expression "x IN (SELECT y ...)" is handled in the same way as the
expression "x = y" for the purposes of determining the collating sequence.)^
^(The collating sequence used for expressions of the form 
Changes to pages/fileformat2.in.
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
of the two records.  The sort order of individual columns is as
follows:</p>

<ol>
<li>^NULL values (serial type 0) sort first
<li>^Numeric values (serial types 1 through 9) sort next and in numeric order
<li>^Text values (even serial types 12 and larger) sort next in the order
    determined by the columns collating function
<li>^BLOB values (odd serial types 13 and larger) sort last in order 
    determined by memcmp().
</ol>

<p>A collating function for each column is necessary in order to compute
the order of text fields.  SQLite defines three built-in collating functions:
</p>

<blockquote><table border=0 cellspacing=10>
<tr>^<td valign=top>BINARY
    <td>Strings are compared byte by byte using the memcmp() function
        from the standard C library.
<tr>^<td valign=top>NOCASE







|




|
|







902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
of the two records.  The sort order of individual columns is as
follows:</p>

<ol>
<li>^NULL values (serial type 0) sort first
<li>^Numeric values (serial types 1 through 9) sort next and in numeric order
<li>^Text values (even serial types 12 and larger) sort next in the order
    determined by the columns [collating function]
<li>^BLOB values (odd serial types 13 and larger) sort last in order 
    determined by memcmp().
</ol>

<p>A [collating function] for each column is necessary in order to compute
the order of text fields.  ^SQLite defines three built-in collating functions:
</p>

<blockquote><table border=0 cellspacing=10>
<tr>^<td valign=top>BINARY
    <td>Strings are compared byte by byte using the memcmp() function
        from the standard C library.
<tr>^<td valign=top>NOCASE
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
</table></blockquote>

<p>^Additional application-specific collating functions can be added to
SQLite using the [sqlite3_create_collation()] interface.</p>

<p>^The default collating function for all strings is BINARY.
^Alternative collating functions for table columns can be specified in the
[CREATE TABLE] statement using the COLLATE clause on the column definition.
^When a column is indexed, the same collating function specified in the
[CREATE TABLE] statement is used for the column in the index, by default,
though this can be overridden using a COLLATE clause in the 
[CREATE INDEX] statement.

<h3>2.3 Representation Of SQL Tables</h3>








|







930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
</table></blockquote>

<p>^Additional application-specific collating functions can be added to
SQLite using the [sqlite3_create_collation()] interface.</p>

<p>^The default collating function for all strings is BINARY.
^Alternative collating functions for table columns can be specified in the
[CREATE TABLE] statement using the COLLATE clause on the [column definition].
^When a column is indexed, the same collating function specified in the
[CREATE TABLE] statement is used for the column in the index, by default,
though this can be overridden using a COLLATE clause in the 
[CREATE INDEX] statement.

<h3>2.3 Representation Of SQL Tables</h3>

Changes to pages/lang.in.
615
616
617
618
619
620
621

622
623
624
625
626
627
628
<p>^A CREATE TABLE statement is basically the keywords "CREATE TABLE"
followed by the name of a new table and a parenthesized list of column
definitions and constraints.  
^Tables names that begin with "<b>sqlite_</b>" are reserved
for use by SQLite itself and cannot normally
appear in a CREATE TABLE statement.</p>


<p>^Each column definition is the name of the column optionally followed by the
[datatype] for that column, then one or more optional column constraints.
^SQLite uses [dynamic typing]; 
the datatype for the column does not restrict what data may be put
in that column.
^The UNIQUE constraint causes an unique index to be created on the specified
columns.  ^All NULL values are considered different from each other and from







>







615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
<p>^A CREATE TABLE statement is basically the keywords "CREATE TABLE"
followed by the name of a new table and a parenthesized list of column
definitions and constraints.  
^Tables names that begin with "<b>sqlite_</b>" are reserved
for use by SQLite itself and cannot normally
appear in a CREATE TABLE statement.</p>

<tcl>hd_fragment {tablecoldef} {column definition} {column definitions}</tcl>
<p>^Each column definition is the name of the column optionally followed by the
[datatype] for that column, then one or more optional column constraints.
^SQLite uses [dynamic typing]; 
the datatype for the column does not restrict what data may be put
in that column.
^The UNIQUE constraint causes an unique index to be created on the specified
columns.  ^All NULL values are considered different from each other and from
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
the virtual table.  ^The &lt;module-name&gt; must be registered with
the SQLite database connection using
[sqlite3_create_module()] or [sqlite3_create_module_v2()]
prior to issuing the CREATE VIRTUAL TABLE statement.
^The module takes zero or more comma-separated arguments.
^The arguments can be just about any text as long as it has balanced
parentheses.  The argument syntax is sufficiently general that the
arguments can be made to appear as column definitions in a traditional
[CREATE TABLE] statement.  
^SQLite passes the module arguments directly
to the [xCreate] and [xConnect] methods of the module implementation
without any interpretation.  It is the responsibility
of the module implementation to parse and interpret its own arguments.</p>

<p>^A virtual table is destroyed using the ordinary







|







1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
the virtual table.  ^The &lt;module-name&gt; must be registered with
the SQLite database connection using
[sqlite3_create_module()] or [sqlite3_create_module_v2()]
prior to issuing the CREATE VIRTUAL TABLE statement.
^The module takes zero or more comma-separated arguments.
^The arguments can be just about any text as long as it has balanced
parentheses.  The argument syntax is sufficiently general that the
arguments can be made to appear as [column definitions] in a traditional
[CREATE TABLE] statement.  
^SQLite passes the module arguments directly
to the [xCreate] and [xConnect] methods of the module implementation
without any interpretation.  It is the responsibility
of the module implementation to parse and interpret its own arguments.</p>

<p>^A virtual table is destroyed using the ordinary
1254
1255
1256
1257
1258
1259
1260
1261

1262
1263





1264
1265
1266
1267
1268
1269
1270
1271
<p>^(Supported unary prefix operators are these:</p>

<blockquote><pre>
<font color="#2c2cf0"><big>-    +    ~    NOT</big></font>
</pre></blockquote>)^

<tcl>hd_fragment collateop {COLLATE operator}</tcl>
<p>^The COLLATE operator can be thought of as a unary postfix

operator.  ^The COLLATE operator has the highest precedence.
^It always binds more tightly than any prefix unary operator or





any binary operator.</p>

<tcl>hd_puts "
<p>^The unary operator [Operator +] is a no-op.  ^It can be applied
to strings, numbers, blobs or NULL and it always returns a result
with the same value as the operand.</p>"</tcl>

<p>Note that there are two variations of the equals and not equals







|
>
|
|
>
>
>
>
>
|







1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
<p>^(Supported unary prefix operators are these:</p>

<blockquote><pre>
<font color="#2c2cf0"><big>-    +    ~    NOT</big></font>
</pre></blockquote>)^

<tcl>hd_fragment collateop {COLLATE operator}</tcl>
<p>^The COLLATE operator is a unary postfix
operator that assigns a [collating sequence] to an expression.
^The COLLATE operator has a higher precedence (binds more tightly) than any
prefix unary operator or any binary operator.
^The collating sequence set by the COLLATE operator overrides the
collating sequence determined by the COLLATE clause in a table
[column definition].
See the [collating sequence| detailed discussion on collating sequences]
in the [datatype | Datatype In SQLite3] document for additional information.
</p>

<tcl>hd_puts "
<p>^The unary operator [Operator +] is a no-op.  ^It can be applied
to strings, numbers, blobs or NULL and it always returns a result
with the same value as the operand.</p>"</tcl>

<p>Note that there are two variations of the equals and not equals
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598

<tcl>hd_fragment castexpr {CAST expression}</tcl>
<h3>CAST expressions</h3>

<p>^A CAST expression changes the [datatype] of the <expr> into the
type specified by &lt;type&gt;. 
^(&lt;type&gt; can be any non-empty type name that is valid
for the type in a column definition of a [CREATE TABLE] statement.)^
^The &lt;type&gt; name is interpreted according to the 
[rules for determining column affinity].

<p>^An explicit cast is stronger than affinity; with the CAST expression
the datatype conversion is forced even if it is lossy and irrreversible.
<ul>
<li><p>







|







1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605

<tcl>hd_fragment castexpr {CAST expression}</tcl>
<h3>CAST expressions</h3>

<p>^A CAST expression changes the [datatype] of the <expr> into the
type specified by &lt;type&gt;. 
^(&lt;type&gt; can be any non-empty type name that is valid
for the type in a [column definition] of a [CREATE TABLE] statement.)^
^The &lt;type&gt; name is interpreted according to the 
[rules for determining column affinity].

<p>^An explicit cast is stronger than affinity; with the CAST expression
the datatype conversion is forced even if it is lossy and irrreversible.
<ul>
<li><p>
Changes to pages/vtab.in.
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
can consist of keywords, string literals, identifiers, numbers, and 
punctuation. The arguments are passed as written (as text) into the
[xCreate | constructor method] of the virtual table implementation 
when the virtual 
table is created and that constructor is responsible for parsing and 
interpreting the arguments. The argument syntax is sufficiently general 
that a virtual table implementation can, if it wants to, interpret its
arguments as column definitions in an ordinary [CREATE TABLE] statement. 
The implementation could also impose some other interpretation on the 
arguments.

<p>Once a virtual table has been created, it can be used like any other 
table with the exceptions noted above and imposed by specific virtual
table implementations. A virtual table is destroyed using the ordinary
[DROP TABLE] syntax.







|







72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
can consist of keywords, string literals, identifiers, numbers, and 
punctuation. The arguments are passed as written (as text) into the
[xCreate | constructor method] of the virtual table implementation 
when the virtual 
table is created and that constructor is responsible for parsing and 
interpreting the arguments. The argument syntax is sufficiently general 
that a virtual table implementation can, if it wants to, interpret its
arguments as [column definitions] in an ordinary [CREATE TABLE] statement. 
The implementation could also impose some other interpretation on the 
arguments.

<p>Once a virtual table has been created, it can be used like any other 
table with the exceptions noted above and imposed by specific virtual
table implementations. A virtual table is destroyed using the ordinary
[DROP TABLE] syntax.