# Documentation Source Text

Check-in [17093bc7d6]

Overview
Comment: Enhancements to the description of how the COLLATE operator works. Tarball | ZIP archive | SQL archive family | ancestors | descendants | both | trunk files | file ages | folders 17093bc7d60d46189912cd8f6d9afa4619aa3afd drh 2010-08-13 17:44:04
Context
 2010-08-14 12:25 Break a big testable statement in lang_expr.html into several smaller blocks. (check-in: 61d45eaf94 user: dan tags: trunk) 2010-08-13 17:44 Enhancements to the description of how the COLLATE operator works. (check-in: 17093bc7d6 user: drh tags: trunk) 2010-08-12 17:55 Initial identification of requirements in the fileformat2.html document. (check-in: 8925c8c2e1 user: drh tags: trunk)
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to pages/datatype3.in.

 ```510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 ... 535 536 537 538 539 540 541 542 543 544 545 546 547 548 ``` ```

6.1 Assigning Collating Sequences from SQL

^Every column of every table has an associated collating function. ^If no collating function is explicitly defined, then the collating function defaults to BINARY. ^The COLLATE clause of the [column-def | column definition] is used to define alternative collating functions for a column.

^(The rules for determining which collating function to use for a binary comparison operator (=, <, >, <=, >=, !=, IS, and IS NOT) are as follows and in the order shown:)^ ................................................................................ preceded by one or more unary "+" operators is still considered a column name.

• ^Otherwise, the BINARY collating function is used for comparison.

• ^The expression "x BETWEEN y and z" is logically equivalent to two comparisons "x >= y AND x <= z" and works with respect to collating functions as if it were two separate comparisons. ^(The expression "x IN (SELECT y ...)" is handled in the same way as the expression "x = y" for the purposes of determining the collating sequence.)^ ^(The collating sequence used for expressions of the form ``` ``` | > > > > > > > > > > > > > > ``` ```510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 ... 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 ``` ```

6.1 Assigning Collating Sequences from SQL

^Every column of every table has an associated collating function. ^If no collating function is explicitly defined, then the collating function defaults to BINARY. ^The COLLATE clause of the [column definition] is used to define alternative collating functions for a column.

^(The rules for determining which collating function to use for a binary comparison operator (=, <, >, <=, >=, !=, IS, and IS NOT) are as follows and in the order shown:)^ ................................................................................ preceded by one or more unary "+" operators is still considered a column name.

• ^Otherwise, the BINARY collating function is used for comparison.

• ^An operand of a comparison is considered to have an explicit collating function assignment (rule 1 above) if any subexpression of the operand uses the postfix [COLLATE operator]. ^Thus, if a [COLLATE operator] is used anywhere in a comparision expression, the collating function defined by that operator is used for string comparison regardless of what table columns might be a part of that expression. ^If two or more [COLLATE operator] subexpressions appear anywhere in a comparison, the left most explicit collating function is used regardless of how deeply the COLLATE operators are nested in the expression and regardless of how the expression is parenthesized.

^The expression "x BETWEEN y and z" is logically equivalent to two comparisons "x >= y AND x <= z" and works with respect to collating functions as if it were two separate comparisons. ^(The expression "x IN (SELECT y ...)" is handled in the same way as the expression "x = y" for the purposes of determining the collating sequence.)^ ^(The collating sequence used for expressions of the form ```

Changes to pages/fileformat2.in.

 ```902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 ... 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 ``` ```of the two records. The sort order of individual columns is as follows:

1. ^NULL values (serial type 0) sort first
2. ^Numeric values (serial types 1 through 9) sort next and in numeric order
3. ^Text values (even serial types 12 and larger) sort next in the order determined by the columns collating function
4. ^BLOB values (odd serial types 13 and larger) sort last in order determined by memcmp().

A collating function for each column is necessary in order to compute the order of text fields. SQLite defines three built-in collating functions:

BINARY Strings are compared byte by byte using the memcmp() function from the standard C library.
NOCASE ................................................................................

^Additional application-specific collating functions can be added to SQLite using the [sqlite3_create_collation()] interface.

^The default collating function for all strings is BINARY. ^Alternative collating functions for table columns can be specified in the [CREATE TABLE] statement using the COLLATE clause on the column definition. ^When a column is indexed, the same collating function specified in the [CREATE TABLE] statement is used for the column in the index, by default, though this can be overridden using a COLLATE clause in the [CREATE INDEX] statement.

2.3 Representation Of SQL Tables

``` ``` | | | | ``` ```902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 ... 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 ``` ```of the two records. The sort order of individual columns is as follows:

1. ^NULL values (serial type 0) sort first
2. ^Numeric values (serial types 1 through 9) sort next and in numeric order
3. ^Text values (even serial types 12 and larger) sort next in the order determined by the columns [collating function]
4. ^BLOB values (odd serial types 13 and larger) sort last in order determined by memcmp().

A [collating function] for each column is necessary in order to compute the order of text fields. ^SQLite defines three built-in collating functions:

^^
BINARY Strings are compared byte by byte using the memcmp() function from the standard C library.
NOCASE ................................................................................

^Additional application-specific collating functions can be added to SQLite using the [sqlite3_create_collation()] interface.

^The default collating function for all strings is BINARY. ^Alternative collating functions for table columns can be specified in the [CREATE TABLE] statement using the COLLATE clause on the [column definition]. ^When a column is indexed, the same collating function specified in the [CREATE TABLE] statement is used for the column in the index, by default, though this can be overridden using a COLLATE clause in the [CREATE INDEX] statement.

2.3 Representation Of SQL Tables

```
^^

Changes to pages/lang.in.

 ```615 616 617 618 619 620 621 622 623 624 625 626 627 628 .... 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 .... 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 .... 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 ``` ```

^A CREATE TABLE statement is basically the keywords "CREATE TABLE" followed by the name of a new table and a parenthesized list of column definitions and constraints. ^Tables names that begin with "sqlite_" are reserved for use by SQLite itself and cannot normally appear in a CREATE TABLE statement.

^Each column definition is the name of the column optionally followed by the [datatype] for that column, then one or more optional column constraints. ^SQLite uses [dynamic typing]; the datatype for the column does not restrict what data may be put in that column. ^The UNIQUE constraint causes an unique index to be created on the specified columns. ^All NULL values are considered different from each other and from ................................................................................ the virtual table. ^The <module-name> must be registered with the SQLite database connection using [sqlite3_create_module()] or [sqlite3_create_module_v2()] prior to issuing the CREATE VIRTUAL TABLE statement. ^The module takes zero or more comma-separated arguments. ^The arguments can be just about any text as long as it has balanced parentheses. The argument syntax is sufficiently general that the arguments can be made to appear as column definitions in a traditional [CREATE TABLE] statement. ^SQLite passes the module arguments directly to the [xCreate] and [xConnect] methods of the module implementation without any interpretation. It is the responsibility of the module implementation to parse and interpret its own arguments.

^A virtual table is destroyed using the ordinary ................................................................................

^(Supported unary prefix operators are these:

-    +    ~    NOT
)^ hd_fragment collateop {COLLATE operator}

^The COLLATE operator can be thought of as a unary postfix operator. ^The COLLATE operator has the highest precedence. ^It always binds more tightly than any prefix unary operator or any binary operator.

hd_puts "

^The unary operator [Operator +] is a no-op. ^It can be applied to strings, numbers, blobs or NULL and it always returns a result with the same value as the operand.

"

Note that there are two variations of the equals and not equals ................................................................................ hd_fragment castexpr {CAST expression}

CAST expressions

^A CAST expression changes the [datatype] of the into the type specified by <type>. ^(<type> can be any non-empty type name that is valid for the type in a column definition of a [CREATE TABLE] statement.)^ ^The <type> name is interpreted according to the [rules for determining column affinity].

^An explicit cast is stronger than affinity; with the CAST expression the datatype conversion is forced even if it is lossy and irrreversible.

• ``` ``` > | | | | | > > > > > > | ``` ```615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 .... 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 .... 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 .... 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 ``` ```

^A CREATE TABLE statement is basically the keywords "CREATE TABLE" followed by the name of a new table and a parenthesized list of column definitions and constraints. ^Tables names that begin with "sqlite_" are reserved for use by SQLite itself and cannot normally appear in a CREATE TABLE statement.

hd_fragment {tablecoldef} {column definition} {column definitions}

^Each column definition is the name of the column optionally followed by the [datatype] for that column, then one or more optional column constraints. ^SQLite uses [dynamic typing]; the datatype for the column does not restrict what data may be put in that column. ^The UNIQUE constraint causes an unique index to be created on the specified columns. ^All NULL values are considered different from each other and from ................................................................................ the virtual table. ^The <module-name> must be registered with the SQLite database connection using [sqlite3_create_module()] or [sqlite3_create_module_v2()] prior to issuing the CREATE VIRTUAL TABLE statement. ^The module takes zero or more comma-separated arguments. ^The arguments can be just about any text as long as it has balanced parentheses. The argument syntax is sufficiently general that the arguments can be made to appear as [column definitions] in a traditional [CREATE TABLE] statement. ^SQLite passes the module arguments directly to the [xCreate] and [xConnect] methods of the module implementation without any interpretation. It is the responsibility of the module implementation to parse and interpret its own arguments.

^A virtual table is destroyed using the ordinary ................................................................................

^(Supported unary prefix operators are these:

-    +    ~    NOT
)^ hd_fragment collateop {COLLATE operator}

^The COLLATE operator is a unary postfix operator that assigns a [collating sequence] to an expression. ^The COLLATE operator has a higher precedence (binds more tightly) than any prefix unary operator or any binary operator. ^The collating sequence set by the COLLATE operator overrides the collating sequence determined by the COLLATE clause in a table [column definition]. See the [collating sequence| detailed discussion on collating sequences] in the [datatype | Datatype In SQLite3] document for additional information.

hd_puts "

^The unary operator [Operator +] is a no-op. ^It can be applied to strings, numbers, blobs or NULL and it always returns a result with the same value as the operand.

"

Note that there are two variations of the equals and not equals ................................................................................ hd_fragment castexpr {CAST expression}

CAST expressions

^A CAST expression changes the [datatype] of the into the type specified by <type>. ^(<type> can be any non-empty type name that is valid for the type in a [column definition] of a [CREATE TABLE] statement.)^ ^The <type> name is interpreted according to the [rules for determining column affinity].

^An explicit cast is stronger than affinity; with the CAST expression the datatype conversion is forced even if it is lossy and irrreversible.

• ```

Changes to pages/vtab.in.

 ```72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 ``` ```can consist of keywords, string literals, identifiers, numbers, and punctuation. The arguments are passed as written (as text) into the [xCreate | constructor method] of the virtual table implementation when the virtual table is created and that constructor is responsible for parsing and interpreting the arguments. The argument syntax is sufficiently general that a virtual table implementation can, if it wants to, interpret its arguments as column definitions in an ordinary [CREATE TABLE] statement. The implementation could also impose some other interpretation on the arguments.

Once a virtual table has been created, it can be used like any other table with the exceptions noted above and imposed by specific virtual table implementations. A virtual table is destroyed using the ordinary [DROP TABLE] syntax. ``` ``` | ``` ```72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 ``` ```can consist of keywords, string literals, identifiers, numbers, and punctuation. The arguments are passed as written (as text) into the [xCreate | constructor method] of the virtual table implementation when the virtual table is created and that constructor is responsible for parsing and interpreting the arguments. The argument syntax is sufficiently general that a virtual table implementation can, if it wants to, interpret its arguments as [column definitions] in an ordinary [CREATE TABLE] statement. The implementation could also impose some other interpretation on the arguments.

Once a virtual table has been created, it can be used like any other table with the exceptions noted above and imposed by specific virtual table implementations. A virtual table is destroyed using the ordinary [DROP TABLE] syntax. ```