*** DRAFT ***

SQLite C Interface

Determine if a virtual table query is DISTINCT

int sqlite3_vtab_distinct(sqlite3_index_info*);

This API may only be used from within an xBestIndex method of a virtual table implementation. The result of calling this interface from outside of xBestIndex() is undefined and probably harmful.

The sqlite3_vtab_distinct() interface returns an integer between 0 and 3. The integer returned by sqlite3_vtab_distinct() gives the virtual table additional information about how the query planner wants the output to be ordered. As long as the virtual table can meet the ordering requirements of the query planner, it may set the "orderByConsumed" flag.

  1. If the sqlite3_vtab_distinct() interface returns 0, that means that the query planner needs the virtual table to return all rows in the sort order defined by the "nOrderBy" and "aOrderBy" fields of the sqlite3_index_info object. This is the default expectation. If the virtual table outputs all rows in sorted order, then it is always safe for the xBestIndex method to set the "orderByConsumed" flag, regardless of the return value from sqlite3_vtab_distinct().

  2. If the sqlite3_vtab_distinct() interface returns 1, that means that the query planner does not need the rows to be returned in sorted order as long as all rows with the same values in all columns identified by the "aOrderBy" field are adjacent. This mode is used when the query planner is doing a GROUP BY.

  3. If the sqlite3_vtab_distinct() interface returns 2, that means that the query planner does not need the rows returned in any particular order, as long as rows with the same values in all columns identified by "aOrderBy" are adjacent. Furthermore, when two or more rows contain the same values for all columns identified by "colUsed", all but one such row may optionally be omitted from the result. The virtual table is not required to omit rows that are duplicates over the "colUsed" columns, but if the virtual table can do that without too much extra effort, it could potentially help the query to run faster. This mode is used for a DISTINCT query.

  4. If the sqlite3_vtab_distinct() interface returns 3, that means the virtual table must return rows in the order defined by "aOrderBy" as if the sqlite3_vtab_distinct() interface had returned 0. However if two or more rows in the result have the same values for all columns identified by "colUsed", then all but one such row may optionally be omitted. Like when the return value is 2, the virtual table is not required to omit rows that are duplicates over the "colUsed" columns, but if the virtual table can do that without too much extra effort, it could potentially help the query to run faster. This mode is used for queries that have both DISTINCT and ORDER BY clauses.

The following table summarizes the conditions under which the virtual table is allowed to set the "orderByConsumed" flag based on the value returned by sqlite3_vtab_distinct(). This table is a restatement of the previous four paragraphs:

sqlite3_vtab_distinct() return value Rows are returned in aOrderBy order Rows with the same value in all aOrderBy columns are adjacent Duplicates over all colUsed columns may be omitted
0yesyesno
1noyesno
2noyesyes
3yesyesyes

For the purposes of comparing virtual table output values to see if the values are same value for sorting purposes, two NULL values are considered to be the same. In other words, the comparison operator is "IS" (or "IS NOT DISTINCT FROM") and not "==".

If a virtual table implementation is unable to meet the requirements specified above, then it must not set the "orderByConsumed" flag in the sqlite3_index_info object or an incorrect answer may result.

A virtual table implementation is always free to return rows in any order it wants, as long as the "orderByConsumed" flag is not set. When the the "orderByConsumed" flag is unset, the query planner will add extra bytecode to ensure that the final results returned by the SQL query are ordered correctly. The use of the "orderByConsumed" flag and the sqlite3_vtab_distinct() interface is merely an optimization. Careful use of the sqlite3_vtab_distinct() interface and the "orderByConsumed" flag might help queries against a virtual table to run faster. Being overly aggressive and setting the "orderByConsumed" flag when it is not valid to do so, on the other hand, might cause SQLite to return incorrect results.

See also lists of Objects, Constants, and Functions.