Documentation Source Text

Check-in [d6de4c860f]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Update fts5.in to note that FTS5 is now included in the amalgamation. And that it supports the table-valued function syntax.
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: d6de4c860f2eebe2b9fcab98ad9d2ca569d536de
User & Date: dan 2015-10-13 16:08:00
Context
2015-10-14
12:27
Change the date of release for 3.9.0. check-in: afadc19de4 user: drh tags: trunk
2015-10-13
16:08
Update fts5.in to note that FTS5 is now included in the amalgamation. And that it supports the table-valued function syntax. check-in: d6de4c860f user: dan tags: trunk
2015-10-12
18:25
Update test statistics. check-in: be7c2233a2 user: drh tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to pages/fts5.in.

35
36
37
38
39
40
41
42


43




44
45
46


47
48
49
50
51


52
53
54
55
56
57
58
..
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
..
92
93
94
95
96
97
98
99

100
101
102
103














104
105
106
107
108
109
110
111
112
113
114
...
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
...
706
707
708
709
710
711
712
713
714
715

716
717
718
719
720
721
722
...
931
932
933
934
935
936
937
938

939
940
941

942
943
944
945
946
947
948
949
950
951
952
953
954











955
956
957
958
959
960
961
[FTS5 CREATE TABLE Options | various options] that may be provided to FTS5 as
part of the CREATE VIRTUAL TABLE statement to configure various aspects of the
new table. These may be used to modify the way in which the FTS5 table extracts
terms from documents and queries, to create extra indexes on disk to speed up
prefix queries, or to create an FTS5 table that acts as an index on content
stored elsewhere.

<p>Once populated, a full-text query may be executed on the contents of an


FTS5 table by adding a MATCH constraint to the WHERE clause of a SELECT 




query. The expression to the left of the MATCH operator must be the name
of the FTS5 table. The expression on the right must be a text value specifying
the term to search for. For example: 



<codeblock>
  <i>-- Query for all rows that contain at least once instance of the term</i>
  <i>-- "fts5" (in any column).</i>
  SELECT * FROM email WHERE email MATCH 'fts5';


</codeblock>

<p> By default, FTS5 full-text searches are case-independent. Like any other
SQL query that does not contain an ORDER BY clause, the example above returns
results in an arbitrary order. To sort results by relevance (most to least
relevant), an ORDER BY may be added to a full-text query as follows:

................................................................................
term surrounded by html &lt;b&gt;&lt;/b&gt; tags. Auxiliary functions are
invoked in the same way as SQLite [corefunc | scalar functions], except that the name
of the FTS5 table is specified as the first argument. For example:

<codeblock>
  <i>-- Query for rows that match "fts5". Return a copy of the "body" column</i>
  <i>-- of each row with the matches surrounded by &lt;b&gt;&lt;/b&gt; tags.</i>
  SELECT highlight(email, 2, '&lt;b&gt;', '&lt;/b&gt;') FROM email WHERE email MATCH 'fts5'
</codeblock>

<p>A description of the available auxiliary functions, and more details
regarding configuration of the special "rank" column, are 
[FTS5 auxiliary functions | available below]. [FTS5 custom auxiliary functions|
Custom auxiliary functions] may also be implemented in C and registered with
FTS5, just as custom SQL functions may be registered with the SQLite core.
................................................................................
       document for it to match the query, 
  <li> sets of terms, prefix terms or phrases that appear within a specified
       proximity of each other (these are called "NEAR queries"), or
  <li> boolean combinations of any of the above.
</ul>

<p> Such advanced searches are requested by providing a more complicated 
FTS5 query string as the text to the right of the MATCH operator. The full

query syntax is [FTS5 query syntax | described here].

<h1 tags="FTS5 building">Compiling and Using FTS5</h1>















<p>The FTS5 source code consists of a series of *.c and other files in the
"ext/fts5" directory of the SQLite source tree. A build process reduces this
to just two files - "fts5.c" and "fts5.h" - which are used to build an SQLite
loadable extension or statically linked into an application.

<ol>
  <li> Obtain the latest SQLite code from fossil.
  <li> Create a Makefile as described in [How To Compile SQLite].
  <li> Build the "fts5.c" target. Which also creates fts5.h.
</ol>

................................................................................
<codeblock>
  $ wget -c http://www.sqlite.org/src/tarball/SQLite-trunk.tgz?uuid=trunk -O SQLite-trunk.tgz
  .... output ...
  $ tar -xzf SQLite-trunk.tgz
  $ cd SQLite-trunk
  $ ./configure && make fts5.c
  ... lots of output ...
  $ ls fts5.& 91;ch]
  fts5.c        fts5.h
</codeblock>

<p>
  The code in "fts5.c" may then be compiled into a loadable extension or
  statically linked into an application as described in 
  [Compiling Loadable Extensions]. There are two entry points defined, both
................................................................................
       configured with columnsize=0.

  <li> <p>Each inserted row must be accompanied by an explicitly specified rowid
       value. If a contentless table is configured with columnsize=0,
       attempting to insert a NULL value into the rowid is an SQLITE_MISMATCH
       error.

  <li> <p>All queries on the table must use the MATCH operator. Attempting to
       SELECT data from the table without a MATCH operator in the WHERE
       clause is an error.

</ul>

<p> The name of the table in which the xColumnSize values are stored
(unless columnsize=0 is specified) is "&lt;name&gt;_docsize", where 
&lt;name&gt; is the name of the FTS5 table itself. The 
<a href=https://www.sqlite.org/download.html>sqlite3_analyzer</a>
tool may be used on an existing database in order to determine how much
................................................................................
</codeblock>

<p> Instead of using bm25() with no trailing arguments, the specific auxiliary
function mapped to the rank column may be configured either on a per-query
basis, or by setting a different persistent default for the FTS table.

<p> In order to change the mapping of the rank column for a single query, 
a term similar to the following is added to the WHERE clause of a query:


<codeblock>
  rank MATCH 'auxiliary-function-name(arg1, arg2, ...)'

</codeblock>

<p> The right-hand-side of the MATCH clause must be a constant expression that
evaluates to a string consisting of the auxiliary function to invoke, followed
by zero or more comma separated arguments within parenthesis. Arguments must
be SQL literals. For example:

<codeblock>
  <i>-- The following queries are logically equivalent. But the second may</i>
  <i>-- be faster. See above. </i>
  SELECT * FROM fts WHERE fts MATCH ? ORDER BY bm25(fts, 10.0, 5.0);
  SELECT * FROM fts WHERE fts MATCH ? AND rank MATCH 'bm25(10.0, 5.0)' ORDER BY rank;
</codeblock>












<p> The default mapping of the rank column for a table may be modified 
using the [FTS5 rank configuration option].

<h1>Special INSERT Commands</h1>

<h2 tags="FTS5 automerge option">The 'automerge' Configuration Option</h2>







|
>
>
|
>
>
>
>
|
|
|
>
>



|

>
>







 







|







 







|
>
|



>
>
>
>
>
>
>
>
>
>
>
>
>
>
|
|
|
|







 







|







 







|
|
|
>







 







|
>



>


|
|
|
|







>
>
>
>
>
>
>
>
>
>
>







35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
..
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
...
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
...
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
...
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
...
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
[FTS5 CREATE TABLE Options | various options] that may be provided to FTS5 as
part of the CREATE VIRTUAL TABLE statement to configure various aspects of the
new table. These may be used to modify the way in which the FTS5 table extracts
terms from documents and queries, to create extra indexes on disk to speed up
prefix queries, or to create an FTS5 table that acts as an index on content
stored elsewhere.

<p>Once populated, there are three ways to execute a full-text query against
the contents of an FTS5 table:

<ul><li> Using a MATCH operator in the WHERE clause of a SELECT statement, or
    <li> Using an equals ("=") operator in the WHERE clause of a SELECT statement, or
    <li> using the [table-valued function] syntax.
</ul>

<p>If using the MATCH or = operators, the expression to the left of the MATCH
   operator must be the name of the FTS5 table. The expression on the right
   must be a text value specifying the term to search for. For the table-valued
   function syntax, the term to search for is specified as the first table argument.
   For example:

<codeblock>
  <i>-- Query for all rows that contain at least once instance of the term</i>
  <i>-- "fts5" (in any column). The following three queries are equivalent.</i>
  SELECT * FROM email WHERE email MATCH 'fts5';
  SELECT * FROM email WHERE email = 'fts5';
  SELECT * FROM email('fts5');
</codeblock>

<p> By default, FTS5 full-text searches are case-independent. Like any other
SQL query that does not contain an ORDER BY clause, the example above returns
results in an arbitrary order. To sort results by relevance (most to least
relevant), an ORDER BY may be added to a full-text query as follows:

................................................................................
term surrounded by html &lt;b&gt;&lt;/b&gt; tags. Auxiliary functions are
invoked in the same way as SQLite [corefunc | scalar functions], except that the name
of the FTS5 table is specified as the first argument. For example:

<codeblock>
  <i>-- Query for rows that match "fts5". Return a copy of the "body" column</i>
  <i>-- of each row with the matches surrounded by &lt;b&gt;&lt;/b&gt; tags.</i>
  SELECT highlight(email, 2, '&lt;b&gt;', '&lt;/b&gt;') FROM email('fts5');
</codeblock>

<p>A description of the available auxiliary functions, and more details
regarding configuration of the special "rank" column, are 
[FTS5 auxiliary functions | available below]. [FTS5 custom auxiliary functions|
Custom auxiliary functions] may also be implemented in C and registered with
FTS5, just as custom SQL functions may be registered with the SQLite core.
................................................................................
       document for it to match the query, 
  <li> sets of terms, prefix terms or phrases that appear within a specified
       proximity of each other (these are called "NEAR queries"), or
  <li> boolean combinations of any of the above.
</ul>

<p> Such advanced searches are requested by providing a more complicated 
FTS5 query string as the text to the right of the MATCH operator (or =
operator, or as the first argument to a table-valued function syntax). The 
full query syntax is [FTS5 query syntax | described here].

<h1 tags="FTS5 building">Compiling and Using FTS5</h1>

<h2>Building FTS5 as part of SQLite</h2>

<p>As of [version 3.9.0], FTS5 is included as part of the SQLite [amalgamation].
It is disabled by default. If using the two autoconf build system, it is
enabled by specifying the "--enable-fts5" option when running the configure
script. 

<p>Or, if sqlite3.c is compiled using some other build system, by arranging for
the SQLITE_ENABLE_FTS5 pre-processor symbol to be defined.

<h2>Building a Loadable Extension</h2>

<p>Alternatively, FTS5 may be built as a loadable extension.

<p>The canonical FTS5 source code consists of a series of *.c and other files
in the "ext/fts5" directory of the SQLite source tree. A build process reduces
this to just two files - "fts5.c" and "fts5.h" - which may be used to build an
SQLite loadable extension.

<ol>
  <li> Obtain the latest SQLite code from fossil.
  <li> Create a Makefile as described in [How To Compile SQLite].
  <li> Build the "fts5.c" target. Which also creates fts5.h.
</ol>

................................................................................
<codeblock>
  $ wget -c http://www.sqlite.org/src/tarball/SQLite-trunk.tgz?uuid=trunk -O SQLite-trunk.tgz
  .... output ...
  $ tar -xzf SQLite-trunk.tgz
  $ cd SQLite-trunk
  $ ./configure && make fts5.c
  ... lots of output ...
  $ ls fts5.&#91;ch]
  fts5.c        fts5.h
</codeblock>

<p>
  The code in "fts5.c" may then be compiled into a loadable extension or
  statically linked into an application as described in 
  [Compiling Loadable Extensions]. There are two entry points defined, both
................................................................................
       configured with columnsize=0.

  <li> <p>Each inserted row must be accompanied by an explicitly specified rowid
       value. If a contentless table is configured with columnsize=0,
       attempting to insert a NULL value into the rowid is an SQLITE_MISMATCH
       error.

  <li> <p>All queries on the table must be full-text queries. In other words,
       they must use the MATCH or = operator with the table-name column as the
       left-hand operand, or else use the table-valued function syntax. Any
       query that is not a full-text query results in an error.
</ul>

<p> The name of the table in which the xColumnSize values are stored
(unless columnsize=0 is specified) is "&lt;name&gt;_docsize", where 
&lt;name&gt; is the name of the FTS5 table itself. The 
<a href=https://www.sqlite.org/download.html>sqlite3_analyzer</a>
tool may be used on an existing database in order to determine how much
................................................................................
</codeblock>

<p> Instead of using bm25() with no trailing arguments, the specific auxiliary
function mapped to the rank column may be configured either on a per-query
basis, or by setting a different persistent default for the FTS table.

<p> In order to change the mapping of the rank column for a single query, 
a term similar to either of the following is added to the WHERE clause of a 
query:

<codeblock>
  rank MATCH 'auxiliary-function-name(arg1, arg2, ...)'
  rank = 'auxiliary-function-name(arg1, arg2, ...)'
</codeblock>

<p> The right-hand-side of the MATCH or = operator must be a constant
expression that evaluates to a string consisting of the auxiliary function to
invoke, followed by zero or more comma separated arguments within parenthesis.
Arguments must be SQL literals. For example:

<codeblock>
  <i>-- The following queries are logically equivalent. But the second may</i>
  <i>-- be faster. See above. </i>
  SELECT * FROM fts WHERE fts MATCH ? ORDER BY bm25(fts, 10.0, 5.0);
  SELECT * FROM fts WHERE fts MATCH ? AND rank MATCH 'bm25(10.0, 5.0)' ORDER BY rank;
</codeblock>

<p> The table-valued function syntax may also be used to specify an alternative
ranking function. In this case the text describing the ranking function should
be specified as the second table-valued function argument. The following three
queries are equivalent:

<codeblock>
  SELECT * FROM fts WHERE fts MATCH ? AND rank MATCH 'bm25(10.0, 5.0)' ORDER BY rank;
  SELECT * FROM fts WHERE fts = ? AND rank = 'bm25(10.0, 5.0)' ORDER BY rank;
  SELECT * FROM fts WHERE fts(?, 'bm25(10.0, 5.0)') ORDER BY rank;
</codeblock>

<p> The default mapping of the rank column for a table may be modified 
using the [FTS5 rank configuration option].

<h1>Special INSERT Commands</h1>

<h2 tags="FTS5 automerge option">The 'automerge' Configuration Option</h2>