Documentation Source Text

Check-in [22655b73bd]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Update the "Differences from FTS3/4" section of fts5.html.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 22655b73bd7daf4d89e3431f8cfda784a3ce9fa4
User & Date: dan 2015-06-10 16:28:34.719
Context
2015-06-17
15:59
Update the change log. Better fragments for matchinfo() flags. (check-in: dcccfa67e1 user: drh tags: trunk)
2015-06-10
16:28
Update the "Differences from FTS3/4" section of fts5.html. (check-in: 22655b73bd user: dan tags: trunk)
13:01
Add documentation for the columnsize=0 option to fts5.html. (check-in: e30da64703 user: dan tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to pages/fts5.in.
99
100
101
102
103
104
105
106




107





108









































109
110
111
112
113
114
115

<p> Such advanced searches are requested by providing a more complicated 
FTS5 query string as the text to the right of the MATCH operator. The full
query syntax is [FTS5 query syntax | described here].

<h2>Differences between FTS5 and FTS3/4</h2>

<p> Also available is the similar but more mature [fts3 | FTS3/4] module. 




Apart from the exciting new name, FTS5 differs from FTS3/4 in the following





ways:










































<ul>
  <li> <p>FTS5 supports "ORDER BY rank" for returning results in order of
       decreasing relevancy.

  <li> <p>FTS5 features an API allowing users to create custom auxiliary 
       functions for advanced ranking and text processing applications. The







|
>
>
>
>
|
>
>
>
>
>
|
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165

<p> Such advanced searches are requested by providing a more complicated 
FTS5 query string as the text to the right of the MATCH operator. The full
query syntax is [FTS5 query syntax | described here].

<h2>Differences between FTS5 and FTS3/4</h2>

<p> Also available is the similar but more mature [fts3 | FTS3/4] module.
FTS5 is similar to FTS3/4 in that the primary task of each is to maintain
an index mapping from each unique token to a list of instances of that token 
within a set of documents, where each instance is identified by the document 
in which it appears and its position within that document. For example:

<codeblock>
  <i>-- Given the following SQL:</i>
  CREATE VIRTUAL TABLE ft USING fts5(a, b);
  INSERT INTO ft(rowid, a, b) VALUES(1, 'X Y', 'Y Z');
  INSERT INTO ft(rowid, a, b) VALUES(2, 'A Z', 'Y Y');

  <i>-- The FTS5 module creates the following mapping on disk:</i>
  A --&gt; (2, 0, 0)
  X --&gt; (1, 0, 0)
  Y --&gt; (1, 1, 0) (2, 1, 0) (2, 1, 1)
  Z --&gt; (2, 1, 1) (1, 1, 1)
</codeblock>

<p>In the example above, each triple identifies the location of a token
instance by rowid, column number (columns are numbered sequentially
starting at 0 from left to right) and position within the column value (the
first token in a column value is 0, the second is 1, and so on). Using this
index, FTS5 is able to provide timely answers to queries such as "the set
of all documents that contain the token 'A'", or "the set of all documents
that contain the sequence 'Y Z'". The list of instances associated with a
single token is called an "instance-list".

<p>The principle difference between FTS3/4 and FTS5 is that in FTS3/4,
each instance-list is stored as a single large database record, whereas
in FTS5 large instance-lists are divided betwen multiple database records.
This has the following implications for dealing with large databases that
contain large lists:

<ul>
  <li> <p>FTS5 is able to load instance-lists into memory incrementally in
       order to reduce memory usage and peak allocation size. FTS3/4 very
       often loads entire instance-lists into memory.

  <li> <p>When processing queries that feature more than one token, FTS5 is
       sometimes able to determine that the query can be answered by
       inspecting a subset of a large instance-list. FTS3/4 almost always
       has to traverse entire instance-lists.

  <li> If an instance-list grows so large that it exceeds
       the [SQLITE_MAX_LENGTH] limit, FTS3/4 is unable to handle it. FTS5
       does not have this problem. 
</ul>

<p>For these reasons, many complex queries may use less memory and run faster 
using FTS5.

<p>Some other ways in which FTS5 differs from FTS3/4 are:

<ul>
  <li> <p>FTS5 supports "ORDER BY rank" for returning results in order of
       decreasing relevancy.

  <li> <p>FTS5 features an API allowing users to create custom auxiliary 
       functions for advanced ranking and text processing applications. The
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
       b-trees that make up its full-text index within an INSERT, UPDATE or
       DELETE statement executed by the user. This means that any operation
       on an FTS3/4 table may turn out to be surprisingly slow, as FTS3/4 
       may unpredictably choose to merge together two or more large b-trees
       within it. FTS5 uses incremental merging by default, which limits
       the amount of processing that may take place within any given 
       INSERT, UPDATE or DELETE operation.

  <li> <p>FTS5 uses significantly less memory when one or more terms in
       a query match a very large number of documents. 
</ul>

<h1 tags="FTS5 query syntax">Full-text Query Syntax</h1>

<p>
The following block contains a summary of the FTS query syntax in BNF form.
A detailed explanation follows.







<
<
<







178
179
180
181
182
183
184



185
186
187
188
189
190
191
       b-trees that make up its full-text index within an INSERT, UPDATE or
       DELETE statement executed by the user. This means that any operation
       on an FTS3/4 table may turn out to be surprisingly slow, as FTS3/4 
       may unpredictably choose to merge together two or more large b-trees
       within it. FTS5 uses incremental merging by default, which limits
       the amount of processing that may take place within any given 
       INSERT, UPDATE or DELETE operation.



</ul>

<h1 tags="FTS5 query syntax">Full-text Query Syntax</h1>

<p>
The following block contains a summary of the FTS query syntax in BNF form.
A detailed explanation follows.