Documentation Source Text

Check-in [984c3483df]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Further enhancements to fts5.in.
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 984c3483dfdaeeca9a3dff93abab0e1d9fce1db4
User & Date: dan 2015-03-06 19:29:24
Context
2015-03-07
11:16
Improve the documentation of fts5 auxiliary functions. check-in: a358d9babf user: dan tags: trunk
2015-03-06
19:29
Further enhancements to fts5.in. check-in: 984c3483df user: dan tags: trunk
2015-03-05
21:00
Add a couple of details to fts5.in. check-in: 23989471bd user: dan tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to pages/fts5.in.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

19
20

21
22
23

24
25



26
27
28


















29
30
31
32



33
34
35




36






























37
38
39
40
41
42
43
44






45
46








































47
48
49
50
51
52
53
54
...
222
223
224
225
226
227
228






































229
230
231
232
233
234
235
236
...
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
...
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
...
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
...
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
...
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
...
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
...
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
...
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
...
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969

<tcl>hd_keywords *fts5 FTS5</tcl>
<title>SQLite FTS5 Extension</title>

<table_of_contents>

<h2 style="margin-left:1.0em" notoc> Overview</h2>

<h1>CREATE TABLE Arguments</h1>

<p>Each argument specified as part of a "CREATE VIRTUAL TABLE ... USING fts5 
..." statement is either a column name or a configuration option. A column
name consists of a single FTS5 bareword or a single string literal quoted
in any manner acceptable to SQLite. A configuration option consists of an
FTS5 bareword - the option name - followed by an "=" character, followed by
the option value. The option value is specified using either a single FTS5 
bareword or a string literal, again quoted in any manner acceptable to the
SQLite core. Anything else is a syntax error.


<h2>Column Names</h2>


<p>It is an error to attempt to name an fts5 table column "rowid" or "rank". 
This is not supported.


<h2>Configuration Options</h2>




<p> A configuration option consists of an FTS5 bareword - the option name -
followed by an "=" character, followed by a either an FTS5 bareword or a


















string literal. For example:

<codeblock>
  CREATE VIRTUAL TABLE mail USING fts5(sender, title, body, tokenize = 'porter ascii');



</codeblock>

<p> There are currently the following configuration options:



































<ul>
  <li> The "tokenize" option, used to configure a [FTS5 tokenizers | custom tokenizer].
  <li> The "prefix" option, used to add [FTS5 prefix indexes | prefix indexes]
       to an FTS5 table.
  <li> The "content" option, used to make the FTS5 table an 
       [FTS5 content option | external content or contentless table].
  <li> The "content_rowid" option, used to set the rowid field of an 
  [FTS5 external content tables | external content table].






</ul>









































<h1>Full-text Query Syntax</h1>

<p>
The following block contains a summary of the FTS query syntax in BNF form.
A detailed explanation follows.

<codeblock>
&lt;phrase&gt;    := string &#91;*]
................................................................................
  ... MATCH 'NEAR(one two) three'   <i>-- 'NEAR(one two) AND three'</i>
  ... MATCH 'one OR two three'      <i>-- 'one OR two AND three'</i>

  ... MATCH '(one OR two) three'    <i>-- Syntax error!</i>
  ... MATCH 'func(one two)'         <i>-- Syntax error!</i>
</codeblock>







































<h1 tags="FTS5 prefix indexes">Prefix Indexes</h1>

<p> By default, FTS5 maintains a single index recording the location of each
token instance within the document set. This means that querying for complete
tokens is fast, as it requires a single lookup, but querying for a prefix 
token can be slow, as it requires a range scan. For example, to query for
the prefix token "abc*" requires a range scan of all tokens greater than
or equal to "abc" and less than "abd".
................................................................................
<codeblock>
  <i>-- Two ways to create an FTS5 table that maintains prefix indexes for
  -- two and three character prefix tokens.</i>
  CREATE VIRTUAL TABLE ft USING fts5(a, b, prefix='2 3');
  CREATE VIRTUAL TABLE ft USING fts5(a, b, prefix=2, prefix=3);
</codeblock>

<h1 tags="FTS5 tokenizers">Tokenizers</h1>

<p> The CREATE VIRTUAL TABLE "tokenize" option is used to configure the
specific tokenizer used by the FTS5 table. The option argument must be either
an FTS5 bareword, or an SQL text literal. The text of the argument is itself
treated as a white-space series of one or more FTS5 barewords or SQL text
literals. The first of these is the name of the tokenizer to use. The second
and subsequent list elements, if they exist, are arguments passed to the
................................................................................

  <li> The <b>porter</b> tokenizer, which implements the 
<a href=http://tartarus.org/martin/PorterStemmer/>porter stemming algorithm</a>.
</ul>

<p> It is also possible to create custom tokenizers for FTS5. The API for doing so is [custom tokenizers | described here].

<h2>Unicode61 Tokenizer</h2>

<p> The unicode tokenizer classifies all unicode characters as either 
"separator" or "token" characters. By default all space and punctuation
characters, as defined by Unicode 6.1, are considered separators, and all 
other characters as token characters. Each contiguous run of one or more 
token characters is considered to be a token. The tokenizer is case-insensitive
according to the rules defined by Unicode 6.1.
................................................................................
  -- script characters, and that considers hyphens and underscore characters
  -- to be part of tokens. </i>
  CREATE VIRTUAL TABLE ft USING fts5(a, b, 
      tokenize = "unicode61 remove_diacritics 0 tokenchars '-_'"
  );
</codeblock>

<h2>Ascii Tokenizer</h2>

<p> The Ascii tokenizer is similar to the Unicode61 tokenizer, except that:

<ul>
  <li> All non-ASCII characters (those with codepoints greater than 127) are
  always considered token characters. If any non-ASCII characters are specified
  as part of the separators option, they are ignored.  
................................................................................
  <i>-- Create an FTS5 table that uses the ascii tokenizer, but does not
  -- consider numeric characters to be part of tokens.</i>
  CREATE VIRTUAL TABLE ft USING fts5(a, b, 
      tokenize = "ascii separators '0123456789'"
  );
</codeblock>

<h2>Porter Tokenizer</h2>

<p> The porter tokenizer is a wrapper tokenizer. It takes the output of some
other tokenizer and applies the 
<a href=http://tartarus.org/martin/PorterStemmer/>porter stemming algorithm</a>
to each token before it returns it to FTS5. This allows search terms like
"correction" to match similar words such as "corrected" or "correcting". The
porter stemmer algorithm is designed for use with English language terms 
................................................................................
  CREATE VIRTUAL TABLE t1 USING fts5(x, tokenize = 'porter unicode61');

  <i>-- A porter tokenizer used to stem the output of the unicode61 tokenizer,
  -- with diacritics removed before stemming.</i>
  CREATE VIRTUAL TABLE t1 USING fts5(x, tokenize = 'porter unicode61 remove_diacritics 1');
</codeblock>

<h1 tags="FTS5 content option">External Content and Contentless Tables</h1>

<p>
Normally, when a row is inserted into an FTS5 table, as well as the various
full-text index entries and other data a copy of the row is stored in a private
table managed by the FTS5 module. When column values are requested from the
FTS5 table by the user or by an auxiliary function implementation, they are
read from this private table. The "content" option may be used to create an
................................................................................
       values. This is known as an "external content" table. In this case all
       FTS5 functionality may be used, but it is the responsibility of the user
       to ensure that the contents of the full-text index are consistent with
       the named database object. If they are not, query results may be
       unpredictable.  
</ul>

<h2 tags="FTS5 contentless tables">Contentless Tables</h2>

<p> A contentless FTS5 table is created by setting the "content" option to
an empty string. For example:

<codeblock>
  CREATE VIRTUAL TABLE f1 USING fts5(a, b, c, content='');
</codeblock>
................................................................................
<p> Contentless FTS5 tables do not support UPDATE or DELETE statements, or
INSERT statements that do not supply a non-NULL value for the rowid field.
Rows may be deleted from a contentless table using an [FTS5 delete command].

<p> Attempting to read any column value except the rowid from a contentless
FTS5 table returns an SQL NULL value.

<h2 tags="FTS5 external content tables">External Content Tables</h2>

<p> An external content FTS5 table is created by setting the content 
option to the name of a table, virtual table or view (hereafter the "content
table") within the same database. Whenever column values are required by
FTS5, it queries the content table as follows, with the rowid of the row
for which values are required bound to the SQL variable:

................................................................................
  CREATE TRIGGER tbl_au AFTER UPDATE ON tbl BEGIN
    INSERT INTO fts_idx(fts_idx, rowid, b, c) VALUES('delete', old.a, old.b, old.c);
    INSERT INTO fts_idx(rowid, b, c) VALUES (new.a, new.b, new.c);
  END;
</codeblock>


<h1> Auxiliary Functions </h1>

<h2>Built-in Auxiliary Functions</h2>

<h3>The bm25() function</h3>

<p> The built-in auxiliary function bm25() returns a real value indicating
how well the current row matches the full-text query. The better the match,
................................................................................
<tclscript>
  set res ""
  set ::extract_api_docs_mode tokenizer_api
  catch { set res [source [file join $::SRC ext/fts5/extract_api_docs.tcl]] }
  set res
</tclscript>

<h2>Custom Auxiliary Functions</h2>

<p> Implementing a custom auxiliary function is similar to implementing an
[application-defined SQL function | scalar SQL function]. The implementation
should be a C function of type fts5_extension_function, defined as follows:

<codeblock>
<tclscript>






|

|

|
|
|
|
|
|
|
|
>

<
>

<
|
>

<
>
>
>

<
<
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
|


<
>
>
>


<
>
>
>
>

>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

<
<
<
<
<
<
<
>
>
>
>
>
>


>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
|







 







>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
|







 







|







 







|







 







|







 







|







 







|







 







|







 







|







 







|







 







|







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

21
22

23
24
25

26
27
28
29


30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50

51
52
53
54
55

56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91







92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
...
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
...
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
...
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
...
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
...
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
...
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
...
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
...
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
...
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
....
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100

<tcl>hd_keywords *fts5 FTS5</tcl>
<title>SQLite FTS5 Extension</title>

<table_of_contents>

<h2 style="margin-left:1.0em" notoc> Overview</h2> 

<h1>Overview of FTS5</h1>

<p>FTS5 is an SQLite [virtual table module] that provides 
<a href=http://en.wikipedia.org/wiki/Full_text_search>full-text search</a>
functionality to database applications. In their most elementary form, 
full-text search engines allow the user to efficiently search a large 
collection of documents for the subset that contain one or more instances of a
search term. The search functionality provided to world wide web users by
<a href=www.google.com>Google</a> is, amongst other things, a full-text search
engine, as it allows users to search for all documents on the web that contain,
for example, the term "fts5".


<h2>Overview of Functionality</h2>


<p>To use FTS5, the user creates an FTS5 virtual table with one or more
columns. For example:


<codeblock>
  CREATE VIRTUAL TABLE email USING fts5(sender, title, body);
</codeblock>



<p>It is an error to add types, constraints or [PRIMARY KEY] declarations to 
a CREATE VIRTUAL TABLE statement used to create an FTS5 table. Once created,
an FTS5 table may be populated using [INSERT], [UPDATE] or [DELETE] statements
like any other table. Like any other table with no PRIMARY KEY declaration, an
FTS5 table has an implicit INTEGER PRIMARY KEY field named rowid. 

<p>Not shown in the example above is that there are also 
[FTS5 CREATE TABLE Options | various options] that may be provided to FTS5 as
part of the CREATE VIRTUAL TABLE statement to configure various aspects of the
new table. These may be used to modify the way in which the FTS5 table extracts
terms from documents and queries, to create extra indexes on disk to speed up
prefix queries, or to create an FTS5 table that acts as an index on content
stored elsewhere.

<p>Once populated, a full-text query may be executed on the contents of an
FTS5 table by adding a MATCH constraint to the WHERE clause of a SELECT 
query. The expression to the right of the MATCH operator must be the name
of the FTS5 table. The expression on the left must be a text value specifying
the term to search for. For example: 

<codeblock>

  <i>-- Query for all rows that contain at least once instance of the term</i>
  <i>-- "fts5" (in any column).</i>
  SELECT * FROM email WHERE email MATCH 'fts5';
</codeblock>


<p> By default, FTS5 full-text searches are case-independent. Like any other
SQL query that does not contain an ORDER BY clause, the example above returns
results in an arbitrary order. To sort results by relevance (most to least
relevant), an ORDER BY may be added to a full-text query as follows:

<codeblock>
  <i>-- Query for all rows that contain at least once instance of the term</i>
  <i>-- "fts5" (in any column). Return results in order from best to worst</i>
  <i>-- match.  </i>
  SELECT * FROM email WHERE email MATCH 'fts5' ORDER BY rank;
</codeblock>

<p> As well as the column values and rowid of a matching row, an application
may use [FTS5 auxiliary functions] to retrieve extra information regarding
the matched row. For example, an auxiliary function may be used to retrieve
a copy of a column value for a matched row with all instances of the matched
term surrounded by html &lt;b&gt;&lt;/b&gt; tags. Auxiliary functions are
invoked in the same way as SQLite [corefunc | scalar functions], except that the name
of the FTS5 table is specified as the first argument. For example:

<codeblock>
  <i>-- Query for rows that match "fts5". Return a copy of the "body" column</i>
  <i>-- of each row with the matches surrounded by &lt;b&gt;&lt;/b&gt; tags.</i>
  SELECT highlight(email, 2, '&lt;b&gt;', '&lt;/b&gt;') FROM email WHERE email MATCH 'fts5'
</codeblock>

<p>A description of the available auxiliary functions, and more details
regarding configuration of the special "rank" column, are 
[FTS5 auxiliary functions | available below]. [FTS5 custom auxiliary functions|
Custom auxiliary functions] may also be implemented in C and registered with
FTS5, just as custom SQL functions may be registered with the SQLite core.

<p> As well as seaching for all rows that contain a term, FTS5 allows 
the user to search for rows that contain:

<ul>







  <li> any terms that begin with a specified prefix,
  <li> "phrases" - sequences of terms or prefix terms that must feature in a
       document for it to match the query, 
  <li> sets of terms, prefix terms or phrases that appear within a specified
       proximity of each other (these are called "NEAR queries"), or
  <li> boolean combinations of any of the above.
</ul>

<p> Such advanced searches are requested by providing a more complicated 
FTS5 query string as the text to the right of the MATCH operator. The full
query syntax is [FTS5 query syntax | described here].

<h2>Differences between FTS5 and FTS3/4</h2>

<p> Also available is the similar but more mature [fts3 | FTS3/4] module. 
Apart from the exciting new name, FTS5 differs from FTS3/4 in the following
ways:

<ul>
  <li> <p>FTS5 supports "ORDER BY rank" for returning results in order of
       decreasing relevancy.

  <li> <p>FTS5 features an API allowing users to create custom auxiliary 
       functions for advanced ranking and text processing applications. The
       special "rank" column may be mapped to a custom auxiliary function
       so that adding "ORDER BY rank" to a query works as expected.

  <li> <p>FTS5 recognizes unicode separator characters and case equivalence by
       default. This is also possible using FTS3/4, but must be explicitly
       enabled.

  <li> <p>The query syntax has been revised where necessary to remove
       ambiguities and to make it possible to escape special characters
       in query terms.

  <li> <p>By default, FTS3/4 occasionally merges together two or more of the
       b-trees that make up its full-text index within an INSERT, UPDATE or
       DELETE statement executed by the user. This means that any operation
       on an FTS3/4 table may turn out to be surprisingly slow, as FTS3/4 
       may unpredictably choose to merge together two or more large b-trees
       within it. FTS5 uses incremental merging by default, which limits
       the amount of processing that may take place within any given 
       INSERT, UPDATE or DELETE operation.

  <li> <p>FTS5 uses significantly less memory when one or more terms in
       a query match a very large number of documents. 
</ul>

<h1 tags="FTS5 query syntax">Full-text Query Syntax</h1>

<p>
The following block contains a summary of the FTS query syntax in BNF form.
A detailed explanation follows.

<codeblock>
&lt;phrase&gt;    := string &#91;*]
................................................................................
  ... MATCH 'NEAR(one two) three'   <i>-- 'NEAR(one two) AND three'</i>
  ... MATCH 'one OR two three'      <i>-- 'one OR two AND three'</i>

  ... MATCH '(one OR two) three'    <i>-- Syntax error!</i>
  ... MATCH 'func(one two)'         <i>-- Syntax error!</i>
</codeblock>

<h1 tags="FTS5 CREATE TABLE Options">FTS5 Table Creation and Initialization</h1>

<p>Each argument specified as part of a "CREATE VIRTUAL TABLE ... USING fts5 
..." statement is either a column name or a configuration option. A column
name consists of a single FTS5 bareword or a single string literal quoted
in any manner acceptable to SQLite. A configuration option consists of an
FTS5 bareword - the option name - followed by an "=" character, followed by
the option value. The option value is specified using either a single FTS5 
bareword or a string literal, again quoted in any manner acceptable to the
SQLite core. Anything else is a syntax error.

<p>It is an error to attempt to name an fts5 table column "rowid" or "rank",
or to assign the same name to a column as is used by the table itself. This 
is not supported.

<p>It is also an error to add a type, constraint specification or PRIMARY KEY
clause to a column name in an FTS5 CREATE VIRTUAL TABLE statement.

<p> A configuration option consists of an FTS5 bareword - the option name -
followed by an "=" character, followed by a either an FTS5 bareword or a
string literal. For example:

<codeblock>
  CREATE VIRTUAL TABLE mail USING fts5(sender, title, body, tokenize = 'porter ascii');
</codeblock>

<p> There are currently the following configuration options:

<ul>
  <li> The "tokenize" option, used to configure a [FTS5 tokenizers | custom tokenizer].
  <li> The "prefix" option, used to add [FTS5 prefix indexes | prefix indexes]
       to an FTS5 table.
  <li> The "content" option, used to make the FTS5 table an 
       [FTS5 content option | external content or contentless table].
  <li> The "content_rowid" option, used to set the rowid field of an 
  [FTS5 external content tables | external content table].
</ul>

<h2 tags="FTS5 prefix indexes">Prefix Indexes</h2>

<p> By default, FTS5 maintains a single index recording the location of each
token instance within the document set. This means that querying for complete
tokens is fast, as it requires a single lookup, but querying for a prefix 
token can be slow, as it requires a range scan. For example, to query for
the prefix token "abc*" requires a range scan of all tokens greater than
or equal to "abc" and less than "abd".
................................................................................
<codeblock>
  <i>-- Two ways to create an FTS5 table that maintains prefix indexes for
  -- two and three character prefix tokens.</i>
  CREATE VIRTUAL TABLE ft USING fts5(a, b, prefix='2 3');
  CREATE VIRTUAL TABLE ft USING fts5(a, b, prefix=2, prefix=3);
</codeblock>

<h2 tags="FTS5 tokenizers">Tokenizers</h2>

<p> The CREATE VIRTUAL TABLE "tokenize" option is used to configure the
specific tokenizer used by the FTS5 table. The option argument must be either
an FTS5 bareword, or an SQL text literal. The text of the argument is itself
treated as a white-space series of one or more FTS5 barewords or SQL text
literals. The first of these is the name of the tokenizer to use. The second
and subsequent list elements, if they exist, are arguments passed to the
................................................................................

  <li> The <b>porter</b> tokenizer, which implements the 
<a href=http://tartarus.org/martin/PorterStemmer/>porter stemming algorithm</a>.
</ul>

<p> It is also possible to create custom tokenizers for FTS5. The API for doing so is [custom tokenizers | described here].

<h3>Unicode61 Tokenizer</h3>

<p> The unicode tokenizer classifies all unicode characters as either 
"separator" or "token" characters. By default all space and punctuation
characters, as defined by Unicode 6.1, are considered separators, and all 
other characters as token characters. Each contiguous run of one or more 
token characters is considered to be a token. The tokenizer is case-insensitive
according to the rules defined by Unicode 6.1.
................................................................................
  -- script characters, and that considers hyphens and underscore characters
  -- to be part of tokens. </i>
  CREATE VIRTUAL TABLE ft USING fts5(a, b, 
      tokenize = "unicode61 remove_diacritics 0 tokenchars '-_'"
  );
</codeblock>

<h3>Ascii Tokenizer</h3>

<p> The Ascii tokenizer is similar to the Unicode61 tokenizer, except that:

<ul>
  <li> All non-ASCII characters (those with codepoints greater than 127) are
  always considered token characters. If any non-ASCII characters are specified
  as part of the separators option, they are ignored.  
................................................................................
  <i>-- Create an FTS5 table that uses the ascii tokenizer, but does not
  -- consider numeric characters to be part of tokens.</i>
  CREATE VIRTUAL TABLE ft USING fts5(a, b, 
      tokenize = "ascii separators '0123456789'"
  );
</codeblock>

<h3>Porter Tokenizer</h3>

<p> The porter tokenizer is a wrapper tokenizer. It takes the output of some
other tokenizer and applies the 
<a href=http://tartarus.org/martin/PorterStemmer/>porter stemming algorithm</a>
to each token before it returns it to FTS5. This allows search terms like
"correction" to match similar words such as "corrected" or "correcting". The
porter stemmer algorithm is designed for use with English language terms 
................................................................................
  CREATE VIRTUAL TABLE t1 USING fts5(x, tokenize = 'porter unicode61');

  <i>-- A porter tokenizer used to stem the output of the unicode61 tokenizer,
  -- with diacritics removed before stemming.</i>
  CREATE VIRTUAL TABLE t1 USING fts5(x, tokenize = 'porter unicode61 remove_diacritics 1');
</codeblock>

<h2 tags="FTS5 content option">External Content and Contentless Tables</h2>

<p>
Normally, when a row is inserted into an FTS5 table, as well as the various
full-text index entries and other data a copy of the row is stored in a private
table managed by the FTS5 module. When column values are requested from the
FTS5 table by the user or by an auxiliary function implementation, they are
read from this private table. The "content" option may be used to create an
................................................................................
       values. This is known as an "external content" table. In this case all
       FTS5 functionality may be used, but it is the responsibility of the user
       to ensure that the contents of the full-text index are consistent with
       the named database object. If they are not, query results may be
       unpredictable.  
</ul>

<h3 tags="FTS5 contentless tables">Contentless Tables</h3>

<p> A contentless FTS5 table is created by setting the "content" option to
an empty string. For example:

<codeblock>
  CREATE VIRTUAL TABLE f1 USING fts5(a, b, c, content='');
</codeblock>
................................................................................
<p> Contentless FTS5 tables do not support UPDATE or DELETE statements, or
INSERT statements that do not supply a non-NULL value for the rowid field.
Rows may be deleted from a contentless table using an [FTS5 delete command].

<p> Attempting to read any column value except the rowid from a contentless
FTS5 table returns an SQL NULL value.

<h3 tags="FTS5 external content tables">External Content Tables</h3>

<p> An external content FTS5 table is created by setting the content 
option to the name of a table, virtual table or view (hereafter the "content
table") within the same database. Whenever column values are required by
FTS5, it queries the content table as follows, with the rowid of the row
for which values are required bound to the SQL variable:

................................................................................
  CREATE TRIGGER tbl_au AFTER UPDATE ON tbl BEGIN
    INSERT INTO fts_idx(fts_idx, rowid, b, c) VALUES('delete', old.a, old.b, old.c);
    INSERT INTO fts_idx(rowid, b, c) VALUES (new.a, new.b, new.c);
  END;
</codeblock>


<h1 tags="FTS5 auxiliary functions"> Auxiliary Functions </h1>

<h2>Built-in Auxiliary Functions</h2>

<h3>The bm25() function</h3>

<p> The built-in auxiliary function bm25() returns a real value indicating
how well the current row matches the full-text query. The better the match,
................................................................................
<tclscript>
  set res ""
  set ::extract_api_docs_mode tokenizer_api
  catch { set res [source [file join $::SRC ext/fts5/extract_api_docs.tcl]] }
  set res
</tclscript>

<h2 tags="FTS5 custom auxiliary functions">Custom Auxiliary Functions</h2>

<p> Implementing a custom auxiliary function is similar to implementing an
[application-defined SQL function | scalar SQL function]. The implementation
should be a C function of type fts5_extension_function, defined as follows:

<codeblock>
<tclscript>