Documentation Source Text

Check-in [8e18201f14]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Update the description of fts4aux in fts3.in to account for recent changes.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 8e18201f14e50069195607c228fe4f573eba8b01
User & Date: dan 2011-02-05 17:00:49.038
Context
2011-02-10
20:25
Update pragma documentation to describe the parameters to the wal_checkpoint pragma that follow from the addition of sqlite3_wal_checkpoint_v2(). (check-in: aefb07aa90 user: drh tags: trunk)
2011-02-05
17:00
Update the description of fts4aux in fts3.in to account for recent changes. (check-in: 8e18201f14 user: dan tags: trunk)
2011-02-04
00:52
Update the ATTACH documentation to allow an expression as the filename parameter. Core ticket http://www.sqlite.org/src/info/9013e13dba. (check-in: be0c879c12 user: drh tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to pages/fts3.in.
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333

1334
1335
1336
1337
1338
1339
1340
1341





1342



1343
1344




1345
1346



1347



1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361

1362

1363


1364


1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377


1378
1379
1380
1381
1382
1383
1384
  matchinfo() function efficiently.

<h1 id=fts4aux tags="fts4aux">Fts4aux - Direct Access to the Full-Text Index</h1>

<p>
  As of version 3.7.6, SQLite includes a new virtual table module called 
  "fts4aux", which can be used to inspect the full-text index of an exiting
  FTS3 or FTS4 table directly (despite its name, fts4aux works just as well
  with FTS3 tables as it does with FTS4 tables). Fts4aux tables are read-only.
  The only way to modify the contents of an fts4aux table is by modifying the
  contents of the associated FTS table. The fts4aux module is automatically
  included in all [compile fts|builds that include FTS].

<p>
  An fts4aux virtual table is constructed with a single argument - the 
  unqualified name of the FTS table that it will be used to access.
  For example:

<codeblock>
  <i>-- Create an FTS4 table</i>
  CREATE VIRTUAL TABLE ft USING fts4;

  <i>-- Create an fts4aux table to access the full-text index for table "ft"</i>
  CREATE VIRTUAL TABLE ft_terms USING fts4aux(ft);
</codeblock>

<p>

  There is one row in an fts4aux table for each distinct term in the
  associated FTS table. An fts4aux table always has the same three columns, 
  as follows, from left to right:

<table striped=1>
  <tr><th>Column Name<th>Column Contents
  <tr><td>term<td> 
    Contains the text of the term for this row.





  <tr><td>documents<td>



    Contains the number of rows of the FTS table that contain at 
    least one instance of the term (in any column). This column always 




    contains an integer value.
  <tr><td>occurrences<td>



    Contains the total number of instances of the term in all rows of the 



    FTS table. This column also always contains an integer value.
</table>

<p>
  For example, using the tables created above:

<codeblock>
  INSERT INTO ft VALUES('Apple banana Cherry');
  INSERT INTO ft VALUES('Banana Date Fig');
  INSERT INTO ft VALUES('Cherry Grapefruit Cherry');

  <i>-- The following query returns this data:</i>
  <i>--</i>
  <i>--     apple       |  1  |  1</i>

  <i>--     banana      |  2  |  2</i>

  <i>--     cherry      |  2  |  3</i>


  <i>--     date        |  1  |  1</i>


  <i>--     fig         |  1  |  1</i>
  <i>--     grapefruit  |  1  |  1</i>
  <i>--</i>
  SELECT term, documents, occurrences FROM ft_terms;
</codeblock>

<p>
  In the example, the values in the "term" column are all lower case, even
  though they were inserted into table "ft" in mixed case. This is because
  an fts3aux table contains the terms as extracted from the document text
  by the [tokenizer]. In this case, since table "ft" uses the 
  [tokenizer|simple tokenizer], this means all terms have been folded to
  lower case.



<p>
  During a transaction, some of the data written to an FTS table may be 
  cached in memory and written to the database only when the transaction is 
  committed. However the implementation of the fts4aux module is only able 
  to read data from the database. In practice this means that if an fts4aux 
  table is queried from within a transaction in which the associated 







|
|
|










|






>
|
|






>
>
>
>
>

>
>
>
|
|
>
>
>
>
|

>
>
>
|
>
>
>
|






|
|
|



|
>
|
>
|
>
>
|
>
>
|
|

|



|
|



|
>
>







1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
  matchinfo() function efficiently.

<h1 id=fts4aux tags="fts4aux">Fts4aux - Direct Access to the Full-Text Index</h1>

<p>
  As of version 3.7.6, SQLite includes a new virtual table module called 
  "fts4aux", which can be used to inspect the full-text index of an exiting
  FTS table directly. Despite its name, fts4aux works just as well with FTS3
  tables as it does with FTS4 tables. Fts4aux tables are read-only. The only
  way to modify the contents of an fts4aux table is by modifying the
  contents of the associated FTS table. The fts4aux module is automatically
  included in all [compile fts|builds that include FTS].

<p>
  An fts4aux virtual table is constructed with a single argument - the 
  unqualified name of the FTS table that it will be used to access.
  For example:

<codeblock>
  <i>-- Create an FTS4 table</i>
  CREATE VIRTUAL TABLE ft USING fts4(x, y);

  <i>-- Create an fts4aux table to access the full-text index for table "ft"</i>
  CREATE VIRTUAL TABLE ft_terms USING fts4aux(ft);
</codeblock>

<p>
  For each term present in the FTS table, there are between 2 and N+1 rows
  in the fts4aux table, where N is the number of user-defined columns in
  the associated FTS table. An fts4aux table always has the same four columns, 
  as follows, from left to right:

<table striped=1>
  <tr><th>Column Name<th>Column Contents
  <tr><td>term<td> 
    Contains the text of the term for this row.
  <tr><td>col<td> 
    This column may contain either the text value '*' (i.e. a single 
    character, UTF codepoint 42) or an integer between 0 and N-1, where N is
    again the number of user-defined columns in the corresponing FTS table.

  <tr><td>documents<td>
    This column always contains an integer value greater than zero.
    <br><br>
    If the "col" column contains the value '*', then this column
    contains the number of rows of the FTS table that contain at least one
    instance of the term (in any column). If col contains an integer
    value, then this column contains the number of rows of the FTS table that
    contain at least one instance of the term in the column identified by
    the col value. As usual, the columns of the FTS table are numbered
    from left to right, starting with zero.

  <tr><td>occurrences<td>
    This column also always contains an integer value greater than zero.
    <br><br>
    If the "col" column contains the value '*', then this column
    contains the total number of instances of the term in all rows of the 
    FTS table (in any column). Otherwise, if col contains an integer
    value, then this column contains the total number of instances of the
    term that appear in the FTS table column identified by the col
    value.
</table>

<p>
  For example, using the tables created above:

<codeblock>
  INSERT INTO ft(x, y) VALUES('Apple banana', 'Cherry');
  INSERT INTO ft(x, y) VALUES('Banana Date Date', 'cherry');
  INSERT INTO ft(x, y) VALUES('Cherry Elderberry', 'Elderberry');

  <i>-- The following query returns this data:</i>
  <i>--</i>
  <i>--     apple       |  *  |  1  |  1</i>
  <i>--     apple       |  0  |  1  |  1</i>
  <i>--     banana      |  *  |  2  |  2</i>
  <i>--     banana      |  0  |  2  |  2</i>
  <i>--     cherry      |  *  |  3  |  3</i>
  <i>--     cherry      |  0  |  1  |  1</i>
  <i>--     cherry      |  1  |  2  |  2</i>
  <i>--     date        |  *  |  1  |  2</i>
  <i>--     date        |  0  |  1  |  2</i>
  <i>--     elderberry  |  *  |  1  |  2</i>
  <i>--     elderberry  |  1  |  1  |  1</i>
  <i>--     elderberry  |  1  |  1  |  1</i>
  <i>--</i>
  SELECT term, col, documents, occurrences FROM ft_terms;
</codeblock>

<p>
  In the example, the values in the "term" column are all lower case, 
  even though they were inserted into table "ft" in mixed case. This is because
  an fts3aux table contains the terms as extracted from the document text
  by the [tokenizer]. In this case, since table "ft" uses the 
  [tokenizer|simple tokenizer], this means all terms have been folded to
  lower case. Also, there is (for example) no row with column "term"
  set to "apple" and column "col" set to 1. Since there are no instances
  of the term "apple" in column 1, no row is present in the fts4aux table.

<p>
  During a transaction, some of the data written to an FTS table may be 
  cached in memory and written to the database only when the transaction is 
  committed. However the implementation of the fts4aux module is only able 
  to read data from the database. In practice this means that if an fts4aux 
  table is queried from within a transaction in which the associated