SQLite

Check-in [64e8a116f3]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Add a comment to explain how the FTS integrity-check works.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | fts4-incr-merge
Files: files | file ages | folders
SHA1: 64e8a116f39434a3b7347f01a47f88eef3276742
User & Date: dan 2012-03-26 10:47:03.798
Context
2012-03-26
10:57
Modify the FTS integrity-check so that the checksums do not depend on the results of signed integer overflow, which is undefined in C. (check-in: f907fc3fb3 user: dan tags: fts4-incr-merge)
10:47
Add a comment to explain how the FTS integrity-check works. (check-in: 64e8a116f3 user: dan tags: fts4-incr-merge)
10:36
Add an experimental integrity-check function to FTS. (check-in: 40fc880474 user: dan tags: fts4-incr-merge)
Changes
Unified Diff Ignore Whitespace Patch
Changes to ext/fts3/fts3_write.c.
4897
4898
4899
4900
4901
4902
4903























4904
4905
4906
4907
4908
4909
4910
/*
** Run the integrity-check. If no error occurs and the current contents of
** the FTS index are correct, return SQLITE_OK. Or, if the contents of the
** FTS index are incorrect, return SQLITE_CORRUPT_VTAB.
**
** Or, if an error (e.g. an OOM or IO error) occurs, return an SQLite 
** error code.























*/
static int fts3DoIntegrityCheck(
  Fts3Table *p                    /* FTS3 table handle */
){
  int rc;
  int bOk = 0;
  rc = fts3IntegrityCheck(p, &bOk);







>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







4897
4898
4899
4900
4901
4902
4903
4904
4905
4906
4907
4908
4909
4910
4911
4912
4913
4914
4915
4916
4917
4918
4919
4920
4921
4922
4923
4924
4925
4926
4927
4928
4929
4930
4931
4932
4933
/*
** Run the integrity-check. If no error occurs and the current contents of
** the FTS index are correct, return SQLITE_OK. Or, if the contents of the
** FTS index are incorrect, return SQLITE_CORRUPT_VTAB.
**
** Or, if an error (e.g. an OOM or IO error) occurs, return an SQLite 
** error code.
**
** The integrity-check works as follows. For each token and indexed token
** prefix in the document set, a 64-bit checksum is calculated (by code
** in fts3ChecksumEntry()) based on the following:
**
**     + The index number (0 for the main index, 1 for the first prefix
**       index etc.),
**     + The token (or token prefix) text itself, 
**     + The language-id of the row it appears in,
**     + The docid of the row it appears in,
**     + The column it appears in, and
**     + The tokens position within that column.
**
** The checksums for all entries in the index are XORed together to create
** a single checksum for the entire index.
**
** The integrity-check code calculates the same checksum in two ways:
**
**     1. By scanning the contents of the FTS index, and 
**     2. By scanning and tokenizing the content table.
**
** If the two checksums are identical, the integrity-check is deemed to have
** passed.
*/
static int fts3DoIntegrityCheck(
  Fts3Table *p                    /* FTS3 table handle */
){
  int rc;
  int bOk = 0;
  rc = fts3IntegrityCheck(p, &bOk);