Documentation Source Text

Check-in [b3c69199b5]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Fix typos in Appendix A of fts3.html.
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: b3c69199b579dfb639a58de9cb480f51b748581a
User & Date: dan 2011-03-07 13:40:36
Context
2011-03-12
05:01
Added comments for SQLITE_OMIT_UNIQUE_ENFORCEMENT. check-in: 21beb4719d user: shaneh tags: trunk
2011-03-07
13:40
Fix typos in Appendix A of fts3.html. check-in: b3c69199b5 user: dan tags: trunk
2011-03-06
03:32
Updates to the testing page: talk of signed-integer overflow and verifying that it is not used. check-in: 3793e03ba1 user: drh tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to pages/fts3.in.

1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
....
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
....
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
....
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
  <i>-- Assuming the application has supplied an SQLite user function named "countintegers"</i>
  <i>-- that returns the number of space-separated integers contained in its only argument,</i>
  <i>-- the following query could be used to return the titles of the 10 documents that contain</i>
  <i>-- the greatest number of instances of the users query terms. Hopefully, these 10</i>
  <i>-- documents will be those that the users considers more or less the most "relevant".</i>
  SELECT title FROM documents 
    WHERE documents MATCH &lt;query&gt;
    ORDER BY countintegers(offsets(document)) DESC
    LIMIT 10 OFFSET 0
</codeblock>

<p>
  The query above could be made to run faster by using the FTS [matchinfo]
  function to determine the number of query term instances that appear in each
  result. The matchinfo function is much more efficient than the offsets 
................................................................................
<codeblock>
  <i>-- If the application supplies an SQLite user function called "rank" that</i>
  <i>-- interprets the blob of data returned by matchinfo and returns a numeric</i>
  <i>-- relevancy based on it, then the following SQL may be used to return the</i>
  <i>-- titles of the 10 most relevant documents in the dataset for a users query.</i>
  SELECT title FROM documents 
    WHERE documents MATCH &lt;query&gt;
    ORDER BY rank(matchinfo(document)) DESC
    LIMIT 10 OFFSET 0
</codeblock>

<p>
  The SQL query in the example above uses less CPU than the first example
  in this section, but still has a non-obvious performance problem. SQLite
  satisfies this query by retrieving the value of the "title" column and
................................................................................
  corresponding to the ten most relevant documents, then load only the title
  and content information for those 10 documents only. Because both the matchinfo
  and docid values are gleaned entirely from the full-text index, this results
  in dramatically less data being loaded from the database into memory.

<codeblock>
  SELECT title FROM documents JOIN ( 
      SELECT docid, rank(matchinfo(document)) AS rank 
      FROM documents
      WHERE documents MATCH &lt;query&gt;
      ORDER BY rank DESC 
      LIMIT 10 OFFSET 0
  ) AS ranktable USING(docid)
  ORDER BY ranktable.rank DESC
</codeblock>
................................................................................
  <i>--      that the snippet function may be used, the "WHERE ... MATCH ..." clause from</i>
  <i>--      the sub-query is duplicated in the outer query.</i>
  <i>--</i>
  <i>--   2. The sub-query joins the documents table with the document_data table, so that</i>
  <i>--      implementation of the rank function has access to the static weight assigned</i>
  <i>--      to each document.</i>
  SELECT title, snippet(documents) FROM documents JOIN ( 
      SELECT docid, rank(matchinfo(document), documents_data.weight) AS rank
      FROM documents JOIN documents_data USING(docid)
      WHERE documents MATCH &lt;query&gt;
      ORDER BY rank DESC 
      LIMIT 10 OFFSET 0
  ) AS ranktable USING(docid)
  WHERE documents MATCH &lt;query&gt;
  ORDER BY ranktable.rank DESC







|







 







|







 







|







 







|







1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
....
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
....
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
....
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
  <i>-- Assuming the application has supplied an SQLite user function named "countintegers"</i>
  <i>-- that returns the number of space-separated integers contained in its only argument,</i>
  <i>-- the following query could be used to return the titles of the 10 documents that contain</i>
  <i>-- the greatest number of instances of the users query terms. Hopefully, these 10</i>
  <i>-- documents will be those that the users considers more or less the most "relevant".</i>
  SELECT title FROM documents 
    WHERE documents MATCH &lt;query&gt;
    ORDER BY countintegers(offsets(documents)) DESC
    LIMIT 10 OFFSET 0
</codeblock>

<p>
  The query above could be made to run faster by using the FTS [matchinfo]
  function to determine the number of query term instances that appear in each
  result. The matchinfo function is much more efficient than the offsets 
................................................................................
<codeblock>
  <i>-- If the application supplies an SQLite user function called "rank" that</i>
  <i>-- interprets the blob of data returned by matchinfo and returns a numeric</i>
  <i>-- relevancy based on it, then the following SQL may be used to return the</i>
  <i>-- titles of the 10 most relevant documents in the dataset for a users query.</i>
  SELECT title FROM documents 
    WHERE documents MATCH &lt;query&gt;
    ORDER BY rank(matchinfo(documents)) DESC
    LIMIT 10 OFFSET 0
</codeblock>

<p>
  The SQL query in the example above uses less CPU than the first example
  in this section, but still has a non-obvious performance problem. SQLite
  satisfies this query by retrieving the value of the "title" column and
................................................................................
  corresponding to the ten most relevant documents, then load only the title
  and content information for those 10 documents only. Because both the matchinfo
  and docid values are gleaned entirely from the full-text index, this results
  in dramatically less data being loaded from the database into memory.

<codeblock>
  SELECT title FROM documents JOIN ( 
      SELECT docid, rank(matchinfo(documents)) AS rank 
      FROM documents
      WHERE documents MATCH &lt;query&gt;
      ORDER BY rank DESC 
      LIMIT 10 OFFSET 0
  ) AS ranktable USING(docid)
  ORDER BY ranktable.rank DESC
</codeblock>
................................................................................
  <i>--      that the snippet function may be used, the "WHERE ... MATCH ..." clause from</i>
  <i>--      the sub-query is duplicated in the outer query.</i>
  <i>--</i>
  <i>--   2. The sub-query joins the documents table with the document_data table, so that</i>
  <i>--      implementation of the rank function has access to the static weight assigned</i>
  <i>--      to each document.</i>
  SELECT title, snippet(documents) FROM documents JOIN ( 
      SELECT docid, rank(matchinfo(documents), documents_data.weight) AS rank
      FROM documents JOIN documents_data USING(docid)
      WHERE documents MATCH &lt;query&gt;
      ORDER BY rank DESC 
      LIMIT 10 OFFSET 0
  ) AS ranktable USING(docid)
  WHERE documents MATCH &lt;query&gt;
  ORDER BY ranktable.rank DESC