Documentation Source Text

Check-in [2dad241898]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Update fts3.html with a description of the matchinfo 'b' option.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 2dad241898147dd53fae898f881ce9158dd4ca58
User & Date: dan 2015-05-11 19:12:13.527
Context
2015-05-15
14:48
Increased information about TH3. (check-in: 7d4f8c9ada user: drh tags: trunk)
2015-05-11
19:12
Update fts3.html with a description of the matchinfo 'b' option. (check-in: 2dad241898 user: dan tags: trunk)
17:09
Updates to the dbstat documentation to explain how it can be used to get information about attached databases other than "main". (check-in: d33510f222 user: drh tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to pages/fts3.in.
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
  of integers in the returned array depends on both the query and the value
  of the second argument (if any) passed to the matchinfo function.

<p>
  The matchinfo function is called with either one or two arguments. As for
  all auxiliary functions, the first argument must be the special 
  [FTS hidden column]. The second argument, if it is specified, must be a text value
  comprised only of the characters 'p', 'c', 'n', 'a', 'l', 's', 'x', and 'y'.
  If no second argument is explicitly supplied, it defaults to "pcx". The
  second argument is referred to as the "format string" below.

<p>
  Characters in the matchinfo format string are processed from left to right. 
  Each character in the format string causes one or more 32-bit unsigned
  integer values to be added to the returned array. The "values" column in







|







1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
  of integers in the returned array depends on both the query and the value
  of the second argument (if any) passed to the matchinfo function.

<p>
  The matchinfo function is called with either one or two arguments. As for
  all auxiliary functions, the first argument must be the special 
  [FTS hidden column]. The second argument, if it is specified, must be a text value
  comprised only of the characters 'p', 'c', 'n', 'a', 'l', 's', 'x', 'y' and 'b'.
  If no second argument is explicitly supplied, it defaults to "pcx". The
  second argument is referred to as the "format string" below.

<p>
  Characters in the matchinfo format string are processed from left to right. 
  Each character in the format string causes one or more 32-bit unsigned
  integer values to be added to the returned array. The "values" column in
1165
1166
1167
1168
1169
1170
1171































1172
1173
1174
1175
1176
1177
1178

<pre>
          hits_for_phrase_p_column_c  = array&#91;c + p*cols&#93;
</pre>
      For queries that use OR expressions, or those that use LIMIT or return
      many rows, the 'y' matchinfo option may incur significantly less overhead
      than 'x'.
































  <tr><td>n <td>1 <td>The number of rows in the FTS4 table. This value is
    only available when querying FTS4 tables, not FTS3.
  <tr><td>a <td><i>cols</i> <td>For each column, the average number of
    tokens in the text values stored in the column (considering all rows in
    the FTS4 table). This value is only available when querying FTS4 tables,
    not FTS3.  







>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209

<pre>
          hits_for_phrase_p_column_c  = array&#91;c + p*cols&#93;
</pre>
      For queries that use OR expressions, or those that use LIMIT or return
      many rows, the 'y' matchinfo option may incur significantly less overhead
      than 'x'.

<tr><td><a name="matchinfo-b">b</a> <td style="white-space:nowrap"><i>(cols+31)/32</i> * <i>phrases</i> 
<td>

  The 'b' flag provides similar information to the 'y' flag, but in a more
  compact form. Instead of the precise number of hits, 'b' provides a single
  boolean flag for each phrase/column combination. If the phrase is present in
  the column at least once (i.e. if the corresponding integer output of 'y' would
  be non-zero), the corresponding flag is set. Otherwise cleared.

<p style="margin-left:0;margin-right:0">
  If the table has 32 or fewer columns, a single unsigned integer is output for
  each phrase in the query. Th least significant bit of the integer is set if the
  phrase appears at least once in column 0. The second least significant bit is
  set if the phrase appears once or more in column 1. And so on.

<p style="margin-left:0;margin-right:0">
  If the table has more than 32 columns, an extra integer is added to the output
  of each phrase for each extra 32 columns or part thereof. Integers
  corresponding to the same phrase are clumped together. For example, if a table
  with 45 columns is queried for two phrases, 4 integers are output. The first
  corresponds to phrase 0 and columns 0-31 of the table. The second integer
  contains data for phrase 0 and columns 32-44, and so on.

<p style="margin-left:0;margin-right:0">
  For example, if nCol is the number of columns in the table, to determine if
  phrase p is present in column c:

<pre>
    p_is_in_c = array&#91;p * (nCol+31)/32&#93; & (1 &lt;&lt; (c % 32))
</pre>

  <tr><td>n <td>1 <td>The number of rows in the FTS4 table. This value is
    only available when querying FTS4 tables, not FTS3.
  <tr><td>a <td><i>cols</i> <td>For each column, the average number of
    tokens in the text values stored in the column (considering all rows in
    the FTS4 table). This value is only available when querying FTS4 tables,
    not FTS3.