Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Update fts3.html for recent changes to FTS. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
ae994ce63aa67a8ee18e86ef77ac0edf |
User & Date: | dan 2014-05-16 11:28:32.378 |
Context
2014-05-19
| ||
19:55 | Mention the automerge enhancement in the release notes. (check-in: 5b3b975982 user: drh tags: trunk) | |
2014-05-16
| ||
11:28 | Update fts3.html for recent changes to FTS. (check-in: ae994ce63a user: dan tags: trunk) | |
2014-05-09
| ||
22:27 | Fix typo in VALUES clause documentation in lang.html. (check-in: fee01c2d5b user: drh tags: trunk) | |
Changes
Changes to pages/fts3.in.
︙ | ︙ | |||
1823 1824 1825 1826 1827 1828 1829 | are supported: <ul> <li><p>INSERT INTO xyz(xyz) VALUES('optimize');</p> <li><p>INSERT INTO xyz(xyz) VALUES('rebuild');</p> <li><p>INSERT INTO xyz(xyz) VALUES('integrity-check');</p> <li><p>INSERT INTO xyz(xyz) VALUES('merge=X,Y');</p> | | | 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 | are supported: <ul> <li><p>INSERT INTO xyz(xyz) VALUES('optimize');</p> <li><p>INSERT INTO xyz(xyz) VALUES('rebuild');</p> <li><p>INSERT INTO xyz(xyz) VALUES('integrity-check');</p> <li><p>INSERT INTO xyz(xyz) VALUES('merge=X,Y');</p> <li><p>INSERT INTO xyz(xyz) VALUES('automerge=N');</p> </ul> <tcl>hd_fragment *fts4optcmd {FTS4 "optimize" command} \ {"optimize" command}</tcl> <h2 id=optimize>The "optimize" command</h2> <p> |
︙ | ︙ | |||
1933 1934 1935 1936 1937 1938 1939 | for X in the range of 100 to 300. The idle thread that is running the merge commands can know when it is done by checking the difference in [sqlite3_total_changes()] before and after each "merge=X,Y" command and stopping the loop when the difference drops below two. <tcl>hd_fragment *fts4automergecmd {FTS4 "automerge" command} \ {"automerge" command}</tcl> | | | > | | > | | | > | | | | < | | > > > > > > > > > > > > > > > > > > > > > | 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 | for X in the range of 100 to 300. The idle thread that is running the merge commands can know when it is done by checking the difference in [sqlite3_total_changes()] before and after each "merge=X,Y" command and stopping the loop when the difference drops below two. <tcl>hd_fragment *fts4automergecmd {FTS4 "automerge" command} \ {"automerge" command}</tcl> <h2 id=automerge">The "automerge=N" command</h2> <p> The "automerge=N" command (where N is an integer between 0 and 15, inclusive) is used to configure an FTS3/4 tables "automerge" parameter, which controls automatic incremental inverted index merging. The default automerge value for new tables is 0, meaning that automatic incremental merging is completely disabled. If the value of the automerge parameter is modified using the "automerge=N" command, the new parameter value is stored persistently in the database and is used by all subsequently established database connections. <p> Setting the automerge parameter to a non-zero value enables automatic incremental merging. This causes SQLite to do a small amount of inverted index merging after every INSERT operation. The amount of merging performed is designed so that the FTS3/4 table never reaches a point where it has 16 segments at the same level and hence has to do a large merge in order to complete an insert. In other words, automatic incremental merging is designed to prevent spiky INSERT performance. <p> The downside of automatic incremental merging is that it makes every INSERT, UPDATE, and DELETE operation on an FTS3/4 table run a little slower, since extra time must be used to do the incremental merge. For maximum performance, it is recommended that applications disable automatic incremental merge and instead use the ["merge" command] in an idle process to keep the inverted indices well merged. But if the structure of an application does not easily allow for idle processes, the use of automatic incremental merge is a very reasonable fallback solution. <p> The actual value of the automerge parameter determines the number of index segments merged simultaneously by an automatic inverted index merge. If the value is set to N, the system waits until there are at least N segments on a single level before beginning to incrementally merge them. Setting a lower value of N causes segments to be merged more quickly, which may speed up full-text queries and, if the workload contains UPDATE or DELETE operations as well as INSERTs, reduce the space on disk consumed by the full-text index. However, it also increases the amount of data written to disk. <p> For general use in cases where the workload contains few UPDATE or DELETE operations, is 8. If the workload contains many UPDATE or DELETE commands, or if query speed is a concern, it may be advantageous to reduce it to 2. <p> For reasons of backwards compatibility, the "automerge=1" command sets the automerge parameter to 8, not 1 (a value of 1 would make no sense anyway, as merging data from a single segment is a no-op). <h1 id=tokenizer tags="tokenizer">Tokenizers</h1> <p> An FTS tokenizer is a set of rules for extracting terms from a document or basic FTS full-text query. |
︙ | ︙ | |||
2528 2529 2530 2531 2532 2533 2534 | belongs to this segment b-tree. Or zero if the entire segment b-tree fits on the root node. If it exists, this node is always a leaf node. <tr><td>leaves_end_block <td> The blockid that corresponds to the leaf node with the largest blockid that belongs to this segment b-tree. Or zero if the entire segment b-tree fits on the root node. <tr><td>end_block <td> | > > > | | | | > > > > > > | 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 | belongs to this segment b-tree. Or zero if the entire segment b-tree fits on the root node. If it exists, this node is always a leaf node. <tr><td>leaves_end_block <td> The blockid that corresponds to the leaf node with the largest blockid that belongs to this segment b-tree. Or zero if the entire segment b-tree fits on the root node. <tr><td>end_block <td> This field may contain either an integer or a text field consisting of two integers separated by a space character (unicode codepoint 0x20). <p style="margin-left:0;margin-right:0"> The first, or only, integer is the blockid that corresponds to the interior node with the largest blockid that belongs to this segment b-tree. Or zero if the entire segment b-tree fits on the root node. If it exists, this node is always an interior node. <p style="margin-left:0;margin-right:0;margin-bottom:0"> The second integer, if it is present, is the aggregate size of all data stored on leaf pages in bytes. If the value is negative, then the segment is the output of an unfinished incremental-merge operation, and the absolute value is current size in bytes. <tr><td>root <td> Blob containing the root node of the segment b-tree. </table> <p> Apart from the root node, the nodes that make up a single segment b-tree are always stored using a contiguous sequence of blockids. Furthermore, the |
︙ | ︙ |