Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Enforce the 80-character line limit on the new documentation source text for FTS4 commands. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
f1d54965d02ae8bcb722f8c646f07e9f |
User & Date: | drh 2012-04-02 15:00:15.574 |
Context
2012-04-02
| ||
15:49 | Updates to the atomic commit document in order to reference WAL and PSOW and to improve clarity of presentation. (check-in: aaf3ea1155 user: drh tags: trunk) | |
15:00 | Enforce the 80-character line limit on the new documentation source text for FTS4 commands. (check-in: f1d54965d0 user: drh tags: trunk) | |
14:47 | Updates to the FTS4 document to describe recent merge=X,Y, automerge=B, and integrity-check commands. Updates to the change log for the same and for other recent changes. (check-in: 2c33993bdc user: drh tags: trunk) | |
Changes
Changes to pages/fts3.in.
︙ | ︙ | |||
1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 | <i>-- The following two queries are both partially optimized using the prefix</i> <i>-- indexes. The optimization is not as pronounced as it is for the queries</i> <i>-- above, but still an improvement over no prefix indexes at all.</i> SELECT * FROM t1 WHERE t1 MATCH 'a*'; SELECT * FROM t1 WHERE t1 MATCH 'abc*'; </codeblock> <h1 id=commands tags="commands">Special Commands For FTS3 and FTS4</h1> <p> Special INSERT operates can be used to issue commands to FTS3 and FTS4 tables. Every FTS3 and FTS4 has a hidden, read-only column which is the same name as the table itself. INSERTs into this hidden column are interpreted as commands to the FTS3/4 table. For a table with the name "xyz" the following commands are supported: <ul> <li><p>INSERT INTO xyz(xyz) VALUES('optimize');</p> <li><p>INSERT INTO xyz(xyz) VALUES('rebuild');</p> <li><p>INSERT INTO xyz(xyz) VALUES('integrity-check');</p> <li><p>INSERT INTO xyz(xyz) VALUES('merge=X,Y');</p> <li><p>INSERT INTO xyz(xyz) VALUES('automerge=B');</p> </ul> | > | > | > | | | | > | | > | | > | > | | > | | > | > | | > | | > | | | > | | | | > | | | > | > | | > | | | > | > | | > | | > | | > | | | > | | | | | > | > | | > | | > | | > | | | > | | > | | > | | 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 | <i>-- The following two queries are both partially optimized using the prefix</i> <i>-- indexes. The optimization is not as pronounced as it is for the queries</i> <i>-- above, but still an improvement over no prefix indexes at all.</i> SELECT * FROM t1 WHERE t1 MATCH 'a*'; SELECT * FROM t1 WHERE t1 MATCH 'abc*'; </codeblock> <tcl>hd_fragment *cmds {FTS4 commands}</tcl> <h1 id=commands tags="commands">Special Commands For FTS3 and FTS4</h1> <p> Special INSERT operates can be used to issue commands to FTS3 and FTS4 tables. Every FTS3 and FTS4 has a hidden, read-only column which is the same name as the table itself. INSERTs into this hidden column are interpreted as commands to the FTS3/4 table. For a table with the name "xyz" the following commands are supported: <ul> <li><p>INSERT INTO xyz(xyz) VALUES('optimize');</p> <li><p>INSERT INTO xyz(xyz) VALUES('rebuild');</p> <li><p>INSERT INTO xyz(xyz) VALUES('integrity-check');</p> <li><p>INSERT INTO xyz(xyz) VALUES('merge=X,Y');</p> <li><p>INSERT INTO xyz(xyz) VALUES('automerge=B');</p> </ul> <tcl>hd_fragment *fts4optcmd {FTS4 "optimize" command} \ {"optimize" command}</tcl> <h2 id=optimize>The "optimize" command</h2> <p> The "optimize" command causes FTS3/4 to merge together all of its inverted index b-trees into one large and complete b-tree. Doing an optimize will make subsequent queries run faster since there are fewer b-trees to search, and it may reduce disk usage by coalescing redundant entries. However, for a large FTS table, running optimize can be as expensive as running [VACUUM]. The optimize command essentially has to read and write the entire FTS table, resulting in a large transaction. <p> In batch-mode operation, where an FTS table is initially built up using a large number of INSERT operations, then queried repeatedly without further changes, it is often a good idea to run "optimize" after the last INSERT and before the first query. <tcl>hd_fragment *fts4rebuidcmd {FTS4 "rebuild" command} \ {"rebuild" command}</tcl> <h2 id=rebuild>The "rebuild" command</h2> <p> The "rebuild" command causes SQLite to discard the entire FTS3/4 table and then rebuild it again from original text. The concept is similar to [REINDEX], only that it applies to an FTS3/4 table instead of an ordinary index. <p> The "rebuild" command should be run whenever the implementation of a custom tokenizer changes, so that all content can be retokenized. The "rebuild" command is also useful when using the [FTS4 content option] after changes have been made to the original content table. <tcl>hd_fragment *fts4ickcmd {FTS4 "integrity-check" command}</tcl> <h2 id=integcheck>The "integrity-check" command</h2> <p> The "integrity-check" command causes SQLite to read and verify the accuracy of all inverted indices in an FTS3/4 table by comparing those inverted indices against the original content. The "integrity-check" command silently succeeds if the inverted indices are all ok, but will fail with an SQLITE_CORRUPT error if any problems are found. <p> The "integrity-check" command is similar in concept to [PRAGMA integrity_check]. In a working system, the "integrity-command" should aways be successful. Possible causes of integrity-check failures include: <ul> <li> The application has made changes to the [FTS shadow tables] directly, without using the FTS3/4 virtual table, causing the shadow tables to become out of sync with each other. <li> Using the [FTS4 content option] and failing to manually keep the content in sync with the FTS4 inverted indices. <li> Bugs in the FTS3/4 virtual table. (The "integrity-check" command was original conceived as part of the test suite for FTS3/4.) <li> Corruption to the underlying SQLite database file. (See documentation on [how to corrupt] and SQLite database for additional information.) </ul> <tcl>hd_fragment *fts4mergecmd {FTS4 "merge" command} {"merge" command}</tcl> <h2 id="mergecmd">The "merge=X,Y" command</h2> <p> The "merge=X,Y" command (where X and Y are integers) causes SQLite to do a limited amount of work toward merging the various inverted index b-trees of an FTS3/4 table together into one large b-tree. The X value is the target number of "blocks" to be merged, and Y is the minimum number of b-tree segments on a level required before merging will be applied to that level. The value of Y should be between 2 and 16 with a recommended value of 8. The value of X can be any positive integer but values on the order of 100 to 300 are recommended. <p> When an FTS table accumulates 16 b-tree segments at the same level, the next INSERT into that table will cause all 16 segments to be merged into a single b-tree segment at the next higher level. The effect of these level merges is that most INSERTs into an FTS table are very fast and take minimal memory, but an occasional INSERT is slow and generates a large transaction because of the need to do merging. This results in "spiky" performance of INSERTs. <p> To avoid spiky INSERT performance, an application can run the "merge=X,Y" command periodically, possibly in an idle thread or idle process, to ensure that the FTS table never accumulates too many b-tree segments at the same level. INSERT performance spikes can generally be avoided, and performance of FTS3/4 can be maximized, by running "merge=X,Y" after every few thousand document inserts. Each "merge=X,Y" command will run in a separate transaction (unless they are grouped together using [BEGIN]...[COMMIT], of course). The transactions can be kept small by choosing a value for X in the range of 100 to 300. The idle thread that is running the merge commands can know when it is done by checking the difference in [sqlite3_total_changes()] before and after each "merge=X,Y" command and stopping the loop when the difference drops below two. <tcl>hd_fragment *fts4automergecmd {FTS4 "automerge" command} \ {"automerge" command}</tcl> <h2 id=automerge">The "automerge=B" command</h2> <p> The "automerge=B" command (where B is either "1" or "0") disables or enables automatic incremental inverted index merging for an FTS3/4 table. The default for new tables is for automatic incremental merging to be disabled. The "automerge=B" command changes this setting. The change is persistent and continues to be in effect for all subsequent database connections to the same database. <p> Enabling automatic incremental merge causes SQLite to do a small amount of inverted index merging after every INSERT operation. The amount of merging performed is designed so that the FTS3/4 table never reaches a point where it has 16 segments at the same level and hence has to do a large merge in order to complete an insert. In other words, automatic incremental merging is designed to prevent spiky INSERT performance. <p> The downside of automatic incremental merging is that it makes every INSERT, UPDATE, and DELETE operation on an FTS3/4 table run a little slower, since extra time must be used to do the incremental merge. For maximum performance, it is recommended that applications disable automatic incremental merge and instead use the ["merge" command] in an idle process to keep the inverted indices well merged. But if the structure of an application does not easily allow for idle processes, the use of automatic incremental merge is a very reasonable fallback solution. <h1 id=tokenizer tags="tokenizer">Tokenizers</h1> <p> An FTS tokenizer is a set of rules for extracting terms from a document or basic FTS full-text query. |
︙ | ︙ |