Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Document the byte-order-mark limitation of fts3/4. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA3-256: |
7e55864b0a74d16f4d0c86bee88a1856 |
User & Date: | dan 2020-01-29 20:19:32.773 |
Context
2020-04-14
| ||
14:08 | Fix a typo in limits.in. (check-in: 2664eaab37 user: dan tags: trunk) | |
2020-01-29
| ||
21:29 | Add the SQLITE_OMIT_AUTOINIT compile-time option to the set of recommended compile-time options. (check-in: f250d55692 user: drh tags: trunk) | |
20:19 | Document the byte-order-mark limitation of fts3/4. (check-in: 7e55864b0a user: dan tags: trunk) | |
2020-01-27
| ||
20:02 | Version 3.31.1 (check-in: 2ab23690d8 user: drh tags: trunk, release, version-3.31.1) | |
Changes
Changes to pages/fts3.in.
︙ | ︙ | |||
2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 | <p> For doclists for which the term appears in more than one column of the FTS virtual table, term-offset lists within the doclist are stored in column number order. This ensures that the term-offset list associated with column 0 (if any) is always first, allowing the first two fields of the term-offset list to be omitted in this case. <h1 id=appendix_a nonumber tags="search application tips"> Appendix A: Search Application Tips </h1> <p> FTS is primarily designed to support Boolean full-text queries - queries | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 | <p> For doclists for which the term appears in more than one column of the FTS virtual table, term-offset lists within the doclist are stored in column number order. This ensures that the term-offset list associated with column 0 (if any) is always first, allowing the first two fields of the term-offset list to be omitted in this case. <h1 tags="bugs">Limitations</h1> <h2> UTF-16 byte-order-mark problem </h2> For UTF-16 databases, when using the "simple" tokenizer, it is possible to use malformed unicode strings to cause the integrity-check to falsely report corruption, or for auxiliary functions to return incorrect results. More specifically, the bug can be triggered by any of the following: <ul> <li><p>A UTF-16 byte-order-mark is embedded at the beginning of an SQL string literal value inserted into an FTS3 table. For example: <codeblock> INSERT INTO fts_table(col) VALUES('<b>{BOM}</b>text...'); </codeblock> <p>where {BOM} is a UTF-16 byte-order-mark, a 16-bit integer value 0xFFFE in either big or little endian format. <li><p>Malformed UTF-8 that SQLite converts to a UTF-16 byte-order-mark is embedded at the beginning of an SQL string literal value inserted into an FTS3 table. <li><p>A text value created by casting a blob that begins with the two bytes 0xFF and 0xFE, in either possible order, is inserted into an FTS3 table. For example: <codeblock> INSERT INTO fts_table(col) VALUES(CAST(X'FEFF' AS TEXT)); </codeblock> </ul> No problems occur if all unicode strings used with FTS3/4 are well-formed. UTF-16 byte-order-marks may be safely used at the start of strings passed to [sqlite3_bind_text16()], [sqlite3_prepare16()] and other similar APIs. <h1 id=appendix_a nonumber tags="search application tips"> Appendix A: Search Application Tips </h1> <p> FTS is primarily designed to support Boolean full-text queries - queries |
︙ | ︙ | |||
3060 3061 3062 3063 3064 3065 3066 | return; <i> /* Jump here if the wrong number of arguments are passed to this function */</i> wrong_number_args: sqlite3_result_error(pCtx, "wrong number of arguments to function rank()", -1); } </codeblock> | > > | 3097 3098 3099 3100 3101 3102 3103 3104 3105 | return; <i> /* Jump here if the wrong number of arguments are passed to this function */</i> wrong_number_args: sqlite3_result_error(pCtx, "wrong number of arguments to function rank()", -1); } </codeblock> |