/ File History
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

History of ext/fts3/unicode/mkunicode.tcl

2019-03-20
05:45
Fix various harmless compiler warnings seen with MSVC. file: [bf7fcaa6] check-in: [1c0fe5b5] user: mistachkin branch: noWarnings, size: 25443
2019-01-02
23:49
Fix harmless compiler warnings in the unicode2 logic of FTS3 and FTS5. file: [49499f79] check-in: [703029ac] user: drh branch: trunk, size: 25439
2018-12-28
07:37
Fix problems in fts5 found by ASAN. file: [2315b3f8] check-in: [c564bf87] user: dan branch: trunk, size: 25422
2018-12-03
17:40
Remove the unused sqlite3Fts5UnicodeNCat() function. file: [2ea30d81] check-in: [7149dacf] user: drh branch: trunk, size: 25353
16:14
Add the "remove_diacritics=2" option to the unicode61 tokenizer in both FTS5 and FTS3/4. file: [106bb4ff] check-in: [06177f3f] user: dan branch: trunk, size: 25420
2018-07-13
19:52
Add the "categories" option to the unicode61 tokenizer in fts5. file: [0069320b] check-in: [80d2b9e6] user: dan branch: trunk, size: 25078
2017-03-20
18:53
Fix some problems in fts3 found by address-sanitizer. file: [ab0543a3] check-in: [16a8e84f] user: dan branch: trunk, size: 18339
2016-02-12
18:48
Fix a potential buffer overread provoked by invalid utf-8 in fts5. file: [2debed3f] check-in: [a049fbbd] user: dan branch: trunk, size: 18325
2015-07-02
15:52
Remove "#ifdef SQLITE_ENABLE_FTS5" from individual fts5 source files. Add a single "#if !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_FTS5)" to fts5.c. file: [95cf7ec1] check-in: [7819002e] user: dan branch: trunk, size: 18297
2015-05-23
15:43
Avoid making redundant copies of position-lists within the fts5 code. file: [ed0534dd] check-in: [5165de54] user: dan branch: fts5, size: 18368
2015-05-22
06:08
Improve test coverage of fts5_unicode2.c. file: [b321eea0] check-in: [fea8a4db] user: dan branch: fts5, size: 18386
2015-02-02
11:32
Fix some problems with building fts5 and fts3 together using the amalgamation. file: [159c1194] check-in: [fb10bbb9] user: dan branch: fts5, size: 22145
2015-01-01
18:03
Merge latest trunk changes with this branch. file: [4199cb88] check-in: [4b365167] user: dan branch: fts5, size: 22028
16:46
Add a version of the unicode61 tokenizer to fts5. file: [2fa92b91] check-in: [d09f7800] user: dan branch: fts5, size: 22032
2014-08-06
18:50
A couple more harmless compiler warnings eliminated. file: [a2567f9d] check-in: [bcf6d775] user: drh branch: trunk, size: 21588
17:49
Fix two more harmless compiler warnings. Make sure the fts3_unicode2.c file is in sync with mkunicode.tcl. file: [ddeb6629] check-in: [a2a60307] user: drh branch: trunk, size: 21585
2013-06-05
16:17
Up until now the fts4 "unicode61" tokenizer has treated all private use codepoints except the first and last of each of the three ranges as alphanumeric (eligible to be part of tokens). This commit fixes this so that all private use codepoints are considered alphanumeric. In other words, it fixes the handling of codepoints 0xE000, 0xF8FF, 0xF0000, 0xFFFFD, 0x100000 and 0x10FFFD. file: [dc6f268e] check-in: [6cfd9af5] user: dan branch: trunk, size: 21592
2012-06-06
19:30
Have the FTS unicode61 strip out diacritics when tokenizing text. This can be disabled by specifying the tokenizer option "remove_diacritics=0". file: [7a9bc018] check-in: [790f76a5] user: dan branch: trunk, size: 21525
2012-05-28
12:22
Omit the fts3 unicode character class routines from the build if fts3/4 is disabled. file: [2029991c] check-in: [c00bb5d4] user: drh branch: trunk, size: 15497
2012-05-26
18:28
If SQLITE_DISABLE_FTS3_UNICODE is defined, do not build the "unicode61" tokenizer. file: [de64862a] check-in: [e71495a8] user: dan branch: fts4-unicode, size: 15339
17:57
Change the format of the tables used by sqlite3FtsUnicodeTolower() to make them a little smaller. file: [27752800] check-in: [b89d3834] user: dan branch: fts4-unicode, size: 15228
2012-05-25
19:50
Add special fast paths to sqlite3FtsUnicodeTolower() and Isalnum() for codepoints in the ASCII range. file: [a7214d17] check-in: [cf7b25d4] user: dan branch: fts4-unicode, size: 14223
18:48
Fix comments in generated file fts3_unicode2.c. file: [3ff244e4] check-in: [3dc567ef] user: dan branch: fts4-unicode, size: 13497
17:50
Add an experimental tokenizer to fts4 - "unicode". This tokenizer works in the same way except that it understands unicode "simple case folding" and recognizes all characters not classified as "Letters" or "Numbers" by unicode as token separators. file: [1f50ed00] check-in: [0c13570e] user: dan branch: fts4-unicode, size: 11733 Added