Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Just don't run tolower() on hi-bit characters. This shouldn't cause us to break any UTF-8 code points, unless they were already broken in the input. (CVS 3376) |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
6c77c2d5e15e9d3efed3e274bc93cd5a |
User & Date: | shess 2006-08-30 21:40:30.000 |
Context
2006-08-31
| ||
15:07 | Refactor the FTS1 module so that its name is "fts1" instead of "fulltext", so that all symbols with external linkage begin with "sqlite3Fts1", and so that all filenames begin with "fts1". (CVS 3377) (check-in: e1891f0dc5 user: drh tags: trunk) | |
2006-08-30
| ||
21:40 | Just don't run tolower() on hi-bit characters. This shouldn't cause us to break any UTF-8 code points, unless they were already broken in the input. (CVS 3376) (check-in: 6c77c2d5e1 user: shess tags: trunk) | |
2006-08-29
| ||
18:46 | Bug fix: Get INSERT INTO ... SELECT working when the target is a virtual table. (CVS 3375) (check-in: 7cdc41e748 user: drh tags: trunk) | |
Changes
Changes to ext/fts1/simple_tokenizer.c.
︙ | ︙ | |||
58 59 60 61 62 63 64 | ** track such information in the database, then we'd only want this ** information on the initial create. */ if( argc>1 ){ t->zDelim = string_dup(argv[1]); } else { /* Build a string excluding alphanumeric ASCII characters */ | | | | | 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 | ** track such information in the database, then we'd only want this ** information on the initial create. */ if( argc>1 ){ t->zDelim = string_dup(argv[1]); } else { /* Build a string excluding alphanumeric ASCII characters */ char zDelim[0x80]; /* nul-terminated, so nul not a member */ int i, j; for(i=1, j=0; i<0x80; i++){ if( !isalnum(i) ){ zDelim[j++] = i; } } zDelim[j++] = '\0'; assert( j<=sizeof(zDelim) ); t->zDelim = string_dup(zDelim); } |
︙ | ︙ | |||
130 131 132 133 134 135 136 | while( c->pCurrent-c->pInput<c->nBytes ){ int n = (int) strcspn(c->pCurrent, t->zDelim); if( n>0 ){ if( n+1>c->nTokenAllocated ){ c->zToken = realloc(c->zToken, n+1); } for(ii=0; ii<n; ii++){ | > > > | > | 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 | while( c->pCurrent-c->pInput<c->nBytes ){ int n = (int) strcspn(c->pCurrent, t->zDelim); if( n>0 ){ if( n+1>c->nTokenAllocated ){ c->zToken = realloc(c->zToken, n+1); } for(ii=0; ii<n; ii++){ /* TODO(shess) This needs expansion to handle UTF-8 ** case-insensitivity. */ char ch = c->pCurrent[ii]; c->zToken[ii] = (unsigned char)ch<0x80 ? tolower(ch) : ch; } c->zToken[n] = '\0'; *ppToken = c->zToken; *pnBytes = n; *piStartOffset = (int) (c->pCurrent-c->pInput); *piEndOffset = *piStartOffset+n; *piPosition = c->iToken++; |
︙ | ︙ |