Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Convert fts2 to store data in a way which allows for much faster updates. Groups of documents form segments which are encoded in a btree layered over a table of blocks, with various tricks to make merges fast. This performs 20x-25x faster than fts1 when loading the Enron corpus, and is only slightly slower for queries. (CVS 3474) |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
85272b2f5394e37916afb1d509e72968 |
User & Date: | shess 2006-10-12 23:15:25.000 |
Context
2006-10-13
| ||
11:55 | Removing debugging printf from the porter stemmer code. Ticket #2016. (CVS 3475) (check-in: 7a08c6272f user: drh tags: trunk) | |
2006-10-12
| ||
23:15 | Convert fts2 to store data in a way which allows for much faster updates. Groups of documents form segments which are encoded in a btree layered over a table of blocks, with various tricks to make merges fast. This performs 20x-25x faster than fts1 when loading the Enron corpus, and is only slightly slower for queries. (CVS 3474) (check-in: 85272b2f53 user: shess tags: trunk) | |
2006-10-11
| ||
17:19 | Bug fix: named local variable lockStyle as lockingStyle in SQLITE_ENABLE_LOCKING_STYLE block in allocateUnixFile (CVS 3473) (check-in: aa0b96c3df user: aswift tags: trunk) | |
Changes
Changes to ext/fts2/fts2.c.
︙ | ︙ | |||
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | ** ** * The FTS2 module is being built as an extension ** (in which case SQLITE_CORE is not defined), or ** ** * The FTS2 module is being built into the core of ** SQLite (in which case SQLITE_ENABLE_FTS2 is defined). */ #if !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_FTS2) #if defined(SQLITE_ENABLE_FTS2) && !defined(SQLITE_CORE) # define SQLITE_CORE 1 #endif #include <assert.h> #if !defined(__APPLE__) #include <malloc.h> | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | < > > > > > > > > > > > > > | 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 | ** ** * The FTS2 module is being built as an extension ** (in which case SQLITE_CORE is not defined), or ** ** * The FTS2 module is being built into the core of ** SQLite (in which case SQLITE_ENABLE_FTS2 is defined). */ /* TODO(shess) Consider exporting this comment to an HTML file or the ** wiki. */ /* The full-text index is stored in a series of b+tree (-like) ** structures called segments which map terms to doclists. The ** structures are like b+trees in layout, but are constructed from the ** bottom up in optimal fashion and are not updatable. Since trees ** are built from the bottom up, things will be described from the ** bottom up. ** ** **** Varints **** ** The basic unit of encoding is a variable-length integer called a ** varint. We encode variable-length integers in little-endian order ** using seven bits * per byte as follows: ** ** KEY: ** A = 0xxxxxxx 7 bits of data and one flag bit ** B = 1xxxxxxx 7 bits of data and one flag bit ** ** 7 bits - A ** 14 bits - BA ** 21 bits - BBA ** and so on. ** ** This is identical to how sqlite encodes varints (see util.c). ** ** **** Document lists **** ** A doclist (document list) holds a docid-sorted list of hits for a ** given term. Doclists hold docids, and can optionally associate ** token positions and offsets with docids. ** ** A DL_POSITIONS_OFFSETS doclist is stored like this: ** ** array { ** varint docid; ** array { (position list for column 0) ** varint position; (delta from previous position plus POS_BASE) ** varint startOffset; (delta from previous startOffset) ** varint endOffset; (delta from startOffset) ** } ** array { ** varint POS_COLUMN; (marks start of position list for new column) ** varint column; (index of new column) ** array { ** varint position; (delta from previous position plus POS_BASE) ** varint startOffset;(delta from previous startOffset) ** varint endOffset; (delta from startOffset) ** } ** } ** varint POS_END; (marks end of positions for this document. ** } ** ** Here, array { X } means zero or more occurrences of X, adjacent in ** memory. A "position" is an index of a token in the token stream ** generated by the tokenizer, while an "offset" is a byte offset, ** both based at 0. Note that POS_END and POS_COLUMN occur in the ** same logical place as the position element, and act as sentinals ** ending a position list array. ** ** A DL_POSITIONS doclist omits the startOffset and endOffset ** information. A DL_DOCIDS doclist omits both the position and ** offset information, becoming an array of varint-encoded docids. ** ** On-disk data is stored as type DL_DEFAULT, so we don't serialize ** the type. Due to how deletion is implemented in the segmentation ** system, on-disk doclists MUST store at least positions. ** ** TODO(shess) Delta-encode docids. This provides a 10% win versus ** DL_POSITIONS_OFFSETS on the first 100,000 documents of the Enron ** corpus, greater versus DL_POSITIONS. ** ** **** Segment leaf nodes **** ** Segment leaf nodes store terms and doclists, ordered by term. Leaf ** nodes are written using LeafWriter, and read using LeafReader (to ** iterate through a single leaf node's data) and LeavesReader (to ** iterate through a segment's entire leaf layer). Leaf nodes have ** the format: ** ** varint iHeight; (height from leaf level, always 0) ** varint nTerm; (length of first term) ** char pTerm[nTerm]; (content of first term) ** varint nDoclist; (length of term's associated doclist) ** char pDoclist[nDoclist]; (content of doclist) ** array { ** (further terms are delta-encoded) ** varint nPrefix; (length of prefix shared with previous term) ** varint nSuffix; (length of unshared suffix) ** char pTermSuffix[nSuffix];(unshared suffix of next term) ** varint nDoclist; (length of term's associated doclist) ** char pDoclist[nDoclist]; (content of doclist) ** } ** ** Here, array { X } means zero or more occurrences of X, adjacent in ** memory. ** ** Leaf nodes are broken into blocks which are stored contiguously in ** the %_segments table in sorted order. This means that when the end ** of a node is reached, the next term is in the node with the next ** greater node id. ** ** New data is spilled to a new leaf node when the current node ** exceeds LEAF_MAX bytes (default 2048). New data which itself is ** larger than STANDALONE_MIN (default 1024) is placed in a standalone ** node (a leaf node with a single term and doclist). The goal of ** these settings is to pack together groups of small doclists while ** making it efficient to directly access large doclists. The ** assumption is that large doclists represent terms which are more ** likely to be query targets. ** ** TODO(shess) It may be useful for blocking decisions to be more ** dynamic. For instance, it may make more sense to have a 2.5k leaf ** node rather than splitting into 2k and .5k nodes. My intuition is ** that this might extend through 2x or 4x the pagesize. ** ** **** Segment interior nodes **** ** Segment interior nodes store blockids for subtree nodes and terms ** to describe what data is stored by the each subtree. Interior ** nodes are written using InteriorWriter, and read using ** InteriorReader. InteriorWriters are created as needed when ** SegmentWriter creates new leaf nodes, or when an interior node ** itself grows too big and must be split. The format of interior ** nodes: ** ** varint iHeight; (height from leaf level, always >0) ** varint iBlockid; (block id of node's leftmost subtree) ** array { ** varint nTerm; (length of term) ** char pTerm[nTerm]; (content of term) ** } ** ** Here, array { X } means zero or more occurrences of X, adjacent in ** memory. ** ** An interior node encodes n terms separating n+1 subtrees. The ** subtree blocks are contiguous, so only the first subtree's blockid ** is encoded. The subtree at iBlockid will contain all terms less ** than the first term encoded (or all terms if no term is encoded). ** Otherwise, for terms greater than or equal to pTerm[i] but less ** than pTerm[i+1], the subtree for that term will be rooted at ** iBlockid+i. ** ** New data is spilled to a new interior node at the same height when ** the current node exceeds INTERIOR_MAX bytes (default 2048). The ** interior nodes at a given height are naturally tracked by interior ** nodes at height+1, and so on. ** ** **** Segment directory **** ** The segment directory in table %_segdir stores meta-information for ** merging and deleting segments, and also the root node of the ** segment's tree. ** ** The root node is the top node of the segment's tree after encoding ** the entire segment, restricted to ROOT_MAX bytes (default 1024). ** This could be either a leaf node or an interior node. If the top ** node requires more than ROOT_MAX bytes, it is flushed to %_segments ** and a new root interior node is generated (which should always fit ** within ROOT_MAX because it only needs space for 2 varints, the ** height and the blockid of the previous root). ** ** The meta-information in the segment directory is: ** level - segment level (see below) ** idx - index within level ** - (level,idx uniquely identify a segment) ** start_block - first leaf node ** leaves_end_block - last leaf node ** end_block - last block (including interior nodes) ** root - contents of root node ** ** If the root node is a leaf node, then start_block, ** leaves_end_block, and end_block are all 0. ** ** **** Segment merging **** ** To amortize update costs, segments are groups into levels and ** merged in matches. Each increase in level represents exponentially ** more documents. ** ** New documents (actually, document updates) are tokenized and ** written individually (using LeafWriter) to a level 0 segment, with ** incrementing idx. When idx reaches MERGE_COUNT (default 16), all ** level 0 segments are merged into a single level 1 segment. Level 1 ** is populated like level 0, and eventually MERGE_COUNT level 1 ** segments are merged to a single level 2 segment (representing ** MERGE_COUNT^2 updates), and so on. ** ** A segment merge traverses all segments at a given level in ** parallel, performing a straightforward sorted merge. Since segment ** leaf nodes are written in to the %_segments table in order, this ** merge traverses the underlying sqlite disk structures efficiently. ** After the merge, all segment blocks from the merged level are ** deleted. ** ** MERGE_COUNT controls how often we merge segments. 16 seems to be ** somewhat of a sweet spot for insertion performance. 32 and 64 show ** very similar performance numbers to 16 on insertion, though they're ** a tiny bit slower (perhaps due to more overhead in merge-time ** sorting). 8 is about 20% slower than 16, 4 about 50% slower than ** 16, 2 about 66% slower than 16. ** ** At query time, high MERGE_COUNT increases the number of segments ** which need to be scanned and merged. For instance, with 100k docs ** inserted: ** ** MERGE_COUNT segments ** 16 25 ** 8 12 ** 4 10 ** 2 6 ** ** This appears to have only a moderate impact on queries for very ** frequent terms (which are somewhat dominated by segment merge ** costs), and infrequent and non-existent terms still seem to be fast ** even with many segments. ** ** TODO(shess) That said, it would be nice to have a better query-side ** argument for MERGE_COUNT of 16. Also, it's possible/likely that ** optimizations to things like doclist merging will swing the sweet ** spot around. ** ** ** **** Handling of deletions and updates **** ** Since we're using a segmented structure, with no docid-oriented ** index into the term index, we clearly cannot simply update the term ** index when a document is deleted or updated. For deletions, we ** write an empty doclist (varint(docid) varint(POS_END)), for updates ** we simply write the new doclist. Segment merges overwrite older ** data for a particular docid with newer data, so deletes or updates ** will eventually overtake the earlier data and knock it out. The ** query logic likewise merges doclists so that newer data knocks out ** older data. ** ** TODO(shess) Provide a VACUUM type operation to clear out all ** deletions and duplications. This would basically be a forced merge ** into a single segment. */ #if !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_FTS2) #if defined(SQLITE_ENABLE_FTS2) && !defined(SQLITE_CORE) # define SQLITE_CORE 1 #endif #include <assert.h> #if !defined(__APPLE__) #include <malloc.h> #endif #include <stdlib.h> #include <stdio.h> #include <string.h> #include <ctype.h> #include "fts2.h" #include "fts2_hash.h" #include "fts2_tokenizer.h" #include "sqlite3.h" #include "sqlite3ext.h" SQLITE_EXTENSION_INIT1 /* TODO(shess) MAN, this thing needs some refactoring. At minimum, it ** would be nice to order the file better, perhaps something along the ** lines of: ** ** - utility functions ** - table setup functions ** - table update functions ** - table query functions ** ** Put the query functions last because they're likely to reference ** typedefs or functions from the table update section. */ #if 0 # define TRACE(A) printf A; fflush(stdout) #else # define TRACE(A) #endif |
︙ | ︙ | |||
70 71 72 73 74 75 76 | sb->len += nFrom; sb->s[sb->len] = 0; } static void append(StringBuffer *sb, const char *zFrom){ nappend(sb, zFrom, strlen(zFrom)); } | | < | > | | | | | < | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 | sb->len += nFrom; sb->s[sb->len] = 0; } static void append(StringBuffer *sb, const char *zFrom){ nappend(sb, zFrom, strlen(zFrom)); } /* Helper functions for certain common memory-allocation idioms: ** ** data_dup() - malloc+memcpy to duplicate a buffer ** data_replace() - realloc+memcpy to dup a buffer over an existing buffer ** data_append() - realloc+memcpy to append data to an existing buffer ** data_append2() - shorthand for calling data_append() twice. */ /* TODO(shess) There is a "block of binary data on the heap" construct ** in here which could be shared with (at least) the StringBuffer and ** DocList constructs. */ static void data_replace(char **ppTarget, int *pnTarget, const char *pSource, int nSource){ *ppTarget = realloc(*ppTarget, nSource); memcpy(*ppTarget, pSource, nSource); *pnTarget = nSource; } static void data_dup(char **ppTarget, int *pnTarget, const char *pSource, int nSource){ *ppTarget = malloc(nSource); memcpy(*ppTarget, pSource, nSource); *pnTarget = nSource; } static void data_append(char **ppTarget, int *pnTarget, const char *pSource, int nSource){ *ppTarget = realloc(*ppTarget, *pnTarget+nSource); memcpy(*ppTarget+*pnTarget, pSource, nSource); *pnTarget += nSource; } static void data_append2(char **ppTarget, int *pnTarget, const char *pSource1, int nSource1, const char *pSource2, int nSource2){ *ppTarget = realloc(*ppTarget, *pnTarget+nSource1+nSource2); memcpy(*ppTarget+*pnTarget, pSource1, nSource1); memcpy(*ppTarget+*pnTarget+nSource1, pSource2, nSource2); *pnTarget += nSource1+nSource2; } /* We may need up to VARINT_MAX bytes to store an encoded 64-bit integer. */ #define VARINT_MAX 10 /* Write a 64-bit variable-length integer to memory starting at p[0]. * The length of data written will be between 1 and VARINT_MAX bytes. * The number of bytes written is returned. */ |
︙ | ︙ | |||
128 129 130 131 132 133 134 | sqlite_int64 i; int ret = getVarint(p, &i); *pi = (int) i; assert( *pi==i ); return ret; } | < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < > > > > > > > > > > > > > > | | | < < < < | < > > > > > > > > > > > > > > > > > > > > | 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 | sqlite_int64 i; int ret = getVarint(p, &i); *pi = (int) i; assert( *pi==i ); return ret; } typedef enum DocListType { DL_DOCIDS, /* docids only */ DL_POSITIONS, /* docids + positions */ DL_POSITIONS_OFFSETS /* docids + positions + offsets */ } DocListType; /* ** By default, only positions and not offsets are stored in the doclists. ** To change this so that offsets are stored too, compile with ** ** -DDL_DEFAULT=DL_POSITIONS_OFFSETS ** ** If DL_DEFAULT is set to DL_DOCIDS, your table can only be inserted ** into (no deletes or updates). */ #ifndef DL_DEFAULT # define DL_DEFAULT DL_POSITIONS #endif typedef struct DocList { char *pData; int nData; DocListType iType; int iLastColumn; /* the last column written */ int iLastPos; /* the last position written */ int iLastOffset; /* the last start offset written */ } DocList; enum { POS_END = 0, /* end of this position list */ POS_COLUMN, /* followed by new column number */ POS_BASE }; /* TODO(shess) I think it might be time to refactor the doclist ** manipulation. Broadly put, there are four fairly discrete clients, ** tokenization, insert-time segment merging, query-time segment ** merging and query-time analysis. The breakdown I think might be ** reasonable would be: ** ** DocListReader - Wraps const char *pData, int nData. ** Used to traverse doclists ** DocListWriter - Starts empty, can add complete doclist elements. ** Used in merging doclists. ** DocBuilder - Used when tokenizing documents. */ static void docListCoreInit(DocList *d, DocListType iType, char *pData, int nData){ d->nData = nData; d->pData = pData; d->iType = iType; d->iLastColumn = 0; d->iLastPos = d->iLastOffset = 0; } /* Initialize a new DocList pointing to static data. Don't call ** docListDestroy() to release, just free(d) (if you originally ** malloced d). */ static void docListStaticInit(DocList *d, DocListType iType, const char *pData, int nData){ docListCoreInit(d, iType, (char *)pData, nData); } /* Initialize a new DocList to hold a copy of the given data. */ static void docListInit(DocList *d, DocListType iType, const char *pData, int nData){ char *p = 0; if( nData>0 ){ p = malloc(nData); memcpy(p, pData, nData); } docListCoreInit(d, iType, p, nData); } /* Create a new dynamically-allocated DocList. */ static DocList *docListNew(DocListType iType){ DocList *d = (DocList *) malloc(sizeof(DocList)); docListInit(d, iType, 0, 0); return d; } |
︙ | ︙ | |||
437 438 439 440 441 442 443 | sqlite_int64 d = 0; while( !atEnd(pReader) && (d=peekDocid(pReader))<iDocid ){ skipDocument(pReader); } return !atEnd(pReader) && d==iDocid; } | < < < < < < < < | 704 705 706 707 708 709 710 711 712 713 714 715 716 717 | sqlite_int64 d = 0; while( !atEnd(pReader) && (d=peekDocid(pReader))<iDocid ){ skipDocument(pReader); } return !atEnd(pReader) && d==iDocid; } #ifdef SQLITE_DEBUG /* ** This routine is used for debugging purpose only. ** ** Write the content of a doclist to standard output. */ static void printDoclist(DocList *p){ |
︙ | ︙ | |||
535 536 537 538 539 540 541 | } } docListDestroy(in); *in = out; } | | | | > | < | < | | < | < < < < < < < < | < < < < < | < < < < < < < < < | | < < < | | < < < | < < | < < < < < < < | < < | < < > | | < | < > > > > > > > > | | | > > | | > > > > > > > > > > > > > > > > > > | > > > > > > > > > > > > > > > | > > > > > > > > > > > > > > > > > > > > | 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 | } } docListDestroy(in); *in = out; } /* Efficiently merge left and right into out, with duplicated docids ** from right overwriting those in left (left is effectively older ** than right). The previous code had a memmove() which introduced an ** O(N^2) into merges, while this code should be O(N). */ static void docListMerge(DocList *out, DocList *left, DocList *right){ DocListReader leftReader, rightReader; int iData = 0; #ifndef NDEBUG /* Track these to make certain that every byte is processed. */ int nLeftProcessed = 0, nRightProcessed = 0; #endif assert( left->iType==right->iType ); /* Handle edge cases. */ /* TODO(shess) Consider simply forbidding edge cases, in the ** interests of saving copies. */ if( left->nData==0 ){ docListInit(out, right->iType, right->pData, right->nData); return; }else if(right->nData==0 ){ docListInit(out, left->iType, left->pData, left->nData); return; } docListInit(out, left->iType, 0, 0); /* At this time, the sum of the space of the inputs is a strict ** upper bound. *out can end up smaller if elements of *right ** overwrite elements of *left, but never larger. */ out->nData = left->nData+right->nData; out->pData = malloc(out->nData); readerInit(&leftReader, left); readerInit(&rightReader, right); while( !atEnd(&leftReader) && !atEnd(&rightReader) ){ sqlite_int64 iLeftDocid = peekDocid(&leftReader); sqlite_int64 iRightDocid = peekDocid(&rightReader); const char *pStart, *pEnd; if( iLeftDocid<iRightDocid ){ /* Copy from *left where less than iRightDocid. */ pStart = leftReader.p; skipToDocid(&leftReader, iRightDocid); pEnd = leftReader.p; #ifndef NDEBUG nLeftProcessed += pEnd-pStart; #endif }else{ /* Copy from *right where less than iLeftDocid, plus the element ** matching iLeftDocid, if present. Also drop the matching ** element from *left. */ pStart = rightReader.p; if( skipToDocid(&rightReader, iLeftDocid) ){ #ifndef NDEBUG const char *pLeftStart = leftReader.p; #endif skipDocument(&leftReader); skipDocument(&rightReader); #ifndef NDEBUG nLeftProcessed += leftReader.p-pLeftStart; #endif } pEnd = rightReader.p; #ifndef NDEBUG nRightProcessed += pEnd-pStart; #endif } assert( pEnd>pStart ); assert( iData+pEnd-pStart<=out->nData ); memcpy(out->pData+iData, pStart, pEnd-pStart); iData += pEnd-pStart; } if( !atEnd(&leftReader) ){ int n = left->nData-(leftReader.p-left->pData); assert( atEnd(&rightReader) ); memcpy(out->pData+iData, leftReader.p, n); iData += n; #ifndef NDEBUG nLeftProcessed += n; #endif }else if( !atEnd(&rightReader) ){ int n = right->nData-(rightReader.p-right->pData); memcpy(out->pData+iData, rightReader.p, n); iData += n; #ifndef NDEBUG nRightProcessed += n; #endif } out->nData = iData; out->pData = realloc(out->pData, out->nData); assert( nLeftProcessed==left->nData ); assert( nRightProcessed==right->nData ); } /* ** Read the next docid off of pIn. Return 0 if we reach the end. * * TODO: This assumes that docids are never 0, but they may actually be 0 since * users can choose docids when inserting into a full-text table. Fix this. |
︙ | ︙ | |||
976 977 978 979 980 981 982 | typedef enum QueryType { QUERY_GENERIC, /* table scan */ QUERY_ROWID, /* lookup by rowid */ QUERY_FULLTEXT /* QUERY_FULLTEXT + [i] is a full-text search for column i*/ } QueryType; | < < < < < < < < < < < < < > | > | > | > | | > | < | | > | > | > > | > | | < > | | > > > > > > | 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 | typedef enum QueryType { QUERY_GENERIC, /* table scan */ QUERY_ROWID, /* lookup by rowid */ QUERY_FULLTEXT /* QUERY_FULLTEXT + [i] is a full-text search for column i*/ } QueryType; typedef enum fulltext_statement { CONTENT_INSERT_STMT, CONTENT_SELECT_STMT, CONTENT_UPDATE_STMT, CONTENT_DELETE_STMT, BLOCK_INSERT_STMT, BLOCK_SELECT_STMT, BLOCK_DELETE_STMT, SEGDIR_MAX_INDEX_STMT, SEGDIR_SET_STMT, SEGDIR_SELECT_STMT, SEGDIR_SPAN_STMT, SEGDIR_DELETE_STMT, SEGDIR_SELECT_ALL_STMT, MAX_STMT /* Always at end! */ } fulltext_statement; /* These must exactly match the enum above. */ /* TODO(shess): Is there some risk that a statement will be used in two ** cursors at once, e.g. if a query joins a virtual table to itself? ** If so perhaps we should move some of these to the cursor object. */ static const char *const fulltext_zStatement[MAX_STMT] = { /* CONTENT_INSERT */ NULL, /* generated in contentInsertStatement() */ /* CONTENT_SELECT */ "select * from %_content where rowid = ?", /* CONTENT_UPDATE */ NULL, /* generated in contentUpdateStatement() */ /* CONTENT_DELETE */ "delete from %_content where rowid = ?", /* BLOCK_INSERT */ "insert into %_segments values (?)", /* BLOCK_SELECT */ "select block from %_segments where rowid = ?", /* BLOCK_DELETE */ "delete from %_segments where rowid between ? and ?", /* SEGDIR_MAX_INDEX */ "select max(idx) from %_segdir where level = ?", /* SEGDIR_SET */ "insert into %_segdir values (?, ?, ?, ?, ?, ?)", /* SEGDIR_SELECT */ "select start_block, leaves_end_block, root from %_segdir " " where level = ? order by idx", /* SEGDIR_SPAN */ "select min(start_block), max(end_block) from %_segdir " " where level = ? and start_block <> 0", /* SEGDIR_DELETE */ "delete from %_segdir where level = ?", /* SEGDIR_SELECT_ALL */ "select root from %_segdir order by level desc, idx", }; /* MERGE_COUNT controls how often we merge segments (see comment at ** top of file). */ #define MERGE_COUNT 16 /* ** A connection to a fulltext index is an instance of the following ** structure. The xCreate and xConnect methods create an instance ** of this structure and xDestroy and xDisconnect free that instance. ** All other methods receive a pointer to the structure as one of their ** arguments. |
︙ | ︙ | |||
1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 | char **azContentColumn; /* column names in content table; malloced */ sqlite3_tokenizer *pTokenizer; /* tokenizer for inserts and queries */ /* Precompiled statements which we keep as long as the table is ** open. */ sqlite3_stmt *pFulltextStatements[MAX_STMT]; }; /* ** When the core wants to do a query, it create a cursor using a ** call to xOpen. This structure is an instance of a cursor. It ** is destroyed by xClose. */ | > > > > > > > > | 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 | char **azContentColumn; /* column names in content table; malloced */ sqlite3_tokenizer *pTokenizer; /* tokenizer for inserts and queries */ /* Precompiled statements which we keep as long as the table is ** open. */ sqlite3_stmt *pFulltextStatements[MAX_STMT]; /* Precompiled statements used for segment merges. We run a ** separate select across the leaf level of each tree being merged. */ sqlite3_stmt *pLeafSelectStmts[MERGE_COUNT]; /* The statement used to prepare pLeafSelectStmts. */ #define LEAF_SELECT \ "select block from %_segments where rowid between ? and ? order by rowid" }; /* ** When the core wants to do a query, it create a cursor using a ** call to xOpen. This structure is an instance of a cursor. It ** is destroyed by xClose. */ |
︙ | ︙ | |||
1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 | */ static int sql_single_step_statement(fulltext_vtab *v, fulltext_statement iStmt, sqlite3_stmt **ppStmt){ int rc = sql_step_statement(v, iStmt, ppStmt); return (rc==SQLITE_DONE) ? SQLITE_OK : rc; } /* insert into %_content (rowid, ...) values ([rowid], [pValues]) */ static int content_insert(fulltext_vtab *v, sqlite3_value *rowid, sqlite3_value **pValues){ sqlite3_stmt *s; int i; int rc = sql_get_statement(v, CONTENT_INSERT_STMT, &s); | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 | */ static int sql_single_step_statement(fulltext_vtab *v, fulltext_statement iStmt, sqlite3_stmt **ppStmt){ int rc = sql_step_statement(v, iStmt, ppStmt); return (rc==SQLITE_DONE) ? SQLITE_OK : rc; } /* Like sql_get_statement(), but for special replicated LEAF_SELECT ** statements. */ /* TODO(shess) Write version for generic statements and then share ** that between the cached-statement functions. */ static int sql_get_leaf_statement(fulltext_vtab *v, int idx, sqlite3_stmt **ppStmt){ assert( idx>=0 && idx<MERGE_COUNT ); if( v->pLeafSelectStmts[idx]==NULL ){ int rc = sql_prepare(v->db, v->zName, &v->pLeafSelectStmts[idx], LEAF_SELECT); if( rc!=SQLITE_OK ) return rc; }else{ int rc = sqlite3_reset(v->pLeafSelectStmts[idx]); if( rc!=SQLITE_OK ) return rc; } *ppStmt = v->pLeafSelectStmts[idx]; return SQLITE_OK; } /* Like sql_step_statement(), but for special replicated LEAF_SELECT ** statements. */ /* TODO(shess) Write version for generic statements and then share ** that between the cached-statement functions. */ static int sql_step_leaf_statement(fulltext_vtab *v, int idx, sqlite3_stmt **ppStmt){ int rc; sqlite3_stmt *s = *ppStmt; while( (rc=sqlite3_step(s))!=SQLITE_DONE && rc!=SQLITE_ROW ){ sqlite3_stmt *pNewStmt; if( rc==SQLITE_BUSY ) continue; if( rc!=SQLITE_ERROR ) return rc; rc = sqlite3_reset(s); if( rc!=SQLITE_SCHEMA ) return SQLITE_ERROR; v->pLeafSelectStmts[idx] = NULL; /* Still in s */ rc = sql_get_leaf_statement(v, idx, &pNewStmt); if( rc!=SQLITE_OK ) goto err; *ppStmt = pNewStmt; rc = sqlite3_transfer_bindings(s, pNewStmt); if( rc!=SQLITE_OK ) goto err; rc = sqlite3_finalize(s); if( rc!=SQLITE_OK ) return rc; s = pNewStmt; } return rc; err: sqlite3_finalize(s); return rc; } /* insert into %_content (rowid, ...) values ([rowid], [pValues]) */ static int content_insert(fulltext_vtab *v, sqlite3_value *rowid, sqlite3_value **pValues){ sqlite3_stmt *s; int i; int rc = sql_get_statement(v, CONTENT_INSERT_STMT, &s); |
︙ | ︙ | |||
1299 1300 1301 1302 1303 1304 1305 | rc = sqlite3_bind_int64(s, 1, iRow); if( rc!=SQLITE_OK ) return rc; return sql_single_step_statement(v, CONTENT_DELETE_STMT, &s); } | > > > > > > > > | > > | > > > | > > > | > > > > > > | > > > > | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | > | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | | > | > > > > | | > | > | < | > > | | < | < < < < < < < < | < < < < < < < < | < < | < < < < < < < < < < < < < < < < < < < < < < < < < < | < < < < < < < | | < < < < < < < < < < < < < | < < < < < | | < < < < | < < < < < < < < < < | < < < | | < < < < < < < < < < < > > > > > > > | 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 | rc = sqlite3_bind_int64(s, 1, iRow); if( rc!=SQLITE_OK ) return rc; return sql_single_step_statement(v, CONTENT_DELETE_STMT, &s); } /* insert into %_segments values ([pData]) ** returns assigned rowid in *piBlockid */ static int block_insert(fulltext_vtab *v, const char *pData, int nData, sqlite_int64 *piBlockid){ sqlite3_stmt *s; int rc = sql_get_statement(v, BLOCK_INSERT_STMT, &s); if( rc!=SQLITE_OK ) return rc; rc = sqlite3_bind_blob(s, 1, pData, nData, SQLITE_STATIC); if( rc!=SQLITE_OK ) return rc; rc = sql_step_statement(v, BLOCK_INSERT_STMT, &s); if( rc==SQLITE_ROW ) return SQLITE_ERROR; if( rc!=SQLITE_DONE ) return rc; *piBlockid = sqlite3_last_insert_rowid(v->db); return SQLITE_OK; } /* delete from %_segments ** where rowid between [iStartBlockid] and [iEndBlockid] ** ** Deletes the range of blocks, inclusive, used to delete the blocks ** which form a segment. */ static int block_delete(fulltext_vtab *v, sqlite_int64 iStartBlockid, sqlite_int64 iEndBlockid){ sqlite3_stmt *s; int rc = sql_get_statement(v, BLOCK_DELETE_STMT, &s); if( rc!=SQLITE_OK ) return rc; rc = sqlite3_bind_int64(s, 1, iStartBlockid); if( rc!=SQLITE_OK ) return rc; rc = sqlite3_bind_int64(s, 2, iEndBlockid); if( rc!=SQLITE_OK ) return rc; return sql_single_step_statement(v, BLOCK_DELETE_STMT, &s); } /* Returns SQLITE_ROW with *pidx set to the maximum segment idx found ** at iLevel. Returns SQLITE_DONE if there are no segments at ** iLevel. Otherwise returns an error. */ static int segdir_max_index(fulltext_vtab *v, int iLevel, int *pidx){ sqlite3_stmt *s; int rc = sql_get_statement(v, SEGDIR_MAX_INDEX_STMT, &s); if( rc!=SQLITE_OK ) return rc; rc = sqlite3_bind_int(s, 1, iLevel); if( rc!=SQLITE_OK ) return rc; rc = sql_step_statement(v, SEGDIR_MAX_INDEX_STMT, &s); /* Should always get at least one row due to how max() works. */ if( rc==SQLITE_DONE ) return SQLITE_DONE; if( rc!=SQLITE_ROW ) return rc; /* NULL means that there were no inputs to max(). */ if( SQLITE_NULL==sqlite3_column_type(s, 0) ){ rc = sqlite3_step(s); if( rc==SQLITE_ROW ) return SQLITE_ERROR; return rc; } *pidx = sqlite3_column_int(s, 0); /* We expect only one row. We must execute another sqlite3_step() * to complete the iteration; otherwise the table will remain locked. */ rc = sqlite3_step(s); if( rc==SQLITE_ROW ) return SQLITE_ERROR; if( rc!=SQLITE_DONE ) return rc; return SQLITE_ROW; } /* insert into %_segdir values ( ** [iLevel], [idx], ** [iStartBlockid], [iLeavesEndBlockid], [iEndBlockid], ** [pRootData] ** ) */ static int segdir_set(fulltext_vtab *v, int iLevel, int idx, sqlite_int64 iStartBlockid, sqlite_int64 iLeavesEndBlockid, sqlite_int64 iEndBlockid, const char *pRootData, int nRootData){ sqlite3_stmt *s; int rc = sql_get_statement(v, SEGDIR_SET_STMT, &s); if( rc!=SQLITE_OK ) return rc; rc = sqlite3_bind_int(s, 1, iLevel); if( rc!=SQLITE_OK ) return rc; rc = sqlite3_bind_int(s, 2, idx); if( rc!=SQLITE_OK ) return rc; rc = sqlite3_bind_int64(s, 3, iStartBlockid); if( rc!=SQLITE_OK ) return rc; rc = sqlite3_bind_int64(s, 4, iLeavesEndBlockid); if( rc!=SQLITE_OK ) return rc; rc = sqlite3_bind_int64(s, 5, iEndBlockid); if( rc!=SQLITE_OK ) return rc; rc = sqlite3_bind_blob(s, 6, pRootData, nRootData, SQLITE_STATIC); if( rc!=SQLITE_OK ) return rc; return sql_single_step_statement(v, SEGDIR_SET_STMT, &s); } /* Queries %_segdir for the block span of the segments in level ** iLevel. Returns SQLITE_DONE if there are no blocks for iLevel, ** SQLITE_ROW if there are blocks, else an error. */ static int segdir_span(fulltext_vtab *v, int iLevel, sqlite_int64 *piStartBlockid, sqlite_int64 *piEndBlockid){ sqlite3_stmt *s; int rc = sql_get_statement(v, SEGDIR_SPAN_STMT, &s); if( rc!=SQLITE_OK ) return rc; rc = sqlite3_bind_int(s, 1, iLevel); if( rc!=SQLITE_OK ) return rc; rc = sql_step_statement(v, SEGDIR_SPAN_STMT, &s); if( rc==SQLITE_DONE ) return SQLITE_DONE; /* Should never happen */ if( rc!=SQLITE_ROW ) return rc; /* This happens if all segments at this level are entirely inline. */ if( SQLITE_NULL==sqlite3_column_type(s, 0) ){ /* We expect only one row. We must execute another sqlite3_step() * to complete the iteration; otherwise the table will remain locked. */ int rc2 = sqlite3_step(s); if( rc2==SQLITE_ROW ) return SQLITE_ERROR; return rc2; } *piStartBlockid = sqlite3_column_int64(s, 0); *piEndBlockid = sqlite3_column_int64(s, 1); /* We expect only one row. We must execute another sqlite3_step() * to complete the iteration; otherwise the table will remain locked. */ rc = sqlite3_step(s); if( rc==SQLITE_ROW ) return SQLITE_ERROR; if( rc!=SQLITE_DONE ) return rc; return SQLITE_ROW; } /* Delete the segment blocks and segment directory records for all ** segments at iLevel. */ static int segdir_delete(fulltext_vtab *v, int iLevel){ sqlite3_stmt *s; sqlite_int64 iStartBlockid, iEndBlockid; int rc = segdir_span(v, iLevel, &iStartBlockid, &iEndBlockid); if( rc!=SQLITE_ROW && rc!=SQLITE_DONE ) return rc; if( rc==SQLITE_ROW ){ rc = block_delete(v, iStartBlockid, iEndBlockid); if( rc!=SQLITE_OK ) return rc; } /* Delete the segment directory itself. */ rc = sql_get_statement(v, SEGDIR_DELETE_STMT, &s); if( rc!=SQLITE_OK ) return rc; rc = sqlite3_bind_int64(s, 1, iLevel); if( rc!=SQLITE_OK ) return rc; return sql_single_step_statement(v, SEGDIR_DELETE_STMT, &s); } /* ** Free the memory used to contain a fulltext_vtab structure. */ static void fulltext_vtab_destroy(fulltext_vtab *v){ int iStmt, i; TRACE(("FTS2 Destroy %p\n", v)); for( iStmt=0; iStmt<MAX_STMT; iStmt++ ){ if( v->pFulltextStatements[iStmt]!=NULL ){ sqlite3_finalize(v->pFulltextStatements[iStmt]); v->pFulltextStatements[iStmt] = NULL; } } for( i=0; i<MERGE_COUNT; i++ ){ if( v->pLeafSelectStmts[i]!=NULL ){ sqlite3_finalize(v->pLeafSelectStmts[i]); v->pLeafSelectStmts[i] = NULL; } } if( v->pTokenizer!=NULL ){ v->pTokenizer->pModule->xDestroy(v->pTokenizer); v->pTokenizer = NULL; } free(v->azColumn); |
︙ | ︙ | |||
1967 1968 1969 1970 1971 1972 1973 | if( rc!=SQLITE_OK ) return rc; rc = constructVtab(db, &spec, ppVTab, pzErr); clearTableSpec(&spec); return rc; } | | | < < < < < < < < < < < < < < < < < < < < < < < < < < < | | < < < > | > > > | > > > > > > | > | 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 | if( rc!=SQLITE_OK ) return rc; rc = constructVtab(db, &spec, ppVTab, pzErr); clearTableSpec(&spec); return rc; } /* The %_content table holds the text of each document, with ** the rowid used as the docid. */ /* TODO(shess) This comment needs elaboration to match the updated ** code. Work it into the top-of-file comment at that time. */ static int fulltextCreate(sqlite3 *db, void *pAux, int argc, const char * const *argv, sqlite3_vtab **ppVTab, char **pzErr){ int rc; TableSpec spec; StringBuffer schema; TRACE(("FTS2 Create\n")); rc = parseSpec(&spec, argc, argv, pzErr); if( rc!=SQLITE_OK ) return rc; initStringBuffer(&schema); append(&schema, "CREATE TABLE %_content("); appendList(&schema, spec.nColumn, spec.azContentColumn); append(&schema, ")"); rc = sql_exec(db, spec.zName, schema.s); free(schema.s); if( rc!=SQLITE_OK ) goto out; rc = sql_exec(db, spec.zName, "create table %_segments(block blob);"); if( rc!=SQLITE_OK ) goto out; rc = sql_exec(db, spec.zName, "create table %_segdir(" " level integer," " idx integer," " start_block integer," " leaves_end_block integer," " end_block integer," " root blob," " primary key(level, idx)" ");"); if( rc!=SQLITE_OK ) goto out; rc = constructVtab(db, &spec, ppVTab, pzErr); out: clearTableSpec(&spec); return rc; |
︙ | ︙ | |||
2078 2079 2080 2081 2082 2083 2084 | static int fulltextDestroy(sqlite3_vtab *pVTab){ fulltext_vtab *v = (fulltext_vtab *)pVTab; int rc; TRACE(("FTS2 Destroy %p\n", pVTab)); rc = sql_exec(v->db, v->zName, | | > > > | 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 | static int fulltextDestroy(sqlite3_vtab *pVTab){ fulltext_vtab *v = (fulltext_vtab *)pVTab; int rc; TRACE(("FTS2 Destroy %p\n", pVTab)); rc = sql_exec(v->db, v->zName, "drop table %_content;" "drop table %_segments;" "drop table %_segdir;" ); if( rc!=SQLITE_OK ) return rc; fulltext_vtab_destroy((fulltext_vtab *)pVTab); return SQLITE_OK; } static int fulltextOpen(sqlite3_vtab *pVTab, sqlite3_vtab_cursor **ppCursor){ |
︙ | ︙ | |||
2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 | return SQLITE_OK; } /* an error occurred; abort */ return rc==SQLITE_DONE ? SQLITE_ERROR : rc; } } /* Return a DocList corresponding to the query term *pTerm. If *pTerm ** is the first term of a phrase query, go ahead and evaluate the phrase ** query and return the doclist for the entire phrase query. ** ** The result is stored in pTerm->doclist. */ static int docListOfTerm( fulltext_vtab *v, /* The full text index */ int iColumn, /* column to restrict to. No restrition if >=nColumn */ QueryTerm *pQTerm, /* Term we are looking for, or 1st term of a phrase */ DocList **ppResult /* Write the result here */ ){ DocList *pLeft, *pRight, *pNew; int i, rc; pLeft = docListNew(DL_POSITIONS); | > > > > > > > | | | 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 | return SQLITE_OK; } /* an error occurred; abort */ return rc==SQLITE_DONE ? SQLITE_ERROR : rc; } } /* TODO(shess) If we pushed LeafReader to the top of the file, or to ** another file, term_select() could be pushed above ** docListOfTerm(). */ static int termSelect(fulltext_vtab *v, int iColumn, const char *pTerm, int nTerm, DocList *out); /* Return a DocList corresponding to the query term *pTerm. If *pTerm ** is the first term of a phrase query, go ahead and evaluate the phrase ** query and return the doclist for the entire phrase query. ** ** The result is stored in pTerm->doclist. */ static int docListOfTerm( fulltext_vtab *v, /* The full text index */ int iColumn, /* column to restrict to. No restrition if >=nColumn */ QueryTerm *pQTerm, /* Term we are looking for, or 1st term of a phrase */ DocList **ppResult /* Write the result here */ ){ DocList *pLeft, *pRight, *pNew; int i, rc; pLeft = docListNew(DL_POSITIONS); rc = termSelect(v, iColumn, pQTerm->pTerm, pQTerm->nTerm, pLeft); if( rc ) return rc; for(i=1; i<=pQTerm->nPhrase; i++){ pRight = docListNew(DL_POSITIONS); rc = termSelect(v, iColumn, pQTerm[i].pTerm, pQTerm[i].nTerm, pRight); if( rc ){ docListDelete(pLeft); return rc; } pNew = docListNew(i<pQTerm->nPhrase ? DL_POSITIONS : DL_DOCIDS); docListPhraseMerge(pLeft, pRight, pNew); docListDelete(pLeft); |
︙ | ︙ | |||
2951 2952 2953 2954 2955 2956 2957 | ** though one could argue that failure there means that the data is ** not durable. *ponder* */ pTokenizer->pModule->xClose(pCursor); return rc; } | < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < | > > > | | 3319 3320 3321 3322 3323 3324 3325 3326 3327 3328 3329 3330 3331 3332 3333 3334 3335 3336 3337 3338 3339 3340 3341 3342 3343 3344 3345 3346 3347 3348 3349 3350 3351 3352 3353 3354 | ** though one could argue that failure there means that the data is ** not durable. *ponder* */ pTokenizer->pModule->xClose(pCursor); return rc; } /* Add doclists for all terms in [pValues] to the hash table [terms]. */ static int insertTerms(fulltext_vtab *v, fts2Hash *terms, sqlite_int64 iRowid, sqlite3_value **pValues){ int i; for(i = 0; i < v->nColumn ; ++i){ char *zText = (char*)sqlite3_value_text(pValues[i]); int rc = buildTerms(v, terms, iRowid, zText, i); if( rc!=SQLITE_OK ) return rc; } return SQLITE_OK; } /* Add empty doclists for all terms in the given row's content to the hash * table [pTerms]. */ static int deleteTerms(fulltext_vtab *v, fts2Hash *pTerms, sqlite_int64 iRowid){ const char **pValues; int i, rc; /* TODO(shess) Should we allow such tables at all? */ if( DL_DEFAULT==DL_DOCIDS ) return SQLITE_ERROR; rc = content_select(v, iRowid, &pValues); if( rc!=SQLITE_OK ) return rc; for(i = 0 ; i < v->nColumn; ++i) { rc = buildTerms(v, pTerms, iRowid, pValues[i], -1); if( rc!=SQLITE_OK ) break; } |
︙ | ︙ | |||
3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 | /* Now add positions for terms which appear in the updated row. */ rc = insertTerms(v, pTerms, iRow, pValues); if( rc!=SQLITE_OK ) return rc; return content_update(v, pValues, iRow); /* execute an SQL UPDATE */ } /* This function implements the xUpdate callback; it's the top-level entry * point for inserting, deleting or updating a row in a full-text table. */ static int fulltextUpdate(sqlite3_vtab *pVtab, int nArg, sqlite3_value **ppArg, sqlite_int64 *pRowid){ fulltext_vtab *v = (fulltext_vtab *) pVtab; fts2Hash terms; /* maps term string -> PosList */ | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 3388 3389 3390 3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 3401 3402 3403 3404 3405 3406 3407 3408 3409 3410 3411 3412 3413 3414 3415 3416 3417 3418 3419 3420 3421 3422 3423 3424 3425 3426 3427 3428 3429 3430 3431 3432 3433 3434 3435 3436 3437 3438 3439 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455 3456 3457 3458 3459 3460 3461 3462 3463 3464 3465 3466 3467 3468 3469 3470 3471 3472 3473 3474 3475 3476 3477 3478 3479 3480 3481 3482 3483 3484 3485 3486 3487 3488 3489 3490 3491 3492 3493 3494 3495 3496 3497 3498 3499 3500 3501 3502 3503 3504 3505 3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516 3517 3518 3519 3520 3521 3522 3523 3524 3525 3526 3527 3528 3529 3530 3531 3532 3533 3534 3535 3536 3537 3538 3539 3540 3541 3542 3543 3544 3545 3546 3547 3548 3549 3550 3551 3552 3553 3554 3555 3556 3557 3558 3559 3560 3561 3562 3563 3564 3565 3566 3567 3568 3569 3570 3571 3572 3573 3574 3575 3576 3577 3578 3579 3580 3581 3582 3583 3584 3585 3586 3587 3588 3589 3590 3591 3592 3593 3594 3595 3596 3597 3598 3599 3600 3601 3602 3603 3604 3605 3606 3607 3608 3609 3610 3611 3612 3613 3614 3615 3616 3617 3618 3619 3620 3621 3622 3623 3624 3625 3626 3627 3628 3629 3630 3631 3632 3633 3634 3635 3636 3637 3638 3639 3640 3641 3642 3643 3644 3645 3646 3647 3648 3649 3650 3651 3652 3653 3654 3655 3656 3657 3658 3659 3660 3661 3662 3663 3664 3665 3666 3667 3668 3669 3670 3671 3672 3673 3674 3675 3676 3677 3678 3679 3680 3681 3682 3683 3684 3685 3686 3687 3688 3689 3690 3691 3692 3693 3694 3695 3696 3697 3698 3699 3700 3701 3702 3703 3704 3705 3706 3707 3708 3709 3710 3711 3712 3713 3714 3715 3716 3717 3718 3719 3720 3721 3722 3723 3724 3725 3726 3727 3728 3729 3730 3731 3732 3733 3734 3735 3736 3737 3738 3739 3740 3741 3742 3743 3744 3745 3746 3747 3748 3749 3750 3751 3752 3753 3754 3755 3756 3757 3758 3759 3760 3761 3762 3763 3764 3765 3766 3767 3768 3769 3770 3771 3772 3773 3774 3775 3776 3777 3778 3779 3780 3781 3782 3783 3784 3785 3786 3787 3788 3789 3790 3791 3792 3793 3794 3795 3796 3797 3798 3799 3800 3801 3802 3803 3804 3805 3806 3807 3808 3809 3810 3811 3812 3813 3814 3815 3816 3817 3818 3819 3820 3821 3822 3823 3824 3825 3826 3827 3828 3829 3830 3831 3832 3833 3834 3835 3836 3837 3838 3839 3840 3841 3842 3843 3844 3845 3846 3847 3848 3849 3850 3851 3852 3853 3854 3855 3856 3857 3858 3859 3860 3861 3862 3863 3864 3865 3866 3867 3868 3869 3870 3871 3872 3873 3874 3875 3876 3877 3878 3879 3880 3881 3882 3883 3884 3885 3886 3887 3888 3889 3890 3891 3892 3893 3894 3895 3896 3897 3898 3899 3900 3901 3902 3903 3904 3905 3906 3907 3908 3909 3910 3911 3912 3913 3914 3915 3916 3917 3918 3919 3920 3921 3922 3923 3924 3925 3926 3927 3928 3929 3930 3931 3932 3933 3934 3935 3936 3937 3938 3939 3940 3941 3942 3943 3944 3945 3946 3947 3948 3949 3950 3951 3952 3953 3954 3955 3956 3957 3958 3959 3960 3961 3962 3963 3964 3965 3966 3967 3968 3969 3970 3971 3972 3973 3974 3975 3976 3977 3978 3979 3980 3981 3982 3983 3984 3985 3986 3987 3988 3989 3990 3991 3992 3993 3994 3995 3996 3997 3998 3999 4000 4001 4002 4003 4004 4005 4006 4007 4008 4009 4010 4011 4012 4013 4014 4015 4016 4017 4018 4019 4020 4021 4022 4023 4024 4025 4026 4027 4028 4029 4030 4031 4032 4033 4034 4035 4036 4037 4038 4039 4040 4041 4042 4043 4044 4045 4046 4047 4048 4049 4050 4051 4052 4053 4054 4055 4056 4057 4058 4059 4060 4061 4062 4063 4064 4065 4066 4067 4068 4069 4070 4071 4072 4073 4074 4075 4076 4077 4078 4079 4080 4081 4082 4083 4084 4085 4086 4087 4088 4089 4090 4091 4092 4093 4094 4095 4096 4097 4098 4099 4100 4101 4102 4103 4104 4105 4106 4107 4108 4109 4110 4111 4112 4113 4114 4115 4116 4117 4118 4119 4120 4121 4122 4123 4124 4125 4126 4127 4128 4129 4130 4131 4132 4133 4134 4135 4136 4137 4138 4139 4140 4141 4142 4143 4144 4145 4146 4147 4148 4149 4150 4151 4152 4153 4154 4155 4156 4157 4158 4159 4160 4161 4162 4163 4164 4165 4166 4167 4168 4169 4170 4171 4172 4173 4174 4175 4176 4177 4178 4179 4180 4181 4182 4183 4184 4185 4186 4187 4188 4189 4190 4191 4192 4193 4194 4195 4196 4197 4198 4199 4200 4201 4202 4203 4204 4205 4206 4207 4208 4209 4210 4211 4212 4213 4214 4215 4216 4217 4218 4219 4220 4221 4222 4223 4224 4225 4226 4227 4228 4229 4230 4231 4232 4233 4234 4235 4236 4237 4238 4239 4240 4241 4242 4243 4244 4245 4246 4247 4248 4249 4250 4251 4252 4253 4254 4255 4256 4257 4258 4259 4260 4261 4262 4263 4264 4265 4266 4267 4268 4269 4270 4271 4272 4273 4274 4275 4276 4277 4278 4279 4280 4281 4282 4283 4284 4285 4286 4287 4288 4289 4290 4291 4292 4293 4294 4295 4296 4297 4298 4299 4300 4301 4302 4303 4304 4305 4306 4307 4308 4309 4310 4311 4312 4313 4314 4315 4316 4317 4318 4319 4320 4321 4322 4323 4324 4325 4326 4327 4328 4329 4330 4331 4332 4333 4334 4335 4336 4337 4338 4339 4340 4341 4342 4343 4344 4345 4346 4347 4348 4349 4350 4351 4352 4353 4354 4355 4356 4357 4358 4359 4360 4361 4362 4363 4364 4365 4366 4367 4368 4369 4370 4371 4372 4373 4374 4375 4376 4377 4378 4379 4380 4381 4382 4383 4384 4385 4386 4387 4388 4389 4390 4391 4392 4393 4394 4395 4396 4397 4398 4399 4400 4401 4402 4403 4404 4405 4406 4407 4408 4409 4410 4411 4412 4413 4414 4415 4416 4417 4418 4419 4420 4421 4422 4423 4424 4425 4426 4427 4428 4429 4430 4431 4432 4433 4434 4435 4436 4437 4438 4439 4440 4441 4442 4443 4444 4445 4446 4447 4448 4449 4450 4451 4452 4453 4454 4455 4456 4457 4458 4459 4460 4461 4462 4463 4464 4465 4466 4467 4468 4469 4470 4471 4472 4473 4474 4475 4476 4477 4478 4479 4480 4481 4482 4483 4484 4485 4486 4487 4488 4489 4490 4491 4492 4493 4494 4495 4496 4497 4498 | /* Now add positions for terms which appear in the updated row. */ rc = insertTerms(v, pTerms, iRow, pValues); if( rc!=SQLITE_OK ) return rc; return content_update(v, pValues, iRow); /* execute an SQL UPDATE */ } /*******************************************************************/ /* InteriorWriter is used to collect terms and block references into ** interior nodes in %_segments. See commentary at top of file for ** format. */ /* How large interior nodes can grow. */ #define INTERIOR_MAX 2048 /* ROOT_MAX controls how much data is stored inline in the segment ** directory. */ /* TODO(shess) Push ROOT_MAX down to whoever is writing things. It's ** only here so that interiorWriterRootInfo() and leafWriterRootInfo() ** can both see it, but if the caller passed it in, we wouldn't even ** need a define. */ #define ROOT_MAX 1024 #if ROOT_MAX<VARINT_MAX*2 # error ROOT_MAX must have enough space for a header. #endif /* InteriorBlock stores a linked-list of interior blocks while a lower ** layer is being constructed. */ typedef struct InteriorBlock { char *pTerm; /* Leftmost term in block's subtree. */ int nTerm; char *pData; int nData; struct InteriorBlock *next; } InteriorBlock; static InteriorBlock *interiorBlockNew(int iHeight, sqlite_int64 iChildBlock, const char *pTerm, int nTerm){ InteriorBlock *block = calloc(1, sizeof(InteriorBlock)); char c[VARINT_MAX+VARINT_MAX]; int n; data_dup(&block->pTerm, &block->nTerm, pTerm, nTerm); n = putVarint(c, iHeight); n += putVarint(c+n, iChildBlock); data_dup(&block->pData, &block->nData, c, n); return block; } typedef struct InteriorWriter { int iHeight; /* from 0 at leaves. */ InteriorBlock *first, *last; struct InteriorWriter *parentWriter; #ifndef NDEBUG sqlite_int64 iLastChildBlock; /* for consistency checks. */ #endif } InteriorWriter; /* Initialize an interior node where pTerm[nTerm] marks the leftmost ** term in the tree. iChildBlock is the leftmost child block at the ** next level down the tree. */ static void interiorWriterInit(int iHeight, const char *pTerm, int nTerm, sqlite_int64 iChildBlock, InteriorWriter *pWriter){ InteriorBlock *block; assert( iHeight>0 ); memset(pWriter, 0, sizeof(*pWriter)); pWriter->iHeight = iHeight; #ifndef NDEBUG pWriter->iLastChildBlock = iChildBlock; #endif block = interiorBlockNew(iHeight, iChildBlock, pTerm, nTerm); pWriter->last = pWriter->first = block; } /* Append the child node rooted at iChildBlock to the interior node, ** with pTerm[nTerm] as the leftmost term in iChildBlock's subtree. */ static void interiorWriterAppend(InteriorWriter *pWriter, const char *pTerm, int nTerm, sqlite_int64 iChildBlock){ char c[VARINT_MAX+VARINT_MAX]; int n = putVarint(c, nTerm); #ifndef NDEBUG pWriter->iLastChildBlock++; #endif assert( pWriter->iLastChildBlock==iChildBlock ); if( pWriter->last->nData+n+nTerm>INTERIOR_MAX ){ /* Overflow to a new block. */ pWriter->last->next = interiorBlockNew(pWriter->iHeight, iChildBlock, pTerm, nTerm); pWriter->last = pWriter->last->next; }else{ InteriorBlock *last = pWriter->last; data_append2(&last->pData, &last->nData, c, n, pTerm, nTerm); } } /* Free the space used by pWriter, including the linked-list of ** InteriorBlocks. */ static int interiorWriterDestroy(InteriorWriter *pWriter){ InteriorBlock *block = pWriter->first; while( block!=NULL ){ InteriorBlock *b = block; block = block->next; free(b->pData); free(b->pTerm); free(b); } #ifndef NDEBUG memset(pWriter, 0x55, sizeof(pWriter)); #endif return SQLITE_OK; } /* If pWriter can fit entirely in ROOT_MAX, return it as the root info ** directly, leaving *piEndBlockid unchanged. Otherwise, flush ** pWriter to %_segments, building a new layer of interior nodes, and ** recursively ask for their root into. */ static int interiorWriterRootInfo(fulltext_vtab *v, InteriorWriter *pWriter, char **ppRootInfo, int *pnRootInfo, sqlite_int64 *piEndBlockid){ InteriorBlock *block = pWriter->first; sqlite_int64 iBlockid = 0; int rc; /* If we can fit the segment inline */ if( block==pWriter->last && block->nData<ROOT_MAX ){ *ppRootInfo = block->pData; *pnRootInfo = block->nData; return SQLITE_OK; } /* Flush the first block to %_segments, and create a new level of ** interior node. */ rc = block_insert(v, block->pData, block->nData, &iBlockid); if( rc!=SQLITE_OK ) return rc; *piEndBlockid = iBlockid; pWriter->parentWriter = malloc(sizeof(*pWriter->parentWriter)); interiorWriterInit(pWriter->iHeight+1, block->pTerm, block->nTerm, iBlockid, pWriter->parentWriter); /* Flush additional blocks and append to the higher interior ** node. */ for(block=block->next; block!=NULL; block=block->next){ rc = block_insert(v, block->pData, block->nData, &iBlockid); if( rc!=SQLITE_OK ) return rc; *piEndBlockid = iBlockid; interiorWriterAppend(pWriter->parentWriter, block->pTerm, block->nTerm, iBlockid); } /* Parent node gets the chance to be the root. */ return interiorWriterRootInfo(v, pWriter->parentWriter, ppRootInfo, pnRootInfo, piEndBlockid); } /****************************************************************/ /* InteriorReader is used to read off the data from an interior node ** (see comment at top of file for the format). InteriorReader does ** not own its data, so interiorReaderDestroy() is a formality. */ typedef struct InteriorReader { const char *pData; int nData; sqlite_int64 iBlockid; } InteriorReader; static void interiorReaderDestroy(InteriorReader *pReader){ #ifndef NDEBUG memset(pReader, 0x55, sizeof(pReader)); #endif } static void interiorReaderInit(const char *pData, int nData, InteriorReader *pReader){ int n; /* Require at least the leading flag byte */ assert( nData>0 ); assert( pData[0]!='\0' ); memset(pReader, '\0', sizeof(pReader)); /* Decode the base blockid, and set the cursor to the first term. */ n = getVarint(pData+1, &pReader->iBlockid); assert( 1+n<=nData ); pReader->pData = pData+1+n; pReader->nData = nData-(1+n); } static int interiorReaderAtEnd(InteriorReader *pReader){ return pReader->nData<=0; } static sqlite_int64 interiorReaderCurrentBlockid(InteriorReader *pReader){ return pReader->iBlockid; } static int interiorReaderTermBytes(InteriorReader *pReader){ int nTerm; assert( !interiorReaderAtEnd(pReader) ); getVarint32(pReader->pData, &nTerm); return nTerm; } static const char *interiorReaderTerm(InteriorReader *pReader){ int n, nTerm; assert( !interiorReaderAtEnd(pReader) ); n = getVarint32(pReader->pData, &nTerm); return pReader->pData+n; } /* Step forward to the next term in the node. */ static void interiorReaderStep(InteriorReader *pReader){ int n, nTerm; assert( !interiorReaderAtEnd(pReader) ); n = getVarint32(pReader->pData, &nTerm); assert( n+nTerm<=pReader->nData ); pReader->pData += n+nTerm; pReader->nData -= n+nTerm; pReader->iBlockid++; } /* Compare the current term to pTerm[nTerm], returning strcmp-style ** results. */ static int interiorReaderTermCmp(InteriorReader *pReader, const char *pTerm, int nTerm){ const char *pReaderTerm = interiorReaderTerm(pReader); int nReaderTerm = interiorReaderTermBytes(pReader); int c, n = nReaderTerm<nTerm ? nReaderTerm : nTerm; if( n==0 ){ if( nReaderTerm>0 ) return -1; if( nTerm>0 ) return 1; return 0; } c = memcmp(pReaderTerm, pTerm, n); if( c!=0 ) return c; return nReaderTerm - nTerm; } /****************************************************************/ /* LeafWriter is used to collect terms and associated doclist data ** into leaf blocks in %_segments (see top of file for format info). */ /* Put terms with data this big in their own block. */ #define STANDALONE_MIN 1024 /* Keep leaf blocks below this size. */ #define LEAF_MAX 2048 typedef struct LeafWriter { int iLevel; int idx; sqlite_int64 iStartBlockid; /* needed to create the root info */ sqlite_int64 iEndBlockid; /* when we're done writing. */ char *pTerm; /* previous encoded term */ int nTerm; char *pData; /* encoding buffer */ int nData; InteriorWriter parentWriter; /* if we overflow */ int has_parent; } LeafWriter; static void leafWriterInit(int iLevel, int idx, LeafWriter *pWriter){ memset(pWriter, 0, sizeof(*pWriter)); pWriter->iLevel = iLevel; pWriter->idx = idx; /* Start out with a reasonably sized block, though it can grow. */ pWriter->pData = malloc(LEAF_MAX); pWriter->nData = putVarint(pWriter->pData, 0); } /* Flush the current leaf node to %_segments, and adding the resulting ** blockid and the starting term to the interior node which will ** contain it. */ static int leafWriterInternalFlush(fulltext_vtab *v, LeafWriter *pWriter){ sqlite_int64 iBlockid = 0; const char *pStartingTerm; int nStartingTerm, rc, n; /* Must have the leading varint(0) flag, plus at least some data. */ assert( pWriter->nData>2 ); rc = block_insert(v, pWriter->pData, pWriter->nData, &iBlockid); if( rc!=SQLITE_OK ) return rc; assert( iBlockid!=0 ); /* Reconstruct the first term in the leaf for purposes of building ** the interior node. */ n = getVarint32(pWriter->pData+1, &nStartingTerm); pStartingTerm = pWriter->pData+1+n; assert( pWriter->nData>1+n+nStartingTerm ); if( pWriter->has_parent ){ interiorWriterAppend(&pWriter->parentWriter, pStartingTerm, nStartingTerm, iBlockid); }else{ interiorWriterInit(1, pStartingTerm, nStartingTerm, iBlockid, &pWriter->parentWriter); pWriter->has_parent = 1; } /* Track the span of this segment's leaf nodes. */ if( pWriter->iEndBlockid==0 ){ pWriter->iEndBlockid = pWriter->iStartBlockid = iBlockid; }else{ pWriter->iEndBlockid++; assert( iBlockid==pWriter->iEndBlockid ); } /* Re-initialize the output buffer. */ pWriter->nData = putVarint(pWriter->pData, 0); pWriter->nTerm = 0; return SQLITE_OK; } /* Fetch the root info for the segment. If the entire leaf fits ** within ROOT_MAX, then it will be returned directly, otherwise it ** will be flushed and the root info will be returned from the ** interior node. *piEndBlockid is set to the blockid of the last ** interior or leaf node written to disk (0 if none are written at ** all). */ static int leafWriterRootInfo(fulltext_vtab *v, LeafWriter *pWriter, char **ppRootInfo, int *pnRootInfo, sqlite_int64 *piEndBlockid){ /* we can fit the segment entirely inline */ if( !pWriter->has_parent && pWriter->nData<ROOT_MAX ){ *ppRootInfo = pWriter->pData; *pnRootInfo = pWriter->nData; *piEndBlockid = 0; return SQLITE_OK; } /* Flush remaining leaf data. */ if( pWriter->nData>1 ){ int rc = leafWriterInternalFlush(v, pWriter); if( rc!=SQLITE_OK ) return rc; } /* We must have flushed a leaf at some point. */ assert( pWriter->has_parent ); /* Tenatively set the end leaf blockid as the end blockid. If the ** interior node can be returned inline, this will be the final ** blockid, otherwise it will be overwritten by ** interiorWriterRootInfo(). */ *piEndBlockid = pWriter->iEndBlockid; return interiorWriterRootInfo(v, &pWriter->parentWriter, ppRootInfo, pnRootInfo, piEndBlockid); } /* Collect the rootInfo data and store it into the segment directory. ** This has the effect of flushing the segment's leaf data to ** %_segments, and also flushing any interior nodes to %_segments. */ static int leafWriterFlush(fulltext_vtab *v, LeafWriter *pWriter){ sqlite_int64 iEndBlockid; char *pRootInfo; int rc, nRootInfo; rc = leafWriterRootInfo(v, pWriter, &pRootInfo, &nRootInfo, &iEndBlockid); if( rc!=SQLITE_OK ) return rc; return segdir_set(v, pWriter->iLevel, pWriter->idx, pWriter->iStartBlockid, pWriter->iEndBlockid, iEndBlockid, pRootInfo, nRootInfo); } static void leafWriterDestroy(LeafWriter *pWriter){ if( pWriter->has_parent ) interiorWriterDestroy(&pWriter->parentWriter); free(pWriter->pTerm); free(pWriter->pData); } /* Push pTerm[nTerm] along with the doclist data to the leaf layer of ** %_segments. */ static int leafWriterStep(fulltext_vtab *v, LeafWriter *pWriter, const char *pTerm, int nTerm, DocList *doclist){ char c[VARINT_MAX+VARINT_MAX]; int rc, n; /* Flush existing data if this item won't fit well. */ if( pWriter->nData>1 && (doclist->nData+nTerm>STANDALONE_MIN || pWriter->nData+doclist->nData+nTerm>LEAF_MAX) ){ rc = leafWriterInternalFlush(v, pWriter); if( rc!=SQLITE_OK ) return rc; } if( pWriter->nTerm==0 ){ /* Encode the entire leading term as: ** varint(nTerm) ** char pTerm[nTerm] */ n = putVarint(c, nTerm); assert( pWriter->nData==1 ); data_append2(&pWriter->pData, &pWriter->nData, c, n, pTerm, nTerm); }else{ /* Delta-encode the term as: ** varint(nPrefix) ** varint(nSuffix) ** char pTermSuffix[nSuffix] */ int nPrefix = 0; while( nPrefix<nTerm && nPrefix<pWriter->nTerm && pTerm[nPrefix]==pWriter->pTerm[nPrefix] ){ nPrefix++; } n = putVarint(c, nPrefix); n += putVarint(c+n, nTerm-nPrefix); data_append2(&pWriter->pData, &pWriter->nData, c, n, pTerm+nPrefix, nTerm-nPrefix); } data_replace(&pWriter->pTerm, &pWriter->nTerm, pTerm, nTerm); /* Encode the doclist as: ** varint(nDoclist) ** char pDoclist[nDoclist] */ n = putVarint(c, doclist->nData); data_append2(&pWriter->pData, &pWriter->nData, c, n, doclist->pData, doclist->nData); /* Flush standalone blocks right out */ if( doclist->nData+nTerm>STANDALONE_MIN ){ rc = leafWriterInternalFlush(v, pWriter); if( rc!=SQLITE_OK ) return rc; } return SQLITE_OK; } /****************************************************************/ /* LeafReader is used to iterate over an individual leaf node. */ typedef struct LeafReader { char *pTerm; /* copy of current term. */ int nTerm; const char *pData; /* data for current term. */ int nData; } LeafReader; static void leafReaderDestroy(LeafReader *pReader){ free(pReader->pTerm); #ifndef NDEBUG memset(pReader, 0x55, sizeof(pReader)); #endif } static int leafReaderAtEnd(LeafReader *pReader){ return pReader->nData<=0; } /* Access the current term. */ static int leafReaderTermBytes(LeafReader *pReader){ return pReader->nTerm; } static const char *leafReaderTerm(LeafReader *pReader){ assert( pReader->nTerm>0 ); return pReader->pTerm; } /* Access the doclist data for the current term. */ static int leafReaderDataBytes(LeafReader *pReader){ int nData; assert( pReader->nTerm>0 ); getVarint32(pReader->pData, &nData); return nData; } static const char *leafReaderData(LeafReader *pReader){ int n, nData; assert( pReader->nTerm>0 ); n = getVarint32(pReader->pData, &nData); return pReader->pData+n; } static void leafReaderInit(const char *pData, int nData, LeafReader *pReader){ int nTerm, n; assert( nData>0 ); assert( pData[0]=='\0' ); memset(pReader, '\0', sizeof(pReader)); /* Read the first term, skipping the header byte. */ n = getVarint32(pData+1, &nTerm); data_dup(&pReader->pTerm, &pReader->nTerm, pData+1+n, nTerm); /* Position after the first term. */ assert( 1+n+nTerm<nData ); pReader->pData = pData+1+n+nTerm; pReader->nData = nData-1-n-nTerm; } /* Step the reader forward to the next term. */ static void leafReaderStep(LeafReader *pReader){ int n, nData, nPrefix, nSuffix; assert( !leafReaderAtEnd(pReader) ); /* Skip previous entry's data block. */ n = getVarint32(pReader->pData, &nData); assert( n+nData<=pReader->nData ); pReader->pData += n+nData; pReader->nData -= n+nData; if( !leafReaderAtEnd(pReader) ){ /* Construct the new term using a prefix from the old term plus a ** suffix from the leaf data. */ n = getVarint32(pReader->pData, &nPrefix); n += getVarint32(pReader->pData+n, &nSuffix); assert( n+nSuffix<pReader->nData ); pReader->nTerm = nPrefix; data_append(&pReader->pTerm, &pReader->nTerm, pReader->pData+n, nSuffix); pReader->pData += n+nSuffix; pReader->nData -= n+nSuffix; } } /* strcmp-style comparison of pReader's current term against pTerm. */ static int leafReaderTermCmp(LeafReader *pReader, const char *pTerm, int nTerm){ int c, n = pReader->nTerm<nTerm ? pReader->nTerm : nTerm; if( n==0 ){ if( pReader->nTerm>0 ) return -1; if(nTerm>0 ) return 1; return 0; } c = memcmp(pReader->pTerm, pTerm, n); if( c!=0 ) return c; return pReader->nTerm - nTerm; } /****************************************************************/ /* LeavesReader wraps LeafReader to allow iterating over the entire ** leaf layer of the tree. */ typedef struct LeavesReader { int idx; /* Index within the segment. */ sqlite3_stmt *pStmt; /* Statement we're streaming leaves from. */ int eof; /* we've seen SQLITE_DONE from pStmt. */ LeafReader leafReader; /* reader for the current leaf. */ char *pRootData; /* root data for inline. */ } LeavesReader; /* Access the current term. */ static int leavesReaderTermBytes(LeavesReader *pReader){ assert( !pReader->eof ); return leafReaderTermBytes(&pReader->leafReader); } static const char *leavesReaderTerm(LeavesReader *pReader){ assert( !pReader->eof ); return leafReaderTerm(&pReader->leafReader); } /* Access the doclist data for the current term. */ static int leavesReaderDataBytes(LeavesReader *pReader){ assert( !pReader->eof ); return leafReaderDataBytes(&pReader->leafReader); } static const char *leavesReaderData(LeavesReader *pReader){ assert( !pReader->eof ); return leafReaderData(&pReader->leafReader); } static int leavesReaderAtEnd(LeavesReader *pReader){ return pReader->eof; } static void leavesReaderDestroy(LeavesReader *pReader){ leafReaderDestroy(&pReader->leafReader); if( pReader->pRootData!=0 ) free(pReader->pRootData); #ifndef NDEBUG memset(pReader, 0x55, sizeof(pReader)); #endif } /* Initialize pReader with the given root data (if iStartBlockid==0 ** the leaf data was entirely contained in the root), or from the ** stream of blocks between iStartBlockid and iEndBlockid, inclusive. */ static int leavesReaderInit(fulltext_vtab *v, int idx, sqlite_int64 iStartBlockid, sqlite_int64 iEndBlockid, const char *pRootData, int nRootData, LeavesReader *pReader){ memset(pReader, 0, sizeof(*pReader)); pReader->idx = idx; if( iStartBlockid==0 ){ /* Entire leaf level fit in root data. */ int n; data_dup(&pReader->pRootData, &n, pRootData, nRootData); leafReaderInit(pReader->pRootData, nRootData, &pReader->leafReader); }else{ sqlite3_stmt *s; int rc = sql_get_leaf_statement(v, idx, &s); if( rc!=SQLITE_OK ) return rc; rc = sqlite3_bind_int64(s, 1, iStartBlockid); if( rc!=SQLITE_OK ) return rc; rc = sqlite3_bind_int64(s, 2, iEndBlockid); if( rc!=SQLITE_OK ) return rc; rc = sql_step_leaf_statement(v, idx, &s); if( rc==SQLITE_DONE ){ pReader->eof = 1; return SQLITE_OK; } if( rc!=SQLITE_ROW ) return rc; pReader->pStmt = s; leafReaderInit(sqlite3_column_blob(pReader->pStmt, 0), sqlite3_column_bytes(pReader->pStmt, 0), &pReader->leafReader); } return SQLITE_OK; } /* Step the current leaf forward to the next term. If we reach the ** end of the current leaf, step forward to the next leaf block. */ static int leavesReaderStep(fulltext_vtab *v, LeavesReader *pReader){ assert( !leavesReaderAtEnd(pReader) ); leafReaderStep(&pReader->leafReader); if( leafReaderAtEnd(&pReader->leafReader) ){ int rc; if( pReader->pRootData ){ pReader->eof = 1; return SQLITE_OK; } rc = sql_step_leaf_statement(v, pReader->idx, &pReader->pStmt); if( rc!=SQLITE_ROW ){ pReader->eof = 1; return rc==SQLITE_DONE ? SQLITE_OK : rc; } leafReaderDestroy(&pReader->leafReader); leafReaderInit(sqlite3_column_blob(pReader->pStmt, 0), sqlite3_column_bytes(pReader->pStmt, 0), &pReader->leafReader); } return SQLITE_OK; } /* Order LeavesReaders by their term, ignoring idx. Readers at eof ** always sort to the end. */ static int leavesReaderTermCmp(LeavesReader *lr1, LeavesReader *lr2){ if( leavesReaderAtEnd(lr1) ){ if( leavesReaderAtEnd(lr2) ) return 0; return 1; } if( leavesReaderAtEnd(lr2) ) return -1; return leafReaderTermCmp(&lr1->leafReader, leavesReaderTerm(lr2), leavesReaderTermBytes(lr2)); } /* Similar to leavesReaderTermCmp(), with additional ordering by idx ** so that older segments sort before newer segments. */ static int leavesReaderCmp(LeavesReader *lr1, LeavesReader *lr2){ int c = leavesReaderTermCmp(lr1, lr2); if( c!=0 ) return c; return lr1->idx-lr2->idx; } /* Assume that pLr[1]..pLr[nLr] are sorted. Bubble pLr[0] into its ** sorted position. */ static void leavesReaderReorder(LeavesReader *pLr, int nLr){ while( nLr>1 && leavesReaderCmp(pLr, pLr+1)>0 ){ LeavesReader tmp = pLr[0]; pLr[0] = pLr[1]; pLr[1] = tmp; nLr--; pLr++; } } /* Initializes pReaders with the segments from level iLevel, returning ** the number of segments in *piReaders. Leaves pReaders in sorted ** order. */ static int leavesReadersInit(fulltext_vtab *v, int iLevel, LeavesReader *pReaders, int *piReaders){ sqlite3_stmt *s; int i, rc = sql_get_statement(v, SEGDIR_SELECT_STMT, &s); if( rc!=SQLITE_OK ) return rc; rc = sqlite3_bind_int(s, 1, iLevel); if( rc!=SQLITE_OK ) return rc; i = 0; while( (rc = sql_step_statement(v, SEGDIR_SELECT_STMT, &s))==SQLITE_ROW ){ sqlite_int64 iStart = sqlite3_column_int64(s, 0); sqlite_int64 iEnd = sqlite3_column_int64(s, 1); const char *pRootData = sqlite3_column_blob(s, 2); int nRootData = sqlite3_column_bytes(s, 2); assert( i<MERGE_COUNT ); rc = leavesReaderInit(v, i, iStart, iEnd, pRootData, nRootData, &pReaders[i]); if( rc!=SQLITE_OK ) break; i++; } if( rc!=SQLITE_DONE ){ while( i-->0 ){ leavesReaderDestroy(&pReaders[i]); } return rc; } *piReaders = i; /* Leave our results sorted by term, then age. */ while( i-- ){ leavesReaderReorder(pReaders+i, *piReaders-i); } return SQLITE_OK; } /* Merge doclists from pReaders[nReaders] into a single doclist, which ** is written to pWriter. Assumes pReaders is ordered oldest to ** newest. */ /* TODO(shess) I have a version of this that merges the doclists ** pairwise, and is thus much faster, but is also more intricate. So ** I'll throw that in as a standalone change. N-way merge would be ** even faster. */ static int leavesReadersMerge(fulltext_vtab *v, LeavesReader *pReaders, int nReaders, LeafWriter *pWriter){ const char *pTerm = leavesReaderTerm(pReaders); int i, rc, nTerm = leavesReaderTermBytes(pReaders); DocList doclist; /* No need to merge, insert directly. */ if( nReaders==1 ){ docListStaticInit(&doclist, DL_DEFAULT, leavesReaderData(pReaders), leavesReaderDataBytes(pReaders)); }else{ docListInit(&doclist, DL_DEFAULT, leavesReaderData(pReaders), leavesReaderDataBytes(pReaders)); for(i=1; i<nReaders; i++){ DocList new, merged; docListStaticInit(&new, DL_DEFAULT, leavesReaderData(pReaders+i), leavesReaderDataBytes(pReaders+i)); docListMerge(&merged, &doclist, &new); docListDestroy(&doclist); doclist = merged; } } /* Insert the new doclist */ rc = leafWriterStep(v, pWriter, pTerm, nTerm, &doclist); if( nReaders>1 ) docListDestroy(&doclist); return rc; } /* Forward ref due to mutual recursion with segdirNextIndex(). */ static int segmentMerge(fulltext_vtab *v, int iLevel); /* Put the next available index at iLevel into *pidx. If iLevel ** already has MERGE_COUNT segments, they are merged to a higher ** level to make room. */ static int segdirNextIndex(fulltext_vtab *v, int iLevel, int *pidx){ int rc = segdir_max_index(v, iLevel, pidx); if( rc==SQLITE_DONE ){ /* No segments at iLevel. */ *pidx = 0; }else if( rc==SQLITE_ROW ){ if( *pidx==(MERGE_COUNT-1) ){ rc = segmentMerge(v, iLevel); if( rc!=SQLITE_OK ) return rc; *pidx = 0; }else{ (*pidx)++; } }else{ return rc; } return SQLITE_OK; } /* Merge MERGE_COUNT segments at iLevel into a new segment at ** iLevel+1. If iLevel+1 is already full of segments, those will be ** merged to make room. */ static int segmentMerge(fulltext_vtab *v, int iLevel){ LeafWriter writer; LeavesReader lrs[MERGE_COUNT]; int i, rc, idx = 0; /* Determine the next available segment index at the next level, ** merging as necessary. */ rc = segdirNextIndex(v, iLevel+1, &idx); if( rc!=SQLITE_OK ) return rc; /* TODO(shess) This assumes that we'll always see exactly ** MERGE_COUNT segments to merge at a given level. That will be ** broken if we allow the developer to request preemptive or ** deferred merging. */ memset(&lrs, '\0', sizeof(lrs)); rc = leavesReadersInit(v, iLevel, lrs, &i); if( rc!=SQLITE_OK ) return rc; assert( i==MERGE_COUNT ); leafWriterInit(iLevel+1, idx, &writer); /* Since leavesReaderReorder() pushes readers at eof to the end, ** when the first reader is empty, all will be empty. */ while( !leavesReaderAtEnd(lrs) ){ /* Figure out how many readers share their next term. */ for(i=1; i<MERGE_COUNT && !leavesReaderAtEnd(lrs+i); i++){ if( 0!=leavesReaderTermCmp(lrs, lrs+i) ) break; } rc = leavesReadersMerge(v, lrs, i, &writer); if( rc!=SQLITE_OK ) goto err; /* Step forward those that were merged. */ while( i-->0 ){ rc = leavesReaderStep(v, lrs+i); if( rc!=SQLITE_OK ) goto err; /* Reorder by term, then by age. */ leavesReaderReorder(lrs+i, MERGE_COUNT-i); } } for(i=0; i<MERGE_COUNT; i++){ leavesReaderDestroy(&lrs[i]); } rc = leafWriterFlush(v, &writer); leafWriterDestroy(&writer); if( rc!=SQLITE_OK ) return rc; /* Delete the merged segment data. */ return segdir_delete(v, iLevel); err: for(i=0; i<MERGE_COUNT; i++){ leavesReaderDestroy(&lrs[i]); } leafWriterDestroy(&writer); return rc; } /* Read pData[nData] as a leaf node, and if the doclist for ** pTerm[nTerm] is present, merge it over *out (any duplicate doclists ** read from pData will overwrite those in *out). */ static int loadSegmentLeaf(fulltext_vtab *v, const char *pData, int nData, const char *pTerm, int nTerm, DocList *out){ LeafReader reader; assert( nData>1 ); assert( *pData=='\0' ); leafReaderInit(pData, nData, &reader); while( !leafReaderAtEnd(&reader) ){ int c = leafReaderTermCmp(&reader, pTerm, nTerm); if( c==0 ){ DocList new, doclist; docListStaticInit(&new, DL_DEFAULT, leafReaderData(&reader), leafReaderDataBytes(&reader)); docListMerge(&doclist, out, &new); docListDestroy(out); *out = doclist; } if( c>=0 ) break; leafReaderStep(&reader); } leafReaderDestroy(&reader); return SQLITE_OK; } /* Traverse the tree represented by pData[nData] looking for ** pTerm[nTerm], merging its doclist over *out if found (any duplicate ** doclists read from the segment rooted at pData will overwrite those ** in *out). */ static int loadSegment(fulltext_vtab *v, const char *pData, int nData, const char *pTerm, int nTerm, DocList *out){ int rc; sqlite3_stmt *s = NULL; assert( nData>1 ); /* Process data as an interior node until we reach a leaf. */ while( *pData!='\0' ){ sqlite_int64 iBlockid; InteriorReader reader; /* Scan the node data until we find a term greater than our term. ** Our target child will be in the blockid under that term, or in ** the last blockid in the node if we never find such a term. */ interiorReaderInit(pData, nData, &reader); while( !interiorReaderAtEnd(&reader) ){ if( interiorReaderTermCmp(&reader, pTerm, nTerm)>0 ) break; interiorReaderStep(&reader); } /* Grab the child blockid before calling sql_get_statement(), ** because sql_get_statement() may reset our data out from under ** us. */ iBlockid = interiorReaderCurrentBlockid(&reader); interiorReaderDestroy(&reader); rc = sql_get_statement(v, BLOCK_SELECT_STMT, &s); if( rc!=SQLITE_OK ) return rc; rc = sqlite3_bind_int64(s, 1, iBlockid); if( rc!=SQLITE_OK ) return rc; rc = sql_step_statement(v, BLOCK_SELECT_STMT, &s); if( rc==SQLITE_DONE ) return SQLITE_ERROR; if( rc!=SQLITE_ROW ) return rc; pData = sqlite3_column_blob(s, 0); nData = sqlite3_column_bytes(s, 0); } rc = loadSegmentLeaf(v, pData, nData, pTerm, nTerm, out); if( rc!=SQLITE_OK ) return rc; /* If we selected a child node, we need to finish that select. */ if( s!=NULL ){ /* We expect only one row. We must execute another sqlite3_step() * to complete the iteration; otherwise the table will remain * locked. */ rc = sqlite3_step(s); if( rc==SQLITE_ROW ) return SQLITE_ERROR; if( rc!=SQLITE_DONE ) return rc; } return SQLITE_OK; } /* Scan the database and merge together the posting lists for the term ** into *out. */ static int termSelect(fulltext_vtab *v, int iColumn, const char *pTerm, int nTerm, DocList *out){ DocList doclist; sqlite3_stmt *s; int rc = sql_get_statement(v, SEGDIR_SELECT_ALL_STMT, &s); if( rc!=SQLITE_OK ) return rc; docListInit(&doclist, DL_DEFAULT, 0, 0); /* Traverse the segments from oldest to newest so that newer doclist ** elements for given docids overwrite older elements. */ while( (rc=sql_step_statement(v, SEGDIR_SELECT_ALL_STMT, &s))==SQLITE_ROW ){ rc = loadSegment(v, sqlite3_column_blob(s, 0), sqlite3_column_bytes(s, 0), pTerm, nTerm, &doclist); if( rc!=SQLITE_OK ) goto err; } if( rc==SQLITE_DONE ){ *out = doclist; /* TODO(shess) The old term_select_all() code applied the column ** restrict as we merged segments, leading to smaller buffers. ** This is probably worthwhile to bring back, once the new storage ** system is checked in. */ if( iColumn<v->nColumn ){ /* querying a single column */ docListRestrictColumn(out, iColumn); } docListDiscardEmpty(out); return SQLITE_OK; } err: docListDestroy(&doclist); return rc; } /****************************************************************/ /* Used to hold hashtable data for sorting. */ typedef struct TermData { const char *pTerm; int nTerm; DocList *pDoclist; } TermData; /* Orders TermData elements in strcmp fashion ( <0 for less-than, 0 ** for equal, >0 for greater-than). */ static int termDataCmp(const void *av, const void *bv){ const TermData *a = (const TermData *)av; const TermData *b = (const TermData *)bv; int n = a->nTerm<b->nTerm ? a->nTerm : b->nTerm; int c = memcmp(a->pTerm, b->pTerm, n); if( c!=0 ) return c; return a->nTerm-b->nTerm; } /* Order pTerms data by term, then write a new level 0 segment using ** LeafWriter. */ static int writeZeroSegment(fulltext_vtab *v, fts2Hash *pTerms){ fts2HashElem *e; int idx, rc, i, n; TermData *pData; LeafWriter writer; /* Determine the next index at level 0, merging as necessary. */ rc = segdirNextIndex(v, 0, &idx); if( rc!=SQLITE_OK ) return rc; n = fts2HashCount(pTerms); pData = malloc(n*sizeof(TermData)); for(i = 0, e = fts2HashFirst(pTerms); e; i++, e = fts2HashNext(e)){ assert( i<n ); pData[i].pTerm = fts2HashKey(e); pData[i].nTerm = fts2HashKeysize(e); pData[i].pDoclist = fts2HashData(e); } assert( i==n ); /* TODO(shess) Should we allow user-defined collation sequences, ** here? I think we only need that once we support prefix searches. */ if( n>1 ) qsort(pData, n, sizeof(*pData), termDataCmp); leafWriterInit(0, idx, &writer); for(i=0; i<n; i++){ rc = leafWriterStep(v, &writer, pData[i].pTerm, pData[i].nTerm, pData[i].pDoclist); if( rc!=SQLITE_OK ) goto err; } rc = leafWriterFlush(v, &writer); err: free(pData); leafWriterDestroy(&writer); return rc; } /* This function implements the xUpdate callback; it's the top-level entry * point for inserting, deleting or updating a row in a full-text table. */ static int fulltextUpdate(sqlite3_vtab *pVtab, int nArg, sqlite3_value **ppArg, sqlite_int64 *pRowid){ fulltext_vtab *v = (fulltext_vtab *) pVtab; fts2Hash terms; /* maps term string -> PosList */ |
︙ | ︙ | |||
3125 3126 3127 3128 3129 3130 3131 | * ppArg[2..2+v->nColumn-1] = values * ppArg[2+v->nColumn] = value for magic column (we ignore this) */ assert( nArg==2+v->nColumn+1); rc = index_insert(v, ppArg[1], &ppArg[2], pRowid, &terms); } | | < < < < < < < | 4526 4527 4528 4529 4530 4531 4532 4533 4534 4535 4536 4537 4538 4539 4540 | * ppArg[2..2+v->nColumn-1] = values * ppArg[2+v->nColumn] = value for magic column (we ignore this) */ assert( nArg==2+v->nColumn+1); rc = index_insert(v, ppArg[1], &ppArg[2], pRowid, &terms); } if( rc==SQLITE_OK ) rc = writeZeroSegment(v, &terms); /* clean up */ for(e=fts2HashFirst(&terms); e; e=fts2HashNext(e)){ DocList *p = fts2HashData(e); docListDelete(p); } fts2HashClear(&terms); |
︙ | ︙ |