SQLite4
Check-in [f3ac136843]
Not logged in

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Allow an fts5 tokenizer to split a single document into multiple streams (i.e. sub-fields within a single column value). Modify the matchinfo APIs so that a ranking function may handle streams and/or columns separately or otherwise.
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | matchinfo
Files: files | file ages | folders
SHA1: f3ac136843205f618826cb50635631dbf238e2bd
User & Date: dan 2013-01-04 18:37:37
Context
2013-01-07
19:52
Add an implementation of snippet() and its associated mi apis to fts5. check-in: 8d94102cd3 user: dan tags: matchinfo
2013-01-04
18:37
Allow an fts5 tokenizer to split a single document into multiple streams (i.e. sub-fields within a single column value). Modify the matchinfo APIs so that a ranking function may handle streams and/or columns separately or otherwise. check-in: f3ac136843 user: dan tags: matchinfo
2013-01-03
20:35
Add comment describing format of row and global size records. check-in: 7cfa40b5c1 user: dan tags: matchinfo
Changes
Hide Diffs Unified Diffs Show Whitespace Changes Patch

Changes to src/fts5.c.

12
13
14
15
16
17
18






19
20
21
22
23
24
25
26
27
..
52
53
54
55
56
57
58
59
60
61
62







63
64
65
66
67
68
69
...
133
134
135
136
137
138
139

140
141
142
143
144
145
146
...
236
237
238
239
240
241
242
243


244


245
246
247
248
249










250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
...
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
...
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
...
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
...
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
....
1155
1156
1157
1158
1159
1160
1161

1162
1163
1164

1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
....
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205

1206
1207
1208


1209
1210










1211
1212
1213
1214
1215
1216
1217
....
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
....
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270


1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283





















1284

1285
1286



1287
1288
1289
1290
1291
1292

1293
1294
1295
























1296
1297
1298
1299
1300
1301
1302
1303
1304
1305







1306
1307
1308
1309
1310


1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
....
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371



1372
1373
1374
1375
1376
1377
1378


1379
1380

1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
....
1401
1402
1403
1404
1405
1406
1407











1408

1409
1410
1411
1412
1413
1414
1415
1416


1417
1418
1419
1420
1421
1422
1423
....
1432
1433
1434
1435
1436
1437
1438
1439
1440



1441
1442
1443
1444
1445
1446
1447




1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459

1460
1461
1462
1463
1464




1465
1466
1467
1468
1469






1470
1471
1472
1473
1474
1475


1476
1477
1478

1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
....
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
....
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
....
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
....
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
....
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
....
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
....
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
....
2422
2423
2424
2425
2426
2427
2428

2429
2430
2431
2432
2433
2434
2435
2436
....
2504
2505
2506
2507
2508
2509
2510
2511
2512
2513
2514
2515
2516
2517
2518
2519
2520
2521
2522
2523



2524






2525



2526











2527
2528

2529
2530
2531
2532
2533









2534

2535
2536



2537


2538





2539
2540
2541
2542




2543
2544
2545
2546
2547



2548
2549
2550

2551
2552









2553
2554

2555
2556
2557
2558
2559
2560
2561
2562
2563
2564
2565
2566
2567
2568
2569
2570
2571
2572
2573
2574
2575
2576
2577
2578
2579
2580
....
2590
2591
2592
2593
2594
2595
2596
2597

2598
2599
2600
2601
2602
2603
2604
2605
2606
2607
2608
2609
2610
2611
2612
2613
2614
2615
2616
2617
2618
2619
2620
2621
....
2637
2638
2639
2640
2641
2642
2643
2644
2645
2646
2647
2648
2649
2650
2651
2652
2653
2654
2655
2656
2657
2658
2659
2660
2661
2662
2663
2664
2665
2666
2667
2668
2669
2670
2671
2672
2673


2674
2675
2676
2677


2678


2679
2680
2681
2682
2683
2684
2685

2686
2687
2688
2689
2690
2691
2692
2693
2694
2695
2696
2697

2698
2699
2700
2701
2702
2703
2704
2705
2706
2707
2708
2709
2710
2711
2712
2713
2714
2715
2716
2717
2718
2719
2720
2721
2722

2723
2724
2725
2726
2727
2728
2729
2730
2731
2732
2733
2734
2735
2736
2737
2738
2739
2740
2741
2742
2743
2744
2745
2746
2747
2748
2749
2750
2751
2752
2753

2754
2755
2756
2757
2758
2759
2760
2761
2762
2763
2764
2765



2766
2767
2768
2769
2770
2771
2772
2773
2774
2775
2776
2777

2778
2779

2780








2781
2782
2783
2784
2785
2786
2787
2788

2789
2790
2791
2792
2793
2794
2795
2796
2797
2798
2799
2800
*/

#include "sqliteInt.h"
#include "vdbeInt.h"

/* 
** Stream numbers must be lower than this.






*/
#define SQLITE4_FTS5_NSTREAM 60

/*
** Records stored within the index:
**
** Row size record:
**   There is one "row size" record in the index for each row in the
**   indexed table. The "row size" record contains the number of tokens
................................................................................
**
**   The data for this record is a series of varint values. The first 
**   varint is the total number of rows in the table. The subsequent
**   varints make up a "row size" record containing the total number of
**   tokens for each S/C combination in all rows of the table.
**
** FTS index records:
**
**   The FTS index records implement the following mapping:
**
**       (token, document-pk) -> (list of instances)







*/

/*
** Default distance value for NEAR operators.
*/
#define FTS5_DEFAULT_NEAR 10

................................................................................
typedef struct Fts5Expr Fts5Expr;
typedef struct Fts5ExprNode Fts5ExprNode;
typedef struct Fts5List Fts5List;
typedef struct Fts5Parser Fts5Parser;
typedef struct Fts5ParserToken Fts5ParserToken;
typedef struct Fts5Phrase Fts5Phrase;
typedef struct Fts5Prefix Fts5Prefix;

typedef struct Fts5Str Fts5Str;
typedef struct Fts5Token Fts5Token;


struct Fts5ParserToken {
  int eType;                      /* Token type */
  int n;                          /* Size of z[] in bytes */
................................................................................
  char *zExpr;                    /* Full text of MATCH expression */
  KVByteArray *aKey;              /* Buffer for primary key */
  int nKeyAlloc;                  /* Bytes allocated at aKey[] */

  KVCursor *pCsr;                 /* Cursor used to retrive values */
  Mem *aMem;                      /* Array of column values */

  /* Array of nPhrase*nCol integers. See sqlite4_mi_row_count() for details. */


  int *anRow;


  i64 *aGlobal;

  /* Size of each column of current row (in tokens). */
  int bSzValid;
  int *aSz;










};

/*
** This type is used when reading (decoding) an instance-list.
*/
typedef struct InstanceList InstanceList;
struct InstanceList {
  u8 *aList;
  int nList;
  int iList;

  /* The current entry */
  int iCol;
  int iWeight;
  int iOff;
};

/*
** Return true for EOF, or false if the next entry is valid.
*/
static int fts5InstanceListNext(InstanceList *p){
................................................................................
    u32 iVal;
    i += getVarint32(&p->aList[i], iVal);
    if( (iVal & 0x03)==0x01 ){
      p->iCol = (iVal>>2);
      p->iOff = 0;
    }
    else if( (iVal & 0x03)==0x03 ){
      p->iWeight = (iVal>>2);
    }
    else{
      p->iOff += (iVal>>1);
      bRet = 0;
    }
  }
  if( bRet ){
................................................................................
static int fts5InstanceListEof(InstanceList *p){
  return (p->aList==0);
}

static void fts5InstanceListAppend(
  InstanceList *p,                /* Instance list to append to */
  int iCol,                       /* Column of new entry */
  int iWeight,                    /* Weight of new entry */
  int iOff                        /* Offset of new entry */
){
  assert( iCol>=p->iCol );
  assert( iCol>p->iCol || iOff>=p->iOff );

  if( iCol!=p->iCol ){
    p->iList += putVarint32(&p->aList[p->iList], (iCol<<2)|0x01);
    p->iCol = iCol;
    p->iOff = 0;
  }

  if( iWeight!=p->iWeight ){
    p->iList += putVarint32(&p->aList[p->iList], (iWeight<<2)|0x03);
    p->iWeight = iWeight;
  }

  p->iList += putVarint32(&p->aList[p->iList], (iOff-p->iOff)<<1);
  p->iOff = iOff;

  assert( p->iList<=p->nList );
}
................................................................................
}

/*
** Callback for fts5CountTokens().
*/
static int fts5CountTokensCb(
  void *pCtx, 
  int iWeight, 
  int iOff, 
  const char *z, int n,
  int iSrc, int nSrc
){
  (*((int *)pCtx))++;
  return 0;
}
................................................................................
struct AppendTokensCtx {
  Fts5Parser *pParse;
  Fts5Str *pStr;
};

static int fts5AppendTokensCb(
  void *pCtx, 
  int iWeight, 
  int iOff, 
  const char *z, int n, 
  int iSrc, int nSrc
){
  struct AppendTokensCtx *p = (struct AppendTokensCtx *)pCtx;
  Fts5Parser *pParse = p->pParse;
  Fts5Token *pToken;
................................................................................
** sqlite4DbRealloc().
*/
typedef struct TokenizeCtx TokenizeCtx;
typedef struct TokenizeTerm TokenizeTerm;
struct TokenizeCtx {
  int rc;
  int iCol;

  sqlite4 *db;
  int nMax;
  int *aSz;                       /* Number of tokens in each column */

  Hash hash;
};
struct TokenizeTerm {
  int iWeight;                    /* Weight of previous entry */
  int iCol;                       /* Column containing previous entry */
  int iOff;                       /* Token offset of previous entry */
  int nToken;                     /* Size of token in bytes */
  int nData;                      /* Bytes of data in value */
  int nAlloc;                     /* Bytes of data allocated */
};

................................................................................
  a = &(((unsigned char *)&pTerm[1])[pTerm->nToken+pTerm->nData]);
  pTerm->nData += putVarint32(a, iVal);
  return pTerm;
}

static int fts5TokenizeCb(
  void *pCtx, 
  int iWeight, 
  int iOff,
  const char *zToken, 
  int nToken, 
  int iSrc, 
  int nSrc
){
  TokenizeCtx *p = (TokenizeCtx *)pCtx;

  TokenizeTerm *pTerm = 0;
  TokenizeTerm *pOrig = 0;



  if( nToken>p->nMax ) p->nMax = nToken;
  p->aSz[p->iCol]++;











  pTerm = (TokenizeTerm *)sqlite4HashFind(&p->hash, zToken, nToken);
  if( pTerm==0 ){
    /* Size the initial allocation so that it fits in the lookaside buffer */
    int nAlloc = sizeof(TokenizeTerm) + nToken + 32;

    pTerm = sqlite4DbMallocZero(p->db, nAlloc);
................................................................................
        pTerm = 0;
      }
      if( pTerm==0 ) goto tokenize_cb_out;
    }
  }
  pOrig = pTerm;

  if( iWeight!=pTerm->iWeight ){
    pTerm = fts5TokenizeAppendInt(p, pTerm, (iWeight << 2) | 0x00000003);
    if( !pTerm ) goto tokenize_cb_out;
    pTerm->iWeight = iWeight;
  }

  if( pTerm && p->iCol!=pTerm->iCol ){
    pTerm = fts5TokenizeAppendInt(p, pTerm, (p->iCol << 2) | 0x00000001);
    if( !pTerm ) goto tokenize_cb_out;
    pTerm->iCol = p->iCol;
    pTerm->iOff = 0;
................................................................................
    p->rc = SQLITE4_NOMEM;
    return 1;
  }

  return 0;
}

static int fts5LoadGlobal(sqlite4 *db, Fts5Info *pInfo, i64 *aVal){
  int rc;
  int nVal = pInfo->nCol + 1;
  u8 aKey[10];                    /* Global record key */
  int nKey;                       /* Bytes in key aKey */
  KVCursor *pCsr = 0;             /* Cursor used to read global record */

  nKey = putVarint32(aKey, pInfo->iRoot);
  aKey[nKey++] = 0x00;



  rc = sqlite4KVStoreOpenCursor(db->aDb[pInfo->iDb].pKV, &pCsr);
  if( rc==SQLITE4_OK ){
    rc = sqlite4KVCursorSeek(pCsr, aKey, nKey, 0);
    if( rc==SQLITE4_NOTFOUND ){
      rc = SQLITE4_OK;
      memset(aVal, 0, sizeof(i64)*nVal);
    }else if( rc==SQLITE4_OK ){
      const u8 *aData = 0;
      int nData = 0;
      rc = sqlite4KVCursorData(pCsr, 0, -1, &aData, &nData);
      if( rc==SQLITE4_OK ){
        int i;





















        int iOff = 0;

        for(i=0; i<nVal; i++){
          iOff += sqlite4GetVarint(&aData[iOff], (u64 *)&aVal[i]);



        }
      }
    }
    sqlite4KVCursorClose(pCsr);
  }


  return rc;
}

























static int fts5CsrLoadGlobal(Fts5Cursor *pCsr){
  int rc = SQLITE4_OK;
  if( pCsr->aGlobal==0 ){
    int nByte = sizeof(i64) * (pCsr->pInfo->nCol + 1);
    pCsr->aGlobal = (i64 *)sqlite4DbMallocZero(pCsr->db, nByte);
    if( pCsr->aGlobal==0 ){
      rc = SQLITE4_NOMEM;
    }else{
      rc = fts5LoadGlobal(pCsr->db, pCsr->pInfo, pCsr->aGlobal);
    }







  }
  return rc;
}

static int fts5CsrLoadSz(Fts5Cursor *pCsr){


  sqlite4 *db = pCsr->db;
  Fts5Info *pInfo = pCsr->pInfo;
  int nVal = pInfo->nCol;
  int rc;
  u8 *aKey;
  int nKey = 0;
  int nPk = pCsr->pExpr->pRoot->nPk;
  KVCursor *pKVCsr = 0;           /* Cursor used to read global record */

  aKey = (u8 *)sqlite4DbMallocZero(db, 10 + nPk);
  if( !aKey ) return SQLITE4_NOMEM;

  nKey = putVarint32(aKey, pInfo->iRoot);
  aKey[nKey++] = 0x00;
  memcpy(&aKey[nKey], pCsr->pExpr->pRoot->aPk, nPk);
  nKey += nPk;

  rc = sqlite4KVStoreOpenCursor(db->aDb[pInfo->iDb].pKV, &pKVCsr);
  if( rc==SQLITE4_OK ){
    rc = sqlite4KVCursorSeek(pKVCsr, aKey, nKey, 0);
    if( rc==SQLITE4_NOTFOUND ){
      rc = SQLITE4_CORRUPT_BKPT;
    }else if( rc==SQLITE4_OK ){
      const u8 *aData = 0;
      int nData = 0;
      rc = sqlite4KVCursorData(pKVCsr, 0, -1, &aData, &nData);
      if( rc==SQLITE4_OK ){
        int i;
        int iOff = 0;
        for(i=0; i<nVal; i++){
          iOff += getVarint32(&aData[iOff], pCsr->aSz[i]);
        }
      }
      pCsr->bSzValid = 1;
    }
    sqlite4KVCursorClose(pKVCsr);
  }

  return rc;
}


/*
................................................................................
  int bDel,                       /* True for a delete, false for insert */
  char **pzErr                    /* OUT: Error message */
){
  int i;
  int rc = SQLITE4_OK;
  KVStore *pStore;
  TokenizeCtx sCtx;
  u8 *aKey = 0;
  int nKey = 0;
  int nTnum = 0;
  u32 dummy = 0;




  const u8 *pPK;
  int nPK;
  HashElem *pElem;

  pStore = db->aDb[pInfo->iDb].pKV;
  sCtx.rc = SQLITE4_OK;


  sCtx.db = db;
  sCtx.nMax = 0;

  sqlite4HashInit(db->pEnv, &sCtx.hash, 1);

  pPK = (const u8 *)sqlite4_value_blob(pKey);
  nPK = sqlite4_value_bytes(pKey);
  
  nTnum = getVarint32(pPK, dummy);
  nPK -= nTnum;
  pPK += nTnum;

  sCtx.aSz = (int *)sqlite4DbMallocZero(db, pInfo->nCol * sizeof(int));
  if( sCtx.aSz==0 ) rc = SQLITE4_NOMEM;

  for(i=0; rc==SQLITE4_OK && i<pInfo->nCol; i++){
    sqlite4_value *pArg = (sqlite4_value *)(&aArg[i]);
    if( pArg->flags & MEM_Str ){
      const char *zText;
      int nText;

      zText = (const char *)sqlite4_value_text(pArg);
................................................................................
      sCtx.iCol = i;
      rc = pInfo->pTokenizer->xTokenize(
          &sCtx, pInfo->p, zText, nText, fts5TokenizeCb
      );
    }
  }












  nKey = sqlite4VarintLen(pInfo->iRoot)+2+sCtx.nMax+nPK + 10*(pInfo->nCol+1);

  aKey = sqlite4DbMallocRaw(db, nKey);
  if( aKey==0 ) rc = SQLITE4_NOMEM;

  for(pElem=sqliteHashFirst(&sCtx.hash); pElem; pElem=sqliteHashNext(pElem)){
    TokenizeTerm *pTerm = (TokenizeTerm *)sqliteHashData(pElem);
    if( rc==SQLITE4_OK ){
      int nToken = sqliteHashKeysize(pElem);
      char *zToken = (char *)sqliteHashKey(pElem);



      nKey = putVarint32(aKey, pInfo->iRoot);
      aKey[nKey++] = 0x24;
      memcpy(&aKey[nKey], zToken, nToken);
      nKey += nToken;
      aKey[nKey++] = 0x00;
      memcpy(&aKey[nKey], pPK, nPK);
................................................................................
        aData += pTerm->nToken;
        rc = sqlite4KVStoreReplace(pStore, aKey, nKey, aData, pTerm->nData);
      }
    }
    sqlite4DbFree(db, pTerm);
  }

  /* Write the "sizes" record into the db */
  if( rc==SQLITE4_OK ){



    nKey = putVarint32(aKey, pInfo->iRoot);
    aKey[nKey++] = 0x00;
    memcpy(&aKey[nKey], pPK, nPK);
    nKey += nPK;

    if( bDel ){
      rc = sqlite4KVStoreReplace(pStore, aKey, nKey, 0, -1);




    }else{
      u8 *aData = &aKey[nKey];
      int nData = 0;
      for(i=0; i<pInfo->nCol; i++){
        nData += putVarint32(&aData[nData], sCtx.aSz[i]);
      }
      rc = sqlite4KVStoreReplace(pStore, aKey, nKey, aData, nData);
    }
  }

  /* Update the global record */
  if( rc==SQLITE4_OK ){

    i64 *aGlobal = (i64 *)aKey;
    u8 *aData = (u8 *)&aGlobal[pInfo->nCol+1];
    int nData = 0;

    rc = fts5LoadGlobal(db, pInfo, aGlobal);




    if( rc==SQLITE4_OK ){
      u8 aDbKey[10];
      int nDbKey;
      nDbKey = putVarint32(aDbKey, pInfo->iRoot);
      aDbKey[nDbKey++] = 0x00;







      nData += sqlite4PutVarint(&aData[nData], aGlobal[0] + (bDel?-1:1));
      for(i=0; i<pInfo->nCol; i++){
        i64 iNew = aGlobal[i+1] + (i64)sCtx.aSz[i] * (bDel?-1:1);
        nData += sqlite4PutVarint(&aData[nData], iNew);
      }



      rc = sqlite4KVStoreReplace(pStore, aDbKey, nDbKey, aData, nData);
    }

  }
  
  sqlite4DbFree(db, aKey);
  sqlite4DbFree(db, sCtx.aSz);
  sqlite4HashClear(&sCtx.hash);
  return rc;
}

static Fts5Info *fts5InfoCreate(Parse *pParse, Index *pIdx, int bCol){
  sqlite4 *db = pParse->db;
................................................................................
**   * the weight assigned to the instance,
**   * the column number, and
**   * the term offset.
*/
static i64 fts5TermInstanceCksum(
  const u8 *aTerm, int nTerm,
  const u8 *aPk, int nPk,
  int iWeight,
  int iCol,
  int iOff
){
  int i;
  i64 cksum = 0;

  /* Add the term to the checksum */
................................................................................

  /* Add the primary key blob to the checksum */
  for(i=0; i<nPk; i++){
    cksum += (cksum << 3) + aPk[i];
  }

  /* Add the weight, column number and offset (in that order) to the checksum */
  cksum += (cksum << 3) + iWeight;
  cksum += (cksum << 3) + iCol;
  cksum += (cksum << 3) + iOff;

  return cksum;
}


................................................................................
  nToken = sqlite4Strlen30((const char *)aToken);
  aPk = &aToken[nToken+1];
  nPk = (&aKey[nKey] - aPk);

  fts5InstanceListInit((u8 *)aVal, nVal, &sList);
  while( 0==fts5InstanceListNext(&sList) ){
    i64 v = fts5TermInstanceCksum(
        aPk, nPk, aToken, nToken, sList.iWeight, sList.iCol, sList.iOff
    );
    cksum = cksum ^ v;
  }

  *piCksum = cksum;
  return SQLITE4_OK;
}
................................................................................
  int nPK;
  int iCol;
  i64 cksum;
};

static int fts5CksumCb(
  void *pCtx, 
  int iWeight, 
  int iOff,
  const char *zToken, 
  int nToken, 
  int iSrc, 
  int nSrc
){
  CksumCtx *p = (CksumCtx *)pCtx;
  i64 cksum;

  cksum = fts5TermInstanceCksum(p->pPK, p->nPK, 
      (const u8 *)zToken, nToken, iWeight, p->iCol, iOff
  );

  p->cksum = (p->cksum ^ cksum);
  return 0;
}

int sqlite4Fts5RowCksum(
................................................................................
      fts5InstanceListNext(&in2);
    }else if( in1.iCol<in2.iCol || (in1.iCol==in2.iCol && in1.iOff<in2.iOff) ){
      pAdv = &in1;
    }else{
      pAdv = &in2;
    }

    fts5InstanceListAppend(&out, pAdv->iCol, pAdv->iWeight, pAdv->iOff);
    fts5InstanceListNext(pAdv);
  }

  if( bFree ){
    sqlite4DbFree(db, p1->aData);
    sqlite4DbFree(db, p2->aData);
  }
................................................................................
  while( rc==SQLITE4_OK && bEof==0 ){
    for(i=1; i<pStr->nToken; i++){
      int bMatch = fts5TokenAdvanceToMatch(&aIn[i], &aIn[0], i, &bEof);
      if( bMatch==0 || bEof ) break;
    }
    if( i==pStr->nToken && (iCol<0 || aIn[0].iCol==iCol) ){
      /* Record a match here */
      fts5InstanceListAppend(&out, aIn[0].iCol, aIn[0].iWeight, aIn[0].iOff);
    }
    bEof = fts5InstanceListNext(&aIn[0]);
  }

  pStr->nList = out.iList;
  sqlite4DbFree(db, aIn);

................................................................................

    while( bEof==0 ){
      if( fts5IsNear(&near, &in, nTrail) 
       || fts5IsNear(&in, &near, nLead)
      ){
        /* The current position is a match. Append an entry to the output
        ** and advance the input cursor. */
        fts5InstanceListAppend(&out, in.iCol, in.iWeight, in.iOff);
        bEof = fts5InstanceListNext(&in);
      }else{
        if( near.iCol<in.iCol || (near.iCol==in.iCol && near.iOff<in.iOff) ){
          bEof = fts5InstanceListNext(&near);
        }else if( near.iCol==in.iCol && near.iOff==in.iOff ){
          bEof = fts5InstanceListNext(&in);
          if( fts5IsNear(&near, &in, nTrail) ){
            fts5InstanceListAppend(&out, near.iCol, near.iWeight, near.iOff);
          }
        }else{
          bEof = fts5InstanceListNext(&in);
        }
      }
    }

................................................................................
  }

  assert( rc!=SQLITE4_NOTFOUND );
  return rc;
}

int sqlite4Fts5Next(Fts5Cursor *pCsr){

  pCsr->bSzValid = 0;
  return fts5ExprAdvance(pCsr->db, pCsr->pExpr->pRoot, 0);
}

int sqlite4Fts5Open(
  sqlite4 *db,                    /* Database handle */
  Fts5Info *pInfo,                /* Index description */
  const char *zMatch,             /* Match expression */
................................................................................
  memcpy(&pCsr->aKey[i], aPk, nPk);

  *paKey = pCsr->aKey;
  *pnKey = nReq;
  return SQLITE4_OK;
}

int sqlite4_mi_column_count(sqlite4_context *pCtx, int *pnCol){
  int rc = SQLITE4_OK;
  if( pCtx->pFts ){
    *pnCol = pCtx->pFts->pInfo->nCol;
  }else{
    rc = SQLITE4_MISUSE;
  }
  return rc;
}

int sqlite4_mi_column_size(sqlite4_context *pCtx, int iCol, int *pnToken){
  int rc = SQLITE4_OK;
  Fts5Cursor *pCsr = pCtx->pFts;










  if( pCsr==0 ){



    rc = SQLITE4_MISUSE;











  }else if( iCol>=pCsr->pInfo->nCol ){
    rc = SQLITE4_ERROR;

  }else{
    if( pCsr->aSz==0 ){
      pCsr->aSz = (int *)sqlite4DbMallocZero(
          pCsr->db, sizeof(int)*pCsr->pInfo->nCol
      );









      if( pCsr->aSz==0 ) rc = SQLITE4_NOMEM;

    }
    if( rc==SQLITE4_OK && pCsr->bSzValid==0 ){



      rc = fts5CsrLoadSz(pCsr);


    }





    if( rc==SQLITE4_OK ){
      assert( pCsr->bSzValid );
      if( iCol>=0 ){
        *pnToken = pCsr->aSz[iCol];




      }else{
        int i;
        int nToken = 0;
        for(i=0; i<pCsr->pInfo->nCol; i++){
          nToken += pCsr->aSz[i];



        }
        *pnToken = nToken;
      }

    }
  }









  return rc;
}


int sqlite4_mi_column_value(
  sqlite4_context *pCtx, 
  int iCol, 
  sqlite4_value **ppVal
){
  int rc = SQLITE4_OK;
  if( pCtx->pFts ){
  }else{
    rc = SQLITE4_MISUSE;
  }
  return rc;
}

int sqlite4_mi_phrase_count(sqlite4_context *pCtx, int *pnPhrase){
  int rc = SQLITE4_OK;
  if( pCtx->pFts ){
    *pnPhrase = pCtx->pFts->pExpr->nPhrase;
  }else{
    rc = SQLITE4_MISUSE;
  }
  return rc;
}

static Fts5Str *fts5FindStr(Fts5ExprNode *p, int *piStr){
  Fts5Str *pRet = 0;
................................................................................
    if( pRet==0 ) pRet = fts5FindStr(p->pRight, piStr);
  }
  return pRet;
}

int sqlite4_mi_match_count(
  sqlite4_context *pCtx, 
  int iCol,

  int iPhrase,
  int *pnMatch
){
  int rc = SQLITE4_OK;
  Fts5Cursor *pCsr = pCtx->pFts;
  if( pCsr ){
    int nMatch = 0;
    Fts5Str *pStr;
    int iCopy = iCol;
    InstanceList sList;

    pStr = fts5FindStr(pCsr->pExpr->pRoot, &iCopy);
    assert( pStr );

    fts5InstanceListInit(pStr->aList, pStr->nList, &sList);
    while( 0==fts5InstanceListNext(&sList) ){
      if( iCol<0 || sList.iCol==iCol ) nMatch++;
    }
    *pnMatch = nMatch;
  }else{
    rc = SQLITE4_MISUSE;
  }
  return rc;
}
................................................................................
  int *pnMatch,
  int *pnDoc,
  int *pnRelevant
){
  return SQLITE4_OK;
}

int sqlite4_mi_total_size(sqlite4_context *pCtx, int iCol, int *pnToken){
  int rc = SQLITE4_OK;
  if( pCtx->pFts ){
    Fts5Cursor *pCsr = pCtx->pFts;
    int nCol = pCsr->pInfo->nCol;

    if( iCol>=nCol ){
      rc = SQLITE4_ERROR;
    }else{
      rc = fts5CsrLoadGlobal(pCsr);
      if( rc==SQLITE4_OK ){
        if( iCol<0 ){
          int i;
          int nToken = 0;
          for(i=0; i<nCol; i++){
            nToken += pCsr->aGlobal[i+1];
          }
          *pnToken = nToken;
        }else{
          *pnToken = pCsr->aGlobal[iCol+1];
        }
      }
    }
  }else{
    rc = SQLITE4_MISUSE;
  }
  return rc;
}

static void fts5StrLoadRowcounts(Fts5Str *pStr, int *anRow){


  InstanceList sList;

  fts5InstanceListInit(pStr->aList, pStr->nList, &sList);
  while( 0==fts5InstanceListNext(&sList) ){


    anRow[sList.iCol]++;


  }
}


static int fts5ExprLoadRowcounts(
  sqlite4 *db, 
  Fts5Info *pInfo,

  Fts5ExprNode *pNode, 
  int **panRow
){
  int rc = SQLITE4_OK;

  if( pNode ){
    if( pNode->eType==TOKEN_PRIMITIVE ){
      int *anRow = *panRow;
      Fts5Phrase *pPhrase = pNode->pPhrase;

      rc = fts5ExprAdvance(db, pNode, 1);
      while( rc==SQLITE4_OK ){

        int i;
        for(i=0; i<pPhrase->nStr; i++){
          fts5StrLoadRowcounts(&pPhrase->aStr[i], &anRow[i*pInfo->nCol]);
        }
        rc = fts5ExprAdvance(db, pNode, 0);
      }

      *panRow = &anRow[pInfo->nCol * pPhrase->nStr];
    }

    if( rc==SQLITE4_OK ){
      rc = fts5ExprLoadRowcounts(db, pInfo, pNode->pLeft, panRow);
    }
    if( rc==SQLITE4_OK ){
      rc = fts5ExprLoadRowcounts(db, pInfo, pNode->pLeft, panRow);
    }
  }

  return rc;
}

static int fts5CsrLoadRowcounts(Fts5Cursor *pCsr){
  int rc = SQLITE4_OK;

  if( pCsr->anRow==0 ){

    sqlite4 *db = pCsr->db;
    Fts5Expr *pCopy;
    Fts5Expr *pExpr = pCsr->pExpr;
    Fts5Info *pInfo = pCsr->pInfo;
    int *anRow;

    pCsr->anRow = anRow = (int *)sqlite4DbMallocZero(db, 
        pExpr->nPhrase * pInfo->nCol * sizeof(int)
    );
    if( !anRow ) return SQLITE4_NOMEM;

    rc = fts5ParseExpression(db, pInfo->pTokenizer, pInfo->p, 
        pInfo->iRoot, pInfo->azCol, pInfo->nCol, pCsr->zExpr, &pCopy, 0
    );
    if( rc==SQLITE4_OK ){
      rc = fts5OpenExprCursors(db, pInfo, pExpr->pRoot);
    }

    if( rc==SQLITE4_OK ){
      rc = fts5ExprLoadRowcounts(db, pInfo, pCopy->pRoot, &anRow);
    }

    fts5ExpressionFree(db, pCopy);
  }

  return rc;
}

int sqlite4_mi_row_count(
  sqlite4_context *pCtx,          /* Context object passed to mi function */
  int iCol,                       /* Specific column (or -1) */

  int iPhrase,                    /* Specific phrase (or -1) */
  int *pnRow                      /* Total number of rows */
){
  int rc = SQLITE4_OK;
  if( pCtx->pFts ){
    Fts5Cursor *pCsr = pCtx->pFts;
    Fts5Expr *pExpr = pCsr->pExpr;
    int nCol = pCsr->pInfo->nCol;
    int nPhrase = pExpr->nPhrase;

    if( iCol>=nCol || iPhrase>=nPhrase ){
      rc = SQLITE4_ERROR;



    }

    else if( iPhrase>=0 ){
      int iIdx = iPhrase * pCsr->pInfo->nCol;

      rc = fts5CsrLoadRowcounts(pCsr);
      if( rc==SQLITE4_OK ){
        if( iCol>0 ){
          *pnRow = pCsr->anRow[iIdx + iCol];
        }else{
          int i;
          int nRow = 0;

          for(i=0; i<pCsr->pInfo->nCol; i++){
            nRow += pCsr->anRow[iIdx + i];

          }








          *pnRow = nRow;
        }
      }
    }else{
      /* Total number of rows in table... */
      rc = fts5CsrLoadGlobal(pCsr);
      if( rc==SQLITE4_OK ){
        *pnRow = (int)pCsr->aGlobal[0];

      }
    }
  }else{
    rc = SQLITE4_MISUSE;
  }
  return rc;
}

/**************************************************************************
***************************************************************************
** Below this point is test code.
*/







>
>
>
>
>
>

|







 







<



>
>
>
>
>
>
>







 







>







 







|
>
>

>
>





>
>
>
>
>
>
>
>
>
>













|







 







|







 







|











|
|
|







 







|







 







|







 







>


|
>



|







 







|







>



>
>

<
>
>
>
>
>
>
>
>
>
>







 







|
|

|







 







|
|
|
|
|
|
|
|
|
>
>





|
<





|
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
|
>
|
|
>
>
>






>



>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>


|
<
<
<
<
<
<
<
>
>
>
>
>
>
>





>
>
|
|
<
<



<









|
|
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<







 







<
<


>
>
>






<
>
>

<
>









<
<
<







 







>
>
>
>
>
>
>
>
>
>
>
|
>
|
|






>
>







 







|

>
>
>





|
|
>
>
>
>

<
<
<
<
<
|





>
|
|
|

|
>
>
>
>

<
|
<
<
>
>
>
>
>
>
|
<
<
<
<

>
>
|
<
|
>


|







 







|







 







|







 







|







 







|










|







 







|







 







|







 







|







|







 







>
|







 







|


|






|

|
>
>
>
|
>
>
>
>
>
>
|
>
>
>

>
>
>
>
>
>
>
>
>
>
>
|
<
>
|
<
<
<
<
>
>
>
>
>
>
>
>
>
|
>
|
<
>
>
>
|
>
>

>
>
>
>
>
|
<
<
<
>
>
>
>
|
<
<
<
<
>
>
>
|
<
|
>
|
|
>
>
>
>
>
>
>
>
>


>









<
<
<
<
<
<
<
<
<
<







 







|
>








|







|







 







<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
|
>
>




>
>
|
>
>
|
|
|




>












>


|




|



|


|










>







|









<

|










|
>
|
|


<
|
|
<
<
<
<
|
>
>
>
|
<
<
<
<
<
|
<
<
<
|
|
>
|
<
>
|
>
>
>
>
>
>
>
>
|
|
|
<
<
<
<
<
>
|
|
<
<
<







12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
..
58
59
60
61
62
63
64

65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
...
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
...
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
...
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
...
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
...
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
...
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
....
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
....
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241

1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
....
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
....
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319

1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390







1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406


1407
1408
1409

1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420

















1421
1422
1423
1424
1425
1426
1427
....
1435
1436
1437
1438
1439
1440
1441


1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452

1453
1454
1455

1456
1457
1458
1459
1460
1461
1462
1463
1464
1465



1466
1467
1468
1469
1470
1471
1472
....
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
....
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542





1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559

1560


1561
1562
1563
1564
1565
1566
1567




1568
1569
1570
1571

1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
....
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
....
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
....
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
....
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
....
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
....
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
....
2315
2316
2317
2318
2319
2320
2321
2322
2323
2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
....
2517
2518
2519
2520
2521
2522
2523
2524
2525
2526
2527
2528
2529
2530
2531
2532
....
2600
2601
2602
2603
2604
2605
2606
2607
2608
2609
2610
2611
2612
2613
2614
2615
2616
2617
2618
2619
2620
2621
2622
2623
2624
2625
2626
2627
2628
2629
2630
2631
2632
2633
2634
2635
2636
2637
2638
2639
2640
2641
2642
2643
2644
2645
2646

2647
2648




2649
2650
2651
2652
2653
2654
2655
2656
2657
2658
2659
2660

2661
2662
2663
2664
2665
2666
2667
2668
2669
2670
2671
2672
2673



2674
2675
2676
2677
2678




2679
2680
2681
2682

2683
2684
2685
2686
2687
2688
2689
2690
2691
2692
2693
2694
2695
2696
2697
2698
2699
2700
2701
2702
2703
2704
2705
2706
2707










2708
2709
2710
2711
2712
2713
2714
....
2724
2725
2726
2727
2728
2729
2730
2731
2732
2733
2734
2735
2736
2737
2738
2739
2740
2741
2742
2743
2744
2745
2746
2747
2748
2749
2750
2751
2752
2753
2754
2755
2756
....
2772
2773
2774
2775
2776
2777
2778





























2779
2780
2781
2782
2783
2784
2785
2786
2787
2788
2789
2790
2791
2792
2793
2794
2795
2796
2797
2798
2799
2800
2801
2802
2803
2804
2805
2806
2807
2808
2809
2810
2811
2812
2813
2814
2815
2816
2817
2818
2819
2820
2821
2822
2823
2824
2825
2826
2827
2828
2829
2830
2831
2832
2833
2834
2835
2836
2837
2838
2839
2840
2841
2842
2843
2844
2845
2846
2847
2848
2849
2850
2851
2852
2853
2854

2855
2856
2857
2858
2859
2860
2861
2862
2863
2864
2865
2866
2867
2868
2869
2870
2871
2872

2873
2874




2875
2876
2877
2878
2879





2880



2881
2882
2883
2884

2885
2886
2887
2888
2889
2890
2891
2892
2893
2894
2895
2896
2897





2898
2899
2900



2901
2902
2903
2904
2905
2906
2907
*/

#include "sqliteInt.h"
#include "vdbeInt.h"

/* 
** Stream numbers must be lower than this.
**
** For optimization purposes, it is assumed that a given tokenizer uses
** a set of contiguous stream numbers starting with 0. And that most
** tokens belong to stream 0.
**
** The hard limit is 63 (due to the format of "row size" records).
*/
#define SQLITE4_FTS5_NSTREAM 32

/*
** Records stored within the index:
**
** Row size record:
**   There is one "row size" record in the index for each row in the
**   indexed table. The "row size" record contains the number of tokens
................................................................................
**
**   The data for this record is a series of varint values. The first 
**   varint is the total number of rows in the table. The subsequent
**   varints make up a "row size" record containing the total number of
**   tokens for each S/C combination in all rows of the table.
**
** FTS index records:

**   The FTS index records implement the following mapping:
**
**       (token, document-pk) -> (list of instances)
**
**   The key for each index record is in the same format as the keys for
**   regular text indexes. An 0x24 byte, followed by the utf-8 representation
**   of the token, followed by 0x00, followed by the PK blob for the table
**   row.
**
**   TODO: Describe value format.
*/

/*
** Default distance value for NEAR operators.
*/
#define FTS5_DEFAULT_NEAR 10

................................................................................
typedef struct Fts5Expr Fts5Expr;
typedef struct Fts5ExprNode Fts5ExprNode;
typedef struct Fts5List Fts5List;
typedef struct Fts5Parser Fts5Parser;
typedef struct Fts5ParserToken Fts5ParserToken;
typedef struct Fts5Phrase Fts5Phrase;
typedef struct Fts5Prefix Fts5Prefix;
typedef struct Fts5Size Fts5Size;
typedef struct Fts5Str Fts5Str;
typedef struct Fts5Token Fts5Token;


struct Fts5ParserToken {
  int eType;                      /* Token type */
  int n;                          /* Size of z[] in bytes */
................................................................................
  char *zExpr;                    /* Full text of MATCH expression */
  KVByteArray *aKey;              /* Buffer for primary key */
  int nKeyAlloc;                  /* Bytes allocated at aKey[] */

  KVCursor *pCsr;                 /* Cursor used to retrive values */
  Mem *aMem;                      /* Array of column values */

  Fts5Size *pSz;                  /* Local size data */
  Fts5Size *pGlobal;              /* Global size data */
  i64 nGlobal;                    /* Total number of rows in table */
  int *anRow;

#if 1
  i64 *aGlobal;

  /* Size of each column of current row (in tokens). */
  int bSzValid;
  int *aSz;
#endif
};

/*
** A deserialized 'size record' (see above).
*/
struct Fts5Size {
  int nCol;                       /* Number of columns in indexed table */
  int nStream;                    /* Number of streams */
  i64 *aSz;                       /* Token count for each C/S */
};

/*
** This type is used when reading (decoding) an instance-list.
*/
typedef struct InstanceList InstanceList;
struct InstanceList {
  u8 *aList;
  int nList;
  int iList;

  /* The current entry */
  int iCol;
  int iStream;
  int iOff;
};

/*
** Return true for EOF, or false if the next entry is valid.
*/
static int fts5InstanceListNext(InstanceList *p){
................................................................................
    u32 iVal;
    i += getVarint32(&p->aList[i], iVal);
    if( (iVal & 0x03)==0x01 ){
      p->iCol = (iVal>>2);
      p->iOff = 0;
    }
    else if( (iVal & 0x03)==0x03 ){
      p->iStream = (iVal>>2);
    }
    else{
      p->iOff += (iVal>>1);
      bRet = 0;
    }
  }
  if( bRet ){
................................................................................
static int fts5InstanceListEof(InstanceList *p){
  return (p->aList==0);
}

static void fts5InstanceListAppend(
  InstanceList *p,                /* Instance list to append to */
  int iCol,                       /* Column of new entry */
  int iStream,                    /* Weight of new entry */
  int iOff                        /* Offset of new entry */
){
  assert( iCol>=p->iCol );
  assert( iCol>p->iCol || iOff>=p->iOff );

  if( iCol!=p->iCol ){
    p->iList += putVarint32(&p->aList[p->iList], (iCol<<2)|0x01);
    p->iCol = iCol;
    p->iOff = 0;
  }

  if( iStream!=p->iStream ){
    p->iList += putVarint32(&p->aList[p->iList], (iStream<<2)|0x03);
    p->iStream = iStream;
  }

  p->iList += putVarint32(&p->aList[p->iList], (iOff-p->iOff)<<1);
  p->iOff = iOff;

  assert( p->iList<=p->nList );
}
................................................................................
}

/*
** Callback for fts5CountTokens().
*/
static int fts5CountTokensCb(
  void *pCtx, 
  int iStream, 
  int iOff, 
  const char *z, int n,
  int iSrc, int nSrc
){
  (*((int *)pCtx))++;
  return 0;
}
................................................................................
struct AppendTokensCtx {
  Fts5Parser *pParse;
  Fts5Str *pStr;
};

static int fts5AppendTokensCb(
  void *pCtx, 
  int iStream, 
  int iOff, 
  const char *z, int n, 
  int iSrc, int nSrc
){
  struct AppendTokensCtx *p = (struct AppendTokensCtx *)pCtx;
  Fts5Parser *pParse = p->pParse;
  Fts5Token *pToken;
................................................................................
** sqlite4DbRealloc().
*/
typedef struct TokenizeCtx TokenizeCtx;
typedef struct TokenizeTerm TokenizeTerm;
struct TokenizeCtx {
  int rc;
  int iCol;
  int nCol;                       /* Number of columns in table */
  sqlite4 *db;
  int nMax;
  i64 *aSz;                       /* Number of tokens in each column/stream */
  int nStream;                    /* Number of streams in document */
  Hash hash;
};
struct TokenizeTerm {
  int iStream;                    /* Weight of previous entry */
  int iCol;                       /* Column containing previous entry */
  int iOff;                       /* Token offset of previous entry */
  int nToken;                     /* Size of token in bytes */
  int nData;                      /* Bytes of data in value */
  int nAlloc;                     /* Bytes of data allocated */
};

................................................................................
  a = &(((unsigned char *)&pTerm[1])[pTerm->nToken+pTerm->nData]);
  pTerm->nData += putVarint32(a, iVal);
  return pTerm;
}

static int fts5TokenizeCb(
  void *pCtx, 
  int iStream, 
  int iOff,
  const char *zToken, 
  int nToken, 
  int iSrc, 
  int nSrc
){
  TokenizeCtx *p = (TokenizeCtx *)pCtx;
  sqlite4 *db = p->db;
  TokenizeTerm *pTerm = 0;
  TokenizeTerm *pOrig = 0;

  /* TODO: Error here if iStream is out of range */

  if( nToken>p->nMax ) p->nMax = nToken;


  if( iStream>=p->nStream ){
    int nOld = p->nStream;
    int nNew = 4;
    while( nNew<=iStream ) nNew = nNew*2;
    p->aSz = (i64*)sqlite4DbReallocOrFree(db, p->aSz, nNew*p->nCol*sizeof(i64));
    if( p->aSz==0 ) goto tokenize_cb_out;
    memset(&p->aSz[p->nStream * p->nCol], 0, (nNew-nOld)*p->nCol*sizeof(i64));
  }
  p->aSz[iStream*p->nCol + p->iCol]++;

  pTerm = (TokenizeTerm *)sqlite4HashFind(&p->hash, zToken, nToken);
  if( pTerm==0 ){
    /* Size the initial allocation so that it fits in the lookaside buffer */
    int nAlloc = sizeof(TokenizeTerm) + nToken + 32;

    pTerm = sqlite4DbMallocZero(p->db, nAlloc);
................................................................................
        pTerm = 0;
      }
      if( pTerm==0 ) goto tokenize_cb_out;
    }
  }
  pOrig = pTerm;

  if( iStream!=pTerm->iStream ){
    pTerm = fts5TokenizeAppendInt(p, pTerm, (iStream << 2) | 0x00000003);
    if( !pTerm ) goto tokenize_cb_out;
    pTerm->iStream = iStream;
  }

  if( pTerm && p->iCol!=pTerm->iCol ){
    pTerm = fts5TokenizeAppendInt(p, pTerm, (p->iCol << 2) | 0x00000001);
    if( !pTerm ) goto tokenize_cb_out;
    pTerm->iCol = p->iCol;
    pTerm->iOff = 0;
................................................................................
    p->rc = SQLITE4_NOMEM;
    return 1;
  }

  return 0;
}

static int fts5LoadSizeRecord(
  sqlite4 *db,                    /* Database handle */
  u8 *aKey, int nKey,             /* KVStore key */
  int nMinStream,                 /* Space for at least this many streams */
  Fts5Info *pInfo,                /* Info record */
  i64 *pnRow,                     /* non-NULL when reading global record */
  Fts5Size **ppSz                 /* OUT: Loaded size record */
){
  Fts5Size *pSz = 0;              /* Size object */
  KVCursor *pCsr = 0;             /* Cursor used to read global record */
  int rc;

  rc = sqlite4KVStoreOpenCursor(db->aDb[pInfo->iDb].pKV, &pCsr);
  if( rc==SQLITE4_OK ){
    rc = sqlite4KVCursorSeek(pCsr, aKey, nKey, 0);
    if( rc==SQLITE4_NOTFOUND ){
      rc = SQLITE4_CORRUPT_BKPT;

    }else if( rc==SQLITE4_OK ){
      const u8 *aData = 0;
      int nData = 0;
      rc = sqlite4KVCursorData(pCsr, 0, -1, &aData, &nData);
      if( rc==SQLITE4_OK ){
        int iOff = 0;
        int nStream = 0;
        int nAlloc;

        /* If pnRow is not NULL, then this is the global record. Read the
        ** number of documents in the table from the start of the record. */
        if( pnRow ){
          iOff += sqlite4GetVarint(&aData[iOff], (u64 *)pnRow);
        }
        iOff += getVarint32(&aData[iOff], nStream);
        nAlloc = (nStream < nMinStream ? nMinStream : nStream);

        pSz = sqlite4DbMallocZero(db, 
            sizeof(Fts5Size) + sizeof(i64) * pInfo->nCol * nAlloc
        );
        if( pSz==0 ){
          rc = SQLITE4_NOMEM;
        }else{
          int iCol = 0;
          pSz->nCol = pInfo->nCol;
          pSz->nStream = nAlloc;
          while( iOff<nData ){
            int i;
            i64 *aSz = &pSz->aSz[iCol*nAlloc];
            for(i=0; i<nStream; i++){
              iOff += sqlite4GetVarint(&aData[iOff], (u64*)&aSz[i]);
            }
            iCol++;
          }
        }
      }
    }
    sqlite4KVCursorClose(pCsr);
  }

  *ppSz = pSz;
  return rc;
}

static int fts5StoreSizeRecord(
  KVStore *p,
  u8 *aKey, int nKey,
  Fts5Size *pSz, 
  i64 nRow, 
  u8 *a                           /* Space to serialize record in */
){
  int iOff = 0;
  int iCol;

  if( nRow>=0 ){
    iOff += sqlite4PutVarint(&a[iOff], nRow);
  }
  iOff += sqlite4PutVarint(&a[iOff], pSz->nStream);
  for(iCol=0; iCol<pSz->nCol; iCol++){
    int i;
    for(i=0; i<pSz->nStream; i++){
      iOff += sqlite4PutVarint(&a[iOff], pSz->aSz[iCol*pSz->nCol+i]);
    }
  }

  return sqlite4KVStoreReplace(p, aKey, nKey, a, iOff);
}

static int fts5CsrLoadGlobal(Fts5Cursor *pCsr){
  int rc = SQLITE4_OK;
  if( pCsr->pGlobal==0 ){







    int nKey;
    u8 aKey[10];
    nKey = putVarint32(aKey, pCsr->pInfo->iRoot);
    aKey[nKey++] = 0x00;
    rc = fts5LoadSizeRecord(
        pCsr->db, aKey, nKey, 0, pCsr->pInfo, &pCsr->nGlobal, &pCsr->pGlobal
    );
  }
  return rc;
}

static int fts5CsrLoadSz(Fts5Cursor *pCsr){
  int rc = SQLITE4_OK;
  if( pCsr->pSz==0 ){
    sqlite4 *db = pCsr->db;
    Fts5Info *pInfo = pCsr->pInfo;


    u8 *aKey;
    int nKey = 0;
    int nPk = pCsr->pExpr->pRoot->nPk;


    aKey = (u8 *)sqlite4DbMallocZero(db, 10 + nPk);
    if( !aKey ) return SQLITE4_NOMEM;

    nKey = putVarint32(aKey, pInfo->iRoot);
    aKey[nKey++] = 0x00;
    memcpy(&aKey[nKey], pCsr->pExpr->pRoot->aPk, nPk);
    nKey += nPk;

    rc = fts5LoadSizeRecord(pCsr->db, aKey, nKey, 0, pInfo, 0, &pCsr->pSz);
    sqlite4DbFree(db, aKey);

















  }

  return rc;
}


/*
................................................................................
  int bDel,                       /* True for a delete, false for insert */
  char **pzErr                    /* OUT: Error message */
){
  int i;
  int rc = SQLITE4_OK;
  KVStore *pStore;
  TokenizeCtx sCtx;


  int nTnum = 0;
  u32 dummy = 0;

  u8 *aSpace = 0;
  int nSpace = 0;

  const u8 *pPK;
  int nPK;
  HashElem *pElem;

  pStore = db->aDb[pInfo->iDb].pKV;


  memset(&sCtx, 0, sizeof(sCtx));
  sCtx.db = db;

  sCtx.nCol = pInfo->nCol;
  sqlite4HashInit(db->pEnv, &sCtx.hash, 1);

  pPK = (const u8 *)sqlite4_value_blob(pKey);
  nPK = sqlite4_value_bytes(pKey);
  
  nTnum = getVarint32(pPK, dummy);
  nPK -= nTnum;
  pPK += nTnum;




  for(i=0; rc==SQLITE4_OK && i<pInfo->nCol; i++){
    sqlite4_value *pArg = (sqlite4_value *)(&aArg[i]);
    if( pArg->flags & MEM_Str ){
      const char *zText;
      int nText;

      zText = (const char *)sqlite4_value_text(pArg);
................................................................................
      sCtx.iCol = i;
      rc = pInfo->pTokenizer->xTokenize(
          &sCtx, pInfo->p, zText, nText, fts5TokenizeCb
      );
    }
  }

  /* Allocate enough space to serialize all the stuff that needs to
  ** be inserted into the database. Specifically:
  **
  **   * Space for index record keys,
  **   * space for the size record and key for this document, and
  **   * space for the updated global size record for the document set.
  **
  ** To make it easier, the below allocates enough space to simultaneously
  ** store the largest index record key and the largest possible global
  ** size record.
  */
  nSpace = (sqlite4VarintLen(pInfo->iRoot) + 2 + sCtx.nMax + nPK) + 
           (9 * (2 + pInfo->nCol * sCtx.nStream));
  aSpace = sqlite4DbMallocRaw(db, nSpace);
  if( aSpace==0 ) rc = SQLITE4_NOMEM;

  for(pElem=sqliteHashFirst(&sCtx.hash); pElem; pElem=sqliteHashNext(pElem)){
    TokenizeTerm *pTerm = (TokenizeTerm *)sqliteHashData(pElem);
    if( rc==SQLITE4_OK ){
      int nToken = sqliteHashKeysize(pElem);
      char *zToken = (char *)sqliteHashKey(pElem);
      u8 *aKey = aSpace;
      int nKey;

      nKey = putVarint32(aKey, pInfo->iRoot);
      aKey[nKey++] = 0x24;
      memcpy(&aKey[nKey], zToken, nToken);
      nKey += nToken;
      aKey[nKey++] = 0x00;
      memcpy(&aKey[nKey], pPK, nPK);
................................................................................
        aData += pTerm->nToken;
        rc = sqlite4KVStoreReplace(pStore, aKey, nKey, aData, pTerm->nData);
      }
    }
    sqlite4DbFree(db, pTerm);
  }

  /* Write the size record into the db */
  if( rc==SQLITE4_OK ){
    u8 *aKey = aSpace;
    int nKey;

    nKey = putVarint32(aKey, pInfo->iRoot);
    aKey[nKey++] = 0x00;
    memcpy(&aKey[nKey], pPK, nPK);
    nKey += nPK;

    if( bDel==0 ){
      Fts5Size sSz;
      sSz.nCol = pInfo->nCol;
      sSz.nStream = sCtx.nStream;
      sSz.aSz = sCtx.aSz;
      rc = fts5StoreSizeRecord(pStore, aKey, nKey, &sSz, -1, &aKey[nKey]);
    }else{





      rc = sqlite4KVStoreReplace(pStore, aKey, nKey, 0, -1);
    }
  }

  /* Update the global record */
  if( rc==SQLITE4_OK ){
    Fts5Size *pSz;                /* Deserialized global size record */
    i64 nRow;                     /* Number of rows in indexed table */
    u8 *aKey = aSpace;            /* Space to format the global record key */
    int nKey;                     /* Size of global record key in bytes */

    nKey = putVarint32(aKey, pInfo->iRoot);
    aKey[nKey++] = 0x00;
    rc = fts5LoadSizeRecord(db, aKey, nKey, sCtx.nStream, pInfo, &nRow, &pSz);
    assert( rc!=SQLITE4_OK || pSz->nStream>=sCtx.nStream );

    if( rc==SQLITE4_OK ){

      int iCol;


      for(iCol=0; iCol<pSz->nCol; iCol++){
        int iStr;
        i64 *aIn = &sCtx.aSz[iCol * sCtx.nStream];
        i64 *aOut = &pSz->aSz[iCol * pSz->nStream];
        for(iStr=0; iStr<sCtx.nStream; iStr++){
          aOut[iStr] += (aIn[iStr] * (bDel?-1:1));
        }




      }
      nRow += (bDel?-1:1);
      rc = fts5StoreSizeRecord(pStore, aKey, nKey, pSz, nRow, &aKey[nKey]);
    }


    sqlite4DbFree(db, pSz);
  }
  
  sqlite4DbFree(db, aSpace);
  sqlite4DbFree(db, sCtx.aSz);
  sqlite4HashClear(&sCtx.hash);
  return rc;
}

static Fts5Info *fts5InfoCreate(Parse *pParse, Index *pIdx, int bCol){
  sqlite4 *db = pParse->db;
................................................................................
**   * the weight assigned to the instance,
**   * the column number, and
**   * the term offset.
*/
static i64 fts5TermInstanceCksum(
  const u8 *aTerm, int nTerm,
  const u8 *aPk, int nPk,
  int iStream,
  int iCol,
  int iOff
){
  int i;
  i64 cksum = 0;

  /* Add the term to the checksum */
................................................................................

  /* Add the primary key blob to the checksum */
  for(i=0; i<nPk; i++){
    cksum += (cksum << 3) + aPk[i];
  }

  /* Add the weight, column number and offset (in that order) to the checksum */
  cksum += (cksum << 3) + iStream;
  cksum += (cksum << 3) + iCol;
  cksum += (cksum << 3) + iOff;

  return cksum;
}


................................................................................
  nToken = sqlite4Strlen30((const char *)aToken);
  aPk = &aToken[nToken+1];
  nPk = (&aKey[nKey] - aPk);

  fts5InstanceListInit((u8 *)aVal, nVal, &sList);
  while( 0==fts5InstanceListNext(&sList) ){
    i64 v = fts5TermInstanceCksum(
        aPk, nPk, aToken, nToken, sList.iStream, sList.iCol, sList.iOff
    );
    cksum = cksum ^ v;
  }

  *piCksum = cksum;
  return SQLITE4_OK;
}
................................................................................
  int nPK;
  int iCol;
  i64 cksum;
};

static int fts5CksumCb(
  void *pCtx, 
  int iStream, 
  int iOff,
  const char *zToken, 
  int nToken, 
  int iSrc, 
  int nSrc
){
  CksumCtx *p = (CksumCtx *)pCtx;
  i64 cksum;

  cksum = fts5TermInstanceCksum(p->pPK, p->nPK, 
      (const u8 *)zToken, nToken, iStream, p->iCol, iOff
  );

  p->cksum = (p->cksum ^ cksum);
  return 0;
}

int sqlite4Fts5RowCksum(
................................................................................
      fts5InstanceListNext(&in2);
    }else if( in1.iCol<in2.iCol || (in1.iCol==in2.iCol && in1.iOff<in2.iOff) ){
      pAdv = &in1;
    }else{
      pAdv = &in2;
    }

    fts5InstanceListAppend(&out, pAdv->iCol, pAdv->iStream, pAdv->iOff);
    fts5InstanceListNext(pAdv);
  }

  if( bFree ){
    sqlite4DbFree(db, p1->aData);
    sqlite4DbFree(db, p2->aData);
  }
................................................................................
  while( rc==SQLITE4_OK && bEof==0 ){
    for(i=1; i<pStr->nToken; i++){
      int bMatch = fts5TokenAdvanceToMatch(&aIn[i], &aIn[0], i, &bEof);
      if( bMatch==0 || bEof ) break;
    }
    if( i==pStr->nToken && (iCol<0 || aIn[0].iCol==iCol) ){
      /* Record a match here */
      fts5InstanceListAppend(&out, aIn[0].iCol, aIn[0].iStream, aIn[0].iOff);
    }
    bEof = fts5InstanceListNext(&aIn[0]);
  }

  pStr->nList = out.iList;
  sqlite4DbFree(db, aIn);

................................................................................

    while( bEof==0 ){
      if( fts5IsNear(&near, &in, nTrail) 
       || fts5IsNear(&in, &near, nLead)
      ){
        /* The current position is a match. Append an entry to the output
        ** and advance the input cursor. */
        fts5InstanceListAppend(&out, in.iCol, in.iStream, in.iOff);
        bEof = fts5InstanceListNext(&in);
      }else{
        if( near.iCol<in.iCol || (near.iCol==in.iCol && near.iOff<in.iOff) ){
          bEof = fts5InstanceListNext(&near);
        }else if( near.iCol==in.iCol && near.iOff==in.iOff ){
          bEof = fts5InstanceListNext(&in);
          if( fts5IsNear(&near, &in, nTrail) ){
            fts5InstanceListAppend(&out, near.iCol, near.iStream, near.iOff);
          }
        }else{
          bEof = fts5InstanceListNext(&in);
        }
      }
    }

................................................................................
  }

  assert( rc!=SQLITE4_NOTFOUND );
  return rc;
}

int sqlite4Fts5Next(Fts5Cursor *pCsr){
  sqlite4DbFree(pCsr->db, pCsr->pSz);
  pCsr->pSz = 0;
  return fts5ExprAdvance(pCsr->db, pCsr->pExpr->pRoot, 0);
}

int sqlite4Fts5Open(
  sqlite4 *db,                    /* Database handle */
  Fts5Info *pInfo,                /* Index description */
  const char *zMatch,             /* Match expression */
................................................................................
  memcpy(&pCsr->aKey[i], aPk, nPk);

  *paKey = pCsr->aKey;
  *pnKey = nReq;
  return SQLITE4_OK;
}

int sqlite4_mi_column_count(sqlite4_context *pCtx, int *pn){
  int rc = SQLITE4_OK;
  if( pCtx->pFts ){
    *pn = pCtx->pFts->pInfo->nCol;
  }else{
    rc = SQLITE4_MISUSE;
  }
  return rc;
}

int sqlite4_mi_phrase_count(sqlite4_context *pCtx, int *pn){
  int rc = SQLITE4_OK;
  if( pCtx->pFts ){
    *pn = pCtx->pFts->pExpr->nPhrase;
  }else{
    rc = SQLITE4_MISUSE;
  }
  return rc;
}

int sqlite4_mi_stream_count(sqlite4_context *pCtx, int *pn){
  int rc = SQLITE4_OK;
  Fts5Cursor *pCsr = pCtx->pFts;
  if( pCsr ){
    rc = fts5CsrLoadGlobal(pCtx->pFts);
    if( rc==SQLITE4_OK ) *pn = pCsr->pGlobal->nStream;
  }else{
    rc = SQLITE4_MISUSE;
  }
  return rc;
}

static int fts5GetSize(Fts5Size *pSz, int iC, int iS){
  int nToken = 0;
  int i;

  if( iC<0 && iS<0 ){
    int nFin = pSz->nCol * pSz->nStream;
    for(i=0; i<nFin; i++) nToken += pSz->aSz[i];
  }else if( iC<0 ){

    for(i=0; i<pSz->nCol; i++) nToken += pSz->aSz[i*pSz->nStream + iS];
  }else if( iS<0 ){




    for(i=0; i<pSz->nStream; i++) nToken += pSz->aSz[pSz->nStream*iC + iS];
  }else if( iC<pSz->nCol && iS<pSz->nStream ){
    nToken = pSz->aSz[iC * pSz->nStream + iS];
  }

  return nToken;
}

int sqlite4_mi_size(sqlite4_context *pCtx, int iC, int iS, int *pn){
  int rc = SQLITE4_OK;
  Fts5Cursor *pCsr = pCtx->pFts;


  if( pCsr==0 ){
    rc = SQLITE4_MISUSE;
  }else{
    rc = fts5CsrLoadSz(pCsr);
    if( rc==SQLITE4_OK ){
      *pn = fts5GetSize(pCsr->pSz, iC, iS);
    }
  }
  return rc;
}

int sqlite4_mi_total_size(sqlite4_context *pCtx, int iC, int iS, int *pn){
  int rc = SQLITE4_OK;



  Fts5Cursor *pCsr = pCtx->pFts;

  if( pCsr==0 ){
    rc = SQLITE4_MISUSE;
  }else{




    rc = fts5CsrLoadGlobal(pCsr);
    if( rc==SQLITE4_OK ){
      *pn = fts5GetSize(pCsr->pGlobal, iC, iS);
    }

  }
  return rc;
}

int sqlite4_mi_total_rows(sqlite4_context *pCtx, int *pn){
  int rc = SQLITE4_OK;
  Fts5Cursor *pCsr = pCtx->pFts;
  if( pCsr==0 ){
    rc = SQLITE4_MISUSE;
  }else{
    rc = fts5CsrLoadGlobal(pCsr);
    if( rc==SQLITE4_OK ) *pn = pCsr->nGlobal;
  }
  return rc;
}


int sqlite4_mi_column_value(
  sqlite4_context *pCtx, 
  int iCol, 
  sqlite4_value **ppVal
){
  int rc = SQLITE4_OK;
  if( pCtx->pFts ){
  }else{










    rc = SQLITE4_MISUSE;
  }
  return rc;
}

static Fts5Str *fts5FindStr(Fts5ExprNode *p, int *piStr){
  Fts5Str *pRet = 0;
................................................................................
    if( pRet==0 ) pRet = fts5FindStr(p->pRight, piStr);
  }
  return pRet;
}

int sqlite4_mi_match_count(
  sqlite4_context *pCtx, 
  int iC,
  int iS,
  int iPhrase,
  int *pnMatch
){
  int rc = SQLITE4_OK;
  Fts5Cursor *pCsr = pCtx->pFts;
  if( pCsr ){
    int nMatch = 0;
    Fts5Str *pStr;
    int iCopy = iPhrase;
    InstanceList sList;

    pStr = fts5FindStr(pCsr->pExpr->pRoot, &iCopy);
    assert( pStr );

    fts5InstanceListInit(pStr->aList, pStr->nList, &sList);
    while( 0==fts5InstanceListNext(&sList) ){
      if( (iC<0 || sList.iCol==iC) && (iS<0 || sList.iStream==iS) ) nMatch++;
    }
    *pnMatch = nMatch;
  }else{
    rc = SQLITE4_MISUSE;
  }
  return rc;
}
................................................................................
  int *pnMatch,
  int *pnDoc,
  int *pnRelevant
){
  return SQLITE4_OK;
}






























static void fts5StrLoadRowcounts(Fts5Str *pStr, int nStream, int *anRow){
  u32 mask = 0;
  int iPrevCol = 0;
  InstanceList sList;

  fts5InstanceListInit(pStr->aList, pStr->nList, &sList);
  while( 0==fts5InstanceListNext(&sList) ){
    if( sList.iCol!=iPrevCol ) mask = 0;
    if( (mask & (1<<sList.iStream))==0 ){
      anRow[sList.iCol * nStream + sList.iStream]++;
      mask |= (1<<sList.iStream);
      iPrevCol = sList.iCol;
    }
  }
}

static int fts5ExprLoadRowcounts(
  sqlite4 *db, 
  Fts5Info *pInfo,
  int nStream,
  Fts5ExprNode *pNode, 
  int **panRow
){
  int rc = SQLITE4_OK;

  if( pNode ){
    if( pNode->eType==TOKEN_PRIMITIVE ){
      int *anRow = *panRow;
      Fts5Phrase *pPhrase = pNode->pPhrase;

      rc = fts5ExprAdvance(db, pNode, 1);
      while( rc==SQLITE4_OK ){
        int nIncr =  pInfo->nCol * nStream;      /* Values for each Fts5Str */
        int i;
        for(i=0; i<pPhrase->nStr; i++){
          fts5StrLoadRowcounts(&pPhrase->aStr[i], nStream, &anRow[i*nIncr]);
        }
        rc = fts5ExprAdvance(db, pNode, 0);
      }

      *panRow = &anRow[pInfo->nCol * nStream * pPhrase->nStr];
    }

    if( rc==SQLITE4_OK ){
      rc = fts5ExprLoadRowcounts(db, pInfo, nStream, pNode->pLeft, panRow);
    }
    if( rc==SQLITE4_OK ){
      rc = fts5ExprLoadRowcounts(db, pInfo, nStream, pNode->pRight, panRow);
    }
  }

  return rc;
}

static int fts5CsrLoadRowcounts(Fts5Cursor *pCsr){
  int rc = SQLITE4_OK;

  if( pCsr->anRow==0 ){
    int nStream = pCsr->pGlobal->nStream;
    sqlite4 *db = pCsr->db;
    Fts5Expr *pCopy;
    Fts5Expr *pExpr = pCsr->pExpr;
    Fts5Info *pInfo = pCsr->pInfo;
    int *anRow;

    pCsr->anRow = anRow = (int *)sqlite4DbMallocZero(db, 
        pExpr->nPhrase * pInfo->nCol * pCsr->pGlobal->nStream * sizeof(int)
    );
    if( !anRow ) return SQLITE4_NOMEM;

    rc = fts5ParseExpression(db, pInfo->pTokenizer, pInfo->p, 
        pInfo->iRoot, pInfo->azCol, pInfo->nCol, pCsr->zExpr, &pCopy, 0
    );
    if( rc==SQLITE4_OK ){
      rc = fts5OpenExprCursors(db, pInfo, pExpr->pRoot);
    }

    if( rc==SQLITE4_OK ){
      rc = fts5ExprLoadRowcounts(db, pInfo, nStream, pCopy->pRoot, &anRow);
    }

    fts5ExpressionFree(db, pCopy);
  }

  return rc;
}

int sqlite4_mi_row_count(
  sqlite4_context *pCtx,          /* Context object passed to mi function */
  int iC,                         /* Specific column (or -ve for all columns) */
  int iS,                         /* Specific stream (or -ve for all streams) */
  int iP,                         /* Specific phrase */
  int *pn                         /* Total number of rows containing C/S/P */
){
  int rc = SQLITE4_OK;

  Fts5Cursor *pCsr = pCtx->pFts;
  if( pCsr==0 ){




    rc = SQLITE4_MISUSE;
  }else{
    rc = fts5CsrLoadGlobal(pCsr);
    if( rc==SQLITE4_OK ) rc = fts5CsrLoadRowcounts(pCsr);






    if( rc==SQLITE4_OK ){



      int i;
      int nRow = 0;
      int nStream = pCsr->pGlobal->nStream;
      int nCol = pCsr->pInfo->nCol;

      int *aRow = &pCsr->anRow[iP * nStream * nCol];

      if( iC<0 && iS<0 ){
        int nFin = nCol * nStream;
        for(i=0; i<nFin; i++) nRow += aRow[i];
      }else if( iC<0 ){
        for(i=0; i<nCol; i++) nRow += aRow[i*nStream + iS];
      }else if( iS<0 ){
        for(i=0; i<nStream; i++) nRow += aRow[nStream*iC + iS];
      }else if( iC<nCol && iS<nStream ){
        nRow = aRow[iC * nStream + iS];
      }






      *pn = nRow;
    }
  }



  return rc;
}

/**************************************************************************
***************************************************************************
** Below this point is test code.
*/

Changes to src/fts5func.c.

101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
...
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
      int ni;                     /* Number of docs with phrase i */

      p->db = db;
      p->nPhrase = nPhrase;
      p->aIdf = (double *)&p[1];

      /* Determine the IDF weight for each phrase in the query. */
      rc = sqlite4_mi_row_count(pCtx, -1, -1, &N);
      for(i=0; rc==SQLITE4_OK && i<nPhrase; i++){
        rc = sqlite4_mi_row_count(pCtx, -1, i, &ni);
        if( rc==SQLITE4_OK ){
          p->aIdf[i] = log((0.5 + N - ni) / (0.5 + ni));
        }
      }

      /* Determine the average document length */
      if( rc==SQLITE4_OK ){
        int nTotal;
        rc = sqlite4_mi_total_size(pCtx, -1, &nTotal);
        if( rc==SQLITE4_OK ){
          p->avgdl = (double)nTotal / (double)N;
        }
      }
    }
  }

................................................................................
    int dl;                     /* Tokens in this row (document length) */
    double L;                   /* Normalized document length */
    double prank;               /* Contribution to rank of this phrase */

    /* Set variable tf to the total number of occurrences of phrase iPhrase
    ** in this row (within any column). And dl to the number of tokens in
    ** the current row (again, in any column).  */
    rc = sqlite4_mi_match_count(pCtx, -1, i, &tf); 
    if( rc==SQLITE4_OK ) rc = sqlite4_mi_column_size(pCtx, -1, &dl); 

    /* Calculate the normalized document length */
    L = (double)dl / p->avgdl;

    /* Calculate the contribution to the rank made by this phrase. Then
    ** add it to variable rank.  */
    prank = (p->aIdf[i] * tf) / (k1 * ( (1.0 - b) + b * L) + tf);







|

|








|







 







|
|







101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
...
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
      int ni;                     /* Number of docs with phrase i */

      p->db = db;
      p->nPhrase = nPhrase;
      p->aIdf = (double *)&p[1];

      /* Determine the IDF weight for each phrase in the query. */
      rc = sqlite4_mi_total_rows(pCtx, &N);
      for(i=0; rc==SQLITE4_OK && i<nPhrase; i++){
        rc = sqlite4_mi_row_count(pCtx, -1, -1, i, &ni);
        if( rc==SQLITE4_OK ){
          p->aIdf[i] = log((0.5 + N - ni) / (0.5 + ni));
        }
      }

      /* Determine the average document length */
      if( rc==SQLITE4_OK ){
        int nTotal;
        rc = sqlite4_mi_total_size(pCtx, -1, -1, &nTotal);
        if( rc==SQLITE4_OK ){
          p->avgdl = (double)nTotal / (double)N;
        }
      }
    }
  }

................................................................................
    int dl;                     /* Tokens in this row (document length) */
    double L;                   /* Normalized document length */
    double prank;               /* Contribution to rank of this phrase */

    /* Set variable tf to the total number of occurrences of phrase iPhrase
    ** in this row (within any column). And dl to the number of tokens in
    ** the current row (again, in any column).  */
    rc = sqlite4_mi_match_count(pCtx, -1, -1, i, &tf); 
    if( rc==SQLITE4_OK ) rc = sqlite4_mi_size(pCtx, -1, -1, &dl); 

    /* Calculate the normalized document length */
    L = (double)dl / p->avgdl;

    /* Calculate the contribution to the rank made by this phrase. Then
    ** add it to variable rank.  */
    prank = (p->aIdf[i] * tf) / (k1 * ( (1.0 - b) + b * L) + tf);

Changes to src/sqlite.h.in.

4419
4420
4421
4422
4423
4424
4425
4426
4427
4428
4429
4430
4431
4432
4433
4434
4435
4436
4437
4438
4439
4440
4441
4442
4443
4444
4445
4446
4447
4448
4449
4450
4451
4452
4453
4454
4455
4456
4457
4458
4459
4460
4461
4462
4463
4464
4465
4466
4467
4468
4469
4470
4471
4472
4473
4474
4475
4476
4477
4478
4479
4480
4481
4482
4483
4484
4485
4486




4487
4488
4489
4490
4491
4492
4493

/*
** Special functions that may be called from within matchinfo UDFs. All
** return an SQLite error code - SQLITE4_OK if successful, or some other
** error code otherwise.
**
** sqlite4_mi_column_count():
**   Set *pnCol to the number of columns in the queried table.
**
** sqlite4_mi_column_size():
**   Set *pnToken to the number of tokens in the value stored in column iCol 
**   of the current row.
**
** sqlite4_mi_column_value():
**   Set *ppVal to point to an sqlite4_value object containing the value
**   read from column iCol of the current row. This object is valid until
**   the function callback returns.
**
** sqlite4_mi_phrase_count():
**   Set *pnPhrase to the number of phrases in the query.
**
** sqlite4_mi_match_count():
**   Set *pn to the number of occurences of phrase iPhrase in column iCol of
**   the current row.
**
** sqlite4_mi_total_match_count():
**   Set *pnMatch to the total number of occurrences of phrase iPhrase
**   in column iCol of all rows in the indexed table. Set *pnDoc to the
**   number of rows that contain at least one match for phrase iPhrase in
**   column iCol.
**
** sqlite4_mi_match_offset():
**   Set *piOff to the token offset of the iMatch'th instance of phrase
**   iPhrase in column iCol of the current row. If any parameter is out
**   of range (i.e. too large) it is not an error. In this case *piOff is 
**   set to -1 before returning.
**   
** sqlite4_mi_total_size():
**   Set *pnToken to the total number of tokens in column iCol of all rows
**   in the indexed table.
**
** sqlite4_mi_row_count():
**   If parameter iPhrase is negative, this function sets the output 
**   parameter to the total number of documents in the collection (rows 
**   in the indexed table).
**
**   Otherwise, if iPhrase is not negative, then the output is set to the
**   total number of rows that contain at least one instance of phrase iPhrase
**   in column iCol, or in any column if iCol is negative.
**
**   If parameter iPhrase is equal to or greater than the number of phrases
**   in the current query, or if iCol is equal to or greater than the number
**   of columns in the indexed table, SQLITE4_MISUSE is returned. The value
**   of the output parameter is undefined in this case.
*/

int sqlite4_mi_column_count(sqlite4_context *, int *pnCol);
int sqlite4_mi_phrase_count(sqlite4_context *, int *pnPhrase);

int sqlite4_mi_column_size(sqlite4_context *, int iCol, int *pnToken);
int sqlite4_mi_match_count(sqlite4_context *, int iCol, int iPhrase, int *pn);
int sqlite4_mi_total_size(sqlite4_context *, int iCol, int *pnToken);
int sqlite4_mi_row_count(sqlite4_context *, int iCol, int iPhrase, int *pnRow);

int sqlite4_mi_column_value(sqlite4_context *, int iCol, sqlite4_value **ppVal);
int sqlite4_mi_match_detail(sqlite4_context *, 
    int iCol, int iPhrase, int iMatch, int *piOff, int *piWeight
);






/*
** Undo the hack that converts floating point types to integer for
** builds on processors without floating point support.
*/
#ifdef SQLITE4_OMIT_FLOATING_POINT







|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
>
>
>
>







4419
4420
4421
4422
4423
4424
4425
4426
4427
4428
4429
4430
4431
4432
4433
4434
4435
4436
4437
4438
4439
4440
4441
4442
4443
4444
4445
4446
4447
4448
4449
4450
4451
4452
4453
4454
4455
4456
4457
4458
4459
4460
4461
4462
4463
4464
4465
4466
4467
4468
4469
4470
4471
4472
4473
4474
4475
4476
4477
4478
4479
4480
4481
4482
4483
4484
4485
4486
4487
4488
4489
4490
4491
4492
4493
4494
4495
4496
4497

/*
** Special functions that may be called from within matchinfo UDFs. All
** return an SQLite error code - SQLITE4_OK if successful, or some other
** error code otherwise.
**
** sqlite4_mi_column_count():
**   Set *pn to the number of columns in the queried table.
**
** sqlite4_mi_phrase_count():
**   Set *pn to the number of phrases in the query.
**
** sqlite4_mi_stream_count():
**   Set *pn to the number of streams in the FTS index.
**
** sqlite4_mi_size():
**   Set *pn to the number of tokens belonging to stream iS in the value 
**   stored in column iC of the current row. 
**
**   Either or both of iS and iC may be negative. If iC is negative, then the
**   output value is the total number of tokens for the specified stream (or
**   streams) across all table columns. Similarly, if iS is negative, the 
**   output value is the total number of tokens in the specified column or 
**   columns, regardless of stream.
**
** sqlite4_mi_total_size():
**   Similar to sqlite4_mi_size(), except the output parameter is set to
**   the total number of tokens belonging to the specified column(s) 
**   and stream(s) in all rows of the table, not just the current row.
**
** sqlite4_mi_total_rows():
**   Set *pn to the total number of rows in the indexed table.
**
** sqlite4_mi_row_count():
**   Set the output parameter to the total number of rows in the table that
**   contain at least one instance of the phrase identified by parameter
**   iP in the column(s) and stream(s) identified by parameters iC and iS.
**
** sqlite4_mi_match_count():
**   Set the output parameter to the total number of occurences of phrase
**   iP in the current row that belong to the column(s) and stream(s) 
**   identified by parameters iC and iS.
**
**   Parameter iP may also be negative. In this case, the output value is
**   set to the total number of occurrences of all query phrases in the
**   current row, subject to the constraints imposed by iC and iS.
**
** sqlite4_mi_match_detail():
**   This function may be used to iterate through all matches in the
**   current row in order of occurrence.
**
** sqlite4_mi_column_value():
**   Set *ppVal to point to an sqlite4_value object containing the value
**   read from column iCol of the current row. This object is valid until
**   the function callback returns.
*/
int sqlite4_mi_column_count(sqlite4_context *, int *pn);
int sqlite4_mi_phrase_count(sqlite4_context *, int *pn);
int sqlite4_mi_stream_count(sqlite4_context *, int *pn);

int sqlite4_mi_total_size(sqlite4_context *, int iC, int iS, int *pn);
int sqlite4_mi_total_rows(sqlite4_context *, int *pn);

int sqlite4_mi_row_count(sqlite4_context *, int iC, int iS, int iP, int *pn);

int sqlite4_mi_size(sqlite4_context *, int iC, int iS, int *pn);
int sqlite4_mi_match_count(sqlite4_context *, int iC, int iS, int iP, int *pn);
int sqlite4_mi_match_detail(
    sqlite4_context *, int iMatch, int *piOff, int *piC, int *piS, int *piP
);
int sqlite4_mi_column_value(sqlite4_context *, int iCol, sqlite4_value **ppVal);



/*
** Undo the hack that converts floating point types to integer for
** builds on processors without floating point support.
*/
#ifdef SQLITE4_OMIT_FLOATING_POINT