Documentation Source Text

Check-in [30e7de5540]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Update the fts3_tokenizer() documentation to describe how it now wants parameters to be bound.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: 30e7de5540a1bfcf7cefa69994d6a2362f6e12883407fa1dbe93887159ad1395
User & Date: drh 2019-04-02 01:06:29.033
Context
2019-04-02
13:26
Add the four new keywords: EXCLUDE GROUPS OTHERS TIES (check-in: 8db10efb63 user: drh tags: trunk)
01:06
Update the fts3_tokenizer() documentation to describe how it now wants parameters to be bound. (check-in: 30e7de5540 user: drh tags: trunk)
2019-04-01
14:45
Merge changes from the 3.27 branch. (check-in: 50513b6f28 user: drh tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to pages/changes.in.
60
61
62
63
64
65
66




67
68

69
70





71
72
73
74
75
76
77
     files that are already in the archive and are unchanged.  Add the
     new --insert option that works like --update used to work.
</ol>
<li> Added the [https://sqlite.org/src/file/ext/misc/fossildelta.c|fossildelta.c]
     extension that can create, apply, and deconstruct the 
 [https://fossil-scm.org/fossil/doc/trunk/www/delta_format.wiki|Fossil DVCS file delta format]
     that is used by the [RBU extension].




<li> The [fts3_tokenizer()] function is restricted to return only NULL
     values unless application-defined FTS3 tokenizers are enabled using

     the [sqlite3_db_config]([SQLITE_DBCONFIG_ENABLE_FTS3_TOKENIZER])
     setting.





<li> Improved robustnessness against corrupt database files.
<li> Miscellaneous performance enhancements
<li> Established a Git mirror of the offical SQLite source tree. 
     The canonical sources for SQLite are maintained using the
     [https://fossil-scm.org/|Fossil DVCS] at [https://sqlite.org/src].
     The Git mirror can be seen at [https://github.com/sqlite/sqlite].
}







>
>
>
>
|
|
>

|
>
>
>
>
>







60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
     files that are already in the archive and are unchanged.  Add the
     new --insert option that works like --update used to work.
</ol>
<li> Added the [https://sqlite.org/src/file/ext/misc/fossildelta.c|fossildelta.c]
     extension that can create, apply, and deconstruct the 
 [https://fossil-scm.org/fossil/doc/trunk/www/delta_format.wiki|Fossil DVCS file delta format]
     that is used by the [RBU extension].
<li> Added the [sqlite3_value_frombind()] API for determining if the argument
     to an SQL function is from a [bound parameter].
<li> Security and compatibilities enhancements to [fts3_tokenizer()]:
<ol type="a">
<li> The [fts3_tokenizer()] function always returns NULL
     unless either the legacy application-defined FTS3 tokenizers interface
     are enabled using
     the [sqlite3_db_config]([SQLITE_DBCONFIG_ENABLE_FTS3_TOKENIZER])
     setting, or unless the first argument to fts3_tokenizer() is a [bound parameter].
<li> The two-argument version of [fts3_tokenizer()] accepts a pointer to the
     tokenizer method object even without
     the [sqlite3_db_config]([SQLITE_DBCONFIG_ENABLE_FTS3_TOKENIZER]) setting
     if the second argument is a [bound parameter]
</ol>     
<li> Improved robustnessness against corrupt database files.
<li> Miscellaneous performance enhancements
<li> Established a Git mirror of the offical SQLite source tree. 
     The canonical sources for SQLite are maintained using the
     [https://fossil-scm.org/|Fossil DVCS] at [https://sqlite.org/src].
     The Git mirror can be seen at [https://github.com/sqlite/sqlite].
}
Changes to pages/fts3.in.
2315
2316
2317
2318
2319
2320
2321
2322

2323


2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
2349

2350
2351
2352
2353
2354
2355
2356
2357
2358
2359
2360
2361
2362
2363
2364
2365
2366
2367
2368
2369
2370
2371
2372
2373
2374
2375

<codeblock>
    SELECT fts3_tokenizer(&lt;tokenizer-name&gt;);
    SELECT fts3_tokenizer(&lt;tokenizer-name&gt;, &lt;sqlite3_tokenizer_module ptr&gt;);
</codeblock>

<p>
  Where &lt;tokenizer-name&gt; is a string identifying the tokenizer and

  &lt;sqlite3_tokenizer_module ptr&gt; is a pointer to an sqlite3_tokenizer_module


  structure encoded as an SQL blob. If the second argument is present,
  it is registered as tokenizer &lt;tokenizer-name&gt; and a copy of it
  returned. If only one argument is passed, a pointer to the tokenizer
  implementation currently registered as &lt;tokenizer-name&gt; is returned,
  encoded as a blob. Or, if no such tokenizer exists, an SQL exception
  (error) is raised.

<p>
  Because of security concerns, SQLite [version 3.11.0] ([dateof:3.11.0])
  and later only enabled the
  second form of the fts3_tokenizer() function when the library is compiled
  with the [SQLITE_ENABLE_FTS3_TOKENIZER | -DSQLITE_ENABLE_FTS3_TOKENIZER]
  option. In earlier versions it was
  always available.  Beginning with SQLite [version 3.12.0]
  ([dateof:3.12.0]), the second form of
  fts3_tokenizer() can also be activated at run-time by calling
  [sqlite3_db_config](db,[SQLITE_DBCONFIG_ENABLE_FTS3_TOKENIZER],1,0).

<p>
  <b>SECURITY WARNING</b>: 
  If a version of the fts3/4 extension that supports the two-argument form of
  fts3_tokenizer() is deployed in an environment where malicious users can
  run arbitrary SQL, then those users should be prevented from invoking the 
  two-argument fts3_tokenizer() function.
  This can be done using the [sqlite3_set_authorizer()|authorization callback], 
  or by disabling the two-argument fts3_tokenizer() interface using a

  call to
  [sqlite3_db_config](db,[SQLITE_DBCONFIG_ENABLE_FTS3_TOKENIZER],0,0).

<p>
  The following block contains an example of calling the fts3_tokenizer()
  function from C code:

<codeblock>
  <i>/*
  ** Register a tokenizer implementation with FTS3 or FTS4.
  */</i>
  int registerTokenizer(
    sqlite3 *db,
    char *zName,
    const sqlite3_tokenizer_module *p
  ){
    int rc;
    sqlite3_stmt *pStmt;
    const char *zSql = "SELECT fts3_tokenizer(?, ?)";

    rc = sqlite3_prepare_v2(db, zSql, -1, &pStmt, 0);
    if( rc!=SQLITE_OK ){
      return rc;
    }

    sqlite3_bind_text(pStmt, 1, zName, -1, SQLITE_STATIC);







|
>
|
>
>
|







|
<
<
<
<
<
<
|
<
|
<
<
<
<
<
|
|
<
>
|
|
















|







2315
2316
2317
2318
2319
2320
2321
2322
2323
2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335






2336

2337





2338
2339

2340
2341
2342
2343
2344
2345
2346
2347
2348
2349
2350
2351
2352
2353
2354
2355
2356
2357
2358
2359
2360
2361
2362
2363
2364
2365
2366

<codeblock>
    SELECT fts3_tokenizer(&lt;tokenizer-name&gt;);
    SELECT fts3_tokenizer(&lt;tokenizer-name&gt;, &lt;sqlite3_tokenizer_module ptr&gt;);
</codeblock>

<p>
  Where &lt;tokenizer-name&gt; is [parameter] to which a string is bound using
  [sqlite3_bind_text()] where the string identifies the tokenizer and
  &lt;sqlite3_tokenizer_module ptr&gt; is a [parameter] to which a BLOB is
  bound using [sqlite3_bind_blob()] where the value of the BLOB is a
  pointer to an sqlite3_tokenizer_module structure.
  If the second argument is present,
  it is registered as tokenizer &lt;tokenizer-name&gt; and a copy of it
  returned. If only one argument is passed, a pointer to the tokenizer
  implementation currently registered as &lt;tokenizer-name&gt; is returned,
  encoded as a blob. Or, if no such tokenizer exists, an SQL exception
  (error) is raised.

<p>
  Prior to SQLite [version 3.11.0] ([dateof:3.11.0]), the arguments to






  fts3_tokenzer() could be literal strings or BLOBs. They did not have to

  be [bound parameters].  But that could lead to security problems in the





  event of an SQL injection.  Hence, the legacy behavior is now disabled
  by default.  But the old legacy behavior can be enabled, for backwards

  compatibility in applications that really need it, 
  by calling
  [sqlite3_db_config](db,[SQLITE_DBCONFIG_ENABLE_FTS3_TOKENIZER],1,0).

<p>
  The following block contains an example of calling the fts3_tokenizer()
  function from C code:

<codeblock>
  <i>/*
  ** Register a tokenizer implementation with FTS3 or FTS4.
  */</i>
  int registerTokenizer(
    sqlite3 *db,
    char *zName,
    const sqlite3_tokenizer_module *p
  ){
    int rc;
    sqlite3_stmt *pStmt;
    const char *zSql = "SELECT fts3_tokenizer(?1, ?2)";

    rc = sqlite3_prepare_v2(db, zSql, -1, &pStmt, 0);
    if( rc!=SQLITE_OK ){
      return rc;
    }

    sqlite3_bind_text(pStmt, 1, zName, -1, SQLITE_STATIC);