Documentation Source Text

Check-in [27ee6d0233]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Update fts3 documentation to mention that fts3_tokenzer(x,y) is only available if SQLITE_ENABLE_FTS3_TOKENIZER is defined at compile time.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 27ee6d023315774f4dd803799b068b09302c708d
User & Date: dan 2016-02-08 19:55:42.153
Context
2016-02-09
14:43
Update the 3.11.0 change log. (check-in: 0dc9452382 user: drh tags: trunk)
2016-02-08
19:55
Update fts3 documentation to mention that fts3_tokenzer(x,y) is only available if SQLITE_ENABLE_FTS3_TOKENIZER is defined at compile time. (check-in: 27ee6d0233 user: dan tags: trunk)
2016-02-03
19:47
Update the documentation for PRAGMA synchronous=EXTRA. (check-in: bcf687da9c user: drh tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to pages/fts3.in.
2270
2271
2272
2273
2274
2275
2276
2277







2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
  case-sensitive. In the example above, specifying that "X" is a separator
  character does not affect the way "x" is handled.

<h2>Custom (User Implemented) Tokenizers</h2>

<p>
  As well as the built-in "simple", "porter" and (possibly) "icu" and
  "unicode61" tokenizers,







  FTS exports an interface that allows users to implement custom tokenizers
  using C. The interface used to create a new tokenizer is defined and 
  described in the fts3_tokenizer.h source file.

<p>
  Registering a new FTS tokenizer is similar to registering a new
  virtual table module with SQLite. The user passes a pointer to a
  structure containing pointers to various callback functions that
  make up the implementation of the new tokenizer type. For tokenizers,
  the structure (defined in fts3_tokenizer.h) is called







|
>
>
>
>
>
>
>
|
|
|







2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
2288
2289
2290
2291
2292
2293
2294
  case-sensitive. In the example above, specifying that "X" is a separator
  character does not affect the way "x" is handled.

<h2>Custom (User Implemented) Tokenizers</h2>

<p>
  As well as the built-in "simple", "porter" and (possibly) "icu" and
  "unicode61" tokenizers, if the library is compiled with the following 
  compiler option:

<codeblock>
  -DSQLITE_ENABLE_FTS3_TOKENIZER
</codeblock>

<p>
  then FTS exports an interface that allows users to implement custom
  tokenizers using C. The interface used to create a new tokenizer is defined
  and described in the fts3_tokenizer.h source file.

<p>
  Registering a new FTS tokenizer is similar to registering a new
  virtual table module with SQLite. The user passes a pointer to a
  structure containing pointers to various callback functions that
  make up the implementation of the new tokenizer type. For tokenizers,
  the structure (defined in fts3_tokenizer.h) is called
2307
2308
2309
2310
2311
2312
2313






2314


2315
2316
2317
2318
2319
2320
2321
2322
2323
2324
  it is registered as tokenizer &lt;tokenizer-name&gt; and a copy of it
  returned. If only one argument is passed, a pointer to the tokenizer
  implementation currently registered as &lt;tokenizer-name&gt; is returned,
  encoded as a blob. Or, if no such tokenizer exists, an SQL exception
  (error) is raised.

<p>






  <b>SECURITY WARNING</b>: If the fts3/4 extension is used in an environment


  where potentially malicious users may execute arbitrary SQL, they should 
  be prevented from invoking the fts3_tokenizer() function, possibly using 
  the [sqlite3_set_authorizer()|authorization callback].

<p>
  The following block contains an example of calling the fts3_tokenizer()
  function from C code:

<codeblock>
  <i>/*







>
>
>
>
>
>
|
>
>
|
|
|







2314
2315
2316
2317
2318
2319
2320
2321
2322
2323
2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
  it is registered as tokenizer &lt;tokenizer-name&gt; and a copy of it
  returned. If only one argument is passed, a pointer to the tokenizer
  implementation currently registered as &lt;tokenizer-name&gt; is returned,
  encoded as a blob. Or, if no such tokenizer exists, an SQL exception
  (error) is raised.

<p>
  As of SQLite version 3.11.0, the second form of the fts3_tokenizer() function
  is only available if the library is compiled with the
  -DSQLITE_ENABLE_FTS3_TOKENIZER compiler switch. In earlier versions it was
  always available.

<p>
  <b>SECURITY WARNING</b>: 
  If a version of the fts3/4 extension that supports the second form of
  fts3_tokenizer() is deployed in an environment where potentially malicious
  users may execute arbitrary SQL, they should be prevented from invoking the
  fts3_tokenizer() function, possibly using the 
  [sqlite3_set_authorizer()|authorization callback].

<p>
  The following block contains an example of calling the fts3_tokenizer()
  function from C code:

<codeblock>
  <i>/*