Documentation Source Text

Check-in [2ed85cc08c]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Documentation on the unicode61 tokenizer and the ability to use shared-cache with in-memory databases.
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 2ed85cc08c697a33cfa03f6ae665c3ab0731dfb4
User & Date: drh 2012-05-28 17:05:12
Context
2012-05-28
17:30
Further enhancements to the in-memory shared-cache documentation. check-in: bc46aa4246 user: drh tags: trunk
17:05
Documentation on the unicode61 tokenizer and the ability to use shared-cache with in-memory databases. check-in: 2ed85cc08c user: drh tags: trunk
2012-05-22
02:49
Update SHA1 sums for version 3.7.12.1. check-in: dbe5954269 user: drh tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to pages/changes.in.

37
38
39
40
41
42
43









44
45
46
47
48
49
50
      <a href="http://www.sqlite.org/src/timeline">
      http://www.sqlite.org/src/timeline</a>.</p>
    }
    hd_close_aux
    hd_enable_main 1
  }
}










chng {2012 May 22 (3.7.12.1)} {
<li>Fix a bug 
    [http://www.sqlite.org/src/info/c2ad16f997ee9c | (ticket c2ad16f997)]
    in the 3.7.12 release that can cause a segfault for certain
    obscure nested aggregate queries.
<li>Fix various other minor test script problems.







>
>
>
>
>
>
>
>
>







37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
      <a href="http://www.sqlite.org/src/timeline">
      http://www.sqlite.org/src/timeline</a>.</p>
    }
    hd_close_aux
    hd_enable_main 1
  }
}

chng {date unknown (3.7.13)} {
<li>Add support for the [unicode61] tokenizer in [FTS3].
<li>[in-memory database | In-memory databases] that are specified using
    [URI filenames] are allowed to use [shared cache], so that the same
    in-memory database can be accessed from multiple database connections.
<li>Recognize and use the [coreqp | mode=memory] query parameter in
    [URI filenames].
}

chng {2012 May 22 (3.7.12.1)} {
<li>Fix a bug 
    [http://www.sqlite.org/src/info/c2ad16f997ee9c | (ticket c2ad16f997)]
    in the 3.7.12 release that can cause a segfault for certain
    obscure nested aggregate queries.
<li>Fix various other minor test script problems.

Changes to pages/compile.in.

553
554
555
556
557
558
559






560
561
562
563
564
565
566

COMPILE_OPTION {SQLITE_DISABLE_DIRSYNC} {
  If this C-preprocessor macro is defined, directory syncs
  are disabled.  SQLite typically attempts to sync the parent
  directory when a file is deleted to ensure the directory
  entries are updated immediately on disk.
}






</tcl>

<tcl>
  hd_fragment "omitfeatures"
  hd_keywords "omitfeatures"
</tcl>
<h2>1.6 Options To Omit Features</h2>







>
>
>
>
>
>







553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572

COMPILE_OPTION {SQLITE_DISABLE_DIRSYNC} {
  If this C-preprocessor macro is defined, directory syncs
  are disabled.  SQLite typically attempts to sync the parent
  directory when a file is deleted to ensure the directory
  entries are updated immediately on disk.
}

COMPILE_OPTION {SQLITE_DISABLE_FTS3_UNICODE} {
  If this C-preprocessor macro is defined, the [unicode61] tokenizer
  in [FTS3] is omitted from the build and is unavailable to 
  applications.
}
</tcl>

<tcl>
  hd_fragment "omitfeatures"
  hd_keywords "omitfeatures"
</tcl>
<h2>1.6 Options To Omit Features</h2>

Changes to pages/fts3.in.

2006
2007
2008
2009
2010
2011
2012









2013
2014
2015
2016
2017
2018
2019
  text according to the ICU rules for finding word boundaries and discards
  any tokens that consist entirely of white-space. This may be suitable
  for some applications in some locales, but not all. If more complex
  processing is required, for example to implement stemming or
  discard punctuation, this can be done by creating a tokenizer
  implementation that uses the ICU tokenizer as part of its implementation.










<h2>Custom (User Implemented) Tokenizers</h2>

<p>
  As well as the built-in "simple", "porter" and (possibly) "icu" tokenizers,
  FTS exports an interface that allows users to implement custom tokenizers
  using C. The interface used to create a new tokenizer is defined and 
  described in the fts3_tokenizer.h source file.







>
>
>
>
>
>
>
>
>







2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
  text according to the ICU rules for finding word boundaries and discards
  any tokens that consist entirely of white-space. This may be suitable
  for some applications in some locales, but not all. If more complex
  processing is required, for example to implement stemming or
  discard punctuation, this can be done by creating a tokenizer
  implementation that uses the ICU tokenizer as part of its implementation.

<tcl>hd_fragment unicode61 unicode61</tcl>
<p>
  The "unicode61" tokenizer is available beginning with SQLite [version 3.7.13].
  Unicode61 works very much like "simple" except that it does full unicode
  case folding according to rules in Unicode Version 6.1 and it recognizes
  unicode space and punctuation characters and uses those to separate tokens.
  The simple tokenizer only does case folding of ASCII characters and only
  recognizes ASCII space and punctuation characters as token separators.

<h2>Custom (User Implemented) Tokenizers</h2>

<p>
  As well as the built-in "simple", "porter" and (possibly) "icu" tokenizers,
  FTS exports an interface that allows users to implement custom tokenizers
  using C. The interface used to create a new tokenizer is defined and 
  described in the fts3_tokenizer.h source file.

Changes to pages/inmemorydb.in.

11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39


























































40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
in memory is to open the database using the special filename
"<b>:memory:</b>".  In other words, instead of passing the name of
a real disk file into one of the [sqlite3_open()], [sqlite3_open16()], or
[sqlite3_open_v2()] functions, pass in the string ":memory:".  For
example:</p>

<blockquote><pre>
rc = sqlite3_open(":memory:", &db);
</pre></blockquote>

<p>When this is done, no disk file is opened.  
Instead, a new database is created
purely in memory.  The database ceases to exist as soon as the database
connection is closed.  Every :memory: database is distinct from every
other.  So, opening two database connections each with the filename
":memory:" will create two independent in-memory databases.</p>

<p>The special filename ":memory:" can be used anywhere that a database
filename is permitted.  For example, it can be used as the
<i>filename</i> in an [ATTACH] command:</p>

<blockquote>
<b>ATTACH DATABASE ':memory:' AS aux1;</b>
</blockquote>

<p>Note that in order for the special ":memory:" name to apply and to
create a pure in-memory database, there must be no additional text in the
filename.  Thus, a disk-based database can be created in a file by prepending
a pathname, like this:  "./:memory:".</p>



























































<tcl>hd_fragment temp_db {temporary tables} {temporary databases}</tcl>
<h2>Temporary Databases</h2>

<p>When the name of the database file handed to [sqlite3_open()] or to
[ATTACH] is an empty string, then a new temporary file is created to hold
the database.</p>

<blockquote><pre>
rc = sqlite3_open("", &db);
</pre></blockquote>

<blockquote><b>
ATTACH DATABASE '' AS aux2;
</b></blockquote>

<p>A different temporary file is created each time, so that just like as
with the special ":memory:" string, two database connections to temporary
databases each have their own private database.  Temporary databases are
automatically deleted when the connection that created them closes.</p>

<p>Even though a disk file is allocated for each temporary database, in







|













|
|
|





>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>









|


|

|







11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
in memory is to open the database using the special filename
"<b>:memory:</b>".  In other words, instead of passing the name of
a real disk file into one of the [sqlite3_open()], [sqlite3_open16()], or
[sqlite3_open_v2()] functions, pass in the string ":memory:".  For
example:</p>

<blockquote><pre>
rc = sqlite3_open(":memory:", &amp;db);
</pre></blockquote>

<p>When this is done, no disk file is opened.  
Instead, a new database is created
purely in memory.  The database ceases to exist as soon as the database
connection is closed.  Every :memory: database is distinct from every
other.  So, opening two database connections each with the filename
":memory:" will create two independent in-memory databases.</p>

<p>The special filename ":memory:" can be used anywhere that a database
filename is permitted.  For example, it can be used as the
<i>filename</i> in an [ATTACH] command:</p>

<blockquote><pre>
ATTACH DATABASE ':memory:' AS aux1;
</pre></blockquote>

<p>Note that in order for the special ":memory:" name to apply and to
create a pure in-memory database, there must be no additional text in the
filename.  Thus, a disk-based database can be created in a file by prepending
a pathname, like this:  "./:memory:".</p>

<p>The special ":memory:" filename also works when using [URI filenames].
For example:

<blockquote><pre>
rc = sqlite3_open("file::memory:", &amp;db);
</pre></blockquote>

Or,

<blockquote><pre>
ATTACH DATABASE 'file::memory:' AS aux1;
</pre></blockquote>

<tcl>hd_fragment sharedmemdb {in-memory shared cache database}</tcl>
<h2>In-memory Databases And Shared Cache</h2>

<p>In-memory databases are allowed to use [shared cache] if they are
opened using a [URI filename].  If the unadorned ":memory:" name is used
to specify the in-memory database, then that database always has a private
cache and is this only visible to the database connection that originally
opened it.  However, the same in-memory database can be opened by two or
more database connections as follows:

<blockquote><pre>
rc = sqlite3_open("file::memory:?cache=shared", &amp;db);
</pre></blockquote>

Or,

<blockquote><pre>
ATTACH DATABASE 'file::memory:?cache=shared' AS aux1;
</pre></blockquote>

<p>This allows separate database connections to share the same
in-memory database.  Of course, all database connections sharing the
in-memory database need to be in the same process.  The database is
automatically deleted and memory is reclaimed when the last connection
to the database closes.

<p>If two or more distinct but shareable in-memory databases are needed
in a single process, then the [coreqp | mode=memory] query parameter can
be used with a [URI filename] to create a named in-memory database:

<blockquote><pre>
rc = sqlite3_open("file:memdb1?mode=memory&amp;cache=shared", &amp;db);
</pre></blockquote>

Or,

<blockquote><pre>
ATTACH DATABASE 'file:memdb1?mode=memory&amp;cache=shared' AS aux1;
</pre></blockquote>

<p>When an in-memory database is named in this way, it will only share its
cache with another connection that uses exactly the same name.



<tcl>hd_fragment temp_db {temporary tables} {temporary databases}</tcl>
<h2>Temporary Databases</h2>

<p>When the name of the database file handed to [sqlite3_open()] or to
[ATTACH] is an empty string, then a new temporary file is created to hold
the database.</p>

<blockquote><pre>
rc = sqlite3_open("", &amp;db);
</pre></blockquote>

<blockquote><pre>
ATTACH DATABASE '' AS aux2;
</pre></blockquote>

<p>A different temporary file is created each time, so that just like as
with the special ":memory:" string, two database connections to temporary
databases each have their own private database.  Temporary databases are
automatically deleted when the connection that created them closes.</p>

<p>Even though a disk file is allocated for each temporary database, in

Changes to pages/uri.in.

152
153
154
155
156
157
158

159
160
161
162
163
164
165
166
...
175
176
177
178
179
180
181
182
183
184



185
186
187
188
189
190
191
the first query parameters, each key and value, and each subsequent key
from the prior value.
^The list of query parameters parameters appended to the xOpen filename
is terminated by a single zero-length key.
Note that the value of a query parameter can be an empty string.
</p>


<tcl>hd_fragment coreqp {query parameters with special meaning to SQLite}</tcl>
<h2>3.3 Recognized Query Parameters</h2>

<p>
Some query parameters are interpreted by the SQLite core and used to 
modify the characteristics of the new connection.  ^All query parameters
are always passed through into the xOpen method of the [VFS] even if
they are previously read and interpreted by the SQLite core.
................................................................................
<dt><b>vfs=</b><i>NAME</i></dt>
<dd><p>^The vfs query parameter causes the database connection to be opened
using the [VFS] called <i>NAME</i>.
^The open attempt fails if <i>NAME</i> is not the name of a [VFS] that
is built into SQLite or that has been previously registered using
[sqlite3_vfs_register()].</dd>

<dt><b>mode=ro<br>mode=rw<br>mode=rwc</b></dt>
<dd><p>^The mode query parameter determines if the new database is opened
read-only, read-write, or read-write and created if it does not exist.



</dd>

<dt><b>cache=shared<br>cache=private</b></dt>
<dd><p>^The cache query parameter determines if the new database is opened
using [shared cache mode] or with a private cache.
</dd>








>
|







 







|

|
>
>
>







152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
...
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
the first query parameters, each key and value, and each subsequent key
from the prior value.
^The list of query parameters parameters appended to the xOpen filename
is terminated by a single zero-length key.
Note that the value of a query parameter can be an empty string.
</p>

<tcl>hd_fragment coreqp *coreqp\
    {query parameters with special meaning to SQLite}</tcl>
<h2>3.3 Recognized Query Parameters</h2>

<p>
Some query parameters are interpreted by the SQLite core and used to 
modify the characteristics of the new connection.  ^All query parameters
are always passed through into the xOpen method of the [VFS] even if
they are previously read and interpreted by the SQLite core.
................................................................................
<dt><b>vfs=</b><i>NAME</i></dt>
<dd><p>^The vfs query parameter causes the database connection to be opened
using the [VFS] called <i>NAME</i>.
^The open attempt fails if <i>NAME</i> is not the name of a [VFS] that
is built into SQLite or that has been previously registered using
[sqlite3_vfs_register()].</dd>

<dt><b>mode=ro<br>mode=rw<br>mode=rwc<br>mode=memory</b></dt>
<dd><p>^The mode query parameter determines if the new database is opened
read-only, read-write, read-write and created if it does not exist, or
if the databaes is a pure in-memory database that never interacts with
disk, respectively.  The <b>mode=memory</b> option was added in 
[version 3.7.13].
</dd>

<dt><b>cache=shared<br>cache=private</b></dt>
<dd><p>^The cache query parameter determines if the new database is opened
using [shared cache mode] or with a private cache.
</dd>