Documentation Source Text

Check-in [a5dfd4a26b]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:First cut at OTA documentation.
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: a5dfd4a26bdf15534d63c34a72f01b4db8ba3146
User & Date: drh 2015-07-13 22:07:31
Context
2015-07-14
13:46
Fix a minor syntax error in the tclsqlite document. check-in: 0462837fe7 user: drh tags: trunk
2015-07-13
22:07
First cut at OTA documentation. check-in: a5dfd4a26b user: drh tags: trunk
2015-07-10
17:04
Add an "application porting guide" describing the steps required to port fts3/4 applciations to fts5. Fix some other deficiencies in fts5.html. check-in: f650ee44b0 user: dan tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to pages/changes.in.

14
15
16
17
18
19
20
21
22

23

24
25
26
27
28
29
30
31
32
33





34
35
36
37
38
39
40
set nChng 0
proc chng {date desc {options {}}} {
  global nChng aChng
  set aChng($nChng) [list $date $desc $options]
  incr nChng
}

chng {2015-00-00 (3.8.11)} {
<li>Added the [FTS5] extension.

<li>Added the [sqlite3_value_dup()] and [sqlite3_value_free()] interfaces.

<li>The [IS operator] is now able to drive indexes.
<li>Enhance the query planner to permit [automatic indexing] on FROM-clause
    subqueries that are implemented by co-routine.
<li>Disallow the use of "rowid" in [common table expressions].
<li>Added the [PRAGMA cell_size_check] command for better and earlier
    detection of database file corruption.
<li>Added the [matchinfo 'b' flag] to the [matchinfo()] function in [FTS3].
<li>Improved fuzz-testing of database files, with fixes for problems found.
<li>Add the fuzzcheck test program and automatically run this program
    using both SQL and database test cases on "make test".





<li>Miscellaneous micro-optimizations result in 22.3% more work for the same
    number of CPU cycles relative to the previous release. 
    SQLite now runs twice as fast as [version 3.8.0] and three times as
    fast as [version 3.3.9].
    (Measured using 
    [http://valgrind.org/docs/manual/cg-manual.html|cachegrind] on the
    [http://www.sqlite.org/src/artifact/83f6b3318f7ee|speedtest1.c] workload on







|

>

>










>
>
>
>
>







14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
set nChng 0
proc chng {date desc {options {}}} {
  global nChng aChng
  set aChng($nChng) [list $date $desc $options]
  incr nChng
}

chng {2015-07-31 (3.8.11)} {
<li>Added the [FTS5] extension.
<li>Added the [OTA] extension.
<li>Added the [sqlite3_value_dup()] and [sqlite3_value_free()] interfaces.
<li>Enhance the [spellfix1] extension to support [ON CONFLICT] clauses.
<li>The [IS operator] is now able to drive indexes.
<li>Enhance the query planner to permit [automatic indexing] on FROM-clause
    subqueries that are implemented by co-routine.
<li>Disallow the use of "rowid" in [common table expressions].
<li>Added the [PRAGMA cell_size_check] command for better and earlier
    detection of database file corruption.
<li>Added the [matchinfo 'b' flag] to the [matchinfo()] function in [FTS3].
<li>Improved fuzz-testing of database files, with fixes for problems found.
<li>Add the fuzzcheck test program and automatically run this program
    using both SQL and database test cases on "make test".
<li>Added the [SQLITE_MUTEX_STATIC_VFS1] static mutex and use it in the
    Windows [VFS].
<li>Enhance the page cache so that it can preallocate a block of memory to
    use for the initial set page cache lines.  Set the default preallocation
    to 100 pages.  Yields about a 5% performance increase on common workloads.
<li>Miscellaneous micro-optimizations result in 22.3% more work for the same
    number of CPU cycles relative to the previous release. 
    SQLite now runs twice as fast as [version 3.8.0] and three times as
    fast as [version 3.3.9].
    (Measured using 
    [http://valgrind.org/docs/manual/cg-manual.html|cachegrind] on the
    [http://www.sqlite.org/src/artifact/83f6b3318f7ee|speedtest1.c] workload on

Changes to pages/optoverview.in.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
..
86
87
88
89
90
91
92

93
94
95
96
97
98
99
...
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
...
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
<title>The SQLite Query Optimizer Overview</title>
<tcl>hd_keywords {optimizer} {query planner} {SQLite query planner}</tcl>

<tcl>
proc CODE {text} {
  hd_puts "<blockquote><pre>"
  hd_puts $text
  hd_puts "</pre></blockquote>"
}
proc SYNTAX {text} {
  hd_puts "<blockquote><pre>"
  set t2 [string map {& &amp; < &lt; > &gt;} $text]
  regsub -all "/(\[^\n/\]+)/" $t2 {</b><i>\1</i><b>} t3
  hd_puts "<b>$t3</b>"
  hd_puts "</pre></blockquote>"
}
................................................................................
  ^All terms of the WHERE clause are analyzed to see if they can be
  satisfied using indices.
  ^(To be usable by an index a term must be of one of the following
  forms:
}
SYNTAX {
  /column/ = /expression/

  /column/ > /expression/
  /column/ >= /expression/
  /column/ < /expression/
  /column/ <= /expression/
  /expression/ = /column/
  /expression/ > /column/
  /expression/ >= /column/
................................................................................
CODE {
  CREATE INDEX idx_ex1 ON ex1(a,b,c,d,e,...,y,z);
}
PARAGRAPH {
  Then the index might be used if the initial columns of the index
  (columns a, b, and so forth) appear in WHERE clause terms.)^
  ^The initial columns of the index must be used with
  the *=* or *IN* or *IS NULL* operators.  
  ^The right-most column that is used can employ inequalities.  
  ^For the right-most
  column of an index that is used, there can be up to two inequalities
  that must sandwich the allowed values of the column between two extremes.
}
PARAGRAPH {
  ^It is not necessary for every column of an index to appear in a
  WHERE clause term in order for that index to be used. 
  ^But there can not be gaps in the columns of the index that are used.
  ^Thus for the example index above, if there is no WHERE clause term
  that constraints column c, then terms that constrain columns a and b can
  be used with the index but not terms that constraint columns d through z.
  ^Similarly, index columns will not normally be used (for indexing purposes)
  if they are to the right of a 
  column that is constrained only by inequalities.
  (See the [skip-scan optimization] below for the exception.)
................................................................................
            UNION SELECT rowid FROM /table/ WHERE /expr2/
            UNION SELECT rowid FROM /table/ WHERE /expr3/)
}
PARAGRAPH {
  The rewritten expression above is conceptual; WHERE clauses containing
  OR are not really rewritten this way.
  The actual implementation of the OR clause uses a mechanism that is
  more efficient than subqueries and which works even 
  for tables where the "rowid" column name has been 
  overloaded for other uses and no longer refers to the real rowid.
  But the essence of the implementation is captured by the statement
  above:  Separate indices are used to find candidate result rows
  from each OR clause term and the final result is the union of
  those rows.
}
PARAGRAPH {
  Note that in most cases, SQLite will only use a single index for each




<
<
<
<
<







 







>







 







|








|







 







|
|
<







1
2
3
4





5
6
7
8
9
10
11
..
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
...
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
...
267
268
269
270
271
272
273
274
275

276
277
278
279
280
281
282
<title>The SQLite Query Optimizer Overview</title>
<tcl>hd_keywords {optimizer} {query planner} {SQLite query planner}</tcl>

<tcl>





proc SYNTAX {text} {
  hd_puts "<blockquote><pre>"
  set t2 [string map {& &amp; < &lt; > &gt;} $text]
  regsub -all "/(\[^\n/\]+)/" $t2 {</b><i>\1</i><b>} t3
  hd_puts "<b>$t3</b>"
  hd_puts "</pre></blockquote>"
}
................................................................................
  ^All terms of the WHERE clause are analyzed to see if they can be
  satisfied using indices.
  ^(To be usable by an index a term must be of one of the following
  forms:
}
SYNTAX {
  /column/ = /expression/
  /column/ IS /expression/
  /column/ > /expression/
  /column/ >= /expression/
  /column/ < /expression/
  /column/ <= /expression/
  /expression/ = /column/
  /expression/ > /column/
  /expression/ >= /column/
................................................................................
CODE {
  CREATE INDEX idx_ex1 ON ex1(a,b,c,d,e,...,y,z);
}
PARAGRAPH {
  Then the index might be used if the initial columns of the index
  (columns a, b, and so forth) appear in WHERE clause terms.)^
  ^The initial columns of the index must be used with
  the *=* or *IN* or *IS* operators.  
  ^The right-most column that is used can employ inequalities.  
  ^For the right-most
  column of an index that is used, there can be up to two inequalities
  that must sandwich the allowed values of the column between two extremes.
}
PARAGRAPH {
  ^It is not necessary for every column of an index to appear in a
  WHERE clause term in order for that index to be used. 
  ^But there cannot be gaps in the columns of the index that are used.
  ^Thus for the example index above, if there is no WHERE clause term
  that constraints column c, then terms that constrain columns a and b can
  be used with the index but not terms that constraint columns d through z.
  ^Similarly, index columns will not normally be used (for indexing purposes)
  if they are to the right of a 
  column that is constrained only by inequalities.
  (See the [skip-scan optimization] below for the exception.)
................................................................................
            UNION SELECT rowid FROM /table/ WHERE /expr2/
            UNION SELECT rowid FROM /table/ WHERE /expr3/)
}
PARAGRAPH {
  The rewritten expression above is conceptual; WHERE clauses containing
  OR are not really rewritten this way.
  The actual implementation of the OR clause uses a mechanism that is
  more efficient and that works even for [WITHOUT ROWID] tables or 
  tables in which the "rowid" is inaccessible.

  But the essence of the implementation is captured by the statement
  above:  Separate indices are used to find candidate result rows
  from each OR clause term and the final result is the union of
  those rows.
}
PARAGRAPH {
  Note that in most cases, SQLite will only use a single index for each

Added pages/ota.in.





































































































































































































































































































































































































































































































































































>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
<title>The OTA Extension</title>
<tcl>
proc CODE {text} {
  hd_puts "<blockquote><pre>"
  hd_puts $text
  hd_puts "</pre></blockquote>"
}
hd_keywords {OTA}
</tcl>
<h1 align='center'>The OTA Extension</h1>

<p>The OTA extension is an add-on for SQLite that facilitates 
rapid bulk updates of large SQLite database files on low-power
devices at the edge of a network.

<p>The OTA name stands for "Over-the-Air" since its original use-case
was updating maps in low-power navigation devices via
wireless.  However, the name is overly specific, since the change set
can be sent to the edge device by any available channel.

<p>Updating an SQLite database file on a remote device can normally
be accomplished simply by send the text of various [INSERT], [DELETE],
and [UPDATE] commands to the device and evaluating them all inside of
a transaction.  OTA provides some advantages over this simple approach:

<ol>
<li><b>OTA runs faster</b>

<p>The most efficient way to apply changes to a B-Tree is to make
the changes in row order.  But if an SQL table has indexes, the row
order for the indexes will all be different from each other and from
the row order of the original table.  OTA works around this by applying
all changes to the table in one pass, then going back in and separately
applying changes to each index in separate passes, thus updating all
B-Trees in the optimal sequence.  For a large database file (one that
does not fit in the OS disk cache) this procedure can result in
two orders of magnitude faster updates.

<li><b>OTA runs in the background</b>

<p>The changes can be applied to the database file by a background
process that does not interfere with read access to the database
file.

<li><b>OTA runs incrementally</b>

<p>The changes can be applied to the database incrementally, with
intervening power outages and/or system resets.  And yet the original
unmodified data remains visible to the device until the moment that
entire change set commits.  
</ol>

<h2>Limitations</h2>

<p>The following limitations apply to OTA updates:

<ul>
<li><p>The changes must consist of [INSERT], [UPDATE], and [DELETE]
    operations only.  CREATE and DROP operations are not
    supported.</p></li>
<li><p>[INSERT] statements may not use default values.</p></li>
<li><p>[UPDATE] and [DELETE] statements must identify the target rows
    by rowid or by non-NULL PRIMARY KEY values.</p></li>
<li><p>[UPDATE] statements may not modify PRIMARY KEY or rowid values.
    </p></li>
<li><p>The OTA update will not fire any triggers.</p></li>
<li><p>The OTA update will not detect or prevent foreign key or
       CHECK constraint violations.</p></li>
<li><p>All OTA updates us the "OR ROLLBACK" constraint handling mechanism.
    </p></li>
</ul>


<h2>Preparing An OTA Update File</h2>

<p>All changes to be applied by OTA are stored in a separate SQLite database
called the "OTA database".  The database that is to be modifed is called
the "target database".

<p>
For each table in the target database, the OTA database should contain a table
named "data_<target name>" with the all the same columns as the
target table, plus one additional column named "ota_control".
The data_% table should have no PRIMARY KEY or UNIQUE constraints, but
each column should have the same type as the corresponding column in
the target database.
The ota_control column should have no type at all. For example, if
the target database contains:

<tcl>CODE {
CREATE TABLE t1(a INTEGER PRIMARY KEY, b TEXT, c UNIQUE);
}</tcl>

<p>Then the OTA database should contain:

<tcl>CODE {
CREATE TABLE data_t1(a INTEGER, b TEXT, c, ota_control);
}</tcl>

<p>The order of the columns in the data_% table does not matter.

<p>If the target database table is a virtual table or a table that has no
PRIMARY KEY declaration, the data_% table must also contain a column 
named "ota_rowid". The ota_rowid column is mapped to the tables [ROWID].
For example, if the target database contains either of the following:

<tcl>CODE {
CREATE VIRTUAL TABLE x1 USING fts3(a, b);
CREATE TABLE x1(a, b);
}</tcl>

<p>then the OTA database should contain:

<tcl>CODE {
CREATE TABLE data_x1(a, b, ota_rowid, ota_control);
}</tcl>

<p>Virtual tables for which the "rowid" column does 
not function like a primary key value cannot be updated using OTA.

<p>
All non-hidden columns (i.e. all columns matched by "SELECT *") of the
target table must be present in the input table. For virtual tables,
hidden columns are optional - they are updated by OTA if present in
the input table, or not otherwise. For example, to write to an fts4
table with a hidden languageid column such as:

<tcl>CODE {
CREATE VIRTUAL TABLE ft1 USING fts4(a, b, languageid='langid');
}</tcl>

<p>Either of the following input table schemas may be used:

<tcl>CODE {
CREATE TABLE data_ft1(a, b, langid, ota_rowid, ota_control);
CREATE TABLE data_ft1(a, b, ota_rowid, ota_control);
}</tcl>

<p>For each row to INSERT into the target database as part of the OTA 
update, the corresponding data_% table should contain a single record
with the "ota_control" column set to contain integer value 0. The
other columns should be set to the values that make up the new record 
to insert. 

<p>If the target database table has an INTEGER PRIMARY KEY, it is not 
possible to insert a NULL value into the IPK column. Attempting to 
do so results in an SQLITE_MISMATCH error.

<p>For each row to DELETE from the target database as part of the OTA 
update, the corresponding data_% table should contain a single record
with the "ota_control" column set to contain integer value 1. The
real primary key values of the row to delete should be stored in the
corresponding columns of the data_% table. The values stored in the
other columns are not used.

<p>For each row to UPDATE from the target database as part of the OTA 
update, the corresponding data_% table should contain a single record
with the "ota_control" column set to contain a value of type text.
The real primary key values identifying the row to update should be 
stored in the corresponding columns of the data_% table row, as should
the new values of all columns being update. The text value in the 
"ota_control" column must contain the same number of characters as
there are columns in the target database table, and must consist entirely
of 'x' and '.' characters (or in some special cases 'd' - see below). For 
each column that is being updated, the corresponding character is set to
'x'. For those that remain as they are, the corresponding character of the
ota_control value should be set to '.'. For example, given the tables 
above, the update statement:

<tcl>CODE {
UPDATE t1 SET c = 'usa' WHERE a = 4;
}</tcl>

<p>is represented by the data_t1 row created by:

<tcl>CODE {
INSERT INTO data_t1(a, b, c, ota_control) VALUES(4, NULL, 'usa', '..x');
}</tcl>

<p>Instead of an 'x' character, characters of the ota_control value specified
for UPDATEs may also be set to 'd'. In this case, instead of updating the
target table with the value stored in the corresponding data_% column, the
user-defined SQL function "ota_delta()" is invoked and the result stored in
the target table column. ota_delta() is invoked with two arguments - the
original value currently stored in the target table column and the 
value specified in the data_xxx table.

<p>For example, this row:

<tcl>CODE {
INSERT INTO data_t1(a, b, c, ota_control) VALUES(4, NULL, 'usa', '..d');
}</tcl>


<p>is similar to an UPDATE statement such as: 

<tcl>CODE {
UPDATE t1 SET c = ota_delta(c, 'usa') WHERE a = 4;
}</tcl>

<p>If the target database table is a virtual table or a table with no PRIMARY
KEY, the ota_control value should not include a character corresponding 
to the ota_rowid value. For example, this:

<tcl>CODE {
INSERT INTO data_ft1(a, b, ota_rowid, ota_control) 
  VALUES(NULL, 'usa', 12, '.x');
}</tcl>


<p>causes a result similar to:

<tcl>CODE {
UPDATE ft1 SET b = 'usa' WHERE rowid = 12;
}</tcl>

<p>The data_xxx tables themselves should have no PRIMARY KEY declarations.
However, OTA is more efficient if reading the rows in from each data_xxx
table in "rowid" order is roughly the same as reading them sorted by
the PRIMARY KEY of the corresponding target database table. In other 
words, rows should be sorted using the destination table PRIMARY KEY 
fields before they are inserted into the data_xxx tables.

<h2>C/C++ Interface</h2>

<p>The API declared below allows an application to apply an OTA update 
stored on disk to an existing target database. Essentially, the 
application:

<ol>
<li><p>
Opens an OTA handle using the sqlite3ota_open() function.

<li><p>
Registers any required virtual table modules with the database
handle returned by sqlite3ota_db(). Also, if required, register
the ota_delta() implementation.

<li><p>
Calls the sqlite3ota_step() function one or more times on
the new handle. Each call to sqlite3ota_step() performs a single
b-tree operation, so thousands of calls may be required to apply 
a complete update.

<li><p>
Calls sqlite3ota_close() to close the OTA update handle. If
sqlite3ota_step() has been called enough times to completely
apply the update to the target database, then the OTA database
is marked as fully applied. Otherwise, the state of the OTA 
update application is saved in the OTA database for later 
resumption.
</ol>

<p>If an update is only partially applied to the target database by the
time sqlite3ota_close() is called, state information is saved 
within the OTA database. This allows subsequent processes to automatically
resume the OTA update from where it left off.

<p>To remove all OTA extension state information, returning an OTA database 
to its original contents, it is sufficient to drop all tables that begin
with the prefix "ota_"

<h2>Locking Constraints</h2>

<p>An OTA update may not be applied to a database in WAL mode. Attempting
to do so is an error (SQLITE_ERROR).

<p>While an OTA handle is open, a SHARED lock may be held on the target
database file. This means it is possible for other clients to read the
database, but not to write it.

<p>If an OTA update is started and then suspended before it is completed,
then an external client writes to the database, then attempting to resume
the suspended OTA update is also an error (SQLITE_BUSY).