Documentation Source Text

Check-in [2906c27967]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Add documentation for RBU vacuum to rbu.in.
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 2906c27967e0705a0a22c19f06bc75a8621d2fa8
User & Date: dan 2016-04-26 15:37:45
Context
2016-04-26
17:59
Add automatically generated tables of contents to rbu.html, cli.html and datatype3.html. check-in: bd775e82aa user: dan tags: trunk
15:37
Add documentation for RBU vacuum to rbu.in. check-in: 2906c27967 user: dan tags: trunk
2016-04-23
16:24
Add a new hyperlink on the cli.html page. check-in: 03e1bee6c8 user: drh tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to pages/fancyformat.tcl.

176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
...
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
    append ret " (C: $::ffreq_children($reqid))"
  } 
  append ret "</p>"
}

set ::Requirements [list]

proc Code {txt} {
  set txt [string trim $txt "\n"]
  set    out {<div class=codeblock style="margin:0 15ex">}
  append out {<table width=100% style="border:1px solid #80a796;padding:0 1ex;background-color:#EEEEEE"><tr><td><pre style="font-size:1.1em">}

  foreach line [split $txt "\n"] {
    if {![string is space $line]} {
      set nSpace [expr {
................................................................................
    /title { 
      set G(inTitle) 0
      append G(doc) [formattag $tag $details]
    }

    codeblock  { set G(inCodeblock) 1 }
    /codeblock { 
      append G(doc) [Code $G(codeblock)]
      set G(codeblock) "" 
      set G(inCodeblock) 0 
    }

    table {
      catch {array unset D} 
      array set D $details







|







 







|







176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
...
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
    append ret " (C: $::ffreq_children($reqid))"
  } 
  append ret "</p>"
}

set ::Requirements [list]

proc CodeBlock {txt} {
  set txt [string trim $txt "\n"]
  set    out {<div class=codeblock style="margin:0 15ex">}
  append out {<table width=100% style="border:1px solid #80a796;padding:0 1ex;background-color:#EEEEEE"><tr><td><pre style="font-size:1.1em">}

  foreach line [split $txt "\n"] {
    if {![string is space $line]} {
      set nSpace [expr {
................................................................................
    /title { 
      set G(inTitle) 0
      append G(doc) [formattag $tag $details]
    }

    codeblock  { set G(inCodeblock) 1 }
    /codeblock { 
      append G(doc) [CodeBlock $G(codeblock)]
      set G(codeblock) "" 
      set G(inCodeblock) 0 
    }

    table {
      catch {array unset D} 
      array set D $details

Changes to pages/rbu.in.

1
2
3
4
5
6
7

8
9

10
11
12
13
14

15









16
17
18
19
20
21





22

23
24

25
26
27
28






29
30
31
32
33
34
35













36
37
38

39
40
41
42
43
44
45
46
47


48
49





50
51
52
53
54
55
56
57
..
64
65
66
67
68
69
70

71
72
73
74
75
76
77
...
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
...
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
...
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
...
457
458
459
460
461
462
463
464





























































































































<title>The RBU Extension</title>
<tcl>
hd_keywords {RBU} {RBU extension}
proc CODE {text} {
  hd_puts "<blockquote><pre>"
  hd_puts $text
  hd_puts "</pre></blockquote>"

}
</tcl>

<h1 align='center'>The RBU Extension</h1>

<p>The RBU extension is an add-on for SQLite that facilitates 
rapid incremental updates of large SQLite database files 
on low-power devices at the edge of a network.











<p>The RBU name stands for "Resumable Bulk Update".

<p>Updating an SQLite database file on a remote device can normally
be accomplished simply by sending the text of various [INSERT], [DELETE],
and [UPDATE] commands to the device and evaluating them all inside of
a transaction.  RBU provides some advantages over this simple approach:







<ol>
<li><b>RBU runs faster</b>


<p>The most efficient way to apply changes to a B-Tree is to make
the changes in row order.  But if an SQL table has indexes, the row
order for the indexes will all be different from each other and from






the row order of the original table.  RBU works around this by applying
all changes to the table in one pass, then
applying changes to each index in separate passes, thus updating each
B-Trees in its optimal sequence.  For a large database file (one that
does not fit in the OS disk cache) this procedure can result in
two orders of magnitude faster updates.














<li><b>RBU runs in the background</b>

<p>The changes can be applied to the database file by a background

process that does not interfere with read access to the database
file.

<li><b>RBU runs incrementally</b>

<p>The changes can be applied to the database incrementally, with
intervening power outages and/or system resets.  And yet the original
unmodified data remains visible to the device until the moment that
entire change set commits.  


</ol>






<h2>Limitations</h2>

<p>The following limitations apply to RBU updates:

<ul>
<li><p>The changes must consist of [INSERT], [UPDATE], and [DELETE]
    operations only.  CREATE and DROP operations are not
    supported.</p></li>
................................................................................
       named "rbu_control".</p></li>
<li><p>The RBU update will not fire any triggers.</p></li>
<li><p>The RBU update will not detect or prevent foreign key or
       CHECK constraint violations.</p></li>
<li><p>All RBU updates use the "OR ROLLBACK" constraint handling mechanism.
    </p></li>
<li><p>The target database may not be in [WAL mode].</p></li>

<li><p>No other writes may occur on the target database while the
       RBU update is being applied.  A read-lock is held on the target
       database to prevent this.</p></li>
</ul>


<h2>Preparing An RBU Update File</h2>
................................................................................
  <a href=http://fossil-scm.org>Fossil source-code management system</a>, or

  <li> In a custom format defined by the RBU application.
</ul>

<p> The fossil delta format may only be used to update BLOB values. Instead
of storing the new BLOB within the data_% table, the fossil delta is stored
instead. And instead of specifying an 'x' as part of the ota_control string
for the column to be updated, an 'f' character is stored. When processing
an 'f' update, RBU loads the original BLOB data from disk, applies the fossil
delta to it and stores the results back into the database file. The RBU
databases generated by [sqldiff --rbu] make use of fossil deltas wherever
doing so would save space in the RBU database.

<p> To use a custom delta format, the RBU application must register a
................................................................................
  the data0_ccc_fts view and applied to FTS table ccc_fts. Because
  "data0_ccc_fts" is smaller than "data_ccc", the FTS table will be updated
  first, as required.

<p>
  Cases in which the underlying content table has an explicit INTEGER PRIMARY
  KEY column are slightly more difficult, as the text values stored in the
  ota_control column are slightly different for the FTS index and its
  underlying content table. For the underlying content table, a character
  must be included in any ota_control text values for the explicit IPK, but
  for the FTS table itself, which has an implicit rowid, it should not. This
  is inconvenient, but can be solved using a more complicated view, as follows:

<tcl>CODE {
  -- Target database schema
  CREATE TABLE ddd(i INTEGER PRIMARY KEY, k TEXT);
  CREATE VIRTUAL TABLE ddd_fts USING fts4(k, content=ddd);
................................................................................
  [FTS shadow tables | underlying database tables] that they create in order to
  store data within the database are not, and [sqldiff] will include add these
  to any RBU database. For this reason, users attempting to use sqldiff to
  create RBU updates to apply to target databases with one or more virtual
  tables will likely have to run sqldiff using the --table option separately
  for each table to update in the target database.

<h2>C/C++ Interface</h2>

<p>Enable the RBU extension by compiling the [amalgamation] with the
[SQLITE_ENABLE_RBU] compile-time option.

<p>The RBU extension interface allows an application to apply an RBU update 
stored in an RBU database to an existing target database.
The procedures is as follows:

<ol>
<li><p>
Open an RBU handle using the sqlite3rbu_open(T,A,S) function.

<p>The T argument is the name of the target database file.
The A argument is the name of the RBU database file.
................................................................................
This allows subsequent processes to automatically
resume the RBU update from where it left off.
If state information is stored in the RBU database, it can be removed
by dropping all tables whose names begin with "rbu_".

<p>For more details, refer to the comments in 
<a href=http://sqlite.org/src/doc/trunk/ext/rbu/sqlite3rbu.h>header file
sqlite3ota.h<a>.




































































































































>


>


|
<
|
>

>
>
>
>
>
>
>
>
>
|

<
<
<
<
>
>
>
>
>

>

<
>

|
<
<
>
>
>
>
>
>
|
<
|
|
|
|

>
>
>
>
>
>
>
>
>
>
>
>
>


<
>
|
<



|
|
|
|
>
>


>
>
>
>
>
|







 







>







 







|







 







|

|







 







|
<
<
<



|







 







|
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
1
2
3
4
5
6
7
8
9
10
11
12
13
14

15
16
17
18
19
20
21
22
23
24
25
26
27
28




29
30
31
32
33
34
35
36

37
38
39


40
41
42
43
44
45
46

47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66

67
68

69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
..
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
...
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
...
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
...
432
433
434
435
436
437
438
439



440
441
442
443
444
445
446
447
448
449
450
...
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
<title>The RBU Extension</title>
<tcl>
hd_keywords {RBU} {RBU extension}
proc CODE {text} {
  hd_puts "<blockquote><pre>"
  hd_puts $text
  hd_puts "</pre></blockquote>"
  #hd_puts [CodeBlock $text]
}
</tcl>
<!table_of_contents>
<h1 align='center'>The RBU Extension</h1>

<p>The RBU extension is an add-on for SQLite designed for use with large 

SQLite database files on low-power devices at the edge of a network. RBU
may be used for two separate tasks:

<ul>
  <li> <b>RBU Update operations</b>. An [RBU Update] is a bulk update of a
          database file that may include many insert, update and delete
          operations on one or more tables.
  <li> <b>RBU Vacuum operations</b>. An [RBU Vacuum] optimizes and rebuilds an
          entire database file, with results similar to SQLite's native VACUUM
          command.  
</ul>

<p>The acronym RBU stands for "Resumable Bulk Update".





<p>Both of the RBU functions may be accomplished using SQLite's built-in 
SQL commands - RBU update via a series of [INSERT], [DELETE] and 
[UPDATE] commands within a single transaction, and RBU vacuum by a single
[VACUUM] command.  The RBU module provides the following advantages over
these simpler approaches:


<ol>

<li><b>RBU may be more efficient</b>

<p>The most efficient way to apply changes to a B-Tree (the data structure


that SQLite uses to store each table and index on disk) is to make the
changes in key order. But if an SQL table has one or more indexes, the key
order for each index may be different from the main table and the other
auxiliary indexes. As a result, when executing a series of [INSERT],
[UPDATE] and [DELETE] statements it is not generally possible to order the
operations so that all b-trees are updated in key order. The RBU update
process works around this by applying all changes to the main table in one 

pass, then applying changes to each index in separate passes, ensuring each
B-Tree is updated optimally. For a large database file (one that does not
fit in the OS disk cache) this procedure can result in two orders of
magnitude faster updates.

<p>An RBU Vacuum operation requires less temporary disk space and writes
less data to disk than an SQLite VACUUM. An SQLite VACUUM requires roughly
twice the size of the final database file in temporary disk space to run.
The total amount of data written is around three times the size of the
final database file. By contrast, an RBU Vacuum requires roughly the size
of the final database file in temporary disk space and writes a total of
twice that to disk.

<p>On the other hand, an RBU Vacuum uses more CPU than a regular SQLite
VACUUM - in one test as much as five times as much. For this reason, an RBU
Vaccum is often signicantly slower than an SQLite VACUUM under the same
conditions.

<li><b>RBU runs in the background</b>


<p>An ongoing RBU operation (either an update or a vacuum) does not
interfere with read access to the database file.


<li><b>RBU runs incrementally</b>

<p>RBU operations may be suspended and then later resumed, perhaps with
intervening power outages and/or system resets. For an RBU update, the
original database content remains visible to all database readers until 
the entire update has been applied - even if the update is suspended and
then later resumed.

</ol>

<p>The RBU extension is not enabled by default. To enable it, compile the
[amalgamation] with the [SQLITE_ENABLE_RBU] compile-time option.

<tcl>hd_fragment rbu_updates {RBU Update}</tcl>
<h1>RBU Updates</h1>
<h2>RBU Update Limitations</h2>

<p>The following limitations apply to RBU updates:

<ul>
<li><p>The changes must consist of [INSERT], [UPDATE], and [DELETE]
    operations only.  CREATE and DROP operations are not
    supported.</p></li>
................................................................................
       named "rbu_control".</p></li>
<li><p>The RBU update will not fire any triggers.</p></li>
<li><p>The RBU update will not detect or prevent foreign key or
       CHECK constraint violations.</p></li>
<li><p>All RBU updates use the "OR ROLLBACK" constraint handling mechanism.
    </p></li>
<li><p>The target database may not be in [WAL mode].</p></li>
<li><p>The target database may not contain [indexes on expressions].
<li><p>No other writes may occur on the target database while the
       RBU update is being applied.  A read-lock is held on the target
       database to prevent this.</p></li>
</ul>


<h2>Preparing An RBU Update File</h2>
................................................................................
  <a href=http://fossil-scm.org>Fossil source-code management system</a>, or

  <li> In a custom format defined by the RBU application.
</ul>

<p> The fossil delta format may only be used to update BLOB values. Instead
of storing the new BLOB within the data_% table, the fossil delta is stored
instead. And instead of specifying an 'x' as part of the rbu_control string
for the column to be updated, an 'f' character is stored. When processing
an 'f' update, RBU loads the original BLOB data from disk, applies the fossil
delta to it and stores the results back into the database file. The RBU
databases generated by [sqldiff --rbu] make use of fossil deltas wherever
doing so would save space in the RBU database.

<p> To use a custom delta format, the RBU application must register a
................................................................................
  the data0_ccc_fts view and applied to FTS table ccc_fts. Because
  "data0_ccc_fts" is smaller than "data_ccc", the FTS table will be updated
  first, as required.

<p>
  Cases in which the underlying content table has an explicit INTEGER PRIMARY
  KEY column are slightly more difficult, as the text values stored in the
  rbu_control column are slightly different for the FTS index and its
  underlying content table. For the underlying content table, a character
  must be included in any rbu_control text values for the explicit IPK, but
  for the FTS table itself, which has an implicit rowid, it should not. This
  is inconvenient, but can be solved using a more complicated view, as follows:

<tcl>CODE {
  -- Target database schema
  CREATE TABLE ddd(i INTEGER PRIMARY KEY, k TEXT);
  CREATE VIRTUAL TABLE ddd_fts USING fts4(k, content=ddd);
................................................................................
  [FTS shadow tables | underlying database tables] that they create in order to
  store data within the database are not, and [sqldiff] will include add these
  to any RBU database. For this reason, users attempting to use sqldiff to
  create RBU updates to apply to target databases with one or more virtual
  tables will likely have to run sqldiff using the --table option separately
  for each table to update in the target database.

<h2>RBU Update C/C++ Programming</h2>




<p>The RBU extension interface allows an application to apply an RBU update 
stored in an RBU database to an existing target database.
The procedure is as follows:

<ol>
<li><p>
Open an RBU handle using the sqlite3rbu_open(T,A,S) function.

<p>The T argument is the name of the target database file.
The A argument is the name of the RBU database file.
................................................................................
This allows subsequent processes to automatically
resume the RBU update from where it left off.
If state information is stored in the RBU database, it can be removed
by dropping all tables whose names begin with "rbu_".

<p>For more details, refer to the comments in 
<a href=http://sqlite.org/src/doc/trunk/ext/rbu/sqlite3rbu.h>header file
sqlite3rbu.h</a>.

<tcl>hd_fragment rbu_vacuum {RBU Vacuum}</tcl>
<h1>RBU Vacuum</h1>

<h2>RBU Vacuum Limitations</h2>

<p>When compared with SQLite's built-in VACUUM command, RBU Vacuum has the
following limitations:

<ul>
  <li><p>It may not be used on a database that contains [indexes on expressions].
  <li><p>The database being vacuumed may not be in [WAL mode].
</ul>

<h2>RBU Vacuum C/C++ Programming</h2>

<p> This section provides an overview of and example code demonstrating the
    integration of RBU Vacuum into an application program.  For full details,
    refer to the comments in 
    <a href=http://sqlite.org/src/doc/trunk/ext/rbu/sqlite3rbu.h>header file
    sqlite3rbu.h</a>.

<p> RBU Vacuum applications all implement some variation of the following
procedure:

<ol>
  <li><p> An RBU handle is created by calling sqlite3rbu_vacuum(T, S).

      <p> Argument T is the name of the database file to vacuum. Argument S is
      the name of a database in which the RBU module will save its state if the
      vacuum operation is suspended.

      <p> If state database S does not exist when sqlite3rbu_vacuum() is
      invoked, it is automatically created and populated with the single table
      used to store the state of an RBU vacuum - "rbu_state". If an ongoing RBU
      vacuum is suspended, this table is populated with state data. The next
      time sqlite3rbu_vacuum() is called with the same S parameter, it detects
      this data and attempts to resume the suspended vacuum operation. When
      an RBU vacuum operation is completed or encounters an error, RBU 
      automatically deletes the contents of the rbu_state table. In this case,
      the next call to sqlite3rbu_vacuum() starts an entirely new vacuum
      operation from scratch.

      <p> It is a good idea to establish a convention for determining the RBU
      vacuum state database name based on the target database name. The
      example code below uses "&lt;target&gt;-vacuum", where &lt;target&gt; is
      the name of the database being vacuumed.

  <li><p> Any custom collation sequences used by indexes within the database
      being vacuumed are registered with both of the database handles returned
      by the sqlite3rbu_db() function.

  <li><p> Function sqlite3rbu_step() is called on the RBU handle until either
      the RBU vacuum is finished, an error occurs or the application wishes to
      suspend the RBU vacuum.

      <p> Each call to sqlite3rbu_step() does a small amount of work towards
      completing the vacuum operation. Depending on the size of the database, a
      single vacuum may require thousands of calls to sqlite3rbu_step().
      sqlite3rbu_step() returns SQLITE_DONE if the vacuum operation has
      finished, SQLITE_OK if the vacuum operation has not finished but no error
      has occurred, and an SQLite error code if an error is encountered. If
      an error does occur, all subsequent calls to sqlite3rbu_step() immediately
      return the same error code.

  <li><p> Finally, sqlite3rbu_close() is called to close the RBU handle. If the
      application stopped calling sqlite3rbu_step() before either the vacuum
      finished or an error occurred, the state of the vacuum is saved in the
      state database so that it may be resumed later on.

      <p> Like sqlite3rbu_step(), if the vacuum operation has finished,
      sqlite3rbu_close() returns SQLITE_DONE. If the vacuum has not finished
      but no error has occurred, SQLITE_OK is returned. Or, if an error has
      occurred, an SQLite error code is returned. If an error occurred as part
      of a prior call to sqlite3rbu_step(), sqlite3rbu_close() returns the
      same error code.
</ol>

<p>The following example code illustrates the techniques described above.  

<tcl>CODE {

<i>/*</i>
<i>** Either start a new RBU vacuum or resume a suspended RBU vacuum on </i>
<i>** database zTarget. Return when either an error occurs, the RBU </i>
<i>** vacuum is finished or when the application signals an interrupt</i>
<i>** (code not shown).</i>
<i>**</i>
<i>** If the RBU vacuum is completed successfully, return SQLITE_DONE.</i>
<i>** If an error occurs, return SQLite error code. Or, if the application</i>
<i>** signals an interrupt, suspend the RBU vacuum operation so that it</i>
<i>** may be resumed by a subsequent call to this function and return</i>
<i>** SQLITE_OK.</i>
<i>**</i>
<i>** This function uses the database named "&lt;zTarget&gt;-vacuum" for</i>
<i>** the state database, where &lt;zTarget&gt; is the name of the database </i>
<i>** being vacuumed.</i>
<i>*/</i>
int do_rbu_vacuum(const char *zTarget){
  int rc;
  char *zState;                   <i>/* Name of state database */</i>
  sqlite3rbu *pRbu;               <i>/* RBU vacuum handle */</i>

  zState = sqlite3_mprintf("%s-vacuum", zTarget);
  if( zState==0 ) return SQLITE_NOMEM;
  pRbu = sqlite3rbu_vacuum(zTarget, zState);
  sqlite3_free(zState);

  if( pRbu ){
    sqlite3 *dbTarget = sqlite3rbu_db(pRbu, 0);
    sqlite3 *dbState = sqlite3rbu_db(pRbu, 1);

    <i>/* Any custom collation sequences used by the target database must</i>
    <i>** be registered with both database handles here.  */</i>

    while( sqlite3rbu_step(pRbu)==SQLITE_OK ){
      if( <i>&lt;application has signalled interrupt&gt;</i> ) break;
    }
  }
  rc = sqlite3rbu_close(pRbu);
  return rc;
}

}</tcl>