Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Update rbu.in with more details on using RBU with FTS tables and the sqldiff tool. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
95c97dad08e9c12766517b2c0310d8e9 |
User & Date: | dan 2015-08-01 18:13:19.759 |
Context
2015-08-11
| ||
15:47 | Enhancements to the arguments in favor of using SQLite as an application file format. Added the undoredo.html document. (check-in: 1e1b1d8101 user: drh tags: trunk) | |
2015-08-01
| ||
18:13 | Update rbu.in with more details on using RBU with FTS tables and the sqldiff tool. (check-in: 95c97dad08 user: dan tags: trunk) | |
2015-07-31
| ||
18:32 | Fix the link to the first mailing list archive. (check-in: 319676d3b8 user: drh tags: trunk) | |
Changes
Changes to pages/fancyformat.tcl.
︙ | ︙ | |||
274 275 276 277 278 279 280 | } catch { unset D(nonumber) } catch { unset D(notoc) } # If there is a "tags" attribute, then add an [hd_fragment] command # to the output. if {[info exists D(tags)]} { | | | | < > > | 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 | } catch { unset D(nonumber) } catch { unset D(notoc) } # If there is a "tags" attribute, then add an [hd_fragment] command # to the output. if {[info exists D(tags)]} { #append G(doc) "<tcl>[list set ::hd(fragment) $D(id)]</tcl>" #foreach t [split $D(tags) ,] { # append G(doc) "<tcl>[list hd_keywords [string trim $t]]</tcl>" #} append G(doc) "<tcl>[list hd_fragment $D(id) $D(tags)]</tcl>" unset D(tags) } append G(doc) [formattag $tag [array get D]] append G(doc) "$HN " } |
︙ | ︙ |
Changes to pages/fts3.in.
︙ | ︙ | |||
1581 1582 1583 1584 1585 1586 1587 | </ul> <p> Because the indexed documents themselves are usually much larger than the full-text index, the content option can be used to achieve significant space savings. | | | 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 | </ul> <p> Because the indexed documents themselves are usually much larger than the full-text index, the content option can be used to achieve significant space savings. <h3 tags="contentless fts4 tables"> Contentless FTS4 Tables </h3> <p> In order to create an FTS4 table that does not store a copy of the indexed documents at all, the content option should be set to an empty string. For example, the following SQL creates such an FTS4 table with three columns - "a", "b", and "c": |
︙ | ︙ | |||
1637 1638 1639 1640 1641 1642 1643 | <p> Errors related to attempting to retrieve column values other than docid are runtime errors that occur within sqlite3_step(). In some cases, for example if the MATCH expression in a SELECT query matches zero rows, there may be no error at all even if a statement does refer to column values other than docid. | | | 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 | <p> Errors related to attempting to retrieve column values other than docid are runtime errors that occur within sqlite3_step(). In some cases, for example if the MATCH expression in a SELECT query matches zero rows, there may be no error at all even if a statement does refer to column values other than docid. <h3 tags="external content fts4 tables"> External Content FTS4 Tables </h3> <p> An "external content" FTS4 table is similar to a contentless table, except that if evaluation of a query requires the value of a column other than docid, FTS4 attempts to retrieve that value from a table (or view, or virtual table) nominated by the user (hereafter referred to as the "content table"). The FTS4 module never writes to the content table, and writing |
︙ | ︙ | |||
2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 | <p> This section describes at a high-level the way the FTS module stores its index and content in the database. It is <b>not necessary to read or understand the material in this section in order to use FTS</b> in an application. However, it may be useful to application developers attempting to analyze and understand FTS performance characteristics, or to developers contemplating enhancements to the existing FTS feature set. <tcl>hd_fragment *shadowtab {FTS shadow tables} {shadow tables}</tcl> <h2 tags="shadowtabs">Shadow Tables</h2> <p> For each FTS virtual table in a database, three to five real (non-virtual) tables are created to store the underlying data. These real tables are called "shadow tables". The real tables are named "%_content", | > | 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 | <p> This section describes at a high-level the way the FTS module stores its index and content in the database. It is <b>not necessary to read or understand the material in this section in order to use FTS</b> in an application. However, it may be useful to application developers attempting to analyze and understand FTS performance characteristics, or to developers contemplating enhancements to the existing FTS feature set. </p> <tcl>hd_fragment *shadowtab {FTS shadow tables} {shadow tables}</tcl> <h2 tags="shadowtabs">Shadow Tables</h2> <p> For each FTS virtual table in a database, three to five real (non-virtual) tables are created to store the underlying data. These real tables are called "shadow tables". The real tables are named "%_content", |
︙ | ︙ |
Changes to pages/rbu.in.
︙ | ︙ | |||
76 77 78 79 80 81 82 83 84 | <h2>Preparing An RBU Update File</h2> <p>All changes to be applied by RBU are stored in a separate SQLite database called the "RBU database". The database that is to be modified is called the "target database". <p> For each table in the target database, the RBU database should contain a table | > > > > > > > > > > > > > | > > > > > > > > > > | | | | | < | | 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 | <h2>Preparing An RBU Update File</h2> <p>All changes to be applied by RBU are stored in a separate SQLite database called the "RBU database". The database that is to be modified is called the "target database". <p>For each table in the target database that will be modified by the update, a corresponding table is created within the RBU database. The RBU database table schema is not the same as that of the target database, but is derived from it as [RBU Database Tables | described below]. <p>The RBU database table contains a single row for each target database row inserted, updated or deleted by the update. Populating the RBU database tables is described in [RBU Database Contents | the following section]. </p> <tcl>hd_fragment database_tables {RBU Database Tables}</tcl> <h3>The RBU Database Schema</h3> <p> For each table in the target database, the RBU database should contain a table named "data<<i>integer</i>>_<<i>target-table-name</i>>" where <<i>target-table-name</i>> is the name of the table in the target database and <<i>integer</i>> is any sequence of zero or more numeric characters (0-9). Tables within the RBU database are processed in order by name (from smallest to largest according to the BINARY collation sequence), so the order in which target tables are updated is influenced by the selection of the <<i>integer</i>> portion of the data_% table name. While this can be useful when using RBU to update [RBU FTS3/4 Tables | certain types of virtual tables], there is normally no reason to use anything other than an empty string in place of <<i>integer</i>>. <p>The data_% table must have all the same columns as the target table, plus one additional column named "rbu_control". The data_% table should have no PRIMARY KEY or UNIQUE constraints, but each column should have the same type as the corresponding column in the target database. The rbu_control column should have no type at all. For example, if the target database contains: <tcl>CODE { CREATE TABLE t1(a INTEGER PRIMARY KEY, b TEXT, c UNIQUE); }</tcl> <p>Then the RBU database should contain: |
︙ | ︙ | |||
135 136 137 138 139 140 141 142 143 144 145 146 147 148 | <p>Either of the following input table schemas may be used: <tcl>CODE { CREATE TABLE data_ft1(a, b, langid, rbu_rowid, rbu_control); CREATE TABLE data_ft1(a, b, rbu_rowid, rbu_control); }</tcl> <p>For each row to INSERT into the target database as part of the RBU update, the corresponding data_% table should contain a single record with the "rbu_control" column set to contain integer value 0. The other columns should be set to the values that make up the new record to insert. | > > > | 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 | <p>Either of the following input table schemas may be used: <tcl>CODE { CREATE TABLE data_ft1(a, b, langid, rbu_rowid, rbu_control); CREATE TABLE data_ft1(a, b, rbu_rowid, rbu_control); }</tcl> <tcl>hd_fragment database_contents {RBU Database Contents}</tcl> <h3>RBU Database Contents</h3> <p>For each row to INSERT into the target database as part of the RBU update, the corresponding data_% table should contain a single record with the "rbu_control" column set to contain integer value 0. The other columns should be set to the values that make up the new record to insert. |
︙ | ︙ | |||
177 178 179 180 181 182 183 | <p>is represented by the data_t1 row created by: <tcl>CODE { INSERT INTO data_t1(a, b, c, rbu_control) VALUES(4, NULL, 'usa', '..x'); }</tcl> | > > > > | > > > | > > | > > > > > > > > > > | | | > > > > | > | < | 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 | <p>is represented by the data_t1 row created by: <tcl>CODE { INSERT INTO data_t1(a, b, c, rbu_control) VALUES(4, NULL, 'usa', '..x'); }</tcl> <p>If RBU is used to update a large BLOB value within a target database, it may be be more efficient to store a patch or delta that can be used to modify the existing BLOB instead of an entirely new value within the RBU database. RBU allows deltas to be specified in two ways: <ul> <li> In the "fossil delta" format - the format used for blob deltas by the <a href=http://fossil-scm.org>Fossil source-code management system</a>, or <li> In a custom format defined by the RBU application. </ul> <p> The fossil delta format may only be used to update BLOB values. Instead of storing the new BLOB within the data_% table, the fossil delta is stored instead. And instead of specifying an 'x' as part of the ota_control string for the column to be updated, an 'f' character is stored. When processing an 'f' update, RBU loads the original BLOB data from disk, applies the fossil delta to it and stores the results back into the database file. The RBU databases generated by [sqldiff --rbu] make use of fossil deltas wherever doing so would save space in the RBU database. <p> To use a custom delta format, the RBU application must register a user-defined SQL function named "rbu_delta" before beginning to process the update. rbu_delta() will be invoked with two arguments - the original value stored in the target table column and the delta value provided as part of the RBU update. It should return the result of applying the delta to the original value. To use the custom delta function, the character of the rbu_control value corresponding to the target column to update must be set to 'd' instead of 'x'. Then, instead of updating the target table with the value stored in the corresponding data_% column, RBU invokes the user-defined SQL function "rbu_delta()" and the store in the target table column. <p>For example, this row: <tcl>CODE { INSERT INTO data_t1(a, b, c, rbu_control) VALUES(4, NULL, 'usa', '..d'); }</tcl> <p>causes RBU to update the target database table in a way similar to: <tcl>CODE { UPDATE t1 SET c = rbu_delta(c, 'usa') WHERE a = 4; }</tcl> <p>If the target database table is a virtual table or a table with no PRIMARY KEY, the rbu_control value should not include a character corresponding |
︙ | ︙ | |||
214 215 216 217 218 219 220 | <p>causes a result similar to: <tcl>CODE { UPDATE ft1 SET b = 'usa' WHERE rowid = 12; }</tcl> | | | | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 | <p>causes a result similar to: <tcl>CODE { UPDATE ft1 SET b = 'usa' WHERE rowid = 12; }</tcl> <p>The data_% tables themselves should have no PRIMARY KEY declarations. However, RBU is more efficient if reading the rows in from each data_% table in "rowid" order is roughly the same as reading them sorted by the PRIMARY KEY of the corresponding target database table. In other words, rows should be sorted using the destination table PRIMARY KEY fields before they are inserted into the data_% tables. <tcl>hd_fragment fts4_tables {RBU FTS3/4 Tables}</tcl> <h3>Using RBU with FTS3/4 Tables</h3> <p>Usually, an [FTS3 | FTS3 or FTS4] table is an example of a virtual table with a rowid that works like a PRIMARY KEY. So, for the following FTS4 tables: <tcl>CODE { CREATE VIRTUAL TABLE ft1 USING fts4(addr, text); CREATE VIRTUAL TABLE ft2 USING fts4; -- implicit "content" column }</tcl> <p>The data_% tables may be created as follows: <tcl>CODE { CREATE TABLE data_ft1 USING fts4(addr, text, rbu_rowid, rbu_control); CREATE TABLE data_ft2 USING fts4(content, rbu_rowid, rbu_control); }</tcl> <p>And populated as if the target table were an ordinary SQLite table with no explicit PRIMARY KEY columns. <p>[contentless fts4 tables | Contentless FTS4 tables] are handled similarly, except that any attempt to update or delete rows will cause an error when applying the update. <p>[external content fts4 tables | External content FTS4 tables] may also be updated using RBU. In this case the user is required to configure the RBU database so that the same set of UPDATE, DELETE and INSERT operations are applied to the FTS4 index as to the underlying content table. As for all updates of external content FTS4 tables, the user is also required to ensure that any UPDATE or DELETE operations are applied to the FTS4 index before they are applied to the underlying content table (refer to FTS4 documentation for a detailed explanation). In RBU, this is done by ensuring that the name of the data_% table used to write to the FTS4 table sorts before the name of the data_% table used to update the underlying content table using the [BINARY] collation sequence. In order to avoid duplicating data within the RBU database, an SQL view may be used in place of one of the data_% tables. For example, for the target database schema: <tcl>CODE { CREATE TABLE ccc(addr, text); CREATE VIRTUAL TABLE ccc_fts USING fts4(addr, text, content=ccc); }</tcl> <p> The following RBU database schema may be used: <tcl>CODE { CREATE TABLE data_ccc(addr, text, rbu_rowid, rbu_control); CREATE VIEW data0_ccc_fts AS SELECT * FROM data_ccc; }</tcl> <p> The data_ccc table may then be populated as normal with the updates intended for target database table ccc. The same updates will be read by RBU from the data0_ccc_fts view and applied to FTS table ccc_fts. Because "data0_ccc_fts" is smalle than "data_ccc", the FTS table will be updated first, as required. <p> Cases in which the underlying content table has an explicit INTEGER PRIMARY KEY column are slightly more difficult, as the text values stored in the ota_control column are slightly different for the FTS index and its underlying content table. For the underlying content table, a character must be included in any ota_control text values for the explicit IPK, but for the FTS table itself, which has an implicit rowid, it should not. This is inconvenient, but can be solved using a more complicated view, as follows: <tcl>CODE { -- Target database schema CREATE TABLE ddd(i INTEGER PRIMARY KEY, k TEXT); CREATE VIRTUAL TABLE ddd_fts USING fts4(k, content=ddd); -- RBU database schema CREATE TABLE data_ccc(i, k, rbu_control); CREATE VIEW data0_ccc_fts AS SELECT i AS rbu_rowid, k, CASE WHEN rbu_control IN (0,1) THEN rbu_control ELSE substr(rbu_control, 2) END FROM data_ccc; }</tcl> <p> The substr() function in the SQL view above returns the text of the rbu_control argument with the first character (the one corresponding to column "i", which is not required by the FTS table) removed. <tcl>hd_fragment sqldiff {sqldiff --rbu}</tcl> <h3>Automatically Generating RBU Updates with sqldiff</h3> <p> As of SQLite version 3.8.12, the [sqldiff] utility is able to generate RBU databases representing the difference between two databases with identical schemas. For example, the following command: <tcl>CODE { sqldiff --rbu t1.db t2.db }</tcl> <p> Outputs an SQL script to create an RBU database which, if used to update database t1.db, patches it so that its contents are identical to that of database t2.db. <p> By default, sqldiff attempts to process all non-virtual tables within the two databases provided to it. If any table appears in one database but not the other, or if any table has a slightly different schema in one database it is an error. The "--table" option may be useful if this causes a problem <p> Virtual tables are ignored by default by sqldiff. However, it is possible to explicitly create an RBU data_% table for a virtual table that features a rowid that functions like a primary key using a command such as: <tcl>CODE { sqldiff --rbu --table <<i>virtual-table-name</i>> t1.db t2.db }</tcl> <p> Unfortunately, even though virtual tables are ignored by default, any [FTS shadow tables | underlying database tables] that they create in order to store data within the database are not, and [sqldiff] will include add these to any RBU database. For this reason, users attempting to use sqldiff to create RBU updates to apply to target databases with one or more virtual tables will likely have to run sqldiff using the --table option separately for each table to update in the target database. <h2>C/C++ Interface</h2> <p>Enable the RBU extension by compiling the [amalgamation] with the [SQLITE_ENABLE_RBU] compile-time option. <p>The RBU extension interface allows an application to apply an RBU update |
︙ | ︙ | |||
279 280 281 282 283 284 285 | <p>If an update is only partially applied to the target database by the time sqlite3rbu_close() is called, state information is saved within the state database if it exists, or otherwise in the RBU database. This allows subsequent processes to automatically resume the RBU update from where it left off. If state information is stored in the RBU database, it can be removed by dropping all tables whose names begin with "rbu_". | > > > > | 454 455 456 457 458 459 460 461 462 463 464 | <p>If an update is only partially applied to the target database by the time sqlite3rbu_close() is called, state information is saved within the state database if it exists, or otherwise in the RBU database. This allows subsequent processes to automatically resume the RBU update from where it left off. If state information is stored in the RBU database, it can be removed by dropping all tables whose names begin with "rbu_". <p>For more details, refer to the comments in <a href=http://sqlite.org/src/doc/trunk/ext/rbu/sqlite3rbu.h>header file sqlite3ota.h<a>. |