Documentation Source Text

Check-in [daa211f9cd]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Update docs for the r-tree auxiliary columns.
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256:daa211f9cdc33065b4d33b0b28a2c1aa3612801bd7909bdc4fe3bad22f9b2ffb
User & Date: drh 2018-05-18 18:02:09
Context
2018-05-18
18:02
Merge fixes from the 3.23 branch. check-in: 45753832a8 user: drh tags: trunk
18:02
Update docs for the r-tree auxiliary columns. check-in: daa211f9cd user: drh tags: trunk
2018-05-17
20:40
Updates to the change log and other documentation pages. check-in: 4c3f0e6adb user: drh tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to pages/changes.in.

19
20
21
22
23
24
25

26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
  set aChng($nChng) [list $date $desc $options]
  set xrefChng($date) $nChng
  incr nChng
}

chng {2018-07-00 (3.24.0)} {
<li> Add support for PostgreSQL-style [UPSERT].

<li> Added new C-language APIs for discovering SQL keywords used by
     SQLite: [sqlite3_keyword_count()], [sqlite3_keyword_name()], and
     [sqlite3_keyword_check()].
<li> Added new C-language APIs for dynamic strings based on the
     [sqlite3_str] object.
<li> Enhance [ALTER TABLE] so that it recognizes "true" and "false" as
     valid arguments to DEFAULT.
<li> Added the sorter-reference optimization as a compile-time option.
     Only available if compiled with SQLITE_ENABLE_SORTER_REFERENCES.
<li> Improve the format of the [EXPLAIN QUERY PLAN] raw output, so that
     it gives better information about the query plan and about the
     relationships between the various components of the plan.
<li> Added the [SQLITE_DBCONFIG_RESET_DATABASE] option to the
     [sqlite3_db_config()] API.
<p><b>[CLI] Enhancements:</b>







>
|


|



|







19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
  set aChng($nChng) [list $date $desc $options]
  set xrefChng($date) $nChng
  incr nChng
}

chng {2018-07-00 (3.24.0)} {
<li> Add support for PostgreSQL-style [UPSERT].
<li> Add support for [auxiliary columns in r-tree tables].
<li> Add C-language APIs for discovering SQL keywords used by
     SQLite: [sqlite3_keyword_count()], [sqlite3_keyword_name()], and
     [sqlite3_keyword_check()].
<li> Add C-language APIs for dynamic strings based on the
     [sqlite3_str] object.
<li> Enhance [ALTER TABLE] so that it recognizes "true" and "false" as
     valid arguments to DEFAULT.
<li> Add the sorter-reference optimization as a compile-time option.
     Only available if compiled with SQLITE_ENABLE_SORTER_REFERENCES.
<li> Improve the format of the [EXPLAIN QUERY PLAN] raw output, so that
     it gives better information about the query plan and about the
     relationships between the various components of the plan.
<li> Added the [SQLITE_DBCONFIG_RESET_DATABASE] option to the
     [sqlite3_db_config()] API.
<p><b>[CLI] Enhancements:</b>

Changes to pages/rtree.in.

240
241
242
243
244
245
246

247
248
249
250
251
252
253
254
...
285
286
287
288
289
290
291

292
293
294
295
296
297
298
...
320
321
322
323
324
325
326










































327
328
329
330
331
332
333
bounding box.  To guard against this, applications should expand their
contained-within query boxes slightly (by 0.000012%) by rounding down the
lower coordinates and rounding up the top coordinates, in each dimension.

<h1>Using R*Trees Effectively</h1>

<p>

The only information that an R*Tree index stores about an object is
its integer ID and its bounding box.  Additional information needs to
be stored in separate tables and related to the R*Tree index using
the primary key.  ^(For the example above, one might create an auxiliary
table as follows:
</p>

<blockquote><pre>
................................................................................
"overlaps" accepting two demo_data.boundary objects and return true or false.
One may assume that "contained_in" and "overlaps" are relatively slow
functions that we do not want to invoke too frequently.
^(Then an efficient way to find the name of all objects located within
the North Carolina 12th District, one may be to run a query like this:
</p>


<blockquote><pre>
SELECT objname FROM demo_data, demo_index
 WHERE demo_data.id=demo_index.id
   AND contained_in(demo_data.boundary, :boundary)
   AND minX&gt;=-81.0 AND maxX&lt;=-79.6
   AND minY&gt;=35.0 AND maxY&gt;=36.2;
</pre></blockquote>)^
................................................................................

<p>The problem with this latter query is that it must apply the
contained_in() function to millions of entries in the demo_data table.
The use of the R*Tree in the penultimate query reduces the number of
calls to contained_in() function to a small subset of the entire table.
The R*Tree index did not find the exact answer itself, it merely
limited the search space.</p>











































<tcl>hd_fragment {intrtree} {integer-valued r-trees}</tcl>
<h1>Integer-Valued R-Trees</h1>

<p>
The default virtual table ("rtree") normally stores coordinates as
single-precision (4-byte) floating point numbers.  If integer coordinates







>
|







 







>







 







>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
...
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
...
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
bounding box.  To guard against this, applications should expand their
contained-within query boxes slightly (by 0.000012%) by rounding down the
lower coordinates and rounding up the top coordinates, in each dimension.

<h1>Using R*Trees Effectively</h1>

<p>
For SQLite versions prior to 3.24.0 ([dateof:3.24.0]),
the only information that an R*Tree index stores about an object is
its integer ID and its bounding box.  Additional information needs to
be stored in separate tables and related to the R*Tree index using
the primary key.  ^(For the example above, one might create an auxiliary
table as follows:
</p>

<blockquote><pre>
................................................................................
"overlaps" accepting two demo_data.boundary objects and return true or false.
One may assume that "contained_in" and "overlaps" are relatively slow
functions that we do not want to invoke too frequently.
^(Then an efficient way to find the name of all objects located within
the North Carolina 12th District, one may be to run a query like this:
</p>

<a name="diquery"></a>
<blockquote><pre>
SELECT objname FROM demo_data, demo_index
 WHERE demo_data.id=demo_index.id
   AND contained_in(demo_data.boundary, :boundary)
   AND minX&gt;=-81.0 AND maxX&lt;=-79.6
   AND minY&gt;=35.0 AND maxY&gt;=36.2;
</pre></blockquote>)^
................................................................................

<p>The problem with this latter query is that it must apply the
contained_in() function to millions of entries in the demo_data table.
The use of the R*Tree in the penultimate query reduces the number of
calls to contained_in() function to a small subset of the entire table.
The R*Tree index did not find the exact answer itself, it merely
limited the search space.</p>

<tcl>hd_fragment {auxcol} {auxiliary columns in r-tree tables}</tcl>
<h2>Auxiliary Columns</h2>

<p>
Beginning with SQLite version 3.24.0 ([dateof:3.24.0]), r-tree tables
can have auxiliary columns that store arbitrary data.
Auxiliary columns can be used in place of
secondary tables such as "demo_data".

<p>
Auxiliary columns are marked with a "+" symbol before the column name.
Auxiliary columns must come after all of the coordinate boundary columns.
There is a limit of no more than 100 auxiliary columns.
The following example shows an r-tree table with auxiliary columns that
is equivalent to the two tables "demo_index" and "demo_data" above:

^(<blockquote><pre>
CREATE VIRTUAL TABLE demo_index2 USING rtree(
   id,              -- Integer primary key
   minX, maxX,      -- Minimum and maximum X coordinate
   minY, maxY,      -- Minimum and maximum Y coordinate
   +objname TEXT,   -- name of the object
   +objtype TEXT,   -- object type
   +boundary BLOB   -- detailed boundary of object
);
</pre></blockquote>)^

<p>
By combining location data and related information into the same
table, auxiliary columns can provide a cleaner model
and reduce the need to joins.
For example, the earlier
<a href="#diquery">join between demo_index and demo_data</a> can now
be written as a simple query, like this:

^(<blockquote><pre>
SELECT objname FROM demo_index2
 WHERE contained_in(boundary, :boundary)
   AND minX&gt;=-81.0 AND maxX&lt;=-79.6
   AND minY&gt;=35.0 AND maxY&gt;=36.2;
</pre></blockquote>)^

<tcl>hd_fragment {intrtree} {integer-valued r-trees}</tcl>
<h1>Integer-Valued R-Trees</h1>

<p>
The default virtual table ("rtree") normally stores coordinates as
single-precision (4-byte) floating point numbers.  If integer coordinates