Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Convert the RTree documentation to fancy-format. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
cb9d13f9313bb61e01e8f81df58c51b1 |
User & Date: | drh 2016-07-13 14:17:01.539 |
Context
2016-07-15
| ||
02:59 | Fix typo in the when-to-use document. (check-in: faf1f46a56 user: drh tags: trunk) | |
2016-07-13
| ||
14:17 | Convert the RTree documentation to fancy-format. (check-in: cb9d13f931 user: drh tags: trunk) | |
13:57 | Add documentation on the dbhash.exe utility and an entry in the change-log. (check-in: f80207b028 user: drh tags: trunk) | |
Changes
Changes to pages/rtree.in.
1 2 | <title>The SQLite R*Tree Module</title> <tcl>hd_keywords *rtree *RTREE {R-Tree extension} {R-Trees}</tcl> | | | | 1 2 3 4 5 6 7 8 9 10 11 12 | <title>The SQLite R*Tree Module</title> <tcl>hd_keywords *rtree *RTREE {R-Tree extension} {R-Trees}</tcl> <table_of_contents> <h1>Overview</h1> <p> An [http://en.wikipedia.org/wiki/R-tree | R-Tree] is a special index that is designed for doing range queries. R-Trees are most commonly used in geospatial systems where each entry is a rectangle with minimum and maximum X and Y coordinates. Given a query rectangle, an R-Tree is able to quickly find all entries that are contained within the query rectangle |
︙ | ︙ | |||
29 30 31 32 33 34 35 | The implementation found in SQLite is a refinement of Guttman's original idea, commonly called "R*Trees", that was described by Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider, Bernhard Seeger: <em>The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles.</em> SIGMOD Conference 1990: 322-331. </p> | | | | 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | The implementation found in SQLite is a refinement of Guttman's original idea, commonly called "R*Trees", that was described by Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider, Bernhard Seeger: <em>The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles.</em> SIGMOD Conference 1990: 322-331. </p> <h1>Compiling The R*Tree Module</h1> <p> The source code to the SQLite R*Tree module is included as part of the [amalgamation] but is disabled by default. To enable the R*Tree module, simply compile with the [SQLITE_ENABLE_RTREE] C-preprocessor macro defined. With many compilers, this is accomplished by adding the option "-DSQLITE_ENABLE_RTREE=1" to the compiler command-line. </p> <h1>Using the R*Tree Module</h1> <p> The SQLite R*Tree module is implemented as a [sqlite3_create_module | virtual table]. ^Each R*Tree index is a virtual table with an odd number of columns between 3 and 11. ^The first column is always a 64-bit signed integer primary key. ^The other columns are pairs, one pair per dimension, containing the |
︙ | ︙ | |||
74 75 76 77 78 79 80 | "rtree" virtual tables or as 32-bit signed integers in "rtree_i32" virtual tables. ^Unlike regular SQLite tables which can store data in a variety of datatypes and formats, the R*Tree rigidly enforce these storage types. If any other type of value is inserted into such a column, the r-tree module silently converts it to the required type before writing the new record to the database. | | | 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 | "rtree" virtual tables or as 32-bit signed integers in "rtree_i32" virtual tables. ^Unlike regular SQLite tables which can store data in a variety of datatypes and formats, the R*Tree rigidly enforce these storage types. If any other type of value is inserted into such a column, the r-tree module silently converts it to the required type before writing the new record to the database. <h2>Creating An R*Tree Index</h2> ^(<p> A new R*Tree index is created as follows: </p> <blockquote><pre> CREATE VIRTUAL TABLE <em><name></em> USING rtree(<em><column-names></em>); |
︙ | ︙ | |||
121 122 123 124 125 126 127 | CREATE VIRTUAL TABLE demo_index USING rtree( id, -- Integer primary key minX, maxX, -- Minimum and maximum X coordinate minY, maxY -- Minimum and maximum Y coordinate ); </pre></blockquote>)^ | | | 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 | CREATE VIRTUAL TABLE demo_index USING rtree( id, -- Integer primary key minX, maxX, -- Minimum and maximum X coordinate minY, maxY -- Minimum and maximum Y coordinate ); </pre></blockquote>)^ <h2>Populating An R*Tree Index</h2> <p> ^The usual [INSERT], [UPDATE], and [DELETE] commands work on an R*Tree index just like on regular tables. ^(So to insert some data into our sample R*Tree index, we can do something like this: </p> |
︙ | ︙ | |||
149 150 151 152 153 154 155 | <p> The entries above might represent (for example) a bounding box around the main office for SQLite.org and bounding box around the 12th Congressional District of North Carolina (prior to the 2011 redistricting) in which SQLite.org was located. </p> | | | 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 | <p> The entries above might represent (for example) a bounding box around the main office for SQLite.org and bounding box around the 12th Congressional District of North Carolina (prior to the 2011 redistricting) in which SQLite.org was located. </p> <h2>Querying An R*Tree Index</h2> <p> ^Any valid query will work against an R*Tree index. But the R*Tree implementation is designed to make two kinds of queries especially efficient. ^(First, queries against the primary key are efficient: </p> |
︙ | ︙ | |||
215 216 217 218 219 220 221 | <p> But, generally speaking, the more constraints that the R*Tree module has to work with, and the smaller the bounding box, the faster the results will come back. </p> | | | 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 | <p> But, generally speaking, the more constraints that the R*Tree module has to work with, and the smaller the bounding box, the faster the results will come back. </p> <h2>Roundoff Error</h2> <p> By default, coordinates are stored in an R*Tree using 32-bit floating point values. When a coordinate cannot be exactly represented by a 32-bit floating point number, the lower-bound coordinates are rounded down and the upper-bound coordinates are rounded up. Thus, bounding boxes might be slightly larger than specified, but will never be any smaller. This |
︙ | ︙ | |||
237 238 239 240 241 242 243 | <p>However, for a "contained-within" style query, rounding the bounding boxes outward might cause some entries to be excluded from the result set if the edge of the entry bounding box corresponds to the edge of the query bounding box. To guard against this, applications should expand their contained-within query boxes slightly (by 0.000012%) by rounding down the lower coordinates and rounding up the top coordinates, in each dimension. | | | 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 | <p>However, for a "contained-within" style query, rounding the bounding boxes outward might cause some entries to be excluded from the result set if the edge of the entry bounding box corresponds to the edge of the query bounding box. To guard against this, applications should expand their contained-within query boxes slightly (by 0.000012%) by rounding down the lower coordinates and rounding up the top coordinates, in each dimension. <h1>Using R*Trees Effectively</h1> <p> The only information that an R*Tree index stores about an object is its integer ID and its bounding box. Additional information needs to be stored in separate tables and related to the R*Tree index using the primary key. ^(For the example above, one might create an auxiliary table as follows: |
︙ | ︙ | |||
322 323 324 325 326 327 328 | contained_in() function to millions of entries in the demo_data table. The use of the R*Tree in the penultimate query reduces the number of calls to contained_in() function to a small subset of the entire table. The R*Tree index did not find the exact answer itself, it merely limited the search space.</p> <tcl>hd_fragment {intrtree} {integer-valued r-trees}</tcl> | | | 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 | contained_in() function to millions of entries in the demo_data table. The use of the R*Tree in the penultimate query reduces the number of calls to contained_in() function to a small subset of the entire table. The R*Tree index did not find the exact answer itself, it merely limited the search space.</p> <tcl>hd_fragment {intrtree} {integer-valued r-trees}</tcl> <h1>Integer-Valued R-Trees</h1> <p> The default virtual table ("rtree") normally stores coordinates as single-precision (4-byte) floating point numbers. If integer coordinates are desired, declare the table using "rtree_i32" instead: <blockquote><pre> |
︙ | ︙ | |||
344 345 346 347 348 349 350 | it might be desirable to have a pure integer implementation of r-trees. This is accomplished by compiling with the [SQLITE_RTREE_INT_ONLY] option. When SQLITE_RTREE_INT_ONLY is used, both the "rtree" and the "rtree_i32" virtual tables store 32-bit integers and only integer values are used for internal computations. <tcl>hd_fragment {customquery} {custom r-tree queries}</tcl> | | | 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 | it might be desirable to have a pure integer implementation of r-trees. This is accomplished by compiling with the [SQLITE_RTREE_INT_ONLY] option. When SQLITE_RTREE_INT_ONLY is used, both the "rtree" and the "rtree_i32" virtual tables store 32-bit integers and only integer values are used for internal computations. <tcl>hd_fragment {customquery} {custom r-tree queries}</tcl> <h1>Custom R-Tree Queries</h1> <p>By using standard SQL expressions in the WHERE clause of a SELECT query, a programmer can query for all R*Tree entries that have a spatial relationship (they intersect with or are contained within) a particular bounding-box. Custom R*Tree queries, using the MATCH operator in the WHERE clause of a SELECT, allow the programmer to query for the set of R*Tree entries that intersect any arbitrary region or shape, not |
︙ | ︙ | |||
401 402 403 404 405 406 407 | </blockquote></pre>)^ <p>^The SQL syntax for custom queries is the same regardless of which interface, sqlite3_rtree_geometry_callback() or sqlite3_rtree_query_callback(), is used to register the SQL function. However, the newer query-style callbacks give the application greater control over how the query proceeds. | | | 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 | </blockquote></pre>)^ <p>^The SQL syntax for custom queries is the same regardless of which interface, sqlite3_rtree_geometry_callback() or sqlite3_rtree_query_callback(), is used to register the SQL function. However, the newer query-style callbacks give the application greater control over how the query proceeds. <h2>The Legacy xGeom Callback</h2> <p>^The legacy xGeom callback is invoked with four arguments. ^The first argument is a pointer to an sqlite3_rtree_geometry structure which provides information about how the SQL function was invoked. ^The second argument is the number of coordinates in each r-tree entry, and is always the same for any given R*Tree. ^The number of coordinates is 2 for a 1-dimensional R*Tree, 4 for a 2-dimensional R*Tree, 6 for a 3-dimensional R*Tree, and so forth. |
︙ | ︙ | |||
460 461 462 463 464 465 466 | value of the pUser variable as the only argument. In other words, xDelUser may be set to a destructor function for the pUser value. <p>^The xGeom callback always does a depth-first search of the r-tree. <tcl>hd_fragment xquery {xQueryFunc R*Tree callback} {sqlite3_rtree_query_callback} </tcl> | | | 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 | value of the pUser variable as the only argument. In other words, xDelUser may be set to a destructor function for the pUser value. <p>^The xGeom callback always does a depth-first search of the r-tree. <tcl>hd_fragment xquery {xQueryFunc R*Tree callback} {sqlite3_rtree_query_callback} </tcl> <h2>The New xQueryFunc Callback</h2> <p>The newer xQueryFunc callback receives more information from the r-tree query engine on each call, and it sends more information back to the query engine before it returns. To help keep the interface manageable, the xQueryFunc callback sends and receives information from the query engine as fields in the sqlite3_rtree_query_info structure: |
︙ | ︙ | |||
552 553 554 555 556 557 558 | is the rowid (the first of the 3 to 11 columns in the R*Tree) for the element being considered. iRowid is only valid for leaves.)^ ^The eParentWithin and rParentScore values are copies of the eWithin and rScore values from the containing subtree of the current row. ^The anQueue field is an array of mxLevel+1 unsigned integers that tell the current number of elements in the priority queue at each level. | | | 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 | is the rowid (the first of the 3 to 11 columns in the R*Tree) for the element being considered. iRowid is only valid for leaves.)^ ^The eParentWithin and rParentScore values are copies of the eWithin and rScore values from the containing subtree of the current row. ^The anQueue field is an array of mxLevel+1 unsigned integers that tell the current number of elements in the priority queue at each level. <h2>Additional Considerations for Custom Queries</h2> <p> ^The MATCH operator of a custom R*Tree query function must be a top-level AND-connected term of the WHERE clause, or else it will not be usable by the R*Tree query optimizer and the query will not be runnable. ^If the MATCH operator is connected to other terms of the WHERE clause via an OR operator, for example, the query will fail with an error. |
︙ | ︙ |