Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Additional work on the R-Tree documentation. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
010e1a52d7a452b87be8a26708b46951 |
User & Date: | drh 2008-06-27 18:56:49.000 |
Context
2008-06-28
| ||
13:31 | Copy the date+time function documentation out of the wiki. (check-in: 5178724f27 user: drh tags: trunk) | |
2008-06-27
| ||
18:56 | Additional work on the R-Tree documentation. (check-in: 010e1a52d7 user: drh tags: trunk) | |
17:20 | Initial check-in of the rtree documentation. Very incomplete. (check-in: a7b39daebb user: drh tags: trunk) | |
Changes
Changes to pages/35to36.in.
︙ | ︙ | |||
257 258 259 260 261 262 263 | <li><p>The [sqlite3_next_stmt()] interface allows an application to discover all [prepared statements] associated with a [database connection].</p></li> <li><p>Added the [page_count] PRAGMA for returning the size of the underlying database file in pages.</p></li> | | < | 257 258 259 260 261 262 263 264 265 266 267 268 | <li><p>The [sqlite3_next_stmt()] interface allows an application to discover all [prepared statements] associated with a [database connection].</p></li> <li><p>Added the [page_count] PRAGMA for returning the size of the underlying database file in pages.</p></li> <li><p>Added a new [rtree | R*Tree index extension].</p></li> </ol> } </tcl> |
Changes to pages/amalgamation.in.
1 2 3 4 5 6 7 | <title>The SQLite Amalgamation</title> <tcl>hd_keywords {amalgamation} {the amalgamation}</tcl> <h2>The SQLite Amalgamation</h2> <p>The core SQLite library consists of about 76 files of C code (as of version 3.6.0) in the core with 11 additional files | | | | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | <title>The SQLite Amalgamation</title> <tcl>hd_keywords {amalgamation} {the amalgamation}</tcl> <h2>The SQLite Amalgamation</h2> <p>The core SQLite library consists of about 76 files of C code (as of version 3.6.0) in the core with 11 additional files in the FTS3 and [RTREE] extensions. Most of these are "source" files in the sense that they are stored in the configuration management system and are maintained directly. But 6 of the core C files are generated automatically during the compilation process. Of the 87 code files, 66 are C code and 21 are C header files.</p> <p>The standard makefiles for SQLite have a target for building an object we call the "amalgamation". The amalgamation is a single C code file, named "sqlite3.c", that contains all C code for the core SQLite library and the FTS3 and [RTREE] extensions. This file contains about 91K lines of code (55K if you omit blank lines and comments) and is over 3 megabytes in size.</p> <p>The amalgamation contains everything you need to integrate SQLite into a larger project. Just copy the amalgamation into your source directory and compile it along with the other C code files in your project. |
︙ | ︙ |
Changes to pages/changes.in.
︙ | ︙ | |||
64 65 66 67 68 69 70 | <li>Added the [sqlite3_config()] interface for doing run-time configuration of the entire SQLite library.</li> <li>Added the [sqlite3_status()] interface used for querying run-time status information about the overall SQLite library and its subsystems.</li> <li>Added the [sqlite3_initialize()] and [sqlite3_shutdown()] interfaces.</li> <li>Added the [PRAGMA page_count] command.</li> <li>Added the [sqlite3_next_stmt()] interface.</li> | | | 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 | <li>Added the [sqlite3_config()] interface for doing run-time configuration of the entire SQLite library.</li> <li>Added the [sqlite3_status()] interface used for querying run-time status information about the overall SQLite library and its subsystems.</li> <li>Added the [sqlite3_initialize()] and [sqlite3_shutdown()] interfaces.</li> <li>Added the [PRAGMA page_count] command.</li> <li>Added the [sqlite3_next_stmt()] interface.</li> <li>Added a new [rtree | R*Tree virtual table]</li> } chng {2008 May 14 (3.5.9)} { <li>Added <em>experimental</em> support for the [journal_mode] PRAGMA and persistent journal.</li> <li>[journal_mode | Journal mode PERSIST] is the default behavior in [locking_mode | exclusive locking mode].</li> |
︙ | ︙ |
Changes to pages/compile.in.
︙ | ︙ | |||
225 226 227 228 229 230 231 | option is not used, the [sqlite3_release_memory()] interface is a no-op. Since [sqlite3_soft_heap_limit()] depends on [sqlite3_release_memory()], this option is also necessary for the correct operation of [sqlite3_soft_heap_limit()]. } COMPILE_OPTION {SQLITE_ENABLE_RTREE} { | | | | 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 | option is not used, the [sqlite3_release_memory()] interface is a no-op. Since [sqlite3_soft_heap_limit()] depends on [sqlite3_release_memory()], this option is also necessary for the correct operation of [sqlite3_soft_heap_limit()]. } COMPILE_OPTION {SQLITE_ENABLE_RTREE} { This option causes SQLite to include support for the [rtree | R*Tree index extension]. } </tcl> <a name="omitfeatures"></a> <h2>1.5 Options To Omit Features</h2> <p>The following options can used to reduce the size of the compiled |
︙ | ︙ |
Changes to pages/rtree.in.
1 | <title>The SQLite R*Tree Module</title> | | | 1 2 3 4 5 6 7 8 9 | <title>The SQLite R*Tree Module</title> <tcl>hd_keywords rtree RTREE</tcl> <h1>1.0 Overview</h1> <p> An <a href="http://en.wikipedia.org/wiki/R-tree">R-Tree</a> is a special index that is designed for doing range queries. R-Trees are most commonly used in geospatial systems where each entry is a rectangle with minimum and |
︙ | ︙ | |||
163 164 165 166 167 168 169 | you can efficiently do inequality queries against the coordinate ranges. To find all elements of the index that are contained within the vicinity of Charlotte, North Carolina, one might do: </p> <blockquote><pre> SELECT id FROM demo_index | | | | | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 | you can efficiently do inequality queries against the coordinate ranges. To find all elements of the index that are contained within the vicinity of Charlotte, North Carolina, one might do: </p> <blockquote><pre> SELECT id FROM demo_index WHERE minX>=-81.08 AND maxX<=-80.58 AND minY>=35.00 AND maxY<=35.44; </pre></blockquote> <p> The query above would very quickly locate id of 1 even if the R*Tree contained millions of entries. The previous is an example of a "contained-within" query. The R*Tree also supports "overlapping" queries. For example, to find all bounding boxes that overlap the Charlotte area: </p> <blockquote><pre> SELECT id FROM demo_index WHERE maxX>=-81.08 AND minX<=-80.58 AND maxY>=35.00 AND minY<=35.44; </pre></blockquote> <p> This second query would find both entry 1 (the SQLite.org office) which is entirely contained within the query box and also Mel Watt's Congressional District which extends well outside the query box but still overlaps the query box. </p> <p> Note that is not necessary for all coordinates in an R*Tree index to be constrained in order for the index search to be efficient. One might, for example, want to query all objects that overlap with the 35th parallel: </p> <blockquote><pre> SELECT id FROM demo_index WHERE maxY>=35.0 AND minY<=35.0; </pre></blockquote> <p> But, generally speaking, the more constraints that the R*Tree module has to work with, and the smaller the bounding box, the faster the results will come back. </p> <h1>4.0 Using R*Trees Effectively</h1> <p> The only information that an R*Tree index stores about an object is its integer ID and its bounding box. Additional information needs to be stored in separate tables and related to the R*Tree index using the primary key. For the example above, one might create an auxiliary table as follows: </p> <blockquote><pre> CREATE TABLE demo_data( id INTEGER PRIMARY KEY, -- primary key objname TEXT, -- name of the object objtype TEXT, -- object type boundary BLOB -- detailed boundary of object ); </pre></blockquote> <p> In this example, the demo_data.boundary field is intended to hold some kind of binary representation of the precise boundaries of the object. The R*Tree index only holds an axis-aligned rectanglar boundary for the object. The R*Tree boundary is just an approximation of the true object boundary. So what typically happens is that the R*Tree index is used to narrow a search down to a list of candidate objects and then more detailed and expensive computations are done on each candidate to find if the candidate truely meets the search criteria. </p> <blockquote><p> <b>Key Point:</b> An R*Tree index does not normally provide the exact answer but merely reduces the set of potential answers from millions to dozens. </p></blockquote> <p> Suppose the demo_data.boundary field holds some proprietary data description of a complex two-dimensional boundary for an object and suppose that the application has used the [sqlite3_create_function()] interface to created application-defined functions "contained_in" and "overlaps" accept two demo_data.boundary objects and return true or false. One may assume that "contained_in" and "overlaps" are relatively slow functions that we do not want to invoke too frequently. Then an efficient way to find the name of all objects located within the North Carolina 12th District, one my run a query like this: </p> <blockquote><pre> SELECT objname FROM demo_data, demo_index WHERE demo_data.id=demo_index.id AND contained_in(demo_data.boundary, :boundary) AND minX>=-81.0 AND max<=-79.6 AND minY>=35.0 AND maxY<=36.2; </pre></blockquote> <p>In the query above, one would presumably bind the binary BLOB description of the precise boundary of the 12th district to the ":boundary" parameter.</p> <p>Notice how the query above works: The R*Tree index runs in the outer loop to find entries that are contained within the bounding box of longitude -81..-79.6 and latitude 35.0..36.2. For each object identifer found, SQLite looks up the corresponding entry in the demo_data table. It then uses the boundary field from the demo_data table as a parameter to the contained_in() function and if that function returns true, the objname field from the demo_data table is returned as the next row of query result.</p> <p>One would get the same answer without the use of the R*Tree index using the following simpler query:</p> <blockquote><pre> SELECT objname FROM demo_data WHERE contained_in(demo_data.boundary, :boundary); </pre></blockquote> <p>The problem with this latter query is that it must apply the contained_in() function to millions of entires in the demo_data table. The use of the R*Tree in the penultimate query reduces the number of calls to contained_in() function to a small subset of the entire table. The R*Tree index did not find the exact answer itself, it merely limited the search space.</p> |