Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Include all changes since 3.7.17 in the 3.8.0.1 change log. Typos fixed in the NGQP document. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | branch-3.8.0 |
Files: | files | file ages | folders |
SHA1: |
3f23dd6ba9f80282688d647366e04b66 |
User & Date: | drh 2013-08-30 14:03:39.847 |
Context
2013-09-02
| ||
14:20 | In the change log for patch releases, repeat the changes of the main release for the page containing changes of the just the patch release, but do not repeat the main release changes in the overall change log. (check-in: affc1a3589 user: drh tags: branch-3.8.0) | |
2013-08-30
| ||
14:28 | Added a 3.8.1 change log. Added documentation on SQLITE_MINIMUM_FILE_DESCRIPTOR. (check-in: 4e286cee85 user: drh tags: trunk) | |
14:03 | Include all changes since 3.7.17 in the 3.8.0.1 change log. Typos fixed in the NGQP document. (check-in: 3f23dd6ba9 user: drh tags: branch-3.8.0) | |
2013-08-29
| ||
17:44 | Add the source-id and sha1sum for 3.8.0.1 to changes.in. (check-in: d22f4d2e9a user: dan tags: branch-3.8.0) | |
Changes
Changes to pages/changes.in.
︙ | ︙ | |||
44 45 46 47 48 49 50 51 52 53 54 55 56 57 | if {$nChng==1 && [file exists $DEST/$filename]} { file copy -force $DEST/$filename $DEST/releaselog/current.html } } } chng {2013-08-29 (3.8.0.1)} { <li>Fix an off-by-one error that caused quoted empty string at the end of a CRNL-terminated line of CSV input to be misread by the command-line shell. <li>Fix a query planner bug involving a LEFT JOIN with a BETWEEN or LIKE/GLOB constraint and then another INNER JOIN to the right that involves an OR constraint. <li>Fix a query planner bug that could result in a segfault when querying tables with a UNIQUE or PRIMARY KEY constraint with more than four columns. | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 | if {$nChng==1 && [file exists $DEST/$filename]} { file copy -force $DEST/$filename $DEST/releaselog/current.html } } } chng {2013-08-29 (3.8.0.1)} { <li>Add support for [partial indexes]</li> <li>Cut-over to the [next generation query planner] for faster and better query plans. <li>The [EXPLAIN QUERY PLAN] output no longer shows an estimate of the number of rows generated by each loop in a join. <li>Added the [FTS4 notindexed option], allowing non-indexed columns in an FTS4 table. <li>Added the [SQLITE_STMTSTATUS_VM_STEP] option to [sqlite3_stmt_status()]. <li>Added the [cache_spill pragma]. <li>Added the [query_only pragma]. <li>Added the [defer_foreign_keys pragma] and the [sqlite3_db_status](db, [SQLITE_DBSTATUS_DEFERRED_FKS],...) C-language interface. <li>Added the "percentile()" function as a [loadable extension] in the ext/misc subdirectory of the source tree. <li>Added the [SQLITE_ALLOW_URI_AUTHORITY] compile-time option. <li>Add the [sqlite3_cancel_auto_extension(X)] interface. <li>A running SELECT statement that lacks a FROM clause (or any other statement that never reads or writes from any database file) will not prevent a read transaction from closing. <li>Add the [SQLITE_DEFAULT_AUTOMATIC_INDEX] compile-time option. Setting this option to 0 disables automatic indices by default. <li>Issue an [SQLITE_WARNING_AUTOINDEX] warning on the [SQLITE_CONFIG_LOG] whenever the query planner uses an automatic index. <li>Added the [SQLITE_FTS3_MAX_EXPR_DEPTH] compile-time option. <li>Added an optional 5th parameter defining the collating sequence to the next_char() extension SQL function. <li>The [SQLITE_BUSY_SNAPSHOT] extended error code is returned in WAL mode when a read transaction cannot be upgraded to a write transaction because the read is on an older snapshot. <li>Enhancements to the sqlite3_analyzer utility program to provide size information separately for each individual index of a table, in addition to the aggregate size. <li>Allow read transactions to be freely opened and closed by SQL statements run from within the implementation of [application-defined SQL functions] if the function is called by a SELECT statement that does not access any database table. <li>Disable the use of posix_fallocate() on all (unix) systems unless the HAVE_POSIX_FALLOCATE compile-time option is used. <li>Update the ".import" command in the [command-line shell] to support multi-line fields and correct RFC-4180 quoting and to issue warning and/or error messages if the input text is not strictly RFC-4180 compliant. <li>Bug fix: In the [unicode61] tokenizer of [FTS4], treat all private code points as identifier symbols. <li>Bug fix: Bare identifiers in ORDER BY clauses bind more tightly to output column names, but identifiers in expressions bind more tightly to input column names. Identifiers in GROUP BY clauses always prefer output column names, however. <li>Bug fixes: Multiple problems in the legacy query optimizer were fixed by the move to [NGQP]. </ul><p>The above are changes since [version 3.7.17]. The differences between 3.8.0 and 3.8.0.1 are as follows:</p><ul> <li>Fix an off-by-one error that caused quoted empty string at the end of a CRNL-terminated line of CSV input to be misread by the command-line shell. <li>Fix a query planner bug involving a LEFT JOIN with a BETWEEN or LIKE/GLOB constraint and then another INNER JOIN to the right that involves an OR constraint. <li>Fix a query planner bug that could result in a segfault when querying tables with a UNIQUE or PRIMARY KEY constraint with more than four columns. |
︙ | ︙ |
Changes to pages/queryplanner-ng.in.
︙ | ︙ | |||
201 202 203 204 205 206 207 | nodes in the graph.</p> <p>The problem of finding the best query plan is equivalent to finding a minimum-cost path through the graph that visits each node exactly once.</p> <p>(Side note: The costs estimates in the TPC-H Q8 graph were computed | | | 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 | nodes in the graph.</p> <p>The problem of finding the best query plan is equivalent to finding a minimum-cost path through the graph that visits each node exactly once.</p> <p>(Side note: The costs estimates in the TPC-H Q8 graph were computed by the query planner in SQLite 3.7.16 and converted using a natural logarithm.) </p> <h3>3.2 Complications</h3> <p>The presentation of the query planner problem above is a simplification. The costs are estimates. We cannot know what the true cost of running a loop is until we actually run the loop. |
︙ | ︙ | |||
253 254 255 256 257 258 259 | GROUP BY, or DISTINCT clause. So for TPC-H Q8, the graph above is a reasonable representation of what needs to be computed. The general case involves a lot of extra complication, which for clarity is neglected in the remainder of this article.</p> <h3>3.3 Finding The Best Query Plan</h3> | | | | 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 | GROUP BY, or DISTINCT clause. So for TPC-H Q8, the graph above is a reasonable representation of what needs to be computed. The general case involves a lot of extra complication, which for clarity is neglected in the remainder of this article.</p> <h3>3.3 Finding The Best Query Plan</h3> <p>Prior to version 3.8.0, SQLite always used the "Nearest Neighbor" or "NN" heuristic when searching for the best query plan. The NN heuristic makes a single traversal of the graph, always choosing the lowest-cost arc as the next step. The NN heuristic works surprisingly well in most cases. And NN is fast, so that SQLite is able to quickly find good plans for even large 64-way joins. In contrast, other SQL database engines that do more extensive searching tend to bog down when the number of tables in a join goes above 10 or 15.</p> <p>Unfortunately, the query plan computed by NN for TPC-H Q8 is not optimal. The plan computed using NN is R-N1-N2-S-C-O-L-P with a cost of 36.92. The notation in the previous sentence means that the R table is run in the outer loop, N1 is in the next inner loop, N2 is in the third loop, and so forth down to P which is in the inner-most loop. The shortest path through the graph (as found via exhaustive search) is P-L-O-C-N1-R-S-N2 with a cost of 27.38. The difference might not seem like much, but remember that the costs are logarithmic, so the shortest path is nearly 750 times faster than that path found using the NN heuristic.</p> <p>One solution to this problem is to change SQLite to do an exhaustive search for the best path. But an exhaustive search requires time proportional to K! (where K is the number of tables in the join) and so when you get beyond a 10-way join, the time |
︙ | ︙ | |||
560 561 562 563 564 565 566 | <center> <img src="images/qp/fqp1.gif"> </center> <p> In the "without ANALYZE" case on the left, the NN algorithm chooses loop P (PLINK) as the outer loop because 4.9 is less than 5.2, resulting | | | 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 | <center> <img src="images/qp/fqp1.gif"> </center> <p> In the "without ANALYZE" case on the left, the NN algorithm chooses loop P (PLINK) as the outer loop because 4.9 is less than 5.2, resulting in path P-T which is algorithm-1. NN only looks at the single best choice at each step so it completely misses the fact that 5.2+4.4 makes a slightly cheaper plan than 4.9+4.8. But the N3 algorithm keeps track of the 5 best paths for a 2-way join, so it ends up selecting path T-P because of its slightly lower overall cost. Path T-P is algorithm-2. </p> |
︙ | ︙ | |||
586 587 588 589 590 591 592 | in the TPC-H Q8 graph.)</p> <h3>4.2 Fixing The Problem</h3> <p>Running [ANALYZE] on the repository database immediately fixed the performance problem. However, we want Fossil to be robust and to always work quickly regardless of whether or not its repository has been analyzed. | | | 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 | in the TPC-H Q8 graph.)</p> <h3>4.2 Fixing The Problem</h3> <p>Running [ANALYZE] on the repository database immediately fixed the performance problem. However, we want Fossil to be robust and to always work quickly regardless of whether or not its repository has been analyzed. For this reason, the query was modified to use the CROSS JOIN operator instead of the plain JOIN operator. SQLite will not reorder the tables of a CROSS JOIN. This is a long-standing feature of SQLite that is specifically designed to allow knowledgeable programmers to enforce a particular loop nesting order. Once the join was changed to CROSS JOIN (the addition of a single keyword) the NGQP was forced to chose the faster algorithm-1 regardless of whether or not |
︙ | ︙ |