Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Updates to the next-generation-query-planner document. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
28554e75969d632ff9111f5750d0db22 |
User & Date: | drh 2013-04-30 22:20:00.775 |
Context
2013-05-01
| ||
00:14 | Fix a typo in intern-v-extern-blob.html (check-in: c8b280b8eb user: drh tags: trunk) | |
2013-04-30
| ||
22:20 | Updates to the next-generation-query-planner document. (check-in: 28554e7596 user: drh tags: trunk) | |
16:49 | Add the Next Generation Query Planner document. (check-in: 74fbc7b392 user: drh tags: trunk) | |
Changes
Changes to pages/queryplanner-ng.in.
︙ | ︙ | |||
162 163 164 165 166 167 168 | R (cost: 3.56) <br> N1 (cost: 5.52) <br> N2 (cost: 5.52) <br> P (cost: 7.71) <br> </blockquote> <p>The second step would find the four shortest paths to visit two nodes | | > > > < | > | | | | | | | | > > > > > | 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 | R (cost: 3.56) <br> N1 (cost: 5.52) <br> N2 (cost: 5.52) <br> P (cost: 7.71) <br> </blockquote> <p>The second step would find the four shortest paths to visit two nodes and that begin with one of the four paths from the previous step. In the case where two or more paths are equivalent (they have the same set of visited nodes, though possibly in a different order) then only the first and lowest cost path is retained. We have:</p> <blockquote> R-N1 (cost: 7.03) <br> R-N2 (cost: 9.08) <br> N2-N1 (cost: 11.04) <br> R-P {cost: 11.27} <br> </blockquote> <p>The third stop starts with the four shortest two-node paths and finds the four shortest non-equivalent three-node paths:</p> <blockquote> R-N1-N2 (cost: 12.55) <br> R-N1-C (cost: 13.43) <br> R-N1-P (cost: 14.74) <br> R-N2-S (cost: 15.08) <br> </blockquote> <p>And so forth. There are 8 nodes in the TPC-H Q8 query, so this process repeats a total of 8 times. In the general case of a K-way join, the storage requirements is O(N) and the computation time is O(K*N). That is a lot less than an exact solution which requires O(2**K) time.</p> <p>But what value to choose for N? We might make N==K. This makes the algorithm O(N**2) which is actually still quite efficient, since the maximum value of K is 64. But that is not enough for the TPC-H Q8 problem. With N==8 on TPC-H Q8 the NNN algorithm finds the solution R-N1-C-O-L-S-N2-P with a cost of 29.78. That is an improvement over NN, but it is still not optimal.</p> <p>NNN finds the optimal solution for TPC-H Q8 when N is 10 or greater.</p> <p>The current thinking is to make N a parameter with a default value of 15 or so. This will likely find at least a very good plan in most circumstances. We think you will need to work very hard to construct a query where NNN with N==15 does not find the optimal plan. Remember that the all SQLite versions prior to 3.7.18 do an excellent job with N==1. The value of N==15 would be the compile-time default, but can be changed using a PRAGMA.</p> <p>Another idea begin considered is to initially attempt to find a query plan using N==1 and then fall back and make a second attempt with a larger N if the first attempt fails to find a provably optimal plan. A provably optimal plan is one that uses the lowest-cost arc into each node.</p> |