Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Update the performance graphs for version 3.20.0. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA3-256: |
5f75ab8f7e2ae3a0d635cf59db3db35d |
User & Date: | drh 2017-07-24 14:17:31.727 |
Context
2017-07-24
| ||
15:15 | Fix typos in the pointer-passing document. (check-in: 2c7e1e0eff user: drh tags: trunk) | |
14:17 | Update the performance graphs for version 3.20.0. (check-in: 5f75ab8f7e user: drh tags: trunk) | |
13:24 | Add requirements marks on the deterministic.html document. (check-in: c705444999 user: drh tags: trunk) | |
Changes
Changes to images/cpu-usage.jpg.
cannot compute difference between binary files
Changes to misc/speed-size-graph.ods.
cannot compute difference between binary files
Changes to pages/changes.in.
︙ | ︙ | |||
94 95 96 97 98 99 100 | <li> Add the -withoutnulls option to the [TCL interface eval method]. <li> Enhance the [sqlite3_analyzer.exe] utility program so that it shows the number of bytes of metadata on btree pages. <li> The [SQLITE_DBCONFIG_ENABLE_QPSG] run-time option and the [SQLITE_ENABLE_QPSG] compile-time option enable the [query planner stability guarantee]. See also ticket [https://www.sqlite.org/src/info/892fc34f173e99d8|892fc34f173e99d8] | | | 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 | <li> Add the -withoutnulls option to the [TCL interface eval method]. <li> Enhance the [sqlite3_analyzer.exe] utility program so that it shows the number of bytes of metadata on btree pages. <li> The [SQLITE_DBCONFIG_ENABLE_QPSG] run-time option and the [SQLITE_ENABLE_QPSG] compile-time option enable the [query planner stability guarantee]. See also ticket [https://www.sqlite.org/src/info/892fc34f173e99d8|892fc34f173e99d8] <li> Miscellaneous optimizations result in a 2% reduction in [CPU cycles used]. <p><b>Bug Fixes:</b> <li> Fix the query planner so that it knows not to use [automatic indexes] on the right table of LEFT JOIN if the WHERE clause uses the [IS operator]. Fix for [https://sqlite.org/src/info/ce68383bf6aba|ce68383bf6aba]. <li> Ensure that the query planner knows that any column of a [flattening optimization|flattened] LEFT JOIN can be NULL even if that column is labeled with "NOT NULL". Fix for ticket |
︙ | ︙ |
Changes to pages/cpu.in.
1 2 3 4 5 6 7 | <title>Measuring and Reducing CPU Usage in SQLite</title> <tcl>hd_keywords {CPU cycles used} {CPU performance measurement}</tcl> <table_of_contents> <h1>Overview</h1> | | | | > | | > | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | <title>Measuring and Reducing CPU Usage in SQLite</title> <tcl>hd_keywords {CPU cycles used} {CPU performance measurement}</tcl> <table_of_contents> <h1>Overview</h1> <p>The graph below shows the number of CPU cycles used by SQLite on a standard workload, for all versions of SQLite going back about 5.5 years. The number of CPU cycles used by SQLite has been halved in just the past three years. <p> This article describes how the SQLite developers measure CPU usage, what those measurements actually mean, and the techniques used by SQLite developers on their continuing quest to further reduce the CPU usage of the SQLite library. </p> <center> <hr> <div class="imgcontainer"> <img src="./images/cpu-usage.jpg"></img></div><br> Measured using cachegrind on Ubuntu 16.04 on x64 with gcc 5.4.0 and -Os.<br> Values normalized so that version 3.20.0 is 100%. <hr> </center> <h1>Measuring Performance</h1> <p>In brief, the CPU performance of SQLite is measured as follows: |
︙ | ︙ | |||
57 58 59 60 61 62 63 | -O2 option creates so much code movement that it is difficult to associate specific CPU instructions to C source code lines. <h2>Workload</h2> <p> The "typical" workload is generated by the | | > > > > > > > > > > > > > > > | 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 | -O2 option creates so much code movement that it is difficult to associate specific CPU instructions to C source code lines. <h2>Workload</h2> <p> The "typical" workload is generated by the [https://sqlite.org/src/file/test/speedtest1.c|speedtest1.c] program in the canonical SQLite source tree. This program strives to exercise the SQLite library in a way that is typical of real-world applications. Of course, every application is different, and so no test program can exactly mirror the behavior of all applications. <p> The speedtest1.c program is updated from time to time as the SQLite developers' understanding of what constitutes "typical" usage evolves. <p> The [https://sqlite.org/src/file/tool/speed-check.sh|speed-check.sh] shell script, also in the canonical source tree, is used to run the speedtest1.c program. To replicate the performance measurements, collect the following files into a single directory: <ul> <li> the "speed-check.sh" script, <li> the "speedtest1.c" test program, and <li> the [amalgamation|SQLite amalgamation] source files "sqlite3.c" and "sqlite3.h" </ul> <p> Then run "sh speed-check.sh trunk". <h2>Performance Measurement</h2> <p> [http://valgrind.org/docs/manual/cg-manual.html|Cachegrind] is used to measure performance because it gives answers that are repeatable to 7 or more significant digits. In comparison, actual (wall-clock) |
︙ | ︙ | |||
92 93 94 95 96 97 98 | real-world timings. But hundreds or thousands of microoptimizations add up, resulting in measurable real-world performance gains. <h1>Performance Measurement Workflow</h1> <p> As SQLite developers edit the SQLite source code, they run the | | | | 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 | real-world timings. But hundreds or thousands of microoptimizations add up, resulting in measurable real-world performance gains. <h1>Performance Measurement Workflow</h1> <p> As SQLite developers edit the SQLite source code, they run the [https://sqlite.org/src/file/tool/speed-check.sh | speed-check.sh] shell script to track the performance impact of changes. This script compiles the speedtest1.c program, runs it under cachegrind, processes the cachegrind output using the [https://sqlite.org/src/file/tool/cg_anno.tcl | cg_anno.tcl] TCL script, then saves the results in a series of text files. Typical output from the speed-check.sh script looks like this: <blockquote><pre> ==16429== I refs: <b><font color="red">1,291,005,499</font></b> ==16429== I1 misses: 24,688,182 ==16429== LLi misses: 5,027 |
︙ | ︙ | |||
127 128 129 130 131 132 133 | <p>The important parts of the output (the parts that the developers pay the most attention to) are shown in red. Basically, the developers want to know the size of the compiled SQLite library and how many CPU cycles were needed to run the performance test. <p>The output from the | | | 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 | <p>The important parts of the output (the parts that the developers pay the most attention to) are shown in red. Basically, the developers want to know the size of the compiled SQLite library and how many CPU cycles were needed to run the performance test. <p>The output from the [https://sqlite.org/src/file/tool/cg_anno.tcl | cg_anno.tcl] script shows the number of CPU cycles spent on each line of code. The report is approximately 80,000 lines long. The following is a brief snippet taken from the middle of the report to show what it looks like: <hr> <blockquote><pre> . SQLITE_PRIVATE int sqlite3BtreeNext(BtCursor *pCur, int *pRes){ |
︙ | ︙ |