Documentation Source Text

Check-in [5f75ab8f7e]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Update the performance graphs for version 3.20.0.
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: 5f75ab8f7e2ae3a0d635cf59db3db35d2ce3410e46dfb5390cfc94c53134e295
User & Date: drh 2017-07-24 14:17:31
Context
2017-07-24
15:15
Fix typos in the pointer-passing document. check-in: 2c7e1e0eff user: drh tags: trunk
14:17
Update the performance graphs for version 3.20.0. check-in: 5f75ab8f7e user: drh tags: trunk
13:24
Add requirements marks on the deterministic.html document. check-in: c705444999 user: drh tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to images/cpu-usage.jpg.

cannot compute difference between binary files

Changes to misc/speed-size-graph.ods.

cannot compute difference between binary files

Changes to pages/changes.in.

94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
<li> Add the -withoutnulls option to the [TCL interface eval method].
<li> Enhance the [sqlite3_analyzer.exe] utility program so that it shows
     the number of bytes of metadata on btree pages.
<li> The [SQLITE_DBCONFIG_ENABLE_QPSG] run-time option and the
     [SQLITE_ENABLE_QPSG] compile-time option enable the
     [query planner stability guarantee].  See also ticket
     [https://www.sqlite.org/src/info/892fc34f173e99d8|892fc34f173e99d8]
<li> Miscellaneous optimizations result in a 2% reduction in CPU cycles.
<p><b>Bug Fixes:</b>
<li> Fix the query planner so that it knows not to use [automatic indexes]
     on the right table of LEFT JOIN if the WHERE clause uses the [IS operator].
     Fix for [https://sqlite.org/src/info/ce68383bf6aba|ce68383bf6aba].
<li> Ensure that the query planner knows that any column of a 
     [flattening optimization|flattened] LEFT JOIN can be NULL even 
     if that column is labeled with "NOT NULL". Fix for ticket 







|







94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
<li> Add the -withoutnulls option to the [TCL interface eval method].
<li> Enhance the [sqlite3_analyzer.exe] utility program so that it shows
     the number of bytes of metadata on btree pages.
<li> The [SQLITE_DBCONFIG_ENABLE_QPSG] run-time option and the
     [SQLITE_ENABLE_QPSG] compile-time option enable the
     [query planner stability guarantee].  See also ticket
     [https://www.sqlite.org/src/info/892fc34f173e99d8|892fc34f173e99d8]
<li> Miscellaneous optimizations result in a 2% reduction in [CPU cycles used].
<p><b>Bug Fixes:</b>
<li> Fix the query planner so that it knows not to use [automatic indexes]
     on the right table of LEFT JOIN if the WHERE clause uses the [IS operator].
     Fix for [https://sqlite.org/src/info/ce68383bf6aba|ce68383bf6aba].
<li> Ensure that the query planner knows that any column of a 
     [flattening optimization|flattened] LEFT JOIN can be NULL even 
     if that column is labeled with "NOT NULL". Fix for ticket 

Changes to pages/cpu.in.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

22
23

24
25
26
27
28
29
30
..
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73















74
75
76
77
78
79
80
..
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
...
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
<title>Measuring and Reducing CPU Usage in SQLite</title>
<tcl>hd_keywords {CPU cycles used} {CPU performance measurement}</tcl>

<table_of_contents>

<h1>Overview</h1>

<p>The graph belows charts the number of CPU cycles used by SQLite on a
standard workload, for all versions of SQLite going back about 8 years.
As can be seen, the number of CPU cycles used by SQLite has been
halved in just the past three years.

<p>
This article describes how the SQLite developers measure CPU usage,
what those measurements actually mean, and the techniques used by
SQLite developers on their continuing quest to further reduce the
CPU usage of the SQLite library.
</p>

<center>
<hr>

<div class="imgcontainer"><img src="./images/cpu-usage.jpg"></img></div><br>
Measured using cachegrind on Ubuntu 16.04 on x64 with gcc 5.4.0 and -Os

<hr>
</center>

<h1>Measuring Performance</h1>

<p>In brief, the CPU performance of SQLite is measured as follows:

................................................................................
-O2 option creates so much code movement that it is difficult to associate
specific CPU instructions to C source code lines.

<h2>Workload</h2>

<p>
The "typical" workload is generated by the
[https://www.sqlite.org/src/file/test/speedtest1.c|speedtest1.c]
program in the canonical SQLite source tree.  This program strives to
exercise the SQLite library in a way that is typical of real-world
applications.  Of course, every application is different, and so
no test program can exactly mirror the behavior of all applications.

<p>
The speedtest1.c program is updated from time to time as the SQLite
developers' understanding of what constitutes "typical" usage evolves.

















<h2>Performance Measurement</h2>

<p>
[http://valgrind.org/docs/manual/cg-manual.html|Cachegrind] is used to
measure performance because it gives answers that are repeatable to 
7 or more significant digits.  In comparison, actual (wall-clock)
................................................................................
real-world timings.  But hundreds or thousands of microoptimizations
add up, resulting in measurable real-world performance gains.

<h1>Performance Measurement Workflow</h1>

<p>
As SQLite developers edit the SQLite source code, they run the
[https://www.sqlite.org/src/file/tool/speed-check.sh | speed-check.sh]
shell script to track the performance impact of changes.  This
script compiles the speedtest1.c program, runs it under cachegrind,
processes the cachegrind output using the
[https://www.sqlite.org/src/file/tool/cg_anno.tcl | cg_anno.tcl] TCL
script, then saves the results in a series of text files.
Typical output from the speed-check.sh script looks like this:

<blockquote><pre>
==16429== I   refs:      <b><font color="red">1,291,005,499</font></b>
==16429== I1  misses:       24,688,182
==16429== LLi misses:            5,027
................................................................................

<p>The important parts of the output (the parts that the developers pay
the most attention to) are shown in red.
Basically, the developers want to know the size of the compiled SQLite
library and how many CPU cycles were needed to run the performance test.

<p>The output from the 
[https://www.sqlite.org/src/file/tool/cg_anno.tcl | cg_anno.tcl] script
shows the number of CPU cycles spent on each line of code.
The report is approximately 80,000 lines long.  The following is a brief
snippet taken from the middle of the report to show what it looks like:

<hr>
<blockquote><pre>
         .  SQLITE_PRIVATE int sqlite3BtreeNext(BtCursor *pCur, int *pRes){







|
|
|











>
|
|
>







 







|









>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







 







|



|







 







|







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
..
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
...
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
...
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
<title>Measuring and Reducing CPU Usage in SQLite</title>
<tcl>hd_keywords {CPU cycles used} {CPU performance measurement}</tcl>

<table_of_contents>

<h1>Overview</h1>

<p>The graph below shows the number of CPU cycles used by SQLite on a
standard workload, for all versions of SQLite going back about 5.5 years.
The number of CPU cycles used by SQLite has been
halved in just the past three years.

<p>
This article describes how the SQLite developers measure CPU usage,
what those measurements actually mean, and the techniques used by
SQLite developers on their continuing quest to further reduce the
CPU usage of the SQLite library.
</p>

<center>
<hr>
<div class="imgcontainer">
<img src="./images/cpu-usage.jpg"></img></div><br>
Measured using cachegrind on Ubuntu 16.04 on x64 with gcc 5.4.0 and -Os.<br>
Values normalized so that version 3.20.0 is 100%.
<hr>
</center>

<h1>Measuring Performance</h1>

<p>In brief, the CPU performance of SQLite is measured as follows:

................................................................................
-O2 option creates so much code movement that it is difficult to associate
specific CPU instructions to C source code lines.

<h2>Workload</h2>

<p>
The "typical" workload is generated by the
[https://sqlite.org/src/file/test/speedtest1.c|speedtest1.c]
program in the canonical SQLite source tree.  This program strives to
exercise the SQLite library in a way that is typical of real-world
applications.  Of course, every application is different, and so
no test program can exactly mirror the behavior of all applications.

<p>
The speedtest1.c program is updated from time to time as the SQLite
developers' understanding of what constitutes "typical" usage evolves.

<p>
The 
[https://sqlite.org/src/file/tool/speed-check.sh|speed-check.sh] shell
script, also in the canonical source tree, is used to run the speedtest1.c
program.  To replicate the performance measurements, collect the following
files into a single directory:
<ul>
<li> the "speed-check.sh" script,
<li> the "speedtest1.c" test program, and
<li> the [amalgamation|SQLite amalgamation] source files "sqlite3.c" and
     "sqlite3.h"
</ul>
<p>
Then run "sh speed-check.sh trunk".


<h2>Performance Measurement</h2>

<p>
[http://valgrind.org/docs/manual/cg-manual.html|Cachegrind] is used to
measure performance because it gives answers that are repeatable to 
7 or more significant digits.  In comparison, actual (wall-clock)
................................................................................
real-world timings.  But hundreds or thousands of microoptimizations
add up, resulting in measurable real-world performance gains.

<h1>Performance Measurement Workflow</h1>

<p>
As SQLite developers edit the SQLite source code, they run the
[https://sqlite.org/src/file/tool/speed-check.sh | speed-check.sh]
shell script to track the performance impact of changes.  This
script compiles the speedtest1.c program, runs it under cachegrind,
processes the cachegrind output using the
[https://sqlite.org/src/file/tool/cg_anno.tcl | cg_anno.tcl] TCL
script, then saves the results in a series of text files.
Typical output from the speed-check.sh script looks like this:

<blockquote><pre>
==16429== I   refs:      <b><font color="red">1,291,005,499</font></b>
==16429== I1  misses:       24,688,182
==16429== LLi misses:            5,027
................................................................................

<p>The important parts of the output (the parts that the developers pay
the most attention to) are shown in red.
Basically, the developers want to know the size of the compiled SQLite
library and how many CPU cycles were needed to run the performance test.

<p>The output from the 
[https://sqlite.org/src/file/tool/cg_anno.tcl | cg_anno.tcl] script
shows the number of CPU cycles spent on each line of code.
The report is approximately 80,000 lines long.  The following is a brief
snippet taken from the middle of the report to show what it looks like:

<hr>
<blockquote><pre>
         .  SQLITE_PRIVATE int sqlite3BtreeNext(BtCursor *pCur, int *pRes){