Documentation Source Text

Check-in [5f75ab8f7e]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Update the performance graphs for version 3.20.0.
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: 5f75ab8f7e2ae3a0d635cf59db3db35d2ce3410e46dfb5390cfc94c53134e295
User & Date: drh 2017-07-24 14:17:31
Context
2017-07-24
15:15
Fix typos in the pointer-passing document. check-in: 2c7e1e0eff user: drh tags: trunk
14:17
Update the performance graphs for version 3.20.0. check-in: 5f75ab8f7e user: drh tags: trunk
13:24
Add requirements marks on the deterministic.html document. check-in: c705444999 user: drh tags: trunk
Changes
Hide Diffs Side-by-Side Diffs Ignore Whitespace Patch

Changes to images/cpu-usage.jpg.

cannot compute difference between binary files

Changes to misc/speed-size-graph.ods.

cannot compute difference between binary files

Changes to pages/changes.in.

    94     94   <li> Add the -withoutnulls option to the [TCL interface eval method].
    95     95   <li> Enhance the [sqlite3_analyzer.exe] utility program so that it shows
    96     96        the number of bytes of metadata on btree pages.
    97     97   <li> The [SQLITE_DBCONFIG_ENABLE_QPSG] run-time option and the
    98     98        [SQLITE_ENABLE_QPSG] compile-time option enable the
    99     99        [query planner stability guarantee].  See also ticket
   100    100        [https://www.sqlite.org/src/info/892fc34f173e99d8|892fc34f173e99d8]
   101         -<li> Miscellaneous optimizations result in a 2% reduction in CPU cycles.
          101  +<li> Miscellaneous optimizations result in a 2% reduction in [CPU cycles used].
   102    102   <p><b>Bug Fixes:</b>
   103    103   <li> Fix the query planner so that it knows not to use [automatic indexes]
   104    104        on the right table of LEFT JOIN if the WHERE clause uses the [IS operator].
   105    105        Fix for [https://sqlite.org/src/info/ce68383bf6aba|ce68383bf6aba].
   106    106   <li> Ensure that the query planner knows that any column of a 
   107    107        [flattening optimization|flattened] LEFT JOIN can be NULL even 
   108    108        if that column is labeled with "NOT NULL". Fix for ticket 

Changes to pages/cpu.in.

     1      1   <title>Measuring and Reducing CPU Usage in SQLite</title>
     2      2   <tcl>hd_keywords {CPU cycles used} {CPU performance measurement}</tcl>
     3      3   
     4      4   <table_of_contents>
     5      5   
     6      6   <h1>Overview</h1>
     7      7   
     8         -<p>The graph belows charts the number of CPU cycles used by SQLite on a
     9         -standard workload, for all versions of SQLite going back about 8 years.
    10         -As can be seen, the number of CPU cycles used by SQLite has been
            8  +<p>The graph below shows the number of CPU cycles used by SQLite on a
            9  +standard workload, for all versions of SQLite going back about 5.5 years.
           10  +The number of CPU cycles used by SQLite has been
    11     11   halved in just the past three years.
    12     12   
    13     13   <p>
    14     14   This article describes how the SQLite developers measure CPU usage,
    15     15   what those measurements actually mean, and the techniques used by
    16     16   SQLite developers on their continuing quest to further reduce the
    17     17   CPU usage of the SQLite library.
    18     18   </p>
    19     19   
    20     20   <center>
    21     21   <hr>
    22         -<div class="imgcontainer"><img src="./images/cpu-usage.jpg"></img></div><br>
    23         -Measured using cachegrind on Ubuntu 16.04 on x64 with gcc 5.4.0 and -Os
           22  +<div class="imgcontainer">
           23  +<img src="./images/cpu-usage.jpg"></img></div><br>
           24  +Measured using cachegrind on Ubuntu 16.04 on x64 with gcc 5.4.0 and -Os.<br>
           25  +Values normalized so that version 3.20.0 is 100%.
    24     26   <hr>
    25     27   </center>
    26     28   
    27     29   <h1>Measuring Performance</h1>
    28     30   
    29     31   <p>In brief, the CPU performance of SQLite is measured as follows:
    30     32   
................................................................................
    57     59   -O2 option creates so much code movement that it is difficult to associate
    58     60   specific CPU instructions to C source code lines.
    59     61   
    60     62   <h2>Workload</h2>
    61     63   
    62     64   <p>
    63     65   The "typical" workload is generated by the
    64         -[https://www.sqlite.org/src/file/test/speedtest1.c|speedtest1.c]
           66  +[https://sqlite.org/src/file/test/speedtest1.c|speedtest1.c]
    65     67   program in the canonical SQLite source tree.  This program strives to
    66     68   exercise the SQLite library in a way that is typical of real-world
    67     69   applications.  Of course, every application is different, and so
    68     70   no test program can exactly mirror the behavior of all applications.
    69     71   
    70     72   <p>
    71     73   The speedtest1.c program is updated from time to time as the SQLite
    72     74   developers' understanding of what constitutes "typical" usage evolves.
    73     75   
           76  +<p>
           77  +The 
           78  +[https://sqlite.org/src/file/tool/speed-check.sh|speed-check.sh] shell
           79  +script, also in the canonical source tree, is used to run the speedtest1.c
           80  +program.  To replicate the performance measurements, collect the following
           81  +files into a single directory:
           82  +<ul>
           83  +<li> the "speed-check.sh" script,
           84  +<li> the "speedtest1.c" test program, and
           85  +<li> the [amalgamation|SQLite amalgamation] source files "sqlite3.c" and
           86  +     "sqlite3.h"
           87  +</ul>
           88  +<p>
           89  +Then run "sh speed-check.sh trunk".
           90  +
    74     91   
    75     92   <h2>Performance Measurement</h2>
    76     93   
    77     94   <p>
    78     95   [http://valgrind.org/docs/manual/cg-manual.html|Cachegrind] is used to
    79     96   measure performance because it gives answers that are repeatable to 
    80     97   7 or more significant digits.  In comparison, actual (wall-clock)
................................................................................
    92    109   real-world timings.  But hundreds or thousands of microoptimizations
    93    110   add up, resulting in measurable real-world performance gains.
    94    111   
    95    112   <h1>Performance Measurement Workflow</h1>
    96    113   
    97    114   <p>
    98    115   As SQLite developers edit the SQLite source code, they run the
    99         -[https://www.sqlite.org/src/file/tool/speed-check.sh | speed-check.sh]
          116  +[https://sqlite.org/src/file/tool/speed-check.sh | speed-check.sh]
   100    117   shell script to track the performance impact of changes.  This
   101    118   script compiles the speedtest1.c program, runs it under cachegrind,
   102    119   processes the cachegrind output using the
   103         -[https://www.sqlite.org/src/file/tool/cg_anno.tcl | cg_anno.tcl] TCL
          120  +[https://sqlite.org/src/file/tool/cg_anno.tcl | cg_anno.tcl] TCL
   104    121   script, then saves the results in a series of text files.
   105    122   Typical output from the speed-check.sh script looks like this:
   106    123   
   107    124   <blockquote><pre>
   108    125   ==16429== I   refs:      <b><font color="red">1,291,005,499</font></b>
   109    126   ==16429== I1  misses:       24,688,182
   110    127   ==16429== LLi misses:            5,027
................................................................................
   127    144   
   128    145   <p>The important parts of the output (the parts that the developers pay
   129    146   the most attention to) are shown in red.
   130    147   Basically, the developers want to know the size of the compiled SQLite
   131    148   library and how many CPU cycles were needed to run the performance test.
   132    149   
   133    150   <p>The output from the 
   134         -[https://www.sqlite.org/src/file/tool/cg_anno.tcl | cg_anno.tcl] script
          151  +[https://sqlite.org/src/file/tool/cg_anno.tcl | cg_anno.tcl] script
   135    152   shows the number of CPU cycles spent on each line of code.
   136    153   The report is approximately 80,000 lines long.  The following is a brief
   137    154   snippet taken from the middle of the report to show what it looks like:
   138    155   
   139    156   <hr>
   140    157   <blockquote><pre>
   141    158            .  SQLITE_PRIVATE int sqlite3BtreeNext(BtCursor *pCur, int *pRes){