Documentation Source Text

Artifact [be05eb81da]

Artifact be05eb81da3684ddbb90c383a4795e3976c221cb:

<title>How SQLite Is Tested</title>
<tcl>hd_keywords testing *tested {test suite}</tcl>

# This document contains many size statistics about SQLite, statistics
# that change frequently.  We want the document to be up-to-date.  To
# facilitate that, all the size values are defined by variables here
# which are then used as needed through the document.
# NOTE:  Also update the version number in the text!!!

# sloc sqlite3.c
set stat(coreSLOC)    84298  ;# Non-comment lines of amalgamation code 
# sloc test*.c
set stat(tclcSLOC)    24160  ;# Non-comment lines of test C code
# ls test*.c tclsqlite.c | wc
set stat(tclcNfile)      44  ;# Number of files of TCL C testcode + tclsqlite.c
# ls -l test*.c tclsqlite.c | awk '{sum+=$5}END{print sum}'
set stat(tclcNByte) 1060069  ;# Number of bytes of TCL C testcode + tclsqlite.c
# sloc *.test *.tcl
set stat(tclsSLOC)   246201  ;# Non-comment lines of TCL test script
# ls *.test *.tcl | wc
set stat(tclsNFile)     760  ;# Number of files of TCL test script
# ls -l *.test *.tcl | awk '{sum+=$5}END{print sum}'
set stat(tclsNByte) 10803982 ;# Number of bytes of TCL test script
# grep do_test *.test | wc; grep do_execsql_test *.test | wc
set stat(tclNTest)    30295  ;# Number of test cases in the TCL test suite
set stat(tclNEval)  1298536  ;# Number of test case evaluations
set stat(nSqlFuzz)   114899  ;# Number of SQL fuzz tests
set stat(vqNEval)    202234  ;# Number of test evaluations for veryquick.test
#  set stat(vqStmtCov)   97.23  ;# veryquick statement coverage
#  set stat(vqBrCov)     92.57  ;# veryquick branch coverage
#  set stat(allStmtCov)  99.50  ;# all.test statement coverage
#  set stat(allBrCov)    97.41  ;# all.test condition/decision coverage
# tclsh mkth3.tcl cfg/*.cfg */*.test >th3.c; sloc th3.c
set stat(th3SLOC)    691250  ;# Non-comment lines in full th3.c
# ls -l th3.c
set stat(th3NByte) 51188527  ;# Number of bytes in full th3.c
# grep th3testBegin */*.test
# grep th3oomBegin */*.test
# grep th3ioerrBegin */*.test
# grep '^--testcase' */*.test
set stat(th3NTest)      35211  ;# Number of test cases
# from output of a min.rc test run.
set stat(th3NECov)     836678  ;# Number of test case evals for coverage
#set stat(th3NETest)  7247055  ;# Number of test case evaluations
#set stat(th3NEExt) 589175483  ;# Number of test case evals extended
#set stat(th3NERel) 2500000000 ;# Number of test case evals release
set stat(th3StmtCov) 100.00  ;# TH3 statement coverage
set stat(th3BrCov)   100.00  ;# TH3 branch coverage
# wc `find . -name '*.test'` | awk '{x+=$1}END{print x}'
set stat(sltsSLOC)  90489494 ;# Non-comment lines of SLT test script
# ls -l `find . -name '*.test'` | awk '{sum+=$5}END{print sum}'
set stat(sltsNByte) 1116800308 ;# Bytes of SLT test script
# find . -name '*.test' | wc
set stat(sltsNFile)        622 ;# Files of SLT test script
# sloc md5.c slt_*.c sqllogictest.c
set stat(sltcSLOC)        1404 ;# Non-comment lines of SLT C code
# grep '^query' `fossil ls | awk '/\.test$/{print $2}'` | wc
set stat(sltNTest)     7195342 ;# Number of test cases in SLT
# grep 'assert(' sqlite3.c | wc
set stat(nAssert)         3691 ;# Number of assert statements
# grep 'testcase(' sqlite3.c | grep -v define | wc
set stat(nTestcase)        695 ;# Number of testcase statements

set stat(totalSLOC) [expr {$stat(tclcSLOC)+$stat(tclsSLOC)+

proc GB {expr} {
  set n [uplevel #0 expr $expr]
  hd_puts [format %.2f [expr {$n/(1000.0*1000.0*1000.0)}]]
proc MiB {expr} {
  set n [uplevel #0 expr $expr]
  hd_puts [format %.1f [expr {$n/(1024.0*1024.0)}]]
proc MB {expr} {
  set n [uplevel #0 expr $expr]
  hd_puts [format %.1f [expr {$n/(1000.0*1000.0)}]]
proc KiB {expr} {
  set n [uplevel #0 expr $expr]
  hd_puts [format %.1f [expr {$n/(1024.0)}]]
proc KB {expr} {
  set n [uplevel #0 expr $expr]
  hd_puts [format %.1f [expr {$n/(1000.0)}]]
proc N {expr} {
  hd_puts [uplevel #0 expr $expr]
proc version {} {
  hd_puts $::stat(version)


<h1 align="center">How SQLite Is Tested</h1>

<h2>1.0 Introduction</h2>

<p>The reliability and robustness of SQLite is achieved in part
by thorough and careful testing.</p>

<p>As of [version 3.8.0],
the SQLite library consists of approximately
<tcl>KB {$stat(coreSLOC)}</tcl> KSLOC of C code.
(KSLOC means thousands of "Source Lines Of Code" or, in other words,
lines of code excluding blank lines and comments.)
By comparison, the project has
hd_puts "[expr {int($stat(totalSLOC)/$stat(coreSLOC))}] times as much"
test code and test scripts - 
<tcl>KB {$stat(totalSLOC)}</tcl> KSLOC.</p>

<h3>1.1 Executive Summary</h3>

<li> Three independently developed test harnesses
<li> 100% branch test coverage in an as-deployed configuration
<li> Millions and millions of test cases
<li> Out-of-memory tests
<li> I/O error tests
<li> Crash and power loss tests
<li> Fuzz tests
<li> Boundary value tests
<li> Disabled optimization tests
<li> Regression tests
<li> Malformed database tests
<li> Extensive use of assert() and run-time checks
<li> Valgrind analysis
<li> Signed-integer overflow checks
<li> Checklists

<tcl>hd_fragment {harnesses} {test harness} {three test harnesses}</tcl>
<h2>2.0 Test Harnesses</h2>

<p>There are three independent test harnesses used for testing the 
core SQLite library.
Each test harness is designed, maintained, and managed separately
from the others.

<tcl>hd_fragment tcl {TCL test suite}</tcl>
The <b>TCL Tests</b> are the oldest set of tests for SQLite.  
They are contained in the same source tree as the
SQLite core and like the SQLite core are in the public domain.  The
TCL tests are the primary tests used during development.
The TCL tests are written using the 
[ | TCL scripting language].
The TCL test harness itself consists of <tcl>KB {$stat(tclcSLOC)}</tcl> KSLOC 
of C code used to create the TCL interface.  The test scripts are contained
in <tcl>N {$stat(tclsNFile)}</tcl> files totaling 
<tcl>MiB {$stat(tclsNByte)}</tcl>MB in size.  There are
<tcl>N {$stat(tclNTest)}</tcl> distinct test cases, but many of the test
cases are parameterized and run multiple times (with different parameters)
so that on a full test run millions of
separate tests are performed.

The <b>[TH3]</b> test harness is a set of proprietary tests, written in
C that provide 100% branch test coverage (and 100% MC/DC test coverage) to
the core SQLite library.  The TH3 tests are designed to run
on embedded and specialized platforms that would not easily support
TCL or other workstation services.  TH3 tests use only the published 
SQLite interfaces.  TH3 is free to [SQLite Consortium] members 
and is available by license to others.  TH3 consists of about
<tcl>MB {$stat(th3NByte)}</tcl> MB or <tcl>KB {$stat(th3SLOC)}</tcl> KSLOC
of C code implementing <tcl>N {$stat(th3NTest)}</tcl> distinct test cases.
TH3 tests are heavily parameterized, though, so a full-coverage test runs
about <tcl>MB {$stat(th3NECov)}</tcl> million different test
instances.  The cases that provide 100% branch test coverage constitute
a subset of the total TH3 test suite.  A soak test
prior to release does hundreds of millions of tests.
Additional information on TH3 is [TH3 | available separately].</p></li>

<tcl>hd_fragment slt {SLT} {SQL Logic Tests}</tcl>
The <a href=""><b>SQL Logic Test</b></a>
or SLT test harness is used to run huge numbers
of SQL statements against both SQLite and several other SQL database engines
and verify that they all get the same answers.  SLT currently compares
SQLite against PostgreSQL, MySQL, Microsoft SQL Server, and Oracle 10g.
SLT runs <tcl>MB {$stat(sltNTest)}</tcl> million queries comprising
<tcl>GB {$stat(sltsNByte)}</tcl>GB of test data.

<p>All of the tests above must run successfully, on multiple platforms
and under multiple compile-time configurations,
before each release of SQLite.</p>

<p>Prior to each check-in to the SQLite source tree, developers
typically run a subset (called "veryquick") of the Tcl tests
consisting of about 
<tcl>KB {$stat(vqNEval)}</tcl> thousand test cases.
The veryquick tests include everything except the anomaly, fuzz, and 
soak tests.  The idea behind the veryquick tests are that they are
sufficient to catch most errors, but also run in only a few minutes
instead of a few hours.</p>

<tcl>hd_fragment anomoly</tcl>
<h2>3.0 Anomaly Testing</h2>

<p>Anomaly tests are tests designed to verify the correct behavior
of SQLite when something goes wrong.  It is (relatively) easy to build
an SQL database engine that behaves correctly on well-formed inputs
on a fully functional computer.  It is more difficult to build a system
that responds sanely to invalid inputs and continues to function following
system malfunctions.  The anomaly tests are designed to verify the latter

<tcl>hd_fragment oomtesting</tcl>
<h3>3.1 Out-Of-Memory Testing</h3>

<p>SQLite, like all SQL database engines, makes extensive use of
malloc()  (See the separate report on
[memory allocation | dynamic memory allocation in SQLite] for
additional detail.)
On servers and workstations, malloc() never fails in practice and so correct
handling of out-of-memory (OOM) errors is not particularly important.
But on embedded devices, OOM errors are frighteningly common and since
SQLite is frequently used on embedded devices, it is important that
SQLite be able to gracefully handle OOM errors.</p>

<p>OOM testing is accomplished by simulating OOM errors.
SQLite allows an application to substitute an alternative malloc()
implementation using the [sqlite3_config]([SQLITE_CONFIG_MALLOC],...)
interface.  The TCL and TH3 test harnesses are both capable of
inserting a modified version of malloc() that can be rigged to fail 
after a certain number of allocations.  These instrumented mallocs
can be set to fail only once and then start working again, or to
continue failing after the first failure.  OOM tests are done in a
loop.  On the first iteration of the loop, the instrumented malloc
is rigged to fail on the first allocation.  Then some SQLite operation
is carried out and checks are done to make sure SQLite handled the
OOM error correctly.  Then the time-to-failure counter
on the instrumented malloc is increased by one and the test is
repeated.  The loop continues until the entire operation runs to
completion without ever encountering a simulated OOM failure.
Tests like this are run twice, once with the instrumented malloc
set to fail only once, and again with the instrumented malloc set
to fail continuously after the first failure.</p>

<tcl>hd_fragment ioerrtesting</tcl>
<h3>3.2 I/O Error Testing</h3>

<p>I/O error testing seeks to verify that SQLite responds sanely
to failed I/O operations.  I/O errors might result from a full disk drive,
malfunctioning disk hardware, network outages when using a network
file system, system configuration or permission changes that occur in the 
middle of an SQL operation, or other hardware or operating system 
malfunctions.  Whatever the cause, it is important that SQLite be able
to respond correctly to these errors and I/O error testing seeks to
verify that it does.</p>

<p>I/O error testing is similar in concept to OOM testing; I/O errors
are simulated and checks are made to verify that SQLite responds
correctly to the simulated errors.  I/O errors are simulated in both
the TCL and TH3 test harnesses by inserting a new
[sqlite3_vfs | Virtual File System object] that is specially rigged
to simulate an I/O error after a set number of I/O operations.
As with OOM error testing, the I/O error simulators can be set to
fail just once, or to fail continuously after the first failure.
Tests are run in a loop, slowly increasing the point of failure until
the test case runs to completion without error.  The loop is run twice,
once with the I/O error simulator set to simulate only a single failure
and a second time with it set to fail all I/O operations after the first

<p>In I/O error tests, after the I/O error simulation failure mechanism
is disabled, the database is examined using
[PRAGMA integrity_check] to make sure that the I/O error has not
introduced database corruption.</p>

<tcl>hd_fragment crashtesting</tcl>
<h3>3.3 Crash Testing</h3>

<p>Crash testing seeks to demonstrate that an SQLite database will not
go corrupt if the application or operating system crashes or if there
is a power failure in the middle of a database update.  A separate
white-paper titled
<a href="atomiccommit.html">Atomic Commit in SQLite</a> describes the
defensive measure SQLite takes to prevent database corruption following
a crash.  Crash tests strive to verify that those defensive measures
are working correctly.</p>

<p>It is impractical to do crash testing using real power failures, of
course, and so crash testing is done in simulation.  An alternative
[sqlite3_vfs | Virtual File System] is inserted that allows the test
harness to simulate the state of the database file following a crash.</p>

<p>In the TCL test harness, the crash simulation is done in a separate
process.  The main testing process spawns a child process which runs
some SQLite operation and randomly crashes somewhere in the middle of
a write operation.  A special [VFS] randomly reorders and corrupts
the unsynchronized
write operations to simulate the effect of buffered filesystems.  After
the child dies, the original test process opens and reads the test
database and verifies that the changes attempted by the child either
completed successfully or else were completely rolled back.  The
[integrity_check] [PRAGMA] is used to make sure no database corruption

<p>The TH3 test harness needs to run on embedded systems that do not
necessarily have the ability to spawn child processes, so it uses
an in-memory [VFS] to simulate crashes.  The in-memory [VFS] can be rigged
to make a snapshot of the entire filesystem after a set number of I/O
operations.  Crash tests run in a loop.  On each iteration of the loop,
the point at which a snapshot is made is advanced until the SQLite
operations being tested run to completion without ever hitting a
snapshot.  Within the loop, after the SQLite operation under test has
completed, the filesystem is reverted to the snapshot and random file
damage is introduced that is characteristic of the kinds of damage
one expects to see following a power loss.  Then the database is opened
and checks are made to ensure that it is well-formed and that the
transaction either ran to completion or was completely rolled back.
The interior of the loop is repeated multiple times for each
snapshot with different random damage each time.</p>

<tcl>hd_fragment multifail</tcl>
<h3>3.4 Compound failure tests</h3>

<p>The test suites for SQLite also explore the result of stacking
multiple failures.  For example, tests are run to ensure correct behavior
when an I/O error or OOM fault occurs while trying to recover from a
prior crash.

<tcl>hd_fragment fuzztesting</tcl>
<h2>4.0 Fuzz Testing</h2>

<p>[ | Fuzz testing]
seeks to establish that SQLite responds correctly to invalid, out-of-range,
or malformed inputs.</p>

<h3>4.1 SQL Fuzz</h3>

<p>SQL fuzz testing consists of creating syntactically correct yet
wildly nonsensical SQL statements and feeding them to SQLite to see
what it will do with them.  Usually some kind of error is returned
(such as "no such table").  Sometimes, purely by chance, the SQL
statement also happens to be semantically correct.  In that case, the
resulting prepared statement is run to make sure it gives a reasonable

<p>The SQL fuzz generator tests are part of the TCL test suite.
During a full test run, about <tcl>KB {$stat(nSqlFuzz)}</tcl> 
thousand fuzz SQL statements are
generated and tested.</p>

<h3>4.2 Malformed Database Files</h3>

<p>There are numerous test cases that verify that SQLite is able to
deal with malformed database files.
These tests first build a well-formed database file, then add
corruption by changing one or more bytes in the file by some means
other than SQLite.  Then SQLite is used to read the database.
In some cases, the bytes changes are in the middle of data.
This causes the content of the database to change while keeping the
database well-formed.
In other cases, unused bytes of the file are modified, which has
no effect on the integrity of the database.
The interesting cases are when bytes of the file that
define database structure get changed.  The malformed database tests
verify that SQLite finds the file format errors and reports them
using the [SQLITE_CORRUPT] return code without overflowing
buffers, dereferencing NULL pointers, or performing other
unwholesome actions.</p>

<h3>4.3 Boundary Value Tests</h3>

<p>SQLite defines certain [limits] on its operation, such as the
maximum number of columns in a table, the maximum length of an 
SQL statement, or the maximum value of an integer.  The TCL and TH3 test
suites both contains numerous tests that push SQLite right to the edge
of its defined limits and verify that it performs correctly for
all allowed values.  Additional tests go beyond the defined limits
and verify that SQLite correctly returns errors.  The source code
contains [testcase macros] to verify that both sides of each boundary
have been tested.</p>

<tcl>hd_fragment regressiontesting</tcl>
<h2>5.0 Regression Testing</h2>

<p>Whenever a bug is reported against SQLite, that bug is not considered
fixed until new test cases that would exhibit the bug have been added 
to either the TCL or TH3 test suites.
Over the years,
this has resulted in thousands and thousands of new tests.
These regression tests ensure that bugs that have
been fixed in the past are not reintroduced into future versions of

<tcl>hd_fragment leakcheck</tcl>
<h2>6.0 Automatic Resource Leak Detection</h2>

<p>Resource leak occurs when system resources
are allocated and never freed.  The most troublesome resource leaks
in many applications are memory leaks - when memory is allocated using
malloc() but never released using free().  But other kinds of resources
can also be leaked:  file descriptors, threads, mutexes, etc.</p>

<p>Both the TCL and TH3 test harnesses automatically track system
resources and report resource leaks on <u>every</u> test run.
No special configuration or setup is required.   The test harnesses
are especially vigilant with regard to memory leaks.  If a change
causes a memory leak, the test harnesses will recognize this
quickly.  SQLite is designed to never leak memory, even after
an exception such as an OOM error or disk I/O error.  The test
harnesses are zealous to enforce this.</p>

<tcl>hd_fragment coverage {test coverage}</tcl>
<h2>7.0 Test Coverage</h2>

<p>The SQLite core has 100% branch test coverage under [TH3] in
its default configuration as measured by
[ | gcov].</p>

<p>The "SQLite core" in the previous paragraph excludes the
operating-system dependent [VFS] backends, since it is
not possible to write cross-platform tests for those modules.  Extensions
such as FTS3 and RTree are also excluded from the analysis.</p>

<tcl>hd_fragment stmtvbr</tcl>
<h3>7.1 Statement versus branch coverage</h3>

<p>There are many ways to measure test coverage.  The most popular
metric is "statement coverage".  When you hear someone say that their
program as "XX% test coverage" without further explanation, they usually
mean statement coverage.  Statement coverage measures what percentage
of lines of code are executed at least once by the test suite.</p>

<p>Branch coverage is more rigorous than statement coverage.  Branch
coverage measures the number of machine-code branch instructions that
are evaluated at least once on both directions.</p>

<p>To illustrate the difference between statement coverage and
branch coverage, consider the following hypothetical
line of C code:</p>

if( a>b && c!=25 ){ d++; }

<p>Such a line of C code might generate a dozen separate machine code
instructions.  If any one of those instructions is ever evaluated, then
we say that the statement has been tested.  So, for example, it might
be the case that the conditional expression is
always false and the "d" variable is
never incremented.  Even so, statement coverage counts this line of
code as having been tested.</p>

<p>Branch coverage is more strict.  With branch coverage, each test and
each subblock within the statement is considered separately.  In order
to achieve 100% branch coverage in the example above, there must be at
least three test cases:</p>

<li> a<=b
<li> a>b && c==25
<li> a>b && c!=25

<p>Any one of the above test cases would provide 100% statement coverage
but all three are required for 100% branch coverage.  Generally speaking,
100% branch coverage implies 100% statement coverage, but the converse is
not true.  To reemphasize, the
[TH3] test harness for SQLite provides the stronger form of
test coverage - 100% branch test coverage.</p>

<tcl>hd_fragment defensivecode</tcl>
<h3>7.2 Coverage testing of defensive code</h3>

<p>A well-written C program will typically contain some defensive tests
which in practice are always true or always false.  This leads to a 
programming dilemma:  Does one remove defensive code in order to obtain
100% branch coverage?</p>

<p>In SQLite, the answer to the previous question is "no".
For testing purposes, the SQLite source code defines
macros called ALWAYS() and NEVER().   The ALWAYS() macro
surrounds conditions
which are expected to always evaluate as true and NEVER() surrounds
conditions that are always evaluated to false.  These macros serve as
comments to indicate that the conditions are defensive code.
For standard builds, these macros are pass-throughs:</p>

#define ALWAYS(X)  (X)
#define NEVER(X)   (X)

<p>During most testing, however, these macros will throw an assertion
fault if their argument does not have the expected truth value.  This
alerts the developers quickly to incorrect design assumptions.

#define ALWAYS(X)  ((X)?1:assert(0),0)
#define NEVER(X)   ((X)?assert(0),1:0)

<p>When measuring test coverage, these macros are defined to be constant
truth values so that they do not generate assembly language branch
instructions, and hence do not come into play when calculating the
branch coverage:</p>

#define ALWAYS(X)  (1)
#define NEVER(X)   (0)

<p>The test suite is designed to be run three times, once for each of
the ALWAYS() and NEVER() definitions shown above.  All three test runs
should yield exactly the same result.  There is a run-time test using
the [sqlite3_test_control]([SQLITE_TESTCTRL_ALWAYS], ...) interface that
can be used to verify that the macros are correctly set to the first
form (the pass-through form) for deployment.</p>

<tcl>hd_fragment {testcase} {testcase macros}</tcl>
<h3>7.3 Forcing coverage of boundary values and boolean vector tests</h3>

<p>Another macro used in conjunction with test coverage measurement is
the <tt>testcase()</tt> macro.  The argument is a condition for which
we want test cases that evaluate to both true and false.
In non-coverage builds (that is to say, in release builds) the
<tt>testcase()</tt> macro is a no-op:</p>

#define testcase(X)

<p>But in a coverage measuring build, the <tt>testcase()</tt> macro
generates code that evaluates the conditional expression in its argument.  
Then during analysis, a check
is made to ensure tests exist that evaluate the conditional to both true
and false.  <tt>Testcase()</tt> macros are used, for example, to help verify
that boundary values are tested.  For example:</p>

testcase( a==b );
testcase( a==b+1 );
if( a>b && c!=25 ){ d++; }

<p>Testcase macros are also used when two or more cases of a switch
statement go to the same block of code, to make sure that the code was
reached for all cases:</p>

switch( op ){
  case OP_Add:
  case OP_Subtract: {
    testcase( op==OP_Add );
    testcase( op==OP_Subtract );
    /* ... */
  /* ... */

<p>For bitmask tests, <tt>testcase()</tt> macros are used to verify that every
bit of the bitmask affects the test.  For example, in the following block
of code, the condition is true if the mask contains either of two bits
indicating either a MAIN_DB or a TEMP_DB is being opened.  
The <tt>testcase()</tt>
macros that precede the if statement verify that both cases are tested:</p>

testcase( mask & SQLITE_OPEN_MAIN_DB );
testcase( mask & SQLITE_OPEN_TEMP_DB );
if( (mask & (SQLITE_OPEN_MAIN_DB|SQLITE_OPEN_TEMP_DB))!=0 ){ ... }

<p>The SQLite source code contains <tcl>N {$stat(nTestcase)}</tcl>
uses of the <tt>testcase()</tt> macro.</p>

<tcl>hd_fragment {mcdc} *MC/DC {MC/DC testing}</tcl>
<h3>7.4 Branch coverage versus MC/DC</h3>

<p>Two methods of measuring test coverage were described above:
"statement" and "branch" coverage.  There are many other test coverage
metrics besides these two.  Another popular metric is "Modified
Condition/Decision Coverage" or MC/DC.  
[ | Wikipedia]
defines MC/DC as follows:</p>

<li> Each decision tries every possible outcome.
<li> Each condition in a decision takes on every possible outcome.
<li> Each entry and exit point is invoked.
<li> Each condition in a decision is shown to independently
     affect the outcome of the decision.

<p>In the C programming language 
where <b><tt>&amp;&amp;</tt></b> and <b><tt>||</tt></b>
are "short-circuit" operators, MC/DC and branch coverage are very nearly
the same thing.  The primary difference is in boolean vector tests.
One can test for any of several bits in bit-vector and still obtain
100% branch test coverage even though the second element of MC/DC - the
requirement that each condition in a decision take on every possible outcome -
might not be satisfied.</p>

<p>SQLite uses <tt>testcase()</tt> macros as described in the previous
subsection to make sure that every condition in a bit-vector decision takes
on every possible outcome.  In this way, SQLite also achieves 100% MC/DC
in addition to 100% branch coverage.</p>

<tcl>hd_fragment thoughts1</tcl>
<h3>7.5 Experience with full test coverage</h3>

<p>The developers of SQLite have found that full coverage testing is an
extremely productive method for preventing the introduction of new bugs
as the system evolves.  Because every single branch
instruction in SQLite core code is covered by test cases, the developers
can be confident that changes they make in one part of the code
do not have unintended consequences in other parts of the code.
It would be extremely difficult maintain the quality of SQLite without such

<tcl>hd_fragment dynamicanalysis</tcl>
<h2>8.0 Dynamic Analysis</h2>

<p>Dynamic analysis refers to internal and external checks on the
SQLite code which are performed while the code is live and running.
Dynamic analysis has proven to be a great help in maintaining the
quality of SQLite.</p>

<tcl>hd_fragment asserts</tcl>
<h3>8.1 Assert</h3>

<p>The SQLite core contains <tcl>N {$stat(nAssert)}</tcl> <tt>assert()</tt>
statements that verify function preconditions and postconditions and
loop invariants.  Assert() is a macro which is a standard part of
ANSI-C.  The argument is a boolean value that is assumed to always be
true.  If the assertion is false, the program prints an error message
and halts.</p>

<p>Assert() macros are disabled by compiling with the NDEBUG macro defined.
In most systems, asserts are enabled by default.  But in SQLite, the
asserts are so numerous and are in such performance critical places, that
the database engine runs about three times slower when asserts are enabled.
Hence, the default (production) build of SQLite disables asserts.  
Assert statements are only enabled when SQLite is compiled with the
SQLITE_DEBUG preprocessor macro defined.</p>

<tcl>hd_fragment valgrind</tcl>
<h3>8.2 Valgrind</h3>

<p>[ | Valgrind] is perhaps the most amazing
and useful developer tool in the world.  Valgrind is a simulator - it simulates
an x86 running a Linux binary.  (Ports of Valgrind for platforms other
than Linux are in development, but as of this writing, Valgrind only
works reliably on Linux, which in the opinion of the SQLite developers 
means that Linux should be the preferred platform for all software development.)
As Valgrind runs a Linux binary, it looks for all kinds of interesting
errors such as array overruns, reading from uninitialized memory,
stack overflows, memory leaks, and so forth.  Valgrind finds problems
that can easily slip through all of the other tests run against SQLite.
And, when Valgrind does find an error, it can dump the developer directly
into a symbolic debugger at the exact point where the error occur, to
facilitate a quick fix.</p>

<p>Because it is a simulator, running a binary in Valgrind is slower than 
running it on native hardware.  (To a first approximation, an application
running in Valgrind on a workstation will perform about the same as it
would running natively on a smartphone.)  So it is impractical to run the full
SQLite test suite through Valgrind.  However, the veryquick tests and
the coverage of the TH3 tests are run through Valgrind prior to every

<tcl>hd_fragment memtesting</tcl>
<h3>8.3 Memsys2</h3>

<p>SQLite contains a pluggable
[memory allocation | memory allocation subsystem].
The default implementation uses system malloc() and free(). 
However, if SQLite is compiled with [SQLITE_MEMDEBUG], an alternative
memory allocation wrapper ([memsys2])
is inserted that looks for memory allocation
errors at run-time.  The memsys2 wrapper checks for memory leaks, of
course, but also looks for buffer overruns, uses of uninitialized memory,
and attempts to use memory after it has been freed.  These same checks
are also done by valgrind (and, indeed, Valgrind does them better)
but memsys2 has the advantage of being much faster than Valgrind, which
means the checks can be done more often and for longer tests.</p>

<tcl>hd_fragment mutextesting</tcl>
<h3>8.4 Mutex Asserts</h3>

<p>SQLite contains a pluggable mutex subsystem.  Depending on 
compile-time options, the default mutex system contains interfaces
[sqlite3_mutex_held()] and [sqlite3_mutex_notheld()] that detect
whether or not a particular mutex is held by the calling thread.
These two interfaces are used extensively within assert() statements
in SQLite to verify mutexes are held and released at all the right
moments, in order to double-check that SQLite does work correctly
in multi-threaded applications.</p>

<tcl>hd_fragment journaltest</tcl>
<h3>8.5 Journal Tests</h3>

<p>One of the things that SQLite does to ensure that transactions
are atomic across system crashes and power failures is to write
all changes into the rollback journal file prior to changing the
database.  The TCL test harness contains an alternative
[OS backend] implementation that helps to
verify this is occurring correctly.  The "journal-test VFS" monitors
all disk I/O traffic between the database file and rollback journal,
checking to make sure that nothing is written into the database
file which has not first been written and synced to the rollback journal.
If any discrepancies are found, an assertion fault is raised.</p>

<p>The journal tests are an additional double-check over and above
the crash tests to make sure that SQLite transactions will be atomic
across system crashes and power failures.</p>

<tcl>hd_fragment intoverflow</tcl>
<h3>8.6 Signed-Integer Overflow Checks</h3>

<p>The various C language standards say that the signed-integer overflow
behavior is undefined.  In other words, when you add a value to a signed
integer such that the result is too large to fit in that integer, the
value does not necessarily wrap around to a negative number, as most
programmers expect.  It might do that.  But it might do something completely
different.  See, for example,
<a href="">here</a>
and <a href="">here</a>.  Even the
same compiler might do something different with signed integer overflow
in different places in the code or at different optimizations settings.</p>

<p>SQLite never overflows a signed integer.
To verify this, the test suites are run at least once when compiled with
the -ftrapv option to GCC.  The -ftrapv option causes GCC to generate code
that will panic() on a signed integer overflow.  In addition, there are
many test cases the strive to provoke integer overflows
using boundary value calculations such as
 "<b>SELECT -1*(-9223372036854775808);</b>".

<tcl>hd_fragment disopttest</tcl>
<h2>9.0 Disabled Optimization Tests</h2>

<p>The [sqlite3_test_control]([SQLITE_TESTCTRL_OPTIMIZATIONS], ...) interface
allows selected SQL statement optimizations to be disabled at run-time.
SQLite should always generate exactly the same answer with optimizations
enabled and with optimizations disabled; the answer simply arrives quicker
with the optimizations turned on.  So in a production environment, one always
leaves the optimizations turned on (the default setting).</p>

<p>One verification technique used on SQLite is to run an entire test suite
twice, once with optimizations left on and a second time with optimizations
turned off, and verify that the same output is obtained both times.  This
shows that the optimizations do not introduce errors.</p>

<p>Not all test cases can be handled this way.  Some test cases check
to verify that the optimizations really are reducing the amount of
computation by counting the number of disk accesses, sort operations, 
full-scan steps, or other processing steps that occur during queries.
Those test cases will appear to fail when optimizations are disabled.
But the majority of test cases simply check that the correct answer
was obtained, and all of those cases can be run successfully with and
without the optimizations, in order to show that the optimizations do not
cause malfunctions.</p>

<tcl>hd_fragment cklist</tcl>
<h2>10.0 Checklists</h2>

<p>The SQLite developers use an on-line checklist to coordinate testing
activity and to verify that all tests pass prior each SQLite release.
<a href="">Past checklists</a>
are retained for historical reference.
The use of checklists for SQLite testing and other development activites
is inspired by <i>
[ | The Checklist Manifesto]

<p>The latest checklists contain approximately 200 items that are
individually checked for each release.  Some checklist items only take
a few seconds to verify and mark off.  Others involve test suites
that run for many hours.</p>

<p>The release checklist is not automated: developers run each item on
the checklist manually.  We find that it is important to keep a human in
the loop.  Sometimes problems are found while running a checklist item
even though the test itself passed.  It is important to have a human
reviewing the test output at the highest level, and constantly asking
"Is this really right?"</p>

<p>The release checklist is continuously evolving.  As new problems or
potential problems are discovered, new checklist items are added to
make sure those problems do not appear in subsequent releases.  The
release checklist has proven to be an invaluable tool in helping to
ensure that nothing is overlooked during the testing process.</p>

<tcl>hd_fragment staticanalysis</tcl>
<h2>11.0 Static Analysis</h2>

<p>Static analysis means analyzing code at or before compile-time to
check for correctness.  Static analysis includes looking at compiler
warning messages and running the code through more in-depth
analysis engines such as the
[ | Clang Static Analyzer].
SQLite compiles without warnings on GCC and Clang using 
the -Wall and -Wextra flags on Linux and Mac and on MSVC on Windows.
No warnings are generated by the Clang Static Analyzer tool "scan-build"
either.  Nevertheless, some warnings might be generated by other
static analyzers.  Users are encouraged not to stress over these
warnings and to instead take solace in the intense testing of SQLite
described above. 

<p>Static analysis has not proven to be especially helpful in finding
bugs in SQLite.  Static analysis has found a few bugs in SQLite, but
those are the exceptions.  More bugs have been
introduced into SQLite while trying to get it to compile without 
warnings than have been found by static analysis.</p>

<tcl>hd_fragment summary</tcl>
<h2>12.0 Summary</h2>

<p>SQLite is open source.  This gives many people the idea that
it is not well tested as commercial software and is perhaps unreliable.
But that impression is false.  
SQLite has exhibited very high reliability in the field and
a very low defect rate, especially considering how rapidly it is evolving.
The quality of SQLite is achieved in part by careful code design and
implementation.  But extensive testing also plays a vital role in
maintaining and improving the quality of SQLite.  This document has
summarized the testing procedures that every release of SQLite undergoes
with the hopes of inspiring the reader to understand that SQLite is
suitable for use in mission-critical applications.</p>