sqllogictest
Artifact Content
Not logged in

Artifact bfa450e7590366e1b7f9afcfffabfc6ed3ad814f:


<h1>About Sqllogictest</h1>

Sqllogictest is a program designed to verify that an SQL database engine
computes correct results by comparing the results to identical
queries from other SQL database engines.  Sqllogictest was originally
designed to test [http://www.sqlite.org/ | SQLite], but it is database engine
neutral and can just as easily be used to test other database products.

Sqllogictest is concerned only with correct results.  
No attention is paid to performance,
optimal use of indices, disk and memory usage, transactional behavior,
or concurrancy and locking issues.  
The sqllogictest program seeks to answer just one question:

<blockquote><i>
Does the database engine compute the correct answer.
</i></blockquote>

Every SQL database engine has test vectors used to validate its operation.
These manually generated test vectors are important.  But generating
test vectors is tedious, since the correct solutions must be computed
and verifed by hand.  The sqllogictest program is designed to sidestep
this tedium by using independently developed database engines to generate
the reference test results automatically.  This allows millions of test 
vectors to be producted by simple scripts, which in turn provides much 
more thorough and complete testing of the database engine.

<h2>Operation</h2>

The sqllogictest program is driven by test scripts containing SQL statements
and queries and, sometimes, query results.  A test script that omits the
results is called a "prototype script".  A test script that includes results
is a "full script".

The sqllogictest program operates in two modes:  test script completion mode
and test script validation mode.  In test script completion mode, the
sqllogictest program reads a prototype script and runs the statements 
and queries against a reference database engine.  The output is a 
full script that is a copy of the prototype script with result inserted.  
In validation mode, 
the sqllogictest program reads a full script and runs the statements and
queries contained therein against a database engine under test.  The results
received back from the database engine are compared against the results
in the full script to validate the output of the database engine.

For example, to verify that SQLite gets the same answer as MySQL on
a particular set of queries, one might execute commands as follows.
First complete the prototype script using MySQL accessed through ODBC
as the reference database engine:

<blockquote><pre>
sqllogictest -odbc DSN=mysqlslt MySQL prototype.test >full.test
</pre></blockquote>

In the command above, you would, of course, substitute whatever DSN
string is appropriate for your installation.
Afterwards, verify the results using the built-in copy of SQLite:

<blockquote><pre>
sqllogictest -verify full.test
</pre></blockquote>

The second command will display any discrepencies between the output
generated by SQLite and the reference data that was generated by MySQL
in the first command.  Notice that the default mode of operation for
sqllogictest is completion mode.  The -verify command-line option is used
to activate validation mode.

Another approach to validation is to run the competion step separately
for each database engine and save the output in separate test scripts.
Then compare the two test scripts using a file comparison utility.
For example:

<blockquote><pre>
sqllogictest -odbc DSN=mysqlslt prototype.test >full-1.test
sqllogictest prototype.test >full-2.test
diff full-1.test full-2.test
</pre></blockquote>

Note that in completion mode, the sqllogictest program will ignore any
result value contained in its input test script.  Or, in other words, it is
safe to pass a full test script into sqllogictest running in completion
mode.  So the previous test could have been run this way:

<blockquote><pre>
sqllogictest -odbc DSN=mysqlslt MySQL prototype.test >full-1.test
sqllogictest full-1.test >full-2.test
diff full-1.test full-2.test
</pre></blockquote>

The long-term vision of the sqllogictest project is to amass a huge 
collection of prototype scripts.  Many of the prototype scripts will
be very large, being automatically generated by some dynamic language
like Tcl or Perl.  Prototype scripts will try queries using all kinds
and combinations of constraints, column orders, joins, indexes, and
so forth in order to exercize as many paths through the logic of the 
database engine as possible.

<h2>Test-Script Format</h2>

Test scripts are line-oriented ASCII text files.  
No provision is made for Unicode; the purpose
of sqllogictest is to test the query and join logic of the database engine, not
its support for localization and internationalization.

Test scripts consist of zero or more records.  A record is a
single statement or query or a control record.  Each record
is separated from its neighbors by one or more blank line.  Records
are evaluated in order, starting from the beginning of the
script and working toward the end.

Lines of the test script that begin with the sharp character
("#", ASCII code 35)
are comment lines and are ignored.  Comment lines are not considered blank
lines and cannot be used to separate records.  Comments
typically occur at the beginning of a record, but they are
allowed to occur in the middle of a record.  Comments
that occurs in the middle of an SQL statement are stripped from the
statement prior to the statement being sent to the database engine for
evaluation.  Comments are logically removed from the script by a preprocessor.
Hence, when we speak of the "first line of a record" we
really mean the "first non-comment line of a record".

Most records are either a statement or a query.  A statement is an SQL
command that is to be evaluated but from which we do not expect to get
results (other than success or failure).  A statement might be a
CREATE TABLE or an INSERT or an UPDATE or a DROP INDEX.  A query is an
SQL command from which we expect to receive results.  The result set
might be empty.

A statement record begins with one of the following two lines:

<blockquote><pre>
statement ok
statement error
</pre></blockquote>

The SQL command to be evaluated is found on the second and all subsequent
liens of the record.  Only a single SQL command is allowed per statement.
The SQL should not have a semicolon or other terminator at the end;  any
required terminators will be added by the database engine interface module.

The SQL command is expected to succeed if the "ok" argument is used and is
expected to fail if the "error" argument is used.  Most statements are
expected to succeed.  But some statements can deliberately fail.  For
example, an INSERT statement that violates a UNIQUE or CHECK or NOT NULL
constraint might fail.  

A query record begins with a line of the following form:

<blockquote>
<tt>query</tt> <i>&lt;type-string&gt; &lt;sort-mode&gt; &lt;label&gt;</i>
</blockquote>

The SQL for the query is found on second an subsequent lines of the 
record up to first line of the form "----" or until the end of the
record.  Lines following the "----" are expected results of the query,
one value per line.  If the "----" and/or the results are omitted, then
the query is expected to return an empty set.  The "----" and results
are also omitted from prototype scripts and are always ignored when
the sqllogictest program is operating in completion mode.  Another way
of thinking about completion mode is that it copies the script from
input to output, replacing all "----" lines and subsequent result values
with the actual results from running the query.

The &lt;type-string&gt; argument to the query statement is a short string
that specifies the number of result columns and the expected datatype
of each result column.  There is one character in the &lt;type-string&gt;
for each result column.  The characters codes are "T" for a text result,
"I" for an integer result, and "R" for a floating-point result.

The &lt;sort-mode&gt; argument is optional.  If included, it must be
one of "nosort", "rowsort", or "valuesort".  The default is "nosort".
In nosort mode, the results appear in exactly the order in which they
were received from the database engine.  The nosort mode should only
be used on queries that have an ORDER BY clause or which only have a
single row of result, since otherwise the order of results is undefined
and might vary from one database engine to another.  The "rowsort" mode
gathers all output from the database engine then sorts it by rows on
the client side.  Sort comparisons use strcmp() on the rendered ASCII
text representation of the values.  Hence, "9" sorts after "10", not before.
The "valuesort" mode works like rowsort except that it does not honor
row groupings.  Each individual result value is sorted on its own.

The &lt;label&gt; argument is also optional.  If included, sqllogictest
stores a hash of the results of this query under the given label.  If
the label is reused, then sqllogictest verifies that the results are the
same.  This can be used to verify that two or more queries in the
same test script that are logically equivalent always generate the same
output.

In the results section, integer values are rendered as if by
printf("%d").  Floating point values are rendered as if by
printf("%.3f").  NULL values are rendered as "NULL".  Empty
strings are rendered as "(empty)".  Within non-empty strings,
all control characters and unprintable characters are rendered as "@".

<h3>Control Records</h3>

The test script might also contain control records.  A control record is
one of the following:

<blockquote>
<tt>halt</tt><br>
<tt>hash-threshold</td> &lt;max-result-set-size&gt;
</blockquote>

A "halt" record is intended for debugging use only.  A halt record merely
causes sqllogictest to ignore the rest of the test script.  A halt record
can be inserted after a query that is giving an anomalous result, causing
the database to be left in the state where it gives the unexpected answer.
After sqllogictest exist, manually debugging can then proceed.

The "hash-threshold" record sets a limit on the number of values that can
appear in a result set.  If the number of values exceeds this, then instead
of recording each individual value in the full test script, an MD5 hash of
all values is computed in stored.  This makes the full test scripts much
shorter, but at the cost of obscuring the results.  If the hash-threshold
is 0, then results are never hashed.  A hash-threshold of 10 or 20 is
recommended.  During debugging, it is advantage to set the hash-threshold
to zero so that all results can be seen.

<h3>Conditional Records</h3>

Statement and query records can be prefixed with zero or more conditionals
of the following form:

<blockquote>
<tt>skipif</tt>  &lt;database-name&gt;<br>
<tt>onlyif</tt>  &lt;database-name&gt;
</blockquote>

The statement or query is not evaluated if a skipif record for the
target database engine is seen in the prefix.  The statement or query
is also skipped if an onlyif record for a different database engine
is seen.

When a skipif or onlyif causes a statement or query to be skipped,
it is still transferred unchanged to standard output in completion
mode.  In validation mode, if the record is a query with &lt;label&gt;
argument, then the result found in the input script is still checked
against the results of other queries with the same label, even though
the SQL is not run.

The skipif and onlyif prefixes can be used to implement test cases where
the SQL syntax varies from one database engine to another.  For example,
PostgreSQL has the syntax quirk that the AS keyword is required prior to
alias names in the Select-list items section of a query.  On all other
database engines, and in the SQL standard, the AS keyword is optional.
The way to work around this is as follows:

<verbatim>
     query III rowsort label-xyzzy
     SELECT a AS x, b AS y, c AS z FROM t1

     skipif postgresql
     query III rowsort label-xyzzy
     SELECT a x, b y, c z FROM t1
</verbatim>

Thus, there are two queries that are identical in every way except that
the second omits the AS keywords.  The second query is omitted from
PostgreSQL runs.  We use the label-xyzzy on both queries
in order to verify that they yield identical results.

Another example is MySQL in which the "/" operator is always floating-point
division even when both operands are integers.  To get integer division in
MySQL you have to use the "DIV" operator.  To test this behavior, one
could do something like the following:

<verbatim>
     skipif mysql
     query I rowsort label-plover
     SELECT a/b FROM t1

     onlyif mysql
     query I rowsort label-plover
     SELECT a DIV b FROM t1
</verbatim>

Once completion runs are made on all database engines, the query label
will operate on subsequent validation runs to
ensure that the same results are obtained from all database engines, 
even though the query syntax is slightly different.

<h2>Suggestions For Generating Test-Scripts</h2>

When sqllogictest runs a test script, it begins with a completely
empty database.  So the first few records of any test script will
typically be CREATE statements of various kinds and expecially
CREATE TABLE statements.  In order to maximize the portability of
scripts across database engines, it is suggested that test scripts stick 
to the basic CREATE TABLE syntax.  Use only a few common datatypes
such as:

  *  INTEGER
  *  VARCHAR(30)
  *  REAL

Remember, the purpose of sqllogictest is to validate the logic behind
the evaluation of SQL statements, not the ability to handle extreme values.
So keep content in a reasonable range:  small integers, short strings, and
floating point numbers that use only the most significant bits of an
a 32-bit IEEE float.

After creating one or more tables and populating them with test data,
use a dynamic language (TCL, Perl, Python, Ruby) to implement a templating 
scheme that will generate thousands or millions of separate queries.  
Use a pseudo-random number
generator (PRNG) to fill in the templates at random.  Seed the PRNG
with a constant at the beginning of the dynamic-language program
so that rerunning the program will generate the same test-script 
every time.

In the dynamic-language programs that generate test scripts, it is useful
to have subroutines that compute elements such as the following:

  *  Randomly permute the elements of a result set.

  *  Choose a random subset of columns in a table to be updated.

  *  Generate a random WHERE clause.

  *  Generate a random string literal of some maximum length.

  *  Generate a random identifier which is not a keyword.

Segregate queries that use LIMIT and OFFSET into separate test scripts
which are only run on database engines that support LIMIT and OFFSET.

All queries should use either an ORDER BY clause so that
the order of values in the output is deterministic, or else
the "rowsort" or "valuesort" modifiers at the beginning
of the query record to ensure that the output appears in the same order
on all database engines.

A typical test script will begin with some CREATE statements followed by
some INSERT statements to add initial data.  This is followed by
thousands of randomly generate UPDATE, DELETE, and INSERT statements.
Several SELECT statements typical follow each UPDATE, DELETE, or INSERT
in order to verify that the content of the database is as expected.

It is useful to includes some NULL values in the initial data in order to
test the NULL handling logic.  Be careful, however, in that different
database engines interpret NULLs in a UNIQUE constraint differently.
SQLite, PostgreSQL, Oracle, MySQL, and Firebird do it one way.  Informix,
DB2, and MS-SQL do it another.  So it is best to avoid using NULLs in
UNIQUE column. Also, NULLs sort differently on different
engines.  SQLite sorts NULL values first.  PostgreSQL and MySQL and most
other database engiens sort NULL values last.  So one should avoid ORDER
BY and LIMIT clauses and use the "rowsort" or "valuesort" parameter on
queries that might return NULLs.

One might have a large set of DELETE, INSERT, SELECT, and UPDATE statements
that are repeated multiple times, but with various CREATE INDEX and
DROP INDEX statements in between each iteration.  Such tests seeks to
prove that the same results appear regardless of whether or not indices
are present.  Remember, the purpose of sqllogictest is to verify that the
database engine gets correct results, not that it makes effective use
of indices.