About Sqllogictest

Sqllogictest is a program designed to verify that an SQL database engine computes correct results by comparing the results to identical queries from other SQL database engines. Sqllogictest was originally designed to test SQLite, but it is database engine neutral and can just as easily be used to test other database products.

Sqllogictest is concerned only with correct results. No attention is paid to performance, optimal use of indices, disk and memory usage, transactional behavior, or concurrancy and locking issues. The sqllogictest program seeks to answer just one question:

Does the database engine compute the correct answer.

Every SQL database engine has test vectors used to validate its operation. These manually generated test vectors are important. But generating test vectors is tedious, since the correct solutions must be computed and verifed by hand. The sqllogictest program is designed to sidestep this tedium by using independently developed database engines to generate the reference test results automatically. This allows millions of test vectors to be producted by simple scripts, which in turn provides much more thorough and complete testing of the database engine.

Operation

The sqllogictest program is driven by test scripts containing SQL statements and queries and, sometimes, query results. A test script that omits the results is called a "prototype script". A test script that includes results is a "full script".

The sqllogictest program operates in two modes: test script completion mode and test script validation mode. In test script completion mode, the sqllogictest program reads a prototype script and runs the statements and queries against a reference database engine. The output is a full script that is a copy of the prototype script with result inserted. In validation mode, the sqllogictest program reads a full script and runs the statements and queries contained therein against a database engine under test. The results received back from the database engine are compared against the results in the full script to validate the output of the database engine.

For example, to verify that SQLite gets the same answer as MySQL on a particular set of queries, one might execute commands as follows. First complete the prototype script using MySQL accessed through ODBC as the reference database engine:

sqllogictest -odbc DSN=mysqlslt MySQL prototype.test >full.test

In the command above, you would, of course, substitute whatever DSN string is appropriate for your installation. Afterwards, verify the results using the built-in copy of SQLite:

sqllogictest -verify full.test

The second command will display any discrepencies between the output generated by SQLite and the reference data that was generated by MySQL in the first command. Notice that the default mode of operation for sqllogictest is completion mode. The -verify command-line option is used to activate validation mode.

Another approach to validation is to run the competion step separately for each database engine and save the output in separate test scripts. Then compare the two test scripts using a file comparison utility. For example:

sqllogictest -odbc DSN=mysqlslt prototype.test >full-1.test
sqllogictest prototype.test >full-2.test
diff full-1.test full-2.test

Note that in completion mode, the sqllogictest program will ignore any result value contained in its input test script. Or, in other words, it is safe to pass a full test script into sqllogictest running in completion mode. So the previous test could have been run this way:

sqllogictest -odbc DSN=mysqlslt MySQL prototype.test >full-1.test
sqllogictest full-1.test >full-2.test
diff full-1.test full-2.test

The long-term vision of the sqllogictest project is to amass a huge collection of prototype scripts. Many of the prototype scripts will be very large, being automatically generated by some dynamic language like Tcl or Perl. Prototype scripts will try queries using all kinds and combinations of constraints, column orders, joins, indexes, and so forth in order to exercize as many paths through the logic of the database engine as possible.

Test-Script Format

Test scripts are line-oriented ASCII text files. No provision is made for Unicode; the purpose of sqllogictest is to test the query and join logic of the database engine, not its support for localization and internationalization.

Test scripts consist of zero or more records. A record is a single statement or query or a control record. Each record is separated from its neighbors by one or more blank line. Records are evaluated in order, starting from the beginning of the script and working toward the end.

Lines of the test script that begin with the sharp character ("#", ASCII code 35) are comment lines and are ignored. Comment lines are not considered blank lines and cannot be used to separate records. Comments typically occur at the beginning of a record, but they are allowed to occur in the middle of a record. Comments that occurs in the middle of an SQL statement are stripped from the statement prior to the statement being sent to the database engine for evaluation. Comments are logically removed from the script by a preprocessor. Hence, when we speak of the "first line of a record" we really mean the "first non-comment line of a record".

Most records are either a statement or a query. A statement is an SQL command that is to be evaluated but from which we do not expect to get results (other than success or failure). A statement might be a CREATE TABLE or an INSERT or an UPDATE or a DROP INDEX. A query is an SQL command from which we expect to receive results. The result set might be empty.

A statement record begins with one of the following two lines:

statement ok
statement error

The SQL command to be evaluated is found on the second and all subsequent liens of the record. Only a single SQL command is allowed per statement. The SQL should not have a semicolon or other terminator at the end; any required terminators will be added by the database engine interface module.

The SQL command is expected to succeed if the "ok" argument is used and is expected to fail if the "error" argument is used. Most statements are expected to succeed. But some statements can deliberately fail. For example, an INSERT statement that violates a UNIQUE or CHECK or NOT NULL constraint might fail.

A query record begins with a line of the following form:

query <type-string> <sort-mode> <label>

The SQL for the query is found on second an subsequent lines of the record up to first line of the form "----" or until the end of the record. Lines following the "----" are expected results of the query, one value per line. If the "----" and/or the results are omitted, then the query is expected to return an empty set. The "----" and results are also omitted from prototype scripts and are always ignored when the sqllogictest program is operating in completion mode. Another way of thinking about completion mode is that it copies the script from input to output, replacing all "----" lines and subsequent result values with the actual results from running the query.

The <type-string> argument to the query statement is a short string that specifies the number of result columns and the expected datatype of each result column. There is one character in the <type-string> for each result column. The characters codes are "T" for a text result, "I" for an integer result, and "R" for a floating-point result.

The <sort-mode> argument is optional. If included, it must be one of "nosort", "rowsort", or "valuesort". The default is "nosort". In nosort mode, the results appear in exactly the order in which they were received from the database engine. The nosort mode should only be used on queries that have an ORDER BY clause or which only have a single row of result, since otherwise the order of results is undefined and might vary from one database engine to another. The "rowsort" mode gathers all output from the database engine then sorts it by rows on the client side. Sort comparisons use strcmp() on the rendered ASCII text representation of the values. Hence, "9" sorts after "10", not before. The "valuesort" mode works like rowsort except that it does not honor row groupings. Each individual result value is sorted on its own.

The <label> argument is also optional. If included, sqllogictest stores a hash of the results of this query under the given label. If the label is reused, then sqllogictest verifies that the results are the same. This can be used to verify that two or more queries in the same test script that are logically equivalent always generate the same output.

In the results section, integer values are rendered as if by printf("%d"). Floating point values are rendered as if by printf("%.3f"). NULL values are rendered as "NULL". Empty strings are rendered as "(empty)". Within non-empty strings, all control characters and unprintable characters are rendered as "@".

Control Records

The test script might also contain control records. A control record is one of the following:

halt
hash-threshold <max-result-set-size>

A "halt" record is intended for debugging use only. A halt record merely causes sqllogictest to ignore the rest of the test script. A halt record can be inserted after a query that is giving an anomalous result, causing the database to be left in the state where it gives the unexpected answer. After sqllogictest exist, manually debugging can then proceed.

The "hash-threshold" record sets a limit on the number of values that can appear in a result set. If the number of values exceeds this, then instead of recording each individual value in the full test script, an MD5 hash of all values is computed in stored. This makes the full test scripts much shorter, but at the cost of obscuring the results. If the hash-threshold is 0, then results are never hashed. A hash-threshold of 10 or 20 is recommended. During debugging, it is advantage to set the hash-threshold to zero so that all results can be seen.

Conditional Records

Statement and query records can be prefixed with zero or more conditionals of the following form:

skipif <database-name>
onlyif <database-name>

The statement or query is not evaluated if a skipif record for the target database engine is seen in the prefix. The statement or query is also skipped if an onlyif record for a different database engine is seen.

When a skipif or onlyif causes a statement or query to be skipped, it is still transferred unchanged to standard output in completion mode. In validation mode, if the record is a query with <label> argument, then the result found in the input script is still checked against the results of other queries with the same label, even though the SQL is not run.

The skipif and onlyif prefixes can be used to implement test cases where the SQL syntax varies from one database engine to another. For example, PostgreSQL has the syntax quirk that the AS keyword is required prior to alias names in the Select-list items section of a query. On all other database engines, and in the SQL standard, the AS keyword is optional. The way to work around this is as follows:

     query III rowsort label-xyzzy
     SELECT a AS x, b AS y, c AS z FROM t1

     skipif postgresql
     query III rowsort label-xyzzy
     SELECT a x, b y, c z FROM t1

Thus, there are two queries that are identical in every way except that the second omits the AS keywords. The second query is omitted from PostgreSQL runs. We use the label-xyzzy on both queries in order to verify that they yield identical results.

Another example is MySQL in which the "/" operator is always floating-point division even when both operands are integers. To get integer division in MySQL you have to use the "DIV" operator. To test this behavior, one could do something like the following:

     skipif mysql
     query I rowsort label-plover
     SELECT a/b FROM t1

     onlyif mysql
     query I rowsort label-plover
     SELECT a DIV b FROM t1

Once completion runs are made on all database engines, the query label will operate on subsequent validation runs to ensure that the same results are obtained from all database engines, even though the query syntax is slightly different.

Suggestions For Generating Test-Scripts

When sqllogictest runs a test script, it begins with a completely empty database. So the first few records of any test script will typically be CREATE statements of various kinds and expecially CREATE TABLE statements. In order to maximize the portability of scripts across database engines, it is suggested that test scripts stick to the basic CREATE TABLE syntax. Use only a few common datatypes such as:

INTEGER
VARCHAR(30)
REAL

Remember, the purpose of sqllogictest is to validate the logic behind the evaluation of SQL statements, not the ability to handle extreme values. So keep content in a reasonable range: small integers, short strings, and floating point numbers that use only the most significant bits of an a 32-bit IEEE float.

After creating one or more tables and populating them with test data, use a dynamic language (TCL, Perl, Python, Ruby) to implement a templating scheme that will generate thousands or millions of separate queries. Use a pseudo-random number generator (PRNG) to fill in the templates at random. Seed the PRNG with a constant at the beginning of the dynamic-language program so that rerunning the program will generate the same test-script every time.

In the dynamic-language programs that generate test scripts, it is useful to have subroutines that compute elements such as the following:

Randomly permute the elements of a result set.

Choose a random subset of columns in a table to be updated.

Generate a random WHERE clause.

Generate a random string literal of some maximum length.

Generate a random identifier which is not a keyword.

Segregate queries that use LIMIT and OFFSET into separate test scripts which are only run on database engines that support LIMIT and OFFSET.

All queries should use either an ORDER BY clause so that the order of values in the output is deterministic, or else the "rowsort" or "valuesort" modifiers at the beginning of the query record to ensure that the output appears in the same order on all database engines.

A typical test script will begin with some CREATE statements followed by some INSERT statements to add initial data. This is followed by thousands of randomly generate UPDATE, DELETE, and INSERT statements. Several SELECT statements typical follow each UPDATE, DELETE, or INSERT in order to verify that the content of the database is as expected.

It is useful to includes some NULL values in the initial data in order to test the NULL handling logic. Be careful, however, in that different database engines interpret NULLs in a UNIQUE constraint differently. SQLite, PostgreSQL, Oracle, MySQL, and Firebird do it one way. Informix, DB2, and MS-SQL do it another. So it is best to avoid using NULLs in UNIQUE column. Also, NULLs sort differently on different engines. SQLite sorts NULL values first. PostgreSQL and MySQL and most other database engiens sort NULL values last. So one should avoid ORDER BY and LIMIT clauses and use the "rowsort" or "valuesort" parameter on queries that might return NULLs.