- 2004-Nov-09 07:21 anonymous
- 2004-Aug-06 23:35 drh
- 2004-Aug-06 22:04 anonymous
- 2003-Oct-06 20:58 anonymous
- 2003-Sep-06 23:06 drh
- 2003-Jul-23 23:29 anonymous
- 2003-Jul-23 23:26 anonymous
- 2003-Jul-22 10:04 anonymous
- 2003-Apr-29 12:30 drh
SQLite 3 and recent versions of SQLite 2.8 support pre-compiled SQL.
Many people have asked for the ability to precompile commonly used SQL statements. The idea is that if the same SQL statement is reused two or more times, the compilation process can be skipped for the second and subsequent uses of that statement, and thus make the statement execute faster.
SQLite does not currently have the ability to precompile. But perhaps this will be added as a future enhancement.
How much of a speed advantage can be expected? To test this, I profiled version 2.8.1 on a script of 51000 INSERT statements. When writing to an in-memory database, the execution of the statements required about 36.3% of the time and the parsing and code generation required 43.1% of the time. The remaining 20.6% was used to read the test script off of disk. For a disk-based database, the numbers were 59.0% for execution, 33.3% for parsing and code generation, and 7.3% for reading the script. These measurements suggest a speed increase of between 150% to 225% can be achieved by reusing a previously compiled program.
The results above are for individual INSERT statements that add a single row of data to the database. For complex SELECT statements or INSERT statements that take their data from a SELECT, many rows of data are visited and the speed advantage of precompiling is greatly reduced.
There is now an experimental API in SQLite CVS (checkin  ) that can be used for pre-compiled queries: (This API has changed as of 2003-09-06. See below for additional information.)
sqlite_reset( sqlite_vm *, const char **, sqlite_vm ** ppVm);
This call is exactly the same as sqlite_finalize() except a new virtual machine is created and stored in *ppVm. The new virtual machine is the same as the original was when it was returned by sqlite_compile() - same query, same callback, same callback context pointer. So if you need to, you can re-execute the query without the overhead of going through the parsing and compilation stage.
Most real applications will want parameters to pre-compiled queries, for example to pre-compile "SELECT val FROM values WHERE oid = %1" and then change the value of %1 for each execution. To simulate this in SQLite, you could create a function using sqlite_create_function() to retrieve the query "parameter" from your program. ie. Instead of "SELECT value FROM values WHERE oid = %1", use "SELECT value FROM values WHERE oid = get_query_parameter(1)". Or whatever, you get the idea.
It would be good if people could post any performance measurements they make here.
Some performance results for pre-compiled queries. The temp db was using the in-memory backend.
Insert 10000 records (real db) - 2036 ms Insert 10000 records (real db, pre-compile) - 1080 ms
Insert 10000 records (temp db) - 1563 ms Insert 10000 records (temp db, pre-compile) - 706 ms
Select 10000 records (real db) - 5430 ms Select 10000 records (real db, pre-compile) - 3733 ms
Select 10000 records (temp db) - 1766 ms Select 10000 records (temp db, pre-compile) - 460 ms
The schema for the test is:
CREATE TABLE tbl( key PRIMARY KEY, val );
Each insert test is:
BEGIN; INSERT INTO tbl VALUES(1, <string about 40 chars long>); INSERT INTO tbl VALUES(2, <string about 40 chars long>); ... INSERT INTO tbl VALUES(9999, <string about 40 chars long>); INSERT INTO tbl VALUES(10000, <string about 40 chars long>); COMMIT;
Each select test is:
SELECT * FROM tbl WHERE key = 1; SELECT * FROM tbl WHERE key = 2; ... SELECT * FROM tbl WHERE key = 9999; SELECT * FROM tbl WHERE key = 10000;
If you use an INTEGER PRIMARY KEY instead of a PRIMARY KEY things are slightly faster all round. The relative benefit of pre-compiled queries is slightly increased (5-10%) as well.
The program to run this test is precomp_test.c. It builds against 2.8.5.
The experimental sqlite_reset() API has changed to be the following:
int sqlite_reset(sqlite_vm *pVM, char **pzErrMsg);
Use this routine like sqlite_finalize() after you finish with a run of the virtual machine. It returns an error code and writes any error message into pzErrMsg. But instead of deleting the virtual machine, it resets it so that it is ready to run again. The same VM can be run over and over.
The SQL that you pass to sqlite_compile() can now contain "variables" represented by question marks. For example:
INSERT INTO tab1 VALUES(?,?,?)
After calling sqlite_compile() or sqlite_reset(), you can fill in the values for the question marks using the new sqlite_bind() API:
int sqlite_bind(sqlite_vm *pVm, int idx, const char *zVal, int len, int copy);
The first parameter is the virtual machine you want to bind values to. The second parameter "idx" specifies which variable (or "?") is to receive the value. The variables are numbered from left to right beginning with 1. So in the example above, the variables are 1, 2, and 3. An SQLITE_RANGE error is returned if the "idx" value is zero, negative, or greater than the number of variables. The third parameter is the value to assign to the variable. The fourth parameter is the length of the data in zVal, including the \000 terminator. You can use -1 here and sqlite_bind() will figure out the length for itself using strlen(). The fifth and last parameter is a flag that is TRUE if you want SQLite to make a copy of the value for itself in memory obtained from malloc(). If the fifth parameter is false, SQLite assumes that zVal is unchanging and that the virtual machine can just whatever memory zVal points to without having to make its own copy.
You can bind the same variable multiple times. If zVal==0, that is the same as setting the variable to NULL. Unbound variables are interpreted as NULL.
The "len" parameter (the 4th parameter) is intended to let you insert binary data into the database. zVal does not really have to be \000 terminated - it can be any sequence of bytes you want. "len" is just the number of bytes in that sequence. Such is the intent of "len" - but it has not test and any attempts to use it like that at this point will likely encounter bugs. You have been warned.
Precompiled statements interact with transactions because the VMs may or may not contain instructions to begin and end automatic transactions. To ensure correct operation you must heed the following advice from D. Richard Hipp:
To run a query within a transaction, you also have to compile it within a transaction. To run a query outside of a transaction it should be compiled outside of the transaction.
This is true for version 2.8 only. In SQLite version 3.0, statements compile outside of a transaction can be run within a transaction and vice versa.