Index: www/arch.tcl ================================================================== --- www/arch.tcl +++ www/arch.tcl @@ -1,9 +1,9 @@ # # Run this Tcl script to generate the sqlite.html file. # -set rcsid {$Id: arch.tcl,v 1.14 2004/07/17 21:56:10 drh Exp $} +set rcsid {$Id: arch.tcl,v 1.15 2004/09/08 13:06:21 drh Exp $} source common.tcl header {Architecture of SQLite} puts {

The Architecture Of SQLite

@@ -22,67 +22,44 @@ A block diagram showing the main components of SQLite and how they interrelate is shown at the right. The text that follows will provide a quick overview of each of these components.

-

History

- -

-There are two main C interfaces to the SQLite library: -sqlite_exec() and sqlite_compile(). Prior to -version 2.8.0 (2003-Feb-16) only sqlite_exec() was supported. -For version 2.8.0, the sqlite_exec and sqlite_compile methods -existed as peers. Beginning with version 2.8.13, the sqlite_compile -method is the primary interface, and sqlite_exec is implemented -using sqlite_compile. Externally, this change is an enhancement -that maintains backwards compatibility. But internally, -the plumbing is very different. The diagram at the right shows -the structure of SQLite for version 2.8.13 and following. -

- -

-This document describes the structure for SQLite version 2.X. -SQLite version 3.0.0 introduces many new features and capabilities. -The basic architecture of the library remains the same. However, -some of the details described here are different. For example, -the code was in the file os.c has now been split out into -several file, on for each operating system. And -the prefix on the names of API routines changed from sqlite_ -to sqlite3_. + +

+This document describes SQLite version 3.0. Version 2.8 and +earlier are similar but the details differ.

Interface

Much of the public interface to the SQLite library is implemented by -functions found in the main.c source file though some routines are +functions found in the main.c, legacy.c, and +vdbeapi.c source files +though some routines are scattered about in other files where they can have access to data structures with file scope. The -sqlite_get_table() routine is implemented in table.c. -sqlite_step() is found in vdbe.c. -sqlite_mprintf() is found in printf.c. +sqlite3_get_table() routine is implemented in table.c. +sqlite3_mprintf() is found in printf.c. +sqlite3_complete() is in tokenize.c. The Tcl interface is implemented by tclsqlite.c. More information on the C interface to SQLite is -available separately.

+available separately.

To avoid name collisions with other software, all external -symbols in the SQLite library begin with the prefix sqlite. +symbols in the SQLite library begin with the prefix sqlite3. Those symbols that are intended for external use (in other words, those symbols which form the API for SQLite) begin -with sqlite_.

- -

SQL Command Processor

- -

+with sqlite3_.

Tokenizer

When a string containing SQL statements is to be executed, the interface passes that string to the tokenizer. The job of the tokenizer is to break the original string up into tokens and pass those tokens -one by one to the parser. The tokenizer is hand-coded in C. -All of the code for the tokenizer -is contained in the tokenize.c source file.

+one by one to the parser. The tokenizer is hand-coded in C in +the file tokenize.c.

Note that in this design, the tokenizer calls the parser. People who are familiar with YACC and BISON may be used to doing things the other way around -- having the parser call the tokenizer. The author of SQLite @@ -111,24 +88,34 @@

Code Generator

After the parser assembles tokens into complete SQL statements, it calls the code generator to produce virtual machine code that will do the work that the SQL statements request. There are many -files in the code generator: build.c, copy.c, +files in the code generator: +attach.c, +auth.c, +build.c, delete.c, -expr.c, insert.c, pragma.c, -select.c, trigger.c, update.c, vacuum.c +expr.c, +insert.c, +pragma.c, +select.c, +trigger.c, +update.c, +vacuum.c and where.c. In these files is where most of the serious magic happens. expr.c handles code generation for expressions. where.c handles code generation for WHERE clauses on -SELECT, UPDATE and DELETE statements. The files copy.c, +SELECT, UPDATE and DELETE statements. The files attach.c, delete.c, insert.c, select.c, trigger.c update.c, and vacuum.c handle the code generation for SQL statements with the same names. (Each of these files calls routines in expr.c and where.c as necessary.) All other -SQL statements are coded out of build.c.

+SQL statements are coded out of build.c. +The auth.c file implements the functionality of +sqlite3_set_authorizer().

Virtual Machine

The program generated by the code generator is executed by the virtual machine. Additional information about the virtual @@ -144,30 +131,46 @@ its own header files: vdbe.h that defines an interface between the virtual machine and the rest of the SQLite library and vdbeInt.h which defines structure private the virtual machine. The vdbeaux.c file contains utilities used by the virtual machine and interface modules used by the rest of the library to -construct VM programs.

+construct VM programs. The vdbeapi.c file contains external +interfaces to the virtual machine such as the +sqlite3_bind_... family of functions. Individual values +(strings, integer, floating point numbers, and BLOBs) are stored +in an internal object named "Mem" which is implemented by +vdbemem.c.

+ +

+SQLite implements SQL functions using callbacks to C-language routines. +Even the built-in SQL functions are implemented this way. Most of +the built-in SQL functions (ex: coalesce(), count(), +substr(), and so forth) can be found in func.c. +Date and time conversion functions are found in date.c. +

B-Tree

An SQLite database is maintained on disk using a B-tree implementation found in the btree.c source file. A separate B-tree is used for each table and index in the database. All B-trees are stored in the -same disk file.

+same disk file. Details of the file format are recorded in a large +comment at the beginning of btree.c.

The interface to the B-tree subsystem is defined by the header file btree.h.

Page Cache

-

The B-tree module requests information from the disk in 1024 byte -chunks. The page cache is reponsible for reading, writing, and +

The B-tree module requests information from the disk in fixed-size +chunks. The default chunk size is 1024 bytes but can vary between 512 +and 65536 bytes. +The page cache is reponsible for reading, writing, and caching these chunks. The page cache also provides the rollback and atomic commit abstraction -and takes care of reader/writer locking of the database file. The +and takes care of locking of the database file. The B-tree driver requests particular pages from the page cache and notifies the page cache when it wants to modify pages or commit or rollback changes and the page cache handles all the messy details of making sure the requests are handled quickly, safely, and efficiently.

@@ -179,14 +182,40 @@

OS Interface

In order to provide portability between POSIX and Win32 operating systems, SQLite uses an abstraction layer to interface with the operating system. -The os.c file contains about 20 routines used for opening and -closing files, deleting files, creating and deleting locks on files, -flushing the disk cache, and so forth. Each of these functions contains -two implementations separated by #ifdefs: one for POSIX and the other -for Win32. The interface to the OS abstraction layer is defined by -the os.h header file. +The interface to the OS abstraction layer is defined in +os.h. Each supported operating system has its own implementation: +os_unix.c for Unix, os_win.c for windows, and so forth. +Each of these operating-specific implements typically has its own +header file: os_unix.h, os_win.h, etc. +

+ +

Utilities

+ +

+Memory allocation and caseless string comparison routines are located +in util.c. +Symbol tables used by the parser are maintained by hash tables found +in hash.c. The utf.c source file contains Unicode +conversion subroutines. +SQLite has its own private implementation of printf() (with +some extensions) in printf.c and its own random number generator +in random.c. +

+ +

Test Code

+ +

+If you count regression test scripts, +more than half the total code base of SQLite is devoted to testing. +There are many assert() statements in the main code files. +In additional, the source files test1.c through test5.c +together with md5.c implement extensions used for testing +purposes only. The os_test.c backend interface is used to +simulate power failures to verify the crash-recovery mechanism in +the pager.

+ } footer $rcsid Index: www/arch2.gif ================================================================== --- www/arch2.gif +++ www/arch2.gif cannot compute difference between binary files Index: www/lang.tcl ================================================================== --- www/lang.tcl +++ www/lang.tcl @@ -1,9 +1,9 @@ # # Run this Tcl script to generate the sqlite.html file. # -set rcsid {$Id: lang.tcl,v 1.71 2004/07/18 20:52:32 drh Exp $} +set rcsid {$Id: lang.tcl,v 1.72 2004/09/08 13:06:21 drh Exp $} source common.tcl header {Query Language Understood by SQLite} puts {

SQL As Understood By SQLite

@@ -181,12 +181,11 @@ ROLLBACK [TRANSACTION []] } puts {

Beginning in version 2.0, SQLite supports transactions with -rollback and atomic commit. See ATTACH for -an exception when there are attached databases.

+rollback and atomic commit.

The optional transaction name is ignored. SQLite currently does not allow nested transactions.

@@ -259,10 +258,13 @@ puts {

The COPY command is available in SQLite version 2.8 and earlier. The COPY command has been removed from SQLite version 3.0 due to complications in trying to support it in a mixed UTF-8/16 environment. +In version 3.0, the command-line shell +contains a new command .import that can be used as a substitute +for COPY.

The COPY command is an extension used to load large amounts of data into a table. It is modeled after a similar command found in PostgreSQL. In fact, the SQLite COPY command is specifically @@ -387,14 +389,14 @@ or a string. Tables names that begin with "sqlite_" are reserved for use by the engine.

Each column definition is the name of the column followed by the datatype for that column, then one or more optional column constraints. -SQLite is typeless. The datatype for the column does not restrict what data may be put in that column. -All information is stored as null-terminated strings. +See Datatypes In SQLite Version 3 for +additional information. The UNIQUE constraint causes an index to be created on the specified columns. This index must contain unique keys. The DEFAULT constraint specifies a default value to use when doing an INSERT. The COLLATE clause specifies what text collating function to use @@ -446,12 +448,13 @@ work.

There are no arbitrary limits on the number of columns or on the number of constraints in a table. The total amount of data in a single row is limited to about -1 megabytes. (This limit can be increased to 16MB by changing -a single #define in the source code and recompiling.)

+1 megabytes in version 2.8. In version 3.0 there is no arbitrary +limit on the amount of data in a row.

+

The CREATE TABLE AS form defines the table to be the result set of a query. The names of the table columns are the names of the columns in the result.

@@ -702,11 +705,12 @@ Syntax {sql-command} { DROP INDEX [ .] } puts { -

The DROP INDEX statement removes an index added with the +

The DROP INDEX statement removes an index added +with the CREATE INDEX statement. The index named is completely removed from the disk. The only way to recover the index is to reenter the appropriate CREATE INDEX command. Non-temporary indexes on tables in an attached database cannot be dropped.

@@ -919,11 +923,11 @@ may only be used in a SELECT statement. Aggregate functions compute their result across all rows of the result set.

The functions shown below are available by default. Additional functions may be written in C and added to the database engine using -the sqlite_create_function() +the sqlite3_create_function() API.

@@ -939,12 +943,12 @@ @@ -971,12 +975,12 @@ @@ -1010,10 +1014,21 @@ + + + + + @@ -1051,12 +1066,13 @@ +return values are "null", "integer", "real", "text", and "blob". +SQLite's type handling is +explained in Datatypes in SQLite Version 3.
abs(X)
glob(X,Y) This function is used to implement the -"Y GLOB X" syntax of SQLite. The -sqlite_create_function() +"X GLOB Y" syntax of SQLite. The +sqlite3_create_function() interface can be used to override this function and thereby change the operation of the GLOB operator.
like(X,Y) This function is used to implement the -"Y LIKE X" syntax of SQL. The -sqlite_create_function() +"X LIKE Y" syntax of SQL. The +sqlite_create_function() interface can be used to override this function and thereby change the operation of the LIKE operator.
nullif(X,Y) Return the first argument if the arguments are different, otherwise return NULL.
quote(X)This routine returns a string which is the value of +its argument suitable for inclusion into another SQL statement. +Strings are surrounded by single-quotes with escapes on interior quotes +as needed. BLOBs are encoded as hexadecimal literals. +The current implementation of VACUUM uses this function. The function +is also useful when writing triggers to implement undo/redo functionality. +
random(*) Return a random integer between -2147483648 and +2147483647.
typeof(X) Return the type of the expression X. The only -return values are "numeric" and "text". SQLite's type handling is -explained in Datatypes in SQLite.
upper(X) Return a copy of input string X converted to all @@ -1067,11 +1083,12 @@

The following aggregate functions are available by default. Additional aggregate functions written in C may be added using the -sqlite_create_aggregate() API.

+sqlite3_create_function() +API.

@@ -1268,12 +1285,14 @@
  • PRAGMA default_cache_size;
    PRAGMA default_cache_size =
    Number-of-pages;

    Query or change the maximum number of database disk pages that SQLite - will hold in memory at once. Each page uses 1K on disk and about 1.5K in memory. - This pragma works like the cache_size + will hold in memory at once. Each page uses 1K on disk and about + 1.5K in memory. + This pragma works like the + cache_size pragma with the additional feature that it changes the cache size persistently. With this pragma, you can set the cache size once and that setting is retained and reused everytime you reopen the database.

  • @@ -1306,11 +1325,13 @@ operations are as much as 50 or more times faster with synchronous OFF.

    This pragma changes the synchronous mode persistently. Once changed, the mode stays as set even if the database is closed and reopened. The synchronous pragma does the same - thing but only applies the setting to the current session.

    + thing but only applies the setting to the current session. + +

  • PRAGMA default_temp_store;
    PRAGMA default_temp_store = DEFAULT;
    (0)
    PRAGMA default_temp_store = MEMORY;
    (2)

  • avg(X) Return the average value of all X within a group.