/ Artifact Content
Login

Artifact 4f6a9afecc099a27bba17b4f8cc9561abc15dc40:


#
# Run this Tcl script to generate the sqlite.html file.
#
set rcsid {$Id: arch.tcl,v 1.2 2000/06/09 14:14:34 drh Exp $}

puts {<html>
<head>
  <title>Architecture of SQLite</title>
</head>
<body bgcolor=white>
<h1 align=center>
The Architecture Of SQLite
</h1>}
puts "<p align=center>
(This page was last modified on [lrange $rcsid 3 4] GMT)
</p>"

puts {
<h2>Introduction</h2>

<table align="right" border="1" cellpadding="15" cellspacing="1">
<tr><th>Block Diagram Of SQLite</th></tr>
<tr><td><img src="arch.png"></td></tr>
</table>
<p>This file describes the architecture of the SQLite library.
A block diagram showing the main components of SQLite
and how they interrelate is shown at the right.  The text that
follows will provide a quick overview of each of these components.
</p>

<h2>Interface</h2>

<p>The public interface to the SQLite library is implemented by
four functions found in the <b>main.c</b> source file.  Additional
information on the C interface to SQLite is
<a href="c_interface.html">available separately</a>.<p>

<p>To avoid name collisions with other software, all external
symbols in the SQLite library begin with the prefix <b>sqlite</b>.
Those symbols that are intended for external use (as oppose to
those which are for internal use only but which have to be exported
do to limitations of the C linker's scoping mechanism) begin
with <b>sqlite_</b>.</p>

<h2>Tokenizer</h2>

<p>When a string containing SQL statements is to be executed, the
interface passes that string to the tokenizer.  The job of the tokenizer
is to break the original string up into tokens and pass those tokens
one by one to the parser.  The tokenizer is hand-coded in C.
(There is no "lex" code here.)  All of the code for the tokenizer
is contained in the <b>tokenize.c</b> source file.</p>

<p>Note that in this design, the tokenizer calls the parser.  People
who are familiar with YACC and BISON may be used to doing things the
other way around -- having the parser call the tokenizer.  This author
as done it both ways, and finds things generally work out nicer for
the tokenizer to call the parser.  YACC has it backwards.</p>

<h2>Parser</h2>

<p>The parser is the piece that assigns meaning to tokens based on
their context.  The parser for SQLite is generated using the
<a href="http://www.hwaci.com/sw/lemon/">Lemon</a> LALR(1) parser
generator.  Lemon does the same job as YACC/BISON, but is uses
a different input syntax which is less error-prone.
Lemon also generates a parser which is reentrant and thread-safe.
And lemon defines the concept of a non-terminal destructor so
that it does not leak memory when syntax errors are encountered.
The source file that drives Lemon is found in <b>parse.y</b>.</p>

<p>Because
lemon is a program not normally found on development machines, the
complete source code to lemon (just one C file) is included in the
SQLite distribution in the "tool" subdirectory.  Documentation on
lemon is found in the "doc" subdirectory of the distribution.
</p>

<h2>Code Generator</h2>

<p>After the parser assembles tokens into complete SQL statements,
it calls the code generator to produce virtual machine code that
will do the work that the SQL statements request.  There are seven
files in the code generator:  <b>build.c</b>, <b>delete.c</b>,
<b>expr.c</b>, <b>insert.c</b> <b>select.c</b>, <b>update.c</b>, 
and <b>where.c</b>.
In these files is where most of the serious magic happens.
<b>expr.c</b> handles code generation for expressions.
<b>where.c</b> handles code generation for WHERE clauses on
SELECT, UPDATE and DELETE statements.  The files
<b>delete.c</b>, <b>insert.c</b>, <b>select.c</b>, and
<b>update.c</b> handle the code generation for SQL statements
with the same names.  (Each of these files calls routines
in <b>expr.c</b> and <b>where.c</b> as necessary.)  All other
SQL statements are coded out of <b>build.c</b>.</p>

<h2>Virtual Machine</h2>

<p>The program generated by the code generator is executed by
the virtual machine.  Additional information about the virtual
machine is <a href="opcode.html">available separately</a>.
To summarize, the virtual machine implements an abstract computing
engine specifically designed to manipulate database files.  The
machine has a stack which is used for intermediate storage.
Each instruction contains an opcode and
up to three additional operands.</p>

<p>The virtual machine is entirely contained in a single
source file <b>vdbe.c</b>.  The virtual machine also has
its own header file <b>vdbe.h</b> that defines an interface
between the virtual machine and the rest of the SQLite library.</p>

<h2>Backend</h2>

<p>The last layer in the design of SQLite is the backend.  The
backend implements an interface between the virtual machine and
the underlying data file library -- GDBM in this case.  The interface
is designed to make it easy to substitute a different database
library, such as the Berkeley DB.  
The backend abstracts many of the low-level details to help
reduce the complexity of the virtual machine.</p>

<p>The backend is contained in the single source file <b>dbbe.c</b>.
The backend also has a header file <b>dbbe.h</b> that defines the
interface between the backend and the rest of the SQLite library.</p>
}

puts {
<br clear="both" />
<p><hr /></p>
<p><a href="index.html"><img src="/goback.jpg" border=0 />
Back to the SQLite Home Page</a>
</p>

</body></html>}