Documentation Source Text: Artifact [a7aa8bb3c5]

Artifact a7aa8bb3c53dced015869f714fdfb12e2fc63a03:

File pages/asyncvfs.in — part of check-in [50fc706041] at 2012-12-03 20:10:10 on branch trunk — Tweaks to the change log in preparation for 3.7.15. (user: drh size: 8190) [more...]
<title>An Asynchronous I/O Module For SQLite</title>
<tcl>hd_keywords {asynchronous VFS} {asynchronous I/O backend}</tcl>
<h1 align="center">An Asynchronous I/O Module For SQLite</h1>

<hr>
<p><font size=+1><b>NOTE:</b> 
[WAL mode] with [PRAGMA synchronous] set to NORMAL avoids calls to
fsync() during transaction commit and only invokes fsync() during
a [checkpoint] operation.  The use of [WAL mode] largely obviates the
need for this asynchronous I/O module.  Hence, this module is no longer
supported.  The source code continues to exist in the SQLite source tree,
but it is not a part of any standard build and is no longer maintained.
This documentation is retained for historical reference.</font></p><hr>

<p>Normally, when SQLite writes to a database file, it waits until the write
operation is finished before returning control to the calling application.
Since writing to the file-system is usually very slow compared with CPU
bound operations, this can be a performance bottleneck. The asynchronous I/O
backend is an extension that causes SQLite to perform all write requests
using a separate thread running in the background. Although this does not
reduce the overall system resources (CPU, disk bandwidth etc.), it does
allow SQLite to return control to the caller quickly even when writing to
the database.

<h2>1.0 FUNCTIONALITY</h2>

<p>With asynchronous I/O, write requests are handled by a separate thread
running in the background.  This means that the thread that initiates
a database write does not have to wait for (sometimes slow) disk I/O
to occur.  The write seems to happen very quickly, though in reality
it is happening at its usual slow pace in the background.

<p>Asynchronous I/O appears to give better responsiveness, but at a price.
You lose the Durable property.  With the default I/O backend of SQLite,
once a write completes, you know that the information you wrote is
safely on disk.  With the asynchronous I/O, this is not the case.  If
your program crashes or if a power loss occurs after the database
write but before the asynchronous write thread has completed, then the
database change might never make it to disk and the next user of the
database might not see your change.

<p>You lose Durability with asynchronous I/O, but you still retain the
other parts of ACID:  Atomic,  Consistent, and Isolated.  Many
applications get along fine without the Durability.

<h3>1.1 How it Works</h3>

<p>Asynchronous I/O works by creating an SQLite [sqlite3_vfs | VFS object]
and registering it with [sqlite3_vfs_register()].
When files opened via 
this VFS are written to (using the vfs xWrite() method), the data is not 
written directly to disk, but is placed in the "write-queue" to be
handled by the background thread.

<p>When files opened with the asynchronous VFS are read from 
(using the vfs xRead() method), the data is read from the file on 
disk and the write-queue, so that from the point of view of
the vfs reader the xWrite() appears to have already completed.

<p>The asynchronous I/O VFS is registered (and unregistered) by calls to the 
API functions sqlite3async_initialize() and sqlite3async_shutdown().
See section "Compilation and Usage" below for details.

<h3>1.2 Limitations</h3>

<p>In order to gain experience with the main ideas surrounding asynchronous 
IO, this implementation is deliberately kept simple. Additional 
capabilities may be added in the future.

<p>For example, as currently implemented, if writes are happening at a 
steady stream that exceeds the I/O capability of the background writer
thread, the queue of pending write operations will grow without bound.
If this goes on for long enough, the host system could run out of memory. 
A more sophisticated module could to keep track of the quantity of 
pending writes and stop accepting new write requests when the queue of 
pending writes grows too large.

<h3>1.3 Locking and Concurrency</h3>

<p>Multiple connections from within a single process that use this
implementation of asynchronous IO may access a single database
file concurrently. From the point of view of the user, if all
connections are from within a single process, there is no difference
between the concurrency offered by "normal" SQLite and SQLite
using the asynchronous backend.

<p>If file-locking is enabled (it is enabled by default), then connections
from multiple processes may also read and write the database file.
However concurrency is reduced as follows:

<ul>
<li><p> When a connection using asynchronous IO begins a database
        transaction, the database is locked immediately. However the
        lock is not released until after all relevant operations
        in the write-queue have been flushed to disk. This means
        (for example) that the database may remain locked for some 
        time after a "[COMMIT]" or "[ROLLBACK]" is issued.

<li><p> If an application using asynchronous IO executes transactions
        in quick succession, other database users may be effectively
        locked out of the database. This is because when a [BEGIN]
        is executed, a database lock is established immediately. But
        when the corresponding COMMIT or ROLLBACK occurs, the lock
        is not released until the relevant part of the write-queue 
        has been flushed through. As a result, if a COMMIT is followed
        by a BEGIN before the write-queue is flushed through, the database 
        is never unlocked,preventing other processes from accessing 
        the database.
</ul>

<p>File-locking may be disabled at runtime using the sqlite3async_control()
API (see below). This may improve performance when an NFS or other 
network file-system, as the synchronous round-trips to the server be 
required to establish file locks are avoided. However, if multiple 
connections attempt to access the same database file when file-locking
is disabled, application crashes and database corruption is a likely
outcome.


<h2>2.0 COMPILATION AND USAGE</h2>

<p>
The asynchronous IO extension consists of a single file of C code
(sqlite3async.c), and a header file (sqlite3async.h), located in the
<a href="http://www.sqlite.org/src/dir?name=ext/async">
<tt>ext/async/</tt> subfolder</a> of the SQLite source tree, that defines the 
C API used by applications to activate and control the modules 
functionality.

<p>
To use the asynchronous IO extension, compile sqlite3async.c as
part of the application that uses SQLite. Then use the APIs defined
in sqlite3async.h to initialize and configure the module.

<p>
The asynchronous IO VFS API is described in detail in comments in 
sqlite3async.h. Using the API usually consists of the following steps:

<ol>
<li><p>Register the asynchronous IO VFS with SQLite by calling the
       sqlite3async_initialize() function.

<li><p>Create a background thread to perform write operations and call
       sqlite3async_run().

<li><p>Use the normal SQLite API to read and write to databases via 
       the asynchronous IO VFS.
</ol>

<p>Refer to comments in the
<a href="http://www.sqlite.org/src/finfo?name=ext/async/sqlite3async.h">
sqlite3async.h header file</a> for details.


<h2>3.0 PORTING</h2>

<p>Currently the asynchronous IO extension is compatible with win32 systems
and systems that support the pthreads interface, including Mac OS X, Linux, 
and other varieties of Unix. 

<p>To port the asynchronous IO extension to another platform, the user must
implement mutex and condition variable primitives for the new platform.
Currently there is no externally available interface to allow this, but
modifying the code within sqlite3async.c to include the new platforms
concurrency primitives is relatively easy. Search within sqlite3async.c
for the comment string "PORTING FUNCTIONS" for details. Then implement
new versions of each of the following:

<blockquote><pre>
static void async_mutex_enter(int eMutex);
static void async_mutex_leave(int eMutex);
static void async_cond_wait(int eCond, int eMutex);
static void async_cond_signal(int eCond);
static void async_sched_yield(void);
</pre></blockquote>

<p>The functionality required of each of the above functions is described
in comments in sqlite3async.c.