Documentation Source Text

Artifact Content
Login

Artifact 814160c53c0abbb12e62d48b5abb7201e2d28316:


<title>Appropriate Uses For SQLite</title>

<h2>Appropriate Uses For SQLite</h2>

<p>
SQLite is not directly comparable to other SQL database engines such
as MySQL, Oracle, PostgreSQL, or SQL Server since SQLite is trying to
solve a very different problem.
</p>

<p>
Other SQL database engines strive to implement a shared repository
of enterprise data.  They emphasis scalability, concurrency, centralization,
and control.
</p>

<p>
SQLite, on the other hand, strives to provide local data storage for
individual applications and devices.  SQLite emphasizes economy,
efficiency, reliability, independence, and simplicity.
</p>

<p>
SQLite is not designed to compete with
[http://www.oracle.com/database/index.html | Oracle].
SQLite is designed to compete with [http://man.he.net/man3/fopen | fopen()].
</p>

<h2>Situations Where SQLite Works Well</h2>

<ul>

<tcl>hd_fragment appfileformat</tcl>
<li><p><b>Application File Format</b></p>

<p>(Further details [application file-format | here].)</p>

<p>
SQLite has been used with great success as the on-disk file format
for desktop applications such as version control systems,
financial analysis tools, media cataloging and editing suites, CAD
packages, record keeping programs, and so forth.  The traditional
File/Open operation calls sqlite3_open() to attach to the database
file.  Updates happen automatically as application content is revised
so the File/Save menu option becomes superfluous.  The File/Save_As
menu option can be implemented using the [backup API].
</p>

<p>
There are many advantages to using SQLite as an application file format,
including:
</p>

<ol type="1">
<li><b>Better performance</b>
<ul>
<li> The application only has to load the data it needs, rather
     than reading the entire file and holding a complete parse
     in memory.
<li> Small edits only overwrite the parts of the file that change,
     reducing write time and wear on SSD drives.
<li> In many cases, loading content from an SQLite database
     is faster than reading individual files from disk.
     See [Internal Versus External BLOBs].
</ul>
<li><b>Reduced application cost and complexity</b>
<ul>
<li> No application file I/O code to write and debug.
<li> Content can be accessed and updated using powerful SQL queries.
<li> Extending the file format in later releases
     is a simple as adding new tables and/or new columns.
<li> Applications can leverage the
     [full-text search] and [RTREE] indexes.
<li> Performance problems can often be resolved, even late in the
     development cycle, using [CREATE INDEX], avoiding costly
     redesign, rewrite, and retest efforts.
</ul>
<li><b>Portability</b>
<ul>
<li> The application file is portable across all operating systems,
     32-bit and 64-bit and big- and little-endian architectures.
<li> A federation of programs, perhaps all written in different programming
     languages, can access the same application file with no
     compatibility concerns.
<li> Multiple processes can attach to the same application
     file and can read and write without interfering with each another.
<li> Diverse content which might otherwise be stored as a "pile-of-files"
     is encapsulated into a single disk file which simplifies transport
     via scp/ftp, USB stick, and/or email attachment.
</ul>
<li><b>Reliability</b>
<ul>
<li> Content can be updated continuously and atomically so 
     that little or no work lost in the event of a power failure or crash.
<li> Bugs are far less likely in SQLite than in custom-written file I/O code.
<li> SQL queries are many times smaller than the equivalent procedural
     code, and since the number of bugs per line of code is roughly
     constant, this means fewer bugs overall.
</ul>
<li><b>Accessibility</b>
<ul>
<li> SQLite database content can be viewed using a wide variety
     third-party tools.
<li> Content stored in an SQLite database is more likely to be 
     recoverable decades in the future, long after all traces of
     the original application have been lost. Data lives longer than code.
</ul>
</ol>

<p>
SQLite allows database files to have any desired filename extension, so
an application can choose a custom filename extension for its own use, if
desired.  The [application_id pragma] can be used to set an "Application ID"
integer in the database file so that tools like
[http://www.darwinsys.com/file/ | file(1)] can determine that the file
is associated with your application and is not just a generic
SQL database.</p>

<tcl>hd_fragment iot {embedded devices} {internet of things}</tcl>
<li><b>Embedded devices</b>
<li><b>Applications at the edge of the network</b>
<li><b>The Internet of Things</b>

<p>Because an SQLite database requires no administration,
SQLite is a good choice for devices that must work
without human support.  SQLite is a good fit for
use in cellphones, set-top boxes, televisions, game consoles,
cameras, watches, kitchen appliances, thermostats, automobiles, 
airplanes, sensor, controls, drones, robots:  the "internet of things".
</p>

<p>Client/server database engines are designed to live at the core
of the network, inside a lovingly-attended datacenter.
SQLite is designed to live at the edge of the
network - fending for itself on the frontier and serving applications
that may have slow and/or dodgy connectivity.
</li>

<li><p><b>Websites</b></p>

<p>SQLite works great as the database engine for most low to
medium traffic websites (which is to say, 99.9% of all websites).
The amount of web traffic that SQLite can handle depends, of course,
on how heavily the website uses its database.  Generally
speaking, any site that gets fewer than 100K hits/day should work
fine with SQLite.
The 100K hits/day figure is a conservative estimate, not a
hard upper bound.
SQLite has been demonstrated to work with 10 times that amount
of traffic.</p>

<p>The SQLite website ([https://www.sqlite.org/]) uses SQLite itself,
of course, and as of this writing (2015) it handles about 400K to 500K
HTTP requests per day, about 15-20% of which are dynamic pages that
touch the database.  Each dynamic page does roughly 200 SQL statements.
This setup runs on a single VM that shares a physical server with 23 others
and yet still keeps the load average of below 0.1 most of the time.
</li>

<li><p><b>Command-line dataset analysis tool</b></p>

<p>
Experienced SQL users can employ
the command-line <b>sqlite3</b> program to analyze miscellaneous
datasets. Raw data can be imported from CSV files, then that
data can be sliced and diced to generate a myriad of summary
reports.  Possible uses include website log analysis, sports
statistics analysis, compilation of programming metrics, and
analysis of experimental results.  Many bioinformatics researchers
use SQLite in this way.
</p>

<p>
You can also do the same thing with an enterprise client/server
database, of course.  The advantage of SQLite is
that it is easier to set up and the resulting database 
is a single file that can be easily written to a USB memory stick
or emailed to a colleague.
</p>
</li>

<li><p><b>File archives</b></p>

<p>
The [https://www.sqlite.org/sqlar|SQLite Archiver] project shows how
SQLite can be used as a substitute for ZIP archives or Tarballs.
An archive of files stored in SQLite is only very slightly larger, and
in some cases actually smaller, than the equivalent ZIP archive.
And an SQLite archive features incremental and atomic updating,
and can be configured to store much richer metadata.
</p>

<p>SQLite Archives are useful as the distribution format for software
or content updates that are broadcast to many clients.  Variations
on this idea are used, for example, to transmit programming guides
to set-top boxes and to send over-the-air updates to vehicle navigation
systems.</p>
</li>

<li><p><b>Replacement for <i>ad hoc</i> disk files</b></p>

<p>Many programs use 
[http://man.he.net/man3/fopen | fopen()],
[http://man.he.net/man3/fread | fread()], and 
[http://man.he.net/man3/fwrite | fwrite()] to create and
manage files of data in home-grown formats.  SQLite works 
particularly well as a
replacement for these <i>ad hoc</i> data files.</p>
</li>

<li><p><b>Internal or temporary databases</b></p>

<p>
For programs that have a lot of data that must be sifted and sorted
in diverse ways, it is often easier and quicker to load the data into
an in-memory SQLite database and use queries with joins and ORDER BY
clauses to extract the data in the form and order needed rather than
to try to code the same operations manually.
Using an SQL database internally in this way also gives the program
greater flexibility since new columns and indices can be added without
having to recode every query.
</p>
</li>

<li><p><b>Stand-in for an enterprise database during demos or testing</b></p>

<p>
If you are writing a client application for an enterprise database engine,
it makes sense to use a generic database backend that allows you to connect
to many different kinds of SQL database engines.  It makes even better
sense to
go ahead and include SQLite in the mix of supported databases and to statically
link the SQLite engine in with the client.  That way the client program
can be used standalone with an SQLite data file for testing or for
demonstrations.
</p>
</li>

<li><p><b>Database Pedagogy</b></p>

<p>
Because it is simple to setup and use (installation is trivial: just
copy the <b>sqlite3</b> or <b>sqlite3.exe</b> executable to the target machine
and run it) SQLite makes a good database engine for use in teaching SQL.
Students can easily create as many databases as they like and can
email databases to the instructor for comments or grading.  For more
advanced students who are interested in studying how an RDBMS is
implemented, the modular and well-commented and documented SQLite code
can serve as a good basis.  This is not to say that SQLite is an accurate
model of how other database engines are implemented, but rather a student who
understands how SQLite works can more quickly comprehend the operational
principles of other systems.
</p>
</li>

<li><p><b>Experimental SQL language extensions</b></p>

<p>The simple, modular design of SQLite makes it a good platform for
prototyping new, experimental database language features or ideas.
</p>
</li>


</ul>

<h2>Situations Where Another RDBMS May Work Better</h2>

<ul>
<li><p><b>Client/Server Applications</b><p>

<p>If you have many client programs accessing a common database
over a network, you should consider using a client/server database
engine instead of SQLite.  SQLite will work over a network filesystem,
but because of the latency associated with most network filesystems,
performance will not be great.  Also, the file locking logic of
many network filesystems implementation contains bugs (on both Unix
and Windows).  If file locking does not work like it should,
it might be possible for two or more client programs to modify the
same part of the same database at the same time, resulting in 
database corruption.  Because this problem results from bugs in
the underlying filesystem implementation, there is nothing SQLite
can do to prevent it.</p>

<p>A good rule of thumb is that you should avoid using SQLite
in situations where the same database will be accessed simultaneously
from many computers over a network filesystem.</p>
</li>

<li><p><b>High-volume Websites</b></p>

<p>SQLite will normally work fine as the database backend to a website.
But if you website is so busy that you are thinking of splitting the
database component off onto a separate machine, then you should 
definitely consider using an enterprise-class client/server database
engine instead of SQLite.</p>
</li>

<li><p><b>Very large datasets</b></p>

<p>An SQLite database is limited in size to 140 terabytes 
(2<sup><small>47</small></sup> bytes, 128 tibibytes).
And even if it could handle larger databases, SQLite stores the entire
database in a single disk file and many filesystems limit the maximum
size of files to something less than this.  So if you are contemplating
databases of this magnitude, you would do well to consider using a
client/server database engine that spreads its content across multiple
disk files, and perhaps across multiple volumes.
</p>
</li>

<li><p><b>High Concurrency</b></p>

<p>
SQLite supports an unlimited number of simultaneous readers, but it 
will only allow one writer at any instant in time.
For many situations, this is not a problem.  Each application
does its database work quickly and moves on, and no lock lasts for more
than a few dozen milliseconds. But there are some applications that require
more concurrency, and those applications may need to seek a different
solution.
</p>
</li>

</ul>

<tcl>
hd_fragment dbcklst {decision checklist}
</tcl>
<h2>Checklist For Choosing The Right Database Engine</h2>

<ol>
<li><p><b>Is the data separated from the application by a network?
       &rarr; choose client/server</b></p>

<p>Relational database engines act as a bandwidth-reducing data filter.
So it is best to keep the database engine and the data on
the same physical device so that the high-bandwidth engine-to-disk
link does not have to traverse the network, only the lower-bandwidth
application-to-engine link.

<p>But SQLite is built into the application.  So if the data is on a
separate device from the application, it is required that the higher
bandwidth engine-to-disk link be across the network.  This works, but
it is suboptimal.  Hence, it is usually better to select a client/server
database engine when the data is on a separate device from the
application.
</li>

<li><p><b>Many concurrent writers? &rarr; choose client/server</b></p>

<p>If many threads and/or processes need to write the
database at the same instant (and they cannot queue up and take turns)
then it is best to select a database engine that supports that
capability, which always means a client/server database engine.

<p>SQLite only supports one writer at a time per database file.
But in most cases, a write transaction only takes milliseconds and
so multiple writers can simply take turns.  SQLite will handle
more write concurrency that many people suspect.  Nevertheless,
client/server database systems, because they have a long-running
server process at hand to coordinate access, can usually handle 
far more write concurrency than SQLite ever will.
</li>

<li><p><b>Big data? &rarr; choose client/server</b></p>

<p>If your data will grow to a size that you are uncomfortable
or unable to fit into a single disk file, then you should select
a solution other than SQLite.  SQLite supports databases up to
140 terabytes in size, assuming you can find a disk drive and filesystem
that will support 140-terabyte files.  Even so, when the size of the
content looks like it might creep into the terabyte range, it would
be good to consider a centralized client/server database.
</li>

<li><p><b>Otherwise &rarr; choose SQLite!</b></p>

<p>For device-local storage with low writer concurrency and less than a
terabyte of content, SQLite is almost always a better solution.  SQLite
is fast and reliable and it requires no configuration or maintenance.
It keeps thing simple.  SQLite "just works".
</li>
</ol>