How To Compile And Use SEE

1.0 Introduction

This file describes the SQLite Encryption Extension (SEE) for SQLite. The SEE allows SQLite to read and write encrypted database files. All database content, including the metadata, is encrypted so that to an outside observer the database appears to be white noise.

A version of SQLite that includes SEE is also able to read and write normal database files created with a public domain version of SQLite. But the public version of SQLite will not be able to read or write an encrypted database file. Indeed, no version of any known software will be able to access an encrypted database file without knowing the encryption key.

The SEE is actually a set of extensions employing various encryption algorithms. The following encryption algorithms are currently supported:

AES-256 in OFB mode (recommended for all new development)
AES-128 in OFB mode
AES-128 in CCM mode
RC4 with security enhancements (legacy only)

2.0 License

The core SQLite library is in the public domain. However, the extensions needed to read and write an encrypted database file are licensed software. You should only be able to see this software if you have a license.

Your license is perpetual. You have paid a one-time fee that allows you to use and modify the software forever. You can ship as many copies of the software to your customers as you want so long as you ensure that only compiled binaries are shipped (you cannot distribute source code) and that your customers cannot make additional copies of the software to use for other purposes.

You can create multiple products that use this software as long as all products are developed and maintained by the same team. For the purposes of this paragraph, a "team" is a work unit where everybody knows each others names. If you are in a large company where this product is used by multiple teams, then each team should acquire their own separate license, or an enterprise license.

3.0 How To Compile

Your application sees SEE as a single large file of C-code that is a drop-in replacement for the SQLite amalgamation. The SEE source-code file works and compiles just like the public-domain "sqlite3.c" amalgamation. If you already build your application using the public-domain "sqlite3.c" file, then to build using SEE you merely replace the public-domain "sqlite3.c" with an SEE-enabled "sqlite3.c" file and recompile.

There are nine different SEE-enabled "sqlite3.c" files to choose from:

sqlite3-see-aes256-openssl.c
sqlite3-see-aes256-cryptoapi.c
sqlite3-see-aes256-ofb.c
sqlite3-see-cccrypt.c
sqlite3-see-aes128-ofb.c
sqlite3-see-aes128-ccm.c
sqlite3-see.c
sqlite3-rc4.c
sqlite3-xor.c

The recommended procedure for adding SEE into your application is to copy one of these files into your application source tree, renaming it as "sqlite3.c" and overwriting the public-domain "sqlite3.c" source file, then recompile. After recompiling, your application should continue working exactly as it did before, reading and writing ordinary unencrypted SQLite databases. Once you have recompiled and verified that everything still works, then go back in and add a PRAGMA (described below) that activates encryption to your application code, and you are done.

3.1 Source Code Files In The SEE Distribution

The following are the source-code files used to implement the SQLite Encryption Extension:

sqlite3-see-aes256-openssl.c
This file is a drop-in replacement for the public-domain "sqlite3.c" file, adding support for encryption using the AES-256 in OFB mode by linking against the external OpenSSL library.
sqlite3-see-cryptoapi.c
This file is a drop-in replacement for the public-domain "sqlite3.c" source file, adding encryption capabilities using the AES256 in OFB mode using the CryptoAPI native interface on Windows.
sqlite3-see-aes256-ofb.c
This file is a drop-in replacement for the public-domain "sqlite3.c" file, adding support for encryption using the AES-256 in OFB mode using a built-in copy of the Rijndaal reference implementation.
sqlite3-see-cccrypt.c
This file is a drop-in replacement for the public-domain "sqlite3.c" file, adding support for the AES-128 and AES-256 encryption algorithms, in OFB mode, using the external CCCrypt encryption. CCCrypt is the default encryption library on MacOS and iOS, and so this implementation of SEE is recommended for those platforms.
The see-ccrypt.c module normally only does AES128 encryption. However, when see-cccrypt is compiled with -DCCCRYPT256, it will use AES256 if and only if the key is exactly 32 bytes long.
sqlite3-see-aes128-ofb.c
This file is a drop-in replacement for the public-domain "sqlite3.c" file. This replacement adds support for the AES-128 encryption algorithm in OFB mode using the Rijndaal reference implementation.
sqlite3-see-aes128-ccm.c
This file is a drop-in replacement for the public-domain "sqlite3.c" file. This replacement adds support for the AES-128 encryption algorithm in CCM mode. CCM mode includes a message authentication code which provides authentication in addition to confidentiality. This uses the Rijndaal reference implementation for AES.
sqlite3-see-rc4.c
This file is a drop-in replacement for the public-domain "sqlite3.c" file, adding support for encryption using the RC4 algorithm. RC4 is no longer considered secure. You should not use this implementation of SEE. It is provided for historical compatibility only.
sqlite3-see.c
This file is a drop-in replacement for the public-domain "sqlite3.c" source file, adding support for encryption using any of the RC4, AES128-OFB, or AES258-OFB algorithms. The algorithm used is based on a prefix to the encryption key. If the key material begins with "rc4:" then RC4 encryption is used. If the key material begins with "aes128:" then AES128-OFB is used. If the key material begins with "aes256:" then AES256-OFB is used. If none of these three valid prefixes appear on the key, then AES128-OFB is the default algorithm. A valid prefix is removed from the key prior to being passed on to the encryption algorithm.
sqlite3-see-xor.c
This file is a drop-in replacement for the public-domain "sqlite3.c" source file, adding pseudo-encryption which does nothing more than XOR the database against a repeated copy of the encryption key. This variant of SEE does not provide true encryption. It is for demonstration use only, or for use in cases where it is desirable to obfuscate a database file without actually encrypting it, perhaps due to legal constraints.
sqlite3.c
A copy of ordinary, unencrypted SQLite that contains additional hooks needed to add encryption. The other encrypted SQLite modules above are all copies of this file with additional code prepended and appended to do the encryption work. This file is provided for reference only and is probably not useful for development.
sqlite3.h
This file contains the interface definitions for SQLite. Other programs that link against SQLite will need this file, and you will need this file in order to compile the CLI, but you do not need this file to compile SQLite itself.
shell.c
This file contains source code for the "CLI", the Command Line Interface program named "sqlite3.exe" that you can use to access and control SQLite database files. This file is different from the "shell.c" file that comes with the public-domain version of SQLite. This shell.c has been enhanced to make use of the encryption extension.

3.2 Building And Compiling The SEE Code

To compile SEE into a static library, select an appropriate "sqlite3-see-*.c" source file (containing the algorithm and implementation you desire), then compile that file just like you would compile an ordinary public-domain "sqlite3.c" source file. On unix systems, the command sequence would be something like this:

    gcc -c sqlite3-see-aes256-ofb.c
    ar a sqlite3-see-aes256-ofb.a sqlite3-see-aes256-ofb.o

On windows, the commands are more like this:

    cl -c sqlite3-see-aes256-ofb.c
    lib /out:libsee.lib sqlite3-see-aes256-ofb.obj

3.3 Building A Shared-Library Or DLL

We encourage you to statically link SQLite against your application. However, if you must use SQLite as a separate DLL or shared library, you can compile as follows on Linux:

    gcc -fPIC -shared -o libsee.so sqlite3-see-aes256-ofb.c

Or on Windows:

    cl -DSQLITE_API=__declspec(dllexport) sqlite3-see-aes256-ofb.c /link /dll /out:libsee.dll

3.4 Building The Command-Line Shell Program

To compile the CLI, just hand the shell.c source file to your C compiler together with either the static library prepared above, or the original source code files. A typical command on Linux is:

    gcc -o sqlite3 shell.c sqlite3-see-aes256-ofb.c -lpthread -ldl

On a Mac:

    gcc -o sqlite3 shell.c sqlite3-see-aes256-ofb.c -ldl

On Windows with MSVC:

    cl /Fesqlite3.exe shell.c sqlite3-see-aes256-ofb.c

For an added performance boost when building the CLI, consider adding the -DSQLITE_THREADSAFE=0 option. The CLI is single threaded and SQLite runs faster if it doesn't have to use its mutexes.

SEE can also be built for Windows Phone 8, UWP 10, and Android.

4.0 Command-Line Usage

The CLI is the same CLI used by public-domain SQLite though with enhancements to support encryption. There are new command-line options ("-key", "-hexkey", and "-textkey") for specifying the encryption key. Examples:

    sqlite3 -key secret database.db
    sqlite3 -hexkey 736563726574 database.db
    sqlite3 -textkey secret2 database.db

If the key is omitted or is an empty string no encryption is performed.

There are three different key formats. The first format (-key) takes the key string and repeats it over and over until it exceeds the number of bytes in the key of the underlying algorithm (16 bytes for AES128, 32 bytes for AES256, or 256 bytes for RC4). It then truncates the result to the algorithm key size. That approach limits the key space since it does not allow 0x00 bytes in the key. The second format (-hexkey) accepts the key as hexadecimal, so any key can be represented. If the provided key is too long it is truncated. If the provided key is too short, it is repeated to fill it out to the algorithm key length. The third format (-textkey) computes a strong hash on the input key material and uses that hash to key the algorithm. The -textkey format is recommended for new applications.

4.1 Changing the encryption key

The SEE-enabled CLI also includes new dot-commands ".rekey", ".hex-rekey", and ".text-rekey" for changing the encryption key:

   .rekey OLD NEW NEW
   .hex-rekey OLD NEW NEW
   .text-rekey OLD NEW NEW

The first argument is always the old password, in exactly the format as it was supplied to the "-key", "-hexkey", or "-textkey" options when the command-line tool was started. If the the database was previously unencrypted, use an empty string "" as the key. The 2nd and 3rd arguments are the new encryption key. You must enter the new key twice to check for typos - the rekey will not occur unless both instances of the new key are the same. To encrypt a previously unencrypted database, do this:

   .rekey "" new-key new-key
   VACUUM

The VACUUM step is not required to enable encryption but it is highly recommended. The VACUUM command ensures that every page of the database file has a secure nonce. The VACUUM is only needed when an existing, non-empty database file is encrypted for the first time.

To decrypt a database do this:

   .rekey old-key "" ""

The .rekey command only works with text keys. To rekey a database that contains a binary key use the ".hex-rekey" command instead. The .hex-rekey command works just like .rekey except the new key is entered as hexadecimal instead of text. The ".text-rekey" command computes a hash of the NEW argument and uses that hash as the encryption key.

5.0 C Interface

If you deploy the SQLite encryption extension as a DLL or shared library then you must first activate the library by invoking:

   sqlite3_activate_see("7bb07b8d471d642e");

The argument is your product activation key. The activation key is available as plain-text in the source code so you can clearly see what it is. The purpose of the activation key is to prevent one of your customers from extracting the SQLite library and using it separately from your application. Without knowledge of the activation key, which only you should know, your users will be unable to access the encryption features.

If you are unable to invoke the C-interface to sqlite3_activate_see() (perhaps because you are accessing SQLite through a wrapper layer) then you can also alternatively activate the encryption features using a PRAGMA:

  PRAGMA activate_extensions='see-7bb07b8d471d642e';

Use the sqlite3_open() API to open an encrypted database or any database that you want to rekey. Immediately after opening, specify the key using sqlite3_key_v2():

   int sqlite3_key_v2(
      sqlite3 *db,         /* The connection from sqlite3_open() */
      const char *zDbName, /* Which ATTACHed database to key */
      const void *pKey,    /* The key */
      int nKey             /* Number of bytes in the key */
   );

If the pKey argument is NULL or nKey is 0, then the database is assumed to be unencrypted. The nKey parameter can be arbitrarily large, though only the first 256 bytes (RC4) or 16 bytes (AES128) or 32 bytes (AES256) will be used. In SEE versions 3.15.0 and later, if nKey is negative, then pKey is assumed to be a zero-terminated passphrase string. In that case the passphrase is hashed and the hash is used as the key to AES algorithm. The passphrase itself is used as the key for RC4.

CAUTION: The feature of using a passphrase hash when nKey<0 was added in version 3.15.0. If you use nKey<0 in any SEE version prior to 3.15.0, encryption will be silently disabled, just as if you had set nKey=0.

The see-ccrypt.c module uses AES128 encryption by default. However, if see-ccrypt.c is compiled with -DCCCRYPT256 and if the sqlite3_key_v2() interface is called with nKey==32, then AES256 encryption is used instead.

If you specify an incorrect key, you will not get an error message right away. But the first time you try to access the database you will get an SQLITE_NOTADB error with a message of "file is encrypted or is not a database".

The zDbName parameter specifies which ATTACH-ed database should get the key. Usually this is "main". You can pass in a NULL pointer as an alias for "main". Unless you have a good reason to do otherwise, it is best to pass in a NULL pointer for the zDbName parameter.

You can change the key on a database using the sqlite3_rekey() routine:

   int sqlite3_rekey_v2(
      sqlite *db,                    /* Database to be rekeyed */
      const char *zDbName,           /* Which ATTACHed database to rekey */
      const void *pKey, int nKey     /* The new key */
   );

A NULL key decrypts the database.

Rekeying requires that every page of the database file be read, decrypted, reencrypted with the new key, then written out again. Consequently, rekeying can take a long time on a larger database.

Most SEE variants allow you to encrypt an existing database that was created using the public domain version of SQLite. This is not possible when using the authenticating version of the encryption extension in see-aes128-ccm.c. If you do encrypt a database that was created with the public domain version of SQLite, no nonce will be used and the file will be vulnerable to a chosen-plaintext attack. If you call sqlite3_key_v2() immediately after sqlite3_open() when you are first creating the database, space will be reserved in the database for a nonce and the encryption will be much stronger. If you do not want to encrypt right away, call sqlite3_key_v2() anyway, with a NULL key, and the space for the nonce will be reserved in the database even though no encryption is done initially.

A public domain version of the SQLite library can read and write an encrypted database with a NULL key. You only need the encryption extension if the key is non-NULL.

6.0 Using the "key" PRAGMA

As an alternative to calling sqlite3_key_v2() to set the decryption key for a database, you can invoke a pragma:

    PRAGMA key='your-secret-key';

You must invoke this pragma before trying to do any other interaction with the database. The key pragma only works with string keys. If you use a binary key, use the hexkey pragma instead:

    PRAGMA hexkey='796f75722d7365637265742d6b6579';

For the equivalent of the --textkey option, in which the text passphrase is hashed to compute the actual encryption key, use:

    PRAGMA textkey='your-secret-key';

Use the rekey, hexrekey, or textrekey pragmas to change the key. So, for example, to change the key to 'demo2' use one of:

    PRAGMA rekey='demo2';
    PRAGMA hexrekey='64656d6f32';
    PRAGMA textrekey='long-passphrase';

Through the use of these pragmas, it is never necessary to directly invoke the sqlite3_key_v2() or sqlite3_rekey_v2() interfaces. This means that SEE can be used with language wrappers that do not know about those interfaces.

The "key", "hexkey", and "textkey" PRAGMA statements expect the same key strings as the "-key", "-hexkey", and "-textkey" arguments to the command-line shell, respectively.

The key PRAGMAs will return a string "ok" if they successfully load an encryption key into SEE. If you invoke one of these pragmas on a system that does not support encryption, or if the key loading operation fails for any reason, then nothing is returned. Note that the "ok" string is returned when any key is loaded, not necessarily the correct key. The only way to determine if the key is correct is to try to read from the database file. An incorrect key will result in a read error.

7.0 Using The ATTACH Command

The key for an attached database is specified using the KEY clause at the end of the ATTACH statement. Like this:

    ATTACH DATABASE 'file2.db' AS two KEY 'xyzzy';

If the KEY clause is omitted, the same key is used that is currently in use by the main database. If the attached database is not encrypted, specify an empty string as the key. The argument to the KEY keyword can be a BLOB constant. For example:

    ATTACH DATABASE 'file2.db' AS two KEY X'78797a7a79';

A text KEY on an ATTACH statement corresponds to what one would provide to the "-key" option of the command-line shell. A BLOB-type KEY corresponds to the shell's "-hexkey" option. There is no mechanism for specifying a passphrase to be hashed on an ATTACH statement. If you are using a hashed key, you must compute the hash yourself and supply it as a BLOB.

8.0 Key Material

The amount of key material actually used by the encryption extension depends on which variant of SEE you are using. With see-rc4.c, the first 256 bytes of key are used. With the see-aes128-ofb and and see-aes128-ccm variants, the first 16 bytes of the key are used. With see-aes256-ofb, the first 32 bytes of key are used.

If you specify a key that is shorter than the maximum key length, then the key material is repeated as many times as necessary to complete the key. If you specify a key that is larger than the maximum key length, then the excess key material is silently ignored.

For the "-textkey" option, up to 256 bytes of the passphrase are hashed using RC4 and the hash value becomes the encryption key. Note that in this context the RC4 algorithm is being used as a hash function, not as a cryptographic function, so the fact that RC4 is a cryptographically weak algorithm is irrelevant.

8.1 Encryption algorithm selection using a key prefix

For the "sqlite3-see.c" SEE variant, the key may begin with a prefix to specify which algorithm to use. The prefix must be exactly one of "rc4:", "aes128:", or "aes256:". The prefix is not used as part of the key sent into the encryption algorithm. So the real key should begin on the first byte after the prefix. Take note of the following important details:

The prefix is case sensitive. "aes256:" is a valid prefix but "AES256:" is not.
If the key prefix is omitted or misspelled, then the encryption algorithm defaults to "aes128" and the misspelled prefix becomes part of the key.
The encryption algorithm can be changed using the sqlite3_rekey_v2() interface or the .rekey command-line. For example, to convert a legacy RC4-encrypted database to use AES-256, enter:
```
.rekey rc4:mykey aes256:mykey aes256:mykey
```
The algorithm prefix strings work on the "sqlite-see.c" variant of SEE only. For any other SEE implementations, any prefix on the key is interpreted as part of the key.
The nKey parameter on sqlite3_key() and sqlite3_key_v2() must include the size of the prefix in addition to the size of the key.
When using PRAGMA hexkey or PRAGMA hexrekey, the key prefix must be hex encoded just like the rest of the key.
```
PRAGMA hexkey='aes128:6d796b6579';         -- Wrong!!
PRAGMA hexkey='6165733132383a6d796b6579';  -- correct
```

9.0 The Importance of a Nonce

The encryption is much more secure if it has a random nonce value on each page of the database. Without a nonce, the encryption can be broken using a chosen-plaintext attack. Purists will argue (rightly) that the encryption is weak without a nonce.

The number of bytes of nonce on each page of the database is determined by byte 20 of the database file. This value is set to zero by default in databases created by the public-domain version of SQLite. You can change this byte to a positive value by running the VACUUM command using an SEE-enabled version of SQLite.

You can check the size of the nonce for a database by using the ".dbinfo" command in an ordinary sqlite3.exe command-line shell program. The output of the ".dbinfo" command will look something like this:

database page size:  4096
write format:        1
read format:         1
reserved bytes:      12    ← Nonce size
file change counter: 3504448735
database page count: 14190
freelist page count: 0
schema cookie:       107
schema format:       4
default cache size:  0
autovacuum top root: 0
incremental vacuum:  0
text encoding:       1 (utf8)
user version:        0
application id:      0
software version:    3008008
number of tables:    53
number of indexes:   53
number of triggers:  0
number of views:     0
schema size:         14257

Bytes 16 through 23 of the database are unencrypted. Thus, you can always check to see how much nonce is being used, even on an encrypted database file, just by looking at byte 20. It is recommended that any product that uses encryption check this byte to make sure it is being set to 4 or 12 or 32 and not 0.

The nonce size may be increased, but not decreased, with one of the following approaches:

From the command-line shell:

    .filectrl reserve_bytes 32
    VACUUM;

Using the C API:

    int nNonce = 32;
    sqlite3_file_control(db, "main", SQLITE_FCNTL_RESERVE_BYTES, &nNonce);
    sqlite3_exec(db, "VACUUM;", 0, 0, 0);

10.0 Security Checklist

When using SEE in an application, it is recommended that you double-check that everything is implemented correctly, and that you are getting strong encryption, by performing the following tests, at a minimum:

Use the SEE-enabled CLI to run the "sqlite3 $DATABASE .dbinfo" command (adding an appropriate -key, -hexkey, or -textkey argument) and verify that your encrypted database files contain a nonce. The nonce should be at least 12 bytes.

Use the SEE-enabled CLI to read an encrypted database, but change the last character of the supplied key by a single character value. Verify that a minor change to the end of the key like this renders the database unreadable. The error message should be "file is not a database". Repeat this test with multiple variations of the key. Confirm that the database is only accessible if the key is exactly correct.

Try to compress an encrypted database file and verify that the file is uncompressible. In other words, run a program like "zip" or "gzip" against the encrypted database and verify that compression does not change the size of the file more than a few bytes smaller.

Limitations

TEMP tables are not encrypted.

In-memory (":memory:") databases are not encrypted.

Bytes 16 through 23 of the database file contain header information which is not encrypted.

11.0 How SEE Works

Each page is encrypted separately. The key to encryption is a combination of the page number, the random nonce (if any) and the database key. The data is encrypted in both the main database and in the rollback journal or WAL file but is unencrypted when held in memory. This means that if an adversary is able to view the memory used by your program, she will be able to see unencrypted data.

The nonce value is changed by a rollback.

The see-aes128-ccm.c variant uses AES in CCM mode with a 16-byte randomly chosen nonce on each page and and 16-byte message authentication code (MAC). Thus with crypto3ccm.c, 32 bytes of every database pages are taken up by encryption and authentication overhead. Consequently, database files created using crypto3ccm.c may be a little larger. Also, because the MAC is computed whenever a page is modified, and verified when a page is read, crypto3ccm.c will often be a little slower. Such is the cost of authentication.