Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Add the "35% Faster Than The Filesystem" document. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA3-256: |
fd906f77ce6e45e84f12d9e0ca7f6522 |
User & Date: | drh 2017-05-04 19:44:53.501 |
Context
2017-05-04
| ||
20:14 | Fix typos and make improvements to the 35% faster document. (check-in: 4764f16dc0 user: drh tags: trunk) | |
19:44 | Add the "35% Faster Than The Filesystem" document. (check-in: fd906f77ce user: drh tags: trunk) | |
2017-05-02
| ||
18:46 | Update the change log and the speed-and-size graph. (check-in: 03fde5ae27 user: drh tags: trunk) | |
Changes
Changes to pages/docsdata.tcl.
︙ | ︙ | |||
277 278 279 280 281 282 283 284 285 286 287 288 289 290 | doc {SQLite As An Application File Format} {appfileformat.html} { This article advocates using SQLite as an application file format in place of XML or JSON or a "pile-of-file". } doc {Well Known Users} {famous.html} { This page lists a small subset of the many thousands of devices and application programs that make use of SQLite. } ############################################################################### heading {Technical and Design Documentation} technical { These documents are oriented toward describing the internal implementation details and operation of SQLite. | > > > > > | 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 | doc {SQLite As An Application File Format} {appfileformat.html} { This article advocates using SQLite as an application file format in place of XML or JSON or a "pile-of-file". } doc {Well Known Users} {famous.html} { This page lists a small subset of the many thousands of devices and application programs that make use of SQLite. } doc {35% Faster Than The Filesystem} {fasterthanfs.html} { This article points out that reading blobs out of an SQLite database is often faster than reading the same blobs from individual files in the filesystem. } ############################################################################### heading {Technical and Design Documentation} technical { These documents are oriented toward describing the internal implementation details and operation of SQLite. |
︙ | ︙ |
Added pages/fasterthanfs.in.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 | <title>35% Faster Than The Filesystem</title> <tcl>hd_keywords {faster than the filesystem}</tcl> <table_of_contents> <h1>Summary</h1> <p>10 kilobyte blobs (for example, thumbnail images) can be read out of an SQLite database about 35% faster than they can be read from individual files on disk. <p>Furthermore, a single SQLite database holding many thousands of thumbnail-sized blobs uses about 20% less disk space than storing the blobs in individual files. <p>The performance difference arises (we believe) because when reading from an SQLite database, the open() and close() system calls only need to be invoked once, whereas open() and close() must be invoked once for each blob when reading the blobs from individual files. In other words, the overhead of calling open() and close() is greater than the overhead of using the database. The size reduction arises from the fact that individual files are padded out to the next multiple of the filesystem block size, whereas the blobs are packed more tightly into an SQLite database. <h1>How Performance Was Measuremented</h1> <p>The performance comparison is accomplished using the [https://www.sqlite.org/src/file/test/kvtest.c|kvtest.c] program found in the SQLite source tree. To compile the test program, first gather the kvtest.c source file into a directory with the [amalgamation|SQLite amalgamation] source files "sqlite3.c" and "sqlite3.h". Then on unix, run a command like the following: <codeblock> gcc -Os -I. -DSQLITE_DIRECT_OVERFLOW_READ kvtest.c sqlite3.c \ -o kvtest -ldl -lpthread </codeblock> <p> Generate a test database with 100,000 random blobs, each 10,000 bytes in size using a command like this: <codeblock> ./kvtest init test1.db --count 100k --size 10k </codeblock> <p> Next, make copies of all the blobs into individual files in a directory using commands like this: <codeblock> mkdir test1.dir ./kvtest export test1.db test1.dir </codeblock> <p> Measure the performance for reading blobs from the database and from individual files using these commands: <codeblock> ./kvtest run test1.db --count 100k --blob-api ./kvtest run test1.dir --count 100k </codeblock> <p> Depending on your platform, you should see that reads from the test1.db database file are about 35% faster than reads from individual files in the test1.dir folder. <h2>Variations</h2> <p>The [-DSQLITE_DIRECT_OVERFLOW_READ] compile-time option causes SQLite to bypass its page cache when reading content from overflow pages. This helps database reads of 10K blobs run a little faster, but not all that much faster. SQLite still holds a speed advantage over direct filesystem reads without the SQLITE_DIRECT_OVERFLOW_READ compile-time option. <p>Other compile-time options such as using -O3 instead of -Os or using [-DSQLITE_THREADSAFE=0] and/or some of the other [recommended compile-time options] might help SQLite to run even faster relative to direct filesystem reads. <p>When constructing the test data, trying varying the size of the blob. The performance advantage will shift toward direct filesystem reads as the size of blobs increase, since the cost of invoking open() and close() will be amortized over more bytes transferred using read(). The break-even point, the point where it becomes faster to read directly from the filesystem, will vary from one system to another. <p>The --blob-api option causes database reads to occur using the [sqlite3_blob_open()], [sqlite3_blob_reopen()], and [sqlite3_blob_read()] interfaces instead of using SQL statements. Without the --blob-api option, a separate SQL statement is run for reading each blob and the performance of reading from the database is approximately the same as the performance from reading directly from files. This is still a significant finding, however, since few people would expect an SQL database to run as fast as a direct read from a file (much less faster) and yet it does. <p>If the --random option is added to the "run" command, that causes the blobs to be read in a random order. This causes a noticable decrease in the performance of database reads, since it forces more movement of B-Tree cursors. When --random is used and --blob-api is omitted, reading directly from files on disk is generally a little faster, but reads from the database are still competitive. <h1>Other Considerations</h1> <p>Other SQL database engines advise applications to store blobs in separate files and then store the filename in the database. In that case, where the database must first be consulted to find the filename before opening and reading the file, simply storing the entire blob in the database is gives much faster read performance with SQLite. See the [Internal Versus External BLOBs] article for more information. <p>This report only looks at the performance of reads, not writes. Because SQLite implements [atomic commit|power-safe ACID transactions] we expect that write performance into SQLite will be slower than writing directly to individual files. However, if ACID transactions are disabled via [PRAGMA journal_mode|PRAGMA journal_mode=OFF] (thus putting SQLite on equal footing with the filesystem) and the [sqlite3_blob_write()] interface is used, SQLite might well be competitive or even faster than writes to separate files on disk. That is an experiment we have not yet run. |