Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Refinements to the faster-than-filesystem article based on peer review. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA3-256: |
e5a09f2ea27eae5e572c0bbe752ff77a |
User & Date: | drh 2017-06-08 19:31:53.189 |
Context
2017-06-09
| ||
12:23 | Clarify the rules for referencing tables in triggers. (check-in: e0f55595d4 user: drh tags: trunk) | |
2017-06-08
| ||
19:31 | Refinements to the faster-than-filesystem article based on peer review. (check-in: e5a09f2ea2 user: drh tags: trunk) | |
14:34 | Version 3.19.3 (check-in: 9a562cf207 user: drh tags: trunk, release, version-3.19.3) | |
Changes
Changes to pages/fasterthanfs.in.
1 2 3 4 5 6 7 8 | <title>35% Faster Than The Filesystem</title> <tcl>hd_keywords {faster than the filesystem} \ {35% Faster Than The Filesystem}</tcl> <table_of_contents> <h1>Summary</h1> | < < < < < | < | | > > > > > > > > | > | > > > > > > > > > > > > > > > > | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 | <title>35% Faster Than The Filesystem</title> <tcl>hd_keywords {faster than the filesystem} \ {35% Faster Than The Filesystem}</tcl> <table_of_contents> <h1>Summary</h1> <p>SQLite reads small blobs (for example, thumbnail images) <a href="#approx">35% faster¹</a> than the same blobs can be read from or written to individual files on disk using fread() or fwrite(). <p>Furthermore, a single SQLite database holding 10-kilobyte blobs uses about 20% less disk space than storing the blobs in individual files. <p>The performance difference arises (we believe) because when working from an SQLite database, the open() and close() system calls are invoked only once, whereas open() and close() are invoked once for each blob when using blobs stored in individual files. It appears that the overhead of calling open() and close() is greater than the overhead of using the database. The size reduction arises from the fact that individual files are padded out to the next multiple of the filesystem block size, whereas the blobs are packed more tightly into an SQLite database. <p> The measurements in this article were made during the week of 2017-06-05 using a version of SQLite in between 3.19.2 and 3.20.0. You may expect future versions of SQLite to perform even better. <h2>Caveats</h2> <a name="approx"></a> <p> ¹The 35% figure above is approximate. Actual timings vary depending on hardware, operating system, and the details of the experiment, and due to random performance fluctuations on real-world hardware. See the text below for more detail. Try the experiments yourself. Report significant deviations to the [mailing lists]. </p> <p> The 35% figure is based on running tests on every machine that the author has easily at hand. Some reviewers of this article report that SQLite has higher latency than direct I/O on their systems. We do not yet understand the difference. We also see indications that SQLite does not perform as well as direct I/O when experiments are run using a cold filesystem cache. <p> So let your take-away be this: read/write latency for SQLite is competitive with read/write latency of individual files on disk. Often SQLite is faster. Sometimes SQLite is almost as fast. Either way, this article disproves the common assumption that a relational database must be slower than direct filesystem I/O. <h1>How These Measurements Are Made</h1> <p>I/O performance is measured using the [https://www.sqlite.org/src/file/test/kvtest.c|kvtest.c] program from the SQLite source tree. To compile this test program, first gather the kvtest.c source file |
︙ | ︙ | |||
58 59 60 61 62 63 64 65 66 | <p>Or on Windows with MSVC: <codeblock> cl -I. -DSQLITE_DIRECT_OVERFLOW_READ kvtest.c sqlite3.c </codeblock> <p> Use the resulting "kvtest" program to | > > > | > | 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 | <p>Or on Windows with MSVC: <codeblock> cl -I. -DSQLITE_DIRECT_OVERFLOW_READ kvtest.c sqlite3.c </codeblock> <p>Instructions for compiling for Android are <a href="#compile-android">shown below</a>. <p> Use the resulting "kvtest" program to generate a test database with 100,000 random uncompressible blobs, each with a random size between 8,000 and 12,000 bytes using a command like this: <codeblock> ./kvtest init test1.db --count 100k --size 10k --variance 2k </codeblock> |
︙ | ︙ | |||
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 | [power-safe transactions] whereas the direct-to-disk writing is not. To put the tests on a more equal footing, add either the --nosync option to the SQLite writes to disable calling fsync() or FlushFileBuffers() to force content to disk, or using the --fsync option for the direct-to-disk tests to force them to invoke fsync() or FlushFileBuffers() when updating disk files. <p> There are many other testing options, which can be seen by running the command: <codeblock> ./kvtest help </codeblock> <h2>Read Performance Measurements</h2> <p>The chart below shows data collected using [https://www.sqlite.org/src/file/test/kvtest.c|kvtest.c] on five different | > > > > > > > > > > | > > > > > > > > > > > | | | | > > > > | | | > | | > | | | > | | 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 | [power-safe transactions] whereas the direct-to-disk writing is not. To put the tests on a more equal footing, add either the --nosync option to the SQLite writes to disable calling fsync() or FlushFileBuffers() to force content to disk, or using the --fsync option for the direct-to-disk tests to force them to invoke fsync() or FlushFileBuffers() when updating disk files. <p> By default, kvtest runs the database I/O measurements all within a single transaction. Use the --multitrans option to run each blob read or write in a separate transaction. The --multitrans option makes SQLite much slower, and uncompetitive with direct disk I/O. This option proves, yet again, that to get the most performance out of SQLite, you should group as much database interaction as possible within a single transaction. <p> There are many other testing options, which can be seen by running the command: <codeblock> ./kvtest help </codeblock> <h2>Read Performance Measurements</h2> <p>The chart below shows data collected using [https://www.sqlite.org/src/file/test/kvtest.c|kvtest.c] on five different systems: <ul> <li><b>Win7</b>: An circa-2009 Dell Inspiron laptop, Pentium dual-core at 2.30GHz, 4GiB RAM, Windows7. <li><b>Win10</b>: A 2016 Lenovo YOGA 910, Intel i7-7500 at 2.70GHz, 16GiB RAM, Windows10. <li><b>Mac</b>: A 2015 MacBook Pro, 3.1GHz intel Core i7, 16GiB RAM, MacOS 10.12.5 <li><b>Ubuntu</b>: Desktop built from Intel i7-4770K at 3.50GHz, 32GiB RAM, Ubuntu 16.04.2 LTS <li><b>Android</b>: Galaxy S3, ARMv7, 2GiB RAM </ul> <p>All machines use SSD except Win7 which has a hard-drive. The test database is 100K blobs with sizes uniformly distributed between 8K and 12K, for a total of about 1 gigabyte of content. The database page size is 4KiB. The -DSQLITE_DIRECT_OVERFLOW_READ compile-time option was used for all of these tests. Tests were run multiple times. The first run was used to warm up the cache and its timings were discarded. <p> The chart below shows average time to read a blob from the SQLite database, relative to the time needed to read the same blob from a file in the filesystem. Smaller is better. The actual timings vary considerably from one system to another (the Ubuntu desktop is much faster than the Galaxy S3 phone, for example). This chart shows the ratio of the times needed to read blobs from SQLite divided by the times needed to read the same blob directly from disk. The left-most column in the chart is the time needed to read from disk divided by itself, and thus always has a value of 1.00. That column is included for visual reference only. <p> In this chart, an SQL statement ("SELECT v FROM kv WHERE k=?1") is prepared once. Then for each blob, the blob key value is bound to the ?1 parameter and the statement is evaluated to extract the blob content. <center> <div class='imgcontainer'> <img src="images/faster-read-sql.jpg"> </div> <br> Chart 1: SQLite read latency relative to direct filesystem reads.<br> 100K blobs, avg 10KB each, random order using SQL </center> <p> The performance can be improved slightly by bypassing the SQL layer and reading the blob content directly using the [sqlite3_blob_read()] interface, as shown in the next chart: <center> <div class='imgcontainer'> <img src="images/faster-read-blobapi.jpg"> </div> <br> Chart 2: SQLite read latency relative to direct filesystem reads.<br> 100K blobs, avg size 10KB, random order<br> using sqlite3_blob_read(). </center> <p> Further performance improves can be made by using the [memory-mapped I/O] feature of SQLite. In the next chart, the entire 1GB database file is memory mapped and blobs are read (in random order) using the [sqlite3_blob_read()] interface: <center> <div class='imgcontainer'> <img src="images/faster-read-mmap.jpg"> </div> <br> Chart 3: SQLite read latency relative to direct filesystem reads.<br> 100K blobs, avg size 10KB, random order<br> using sqlite3_blob_read() from a memory-mapped database. </center> <p> The third chart shows that reading blob content out of SQLite can be twice as fast as reading from individual files on disk for Mac and Android, and an amazing ten times faster for Windows. |
︙ | ︙ | |||
285 286 287 288 289 290 291 | setting is NORMAL instead of FULL, the transactions are not durable. <center> <div class='imgcontainer'> <img src="images/faster-write-safe.jpg"> </div> <br> | > | | | | 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 | setting is NORMAL instead of FULL, the transactions are not durable. <center> <div class='imgcontainer'> <img src="images/faster-write-safe.jpg"> </div> <br> Chart 4: SQLite write latency relative to direct filesystem writes.<br> 10K blobs, avg size 10KB, random order,<br> WAL mode with synchronous NORMAL,<br> exclusive of checkpoint time </center> <p> The android performance numbers for the write experiments are omitted because the performance tests on the Galaxy S3 are so random. Two consecutive runs of the exact same experiment would give wildly different times. And, to be fair, the performance of SQLite on android is slightly |
︙ | ︙ | |||
309 310 311 312 313 314 315 | data prone to corruption due to system crashes and power failures. <center> <div class='imgcontainer'> <img src="images/faster-write-unsafe.jpg"> </div> <br> | > | | | 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 | data prone to corruption due to system crashes and power failures. <center> <div class='imgcontainer'> <img src="images/faster-write-unsafe.jpg"> </div> <br> Chart 5: SQLite write latency relative to direct filesystem writes.<br> 10K blobs, avg size 10KB, random order,<br> journaling disabled, synchronous OFF. </center> <p> In all of the write tests, it is important to disable anti-virus software prior to running the direct-to-disk performance tests. We found that anti-virus software slows down direct-to-disk by an order of magnitude whereas it impacts SQLite writes very little. This is probably due to the |
︙ | ︙ | |||
350 351 352 353 354 355 356 | <ol type="A"> <li> <p>SQLite is competitive with, and usually faster than, blobs stored in separate files on disk, for both reading and writing. <li> | > > > > > > | | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 | <ol type="A"> <li> <p>SQLite is competitive with, and usually faster than, blobs stored in separate files on disk, for both reading and writing. <li> <p>SQLite is much faster than direct writes to disk on Windows when anti-virus protection is turned on. Since anti-virus software is and should be on by default in Windows, that means that SQLite is generally much faster than direct disk writes on Windows. <li> <p>Reading is about an order of magnitude faster than writing, for all systems and for both SQLite and direct-to-disk I/O. <li> <p>I/O performance varies widely depending on operating system and hardware. Make your own measurements before drawing conclusions. <li> <p>Some other SQL database engines advise developers to store blobs in separate files and then store the filename in the database. In that case, where the database must first be consulted to find the filename before opening and reading the file, simply storing the entire blob in the database gives much faster read and write performance with SQLite. See the [Internal Versus External BLOBs] article for more information. </ol> <h1>Additional Notes</h1> <a name="compile-android"></a> <h2>Compiling And Testing on Android</h2> <p> The kvtest program is compiled and run on Android as follows. First install the Android SDK and NDK. Then prepare a script named "android-gcc" that looks approximately like this: <codeblock> #!/bin/sh # NDK=/home/drh/Android/Sdk/ndk-bundle SYSROOT=$NDK/platforms/android-16/arch-arm ABIN=$NDK/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin GCC=$ABIN/arm-linux-androideabi-gcc $GCC --sysroot=$SYSROOT -fPIC -pie $* </codeblock> <p>Make that script executable and put it on your $PATH. Then compile the kvtest program as follows: <codeblock> android-gcc -Os -I. kvtest.c sqlite3.c -o kvtest-android </codeblock> <p>Next, move the resulting kvtest-android executable to the Android device: <codeblock> adb push kvtest-android /data/local/tmp </codeblock> <p>Finally use "adb shell" to get a shell prompt on the Android device, cd into the /data/local/tmp directory, and begin running the tests as with any other unix host. |