Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Add a reference to the Jim Gray paper to the faster-than-filesystem article. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA3-256: |
b37cb6bc6053501d116fcdf1022b3551 |
User & Date: | drh 2017-06-17 13:51:31.423 |
Context
2017-06-26
| ||
14:43 | Correction to the schema for the DBSTAT virtual table. (check-in: a30aad4d5c user: drh tags: trunk) | |
2017-06-17
| ||
13:51 | Add a reference to the Jim Gray paper to the faster-than-filesystem article. (check-in: b37cb6bc60 user: drh tags: trunk) | |
10:14 | Add change log, news, and chronology entries for the 3.18.2 backpatch release. (check-in: d7d183be5b user: drh tags: trunk) | |
Changes
Changes to pages/fasterthanfs.in.
︙ | |||
55 56 57 58 59 60 61 62 63 64 65 66 67 68 | 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 | + + + + + + + + + + + + + + + + + + + + + + + + + + | <p> So let your take-away be this: read/write latency for SQLite is competitive with read/write latency of individual files on disk. Often SQLite is faster. Sometimes SQLite is almost as fast. Either way, this article disproves the common assumption that a relational database must be slower than direct filesystem I/O. <h2>Related Studies</h2> <p> [https://www.microsoft.com/en-us/research/people/gray/|Jim Gray] and others studied the read performance of BLOBs versus file I/O for Microsoft SQL Server and found that reading BLOBs out of the database was faster for BLOB sizes less than between 250KiB and 1MiB. ([https://www.microsoft.com/en-us/research/publication/to-blob-or-not-to-blob-large-object-storage-in-a-database-or-a-filesystem/|Paper]). In that study, the database still stores the filename of the content even if the content is held in a separate file. So the database is consulted for every BLOB, even if it is only to extract the filename. In this article, the key for the BLOB is the filename, so no preliminary database access is required. Because the database is never used at all when reading content from individual files in this article, the threshold at which direct file I/O becomes faster is smaller than it is in Gray's paper. <p> The [Internal Versus External BLOBs] article on this website is an earlier investigation (circa 2011) that uses the same approach as the Jim Gray paper — storing the blob filenames as entries in the database &mdash but for SQLite instead of SQL Server. <h1>How These Measurements Are Made</h1> <p>I/O performance is measured using the [https://www.sqlite.org/src/file/test/kvtest.c|kvtest.c] program from the SQLite source tree. To compile this test program, first gather the kvtest.c source file |
︙ |