Documentation Source Text

Check-in [b37cb6bc60]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Add a reference to the Jim Gray paper to the faster-than-filesystem article.
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: b37cb6bc6053501d116fcdf1022b35510c48b6b61625d784857a3ee0fbad3c57
User & Date: drh 2017-06-17 13:51:31
Context
2017-06-26
14:43
Correction to the schema for the DBSTAT virtual table. check-in: a30aad4d5c user: drh tags: trunk
2017-06-17
13:51
Add a reference to the Jim Gray paper to the faster-than-filesystem article. check-in: b37cb6bc60 user: drh tags: trunk
10:14
Add change log, news, and chronology entries for the 3.18.2 backpatch release. check-in: d7d183be5b user: drh tags: trunk
Changes
Hide Diffs Side-by-Side Diffs Ignore Whitespace Patch

Changes to pages/fasterthanfs.in.

    55     55   <p>
    56     56   So let your take-away be this: read/write latency for
    57     57   SQLite is competitive with read/write latency of individual files on
    58     58   disk.  Often SQLite is faster.  Sometimes SQLite is almost
    59     59   as fast.  Either way, this article disproves the common
    60     60   assumption that a relational database must be slower than direct
    61     61   filesystem I/O.
           62  +
           63  +<h2>Related Studies</h2>
           64  +
           65  +<p>
           66  +[https://www.microsoft.com/en-us/research/people/gray/|Jim Gray]
           67  +and others studied the read performance of BLOBs
           68  +versus file I/O for Microsoft SQL Server and found that reading BLOBs 
           69  +out of the 
           70  +database was faster for BLOB sizes less than between 250KiB and 1MiB.
           71  +([https://www.microsoft.com/en-us/research/publication/to-blob-or-not-to-blob-large-object-storage-in-a-database-or-a-filesystem/|Paper]).
           72  +In that study, the database still stores the filename of the content even
           73  +if the content is held in a separate file.  So the database is consulted
           74  +for every BLOB, even if it is only to extract the filename.  In this
           75  +article, the key for the BLOB is the filename, so no preliminary database
           76  +access is required.  Because the database is never used at all when
           77  +reading content from individual files in this article, the threshold
           78  +at which direct file I/O becomes faster is smaller than it is in Gray's
           79  +paper.
           80  +
           81  +<p>
           82  +The [Internal Versus External BLOBs] article on this website is an
           83  +earlier investigation (circa 2011) that uses the same approach as the
           84  +Jim Gray paper &mdash; storing the blob filenames as entries in the
           85  +database &mdash but for SQLite instead of SQL Server.
           86  +
           87  +
    62     88   
    63     89   <h1>How These Measurements Are Made</h1>
    64     90   
    65     91   <p>I/O performance is measured using the
    66     92   [https://www.sqlite.org/src/file/test/kvtest.c|kvtest.c] program
    67     93   from the SQLite source tree.
    68     94   To compile this test program, first gather the kvtest.c source file