Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Improvements to the application file format document. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
9e12f0ceddbc3af91e56f4213084db6b |
User & Date: | drh 2014-03-13 15:38:15.208 |
Context
2014-03-14
| ||
16:35 | Further tuning of the application file format document. (check-in: 1b422ce8de user: drh tags: trunk) | |
2014-03-13
| ||
15:38 | Improvements to the application file format document. (check-in: 9e12f0cedd user: drh tags: trunk) | |
00:43 | First complete draft of the new application file format document. Integrate with the rest of the documentation via hyperlinks. (check-in: 6d257b8d92 user: drh tags: trunk) | |
Changes
Changes to pages/about.in.
︙ | ︙ | |||
49 50 51 52 53 54 55 | files. A complete SQL database with multiple tables, indices, triggers, and views, is contained in a single disk file. The database file format is cross-platform - you can freely copy a database between 32-bit and 64-bit systems or between [http://en.wikipedia.org/wiki/Endianness | big-endian] and [http://en.wikipedia.org/wiki/Endianness | little-endian] architectures. These features make SQLite a popular choice as | | | 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | files. A complete SQL database with multiple tables, indices, triggers, and views, is contained in a single disk file. The database file format is cross-platform - you can freely copy a database between 32-bit and 64-bit systems or between [http://en.wikipedia.org/wiki/Endianness | big-endian] and [http://en.wikipedia.org/wiki/Endianness | little-endian] architectures. These features make SQLite a popular choice as an [Application File Format]. Think of SQLite not as a replacement for [http://www.oracle.com/database/index.html|Oracle] but as a replacement for [http://man.he.net/man3/fopen|fopen()]</p> <p>SQLite is a compact library. With all features enabled, the [library size] can be less than 500KiB, depending on the target platform and compiler optimization settings. |
︙ | ︙ |
Changes to pages/appfileformat.in.
|
| | > | | | < | | | | | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | <tcl>hd_keywords *appformat {application file-format} \ {Application File Format}</tcl> <title>SQLite As An Application File Format</title> <h1 align="center"> SQLite As An Application File Format </h1> <h2>Executive Summary</h2> <p>An SQLite database file with a defined schema often make an excellent application file format. Here are a dozen reaons why this is so: <ol> <li> Simplified Application Development <li> Single-File Documents <li> High-Level Query Language <li> Accessible Content <li> Cross-Platform <li> Atomic Transactions <li> Incremental And Continuous Updates <li> Easily Extensible <li> Performance <li> Concurrent Use By Multiple Processes <li> Multiple Programming Languages <li> Better Applications </ol> <p>Each of these points will be described in more detail below, after first considering more closely what this article means by "application file format". <h2>What Is An Application File Format?</h2> <p> An "application file format" is the file format used to persist application state to disk or to exchange information between programs. |
︙ | ︙ | |||
48 49 50 51 52 53 54 | <li>GIT - Git source code repository <li>EPUB - The Electronic Publication format used by non-Kindle eBooks <li>ODT - The Open Document format used by OpenOffice and others <li>PPT - Microsoft PowerPoint presentations <li>ODP - The Open Document presentation format used by OpenOffice and others </ul> | > > > > > > > > > > > > > > > | | | | < | | > | | | | | | > | > > > > > > > > > | > > | > > > > | > > > > | | 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 | <li>GIT - Git source code repository <li>EPUB - The Electronic Publication format used by non-Kindle eBooks <li>ODT - The Open Document format used by OpenOffice and others <li>PPT - Microsoft PowerPoint presentations <li>ODP - The Open Document presentation format used by OpenOffice and others </ul> <p>We make a distinction between a "file format" and an "application format". A file format is used to store a single object. So, for example, a GIF or JPEG file stores a single image, and an XHTML file stores text, so those are "file formats" and not "application formats". A EPUB file, in contrast, stores both text and images (as contained XHTML and GIF/JPEG files) and so it is considered a "application format". This article is about "application formats". <p>The boundary between a file format and an application format is fuzzy. This article calls JPEG a file format, but for an image editor, JPEG might be considered the application format. Much depends on context. For this article, let us say that a file format stores a single object and an application format stores many different objects and their relationships to one another. <p>Most application formats fit into one of these three categories: <ol> <li><p><b>Fully Custom Formats.</b> Custom formats are specifically designed for a single application. DOC, DWG, PDF, XLS, and PPT are examples of custom formats. Custom formats are usually contained within a single file, for ease of transport. They are also usually binary, though the DWG format is a notable exception. Custom file formats require specialized application code to read and write and are not normally accessible from commonly available tools such as unix command-line programs and text editors. In other words, custom formats are usually "opaque blobs". To access the content of a custom application file format, one needs a tool specifically engineered to read and/or write that format. <li><p><b>Pile-of-Files Formats.</b> Sometimes the application state is stored as a hierarchy of files. Git is a prime example of this, though the phenomenon occurs frequently in one-off and bespoke applications. A pile-of-files format essentially uses the filesystem as a key/value database, storing small chunks of information into separate files. This gives the advantage of making the content more accessible to common utility programs such as text editors or "awk" or "grep". But even if many of the files in a pile-of-files format are easily readable, there are usually some files that have their own custom format (example: Git "Packfiles") and are hence "opaque blobs" that are not readable or writable without specialized tools. It is also much less convenient to move a pile-of-files from one place or machine to another, than it is to move a single file. And it is hard to make a pile-of-files document into an email attachment, for example. Finally, a pile-of-files format breaks the "document metaphor": there is no one file that a user can point to that is "the document". <li><p><b>Wrapped Pile-of-Files Formats.</b> Some applications use a Pile-of-Files that is then encapsulated into some kind of single-file container, usually a ZIP archive. EPUB, ODT,and ODP are examples of this approach. An EPUB book is really just a ZIP archive that contains various XHTML files for the text of book chapters, GIF and JPEG images for the artwork, and a specialized catalog file that tells the eBook reader how all the XML and image files fit together. OpenOffice documents (ODT and ODP) are also ZIP archives containing XML and images that represent their content as well as "catalog" files that show the interrelationships between the component parts. <p>A wrapped pile-of-files format is a compromise between a full custom file format and a pure pile-of-files format. A wrapped pile-of-files format is not an opaque blob in the same sense as a custom file format, since the component parts can still be accessed using any common ZIP archiver, but the format is not quite as accessible as a pure pile-of-files format because one does still need the ZIP archiver, and one cannot normally use command-line tools like "find" on the file hierarchy without first un-zipping it. On the other hand, a wrapped pile-of-files format does preserve the document metaphor by putting all content into a single disk file. And because it is compressed, the wrapped pile-of-files format tends to be more compact. <p>As with custom file formats, and unlike pure pile-of-file formats, a wrapped pile-of-files format is not as easy to edit, since one most normally rewrite the entire file to change any component part. </ol> <p>The purpose of this document is to argue in favor of a fourth new catagory of application file format: An SQLite database file. <h2>SQLite As The Application File Format</h2> <p> An SQLite database file makes an excellent alternative to a custom or pile-of-files application format. In its simplest form, an SQLite database with a single key/value table like <blockquote><pre> CREATE TABLE files(filename TEXT PRIMARY KEY, content BLOB); </pre></blockquote> could serve as a direct replacement for a wrapped pile-of-files format. If the content is compressed, then such an SQLite database is only slightly larger than an equivalent ZIP archive, and it has the advantage of being able to write individual "files" without having to rewrite the entire document. <p> But an SQLite database is not limited to a simple key/value structure like a pile-of-files database. An SQLite database can have dozens or hundreds or thousands of different of tables, with dozens or hundreds or thousands of fields per table, each with different datatypes and particular meanings, all cross-referencing each other, and all stored efficiently and compactly in a single disk file. <p> Compared to other approaches, the use of an SQLite database as an application file format has compelling advantages: </p> <ol> <li><p><b>Simplified Application Development.</b> No code is needed for reading or writing the application file. One has merely to link against the SQLite library, or include the [amalgamation | single "sqlite3.c" source file] with the rest of the application C code, and SQLite will take care of all of the application file I/O. This can reduce application code size by many thousands of lines, with corresponding saving in development and maintenance costs. |
︙ | ︙ | |||
227 228 229 230 231 232 233 | to verify that no repository history has been lost prior to each change to the repository. <li><p><b>Incremental And Continuous Updates.</b> When writing to an SQLite database file, only those parts of the file that actually change are written out to disk. This makes the writing happen faster and saves wear on SSDs. This is an enormous advantage over custom | | | 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 | to verify that no repository history has been lost prior to each change to the repository. <li><p><b>Incremental And Continuous Updates.</b> When writing to an SQLite database file, only those parts of the file that actually change are written out to disk. This makes the writing happen faster and saves wear on SSDs. This is an enormous advantage over custom and wrapped pile-of-files formats, both of which must completely rewrite the entire document in order to change a single byte. Pure pile-of-files formats can also do incremental updates to some extent, though the granularity of writes is usually larger with pile-of-file formats (a single file) than with SQLite (a single page). <p>A desktop application built on SQLite can also do continuous update. |
︙ | ︙ | |||
368 369 370 371 372 373 374 | </ol> <h2>Conclusion</h2> <p> SQLite is not the perfect application file format for every situation. But in many cases, SQLite is a far better choice than either a custom | | | 403 404 405 406 407 408 409 410 411 412 413 414 | </ol> <h2>Conclusion</h2> <p> SQLite is not the perfect application file format for every situation. But in many cases, SQLite is a far better choice than either a custom file format, a pile-of-files, or a wrapped pile-of-files. SQLite is a high-level, stable, reliable, cross-platform, widely-deployed, extensible, performant, accessible, concurrent file format. It deserves your consideration as the standard file format on your next application design. |