Documentation Source Text

Check-in [89222751c6]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Further incremental improvements to fileio.in.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 89222751c6e4f210a210fda2cdac6884545063a9
User & Date: dan 2008-07-31 16:11:50.000
Context
2008-07-31
17:13
Begin adding the document on memory allocation. Update the index and changes documents for the release of version 3.6.1. (check-in: 8f269144c3 user: drh tags: trunk)
16:11
Further incremental improvements to fileio.in. (check-in: 89222751c6 user: dan tags: trunk)
10:57
Add a "document structure" section to fileio.in. (check-in: 92809645e2 user: dan tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to pages/fileio.in.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

<tcl>
proc hd_assumption {id text} {
  hd_requirement $id $text
}

proc process {text} {
  set zOut ""
  set zSpecial ""

  foreach zLine [split $text "\n"] {

    switch -regexp $zLine {
      {^ *REQ *[^ ][^ ]* *$} {
        regexp { *REQ *([^ ]+) *} $zLine -> zRecid
        append zOut "<p class=req id=$zRecid>"
        set zSpecial hd_requirement
        set zRecText ""
      }











<







1
2
3
4
5
6
7
8
9
10
11

12
13
14
15
16
17
18

<tcl>
proc hd_assumption {id text} {
  hd_requirement $id $text
}

proc process {text} {
  set zOut ""
  set zSpecial ""

  foreach zLine [split $text "\n"] {

    switch -regexp $zLine {
      {^ *REQ *[^ ][^ ]* *$} {
        regexp { *REQ *([^ ]+) *} $zLine -> zRecid
        append zOut "<p class=req id=$zRecid>"
        set zSpecial hd_requirement
        set zRecText ""
      }
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
        if {$zSpecial ne ""} {
          if {[regexp {^ *\. *$} $zLine]} {set zLine ""}
          append zRecText "$zLine\n"
        }
        append zOut "$zLine\n"
      }
    }

  }
  set zOut
}

hd_resolve [process {
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">








<







33
34
35
36
37
38
39

40
41
42
43
44
45
46
        if {$zSpecial ne ""} {
          if {[regexp {^ *\. *$} $zLine]} {set zLine ""}
          append zRecText "$zLine\n"
        }
        append zOut "$zLine\n"
      }
    }

  }
  set zOut
}

hd_resolve [process {
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">

197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
  <h2>Performance Related Assumptions</h2>

    ASSUMPTION A21010
      It is assumed that writing a series of sequential blocks of data to 
      a file in order is faster than writing the same blocks in an arbitrary
      order.


  <h2 id=fs_characteristics>System Failure Related Assumptions</h2>
    <p>
      In the event of an operating system or power failure, the various 
      combinations of file-system and storage hardware available provide
      varying levels of guarantee as to the integrity of the data written
      to the file system just before or during the failure. The exact
      combination of IO operations that SQLite is required to perform







<







195
196
197
198
199
200
201

202
203
204
205
206
207
208
  <h2>Performance Related Assumptions</h2>

    ASSUMPTION A21010
      It is assumed that writing a series of sequential blocks of data to 
      a file in order is faster than writing the same blocks in an arbitrary
      order.


  <h2 id=fs_characteristics>System Failure Related Assumptions</h2>
    <p>
      In the event of an operating system or power failure, the various 
      combinations of file-system and storage hardware available provide
      varying levels of guarantee as to the integrity of the data written
      to the file system just before or during the failure. The exact
      combination of IO operations that SQLite is required to perform
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
      <li><p>The <b>atomic-write</b> property. A system that supports this
          property also specifies the size or sizes of the blocks that it
          is capable of writing. Valid sizes are powers of two greater than
          512. If a write operation modifies a block of <i>n</i> bytes,
          where <i>n</i> is one of the block sizes for which <i>atomic-write</i>
          is supported, then it is impossible for an aligned write of <i>n</i>
          bytes to cause data corruption. If a failure occurs after such 
	  a write operation and before the applicable file handle is
          <i>synced</i>, then following recovery it will appear as if the
          write operation succeeded or did not take place at all. It is not
          possible that only part of the data specified by the write operation
          was written to persistent media, nor is it possible for any content
          of the sectors spanned by the write operation to be replaced with
          garbage data, as it is normally assumed to be.








|







290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
      <li><p>The <b>atomic-write</b> property. A system that supports this
          property also specifies the size or sizes of the blocks that it
          is capable of writing. Valid sizes are powers of two greater than
          512. If a write operation modifies a block of <i>n</i> bytes,
          where <i>n</i> is one of the block sizes for which <i>atomic-write</i>
          is supported, then it is impossible for an aligned write of <i>n</i>
          bytes to cause data corruption. If a failure occurs after such 
          a write operation and before the applicable file handle is
          <i>synced</i>, then following recovery it will appear as if the
          write operation succeeded or did not take place at all. It is not
          possible that only part of the data specified by the write operation
          was written to persistent media, nor is it possible for any content
          of the sectors spanned by the write operation to be replaced with
          garbage data, as it is normally assumed to be.

323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
          assume that if the write operations of unknown status are arranged
          in the order that they occured:
          <ol> 
            <li> the first <i>n</i> operations will have been executed 
                 successfully,
            <li> the next operation puts all device sectors that it modifies
                 into the transient state, so that following recovery each
		 sector may be partially written, completely written, not
                 written at all or populated with garbage data,
            <li> the remaining operations will not have had any effect on
                 the contents of the file-system.
          </ol> 
    </ul>

    <h3>Failure Related Assumption Details</h3>







|







320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
          assume that if the write operations of unknown status are arranged
          in the order that they occured:
          <ol> 
            <li> the first <i>n</i> operations will have been executed 
                 successfully,
            <li> the next operation puts all device sectors that it modifies
                 into the transient state, so that following recovery each
                 sector may be partially written, completely written, not
                 written at all or populated with garbage data,
            <li> the remaining operations will not have had any effect on
                 the contents of the file-system.
          </ol> 
    </ul>

    <h3>Failure Related Assumption Details</h3>
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
    It may at first seem odd to mention the <i>page cache</i>, primarily
    an implementation detail, in this document. However, it is necessary to 
    acknowledge and describe the <i>page cache</i> in order to provide a
    more complete explanation of the nature and quantity of IO performed
    by SQLite. Further description of the <i>page cache</i> is provided in 
    section <cite>page_cache_descripton</cite>.
    
    

<!--
  <p>
    A database connection is always in one of the following states:

  <ol>
    <li><i>Unlocked state</i> (no transaction).







<







602
603
604
605
606
607
608

609
610
611
612
613
614
615
    It may at first seem odd to mention the <i>page cache</i>, primarily
    an implementation detail, in this document. However, it is necessary to 
    acknowledge and describe the <i>page cache</i> in order to provide a
    more complete explanation of the nature and quantity of IO performed
    by SQLite. Further description of the <i>page cache</i> is provided in 
    section <cite>page_cache_descripton</cite>.
    


<!--
  <p>
    A database connection is always in one of the following states:

  <ol>
    <li><i>Unlocked state</i> (no transaction).
627
628
629
630
631
632
633
634































635
636




637
638
639
640
641
642
643
644
645
646
647
648
649
650

651
652
653
654
655
656
657
658
659
      database file. In some cases, various actions apart from simply obtaining
      the file-system lock must take place when a <i>database connection</i>
      transitions from one state to another.
 
  <p class=todo>
    Maybe a state diagram will be possible...
 -->
































  <h2 id=open_new_connection>Opening a New Connection</h2>





    <p>
      Opening a new database connection is a two-step process:

    <ol>
      <li> A file-handle is opened on the database file.
      <li> If step 1 was successful, an attempt is made to read the 
	   <i>database file header</i> from the database file using the 
           new file-handle.
    </ol>

    <p>
      In step 2 of the procedure above, the database file is not locked
      before it is read from. This is the only exception to the locking 
      rules described in section <cite>reading_data</cite>.

    <p>
      One reason for attempting to read the <i>database file header</i>
      is to determine the <i>page-size</i> used by the database file. 
      Because it is not possible to be certain as to the <i>page-size</i> 
      without holding at least a <i>shared lock</i> on the database file
      (because some other <i>database connection</i> might have changed it
      since the <i>database file header</i> was read), the value read from the
      <i>database file header</i> is known as the <i>expected page size</i>. 









>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>


>
>
>
>














>

|







623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
      database file. In some cases, various actions apart from simply obtaining
      the file-system lock must take place when a <i>database connection</i>
      transitions from one state to another.
 
  <p class=todo>
    Maybe a state diagram will be possible...
 -->

  <h2 id=page_cache_descripton>The Page Cache</h2>
    <p>
      The contents of an SQLite database file is formatted as a set of 
      fixed size pages (see <cite>ff_sqlitert_requirements</cite>) for a
      complete description of the format used. The <i>page size</i> used
      for a particular database is stored as part of the database file
      header at a well-known offset within the first 100 bytes of the 
      file.

    <p>
      As one might imagine, the <i>page cache</i> caches data read from the
      database file on a page basis. Whenever data is read from the database
      file to satisfy user queries, it is loaded in units of a page at a
      time (see section <cite>reading_data</cite> for further details). 
      After being read, page content is stored by the <i>page cache</i> in
      main memory. The next time the page data is required, it may be read
      from the <i>page cache</i> instead of from the database file.

    <p>
      Data is also cached within the <i>page cache</i> before it is written
      to the database file. Usually, when a user issues a command that modifies
      the content of the database file, only the cached version of the 
      page within the connection's <i>page cache</i> is modified. When the
      containing <i>write transaction</i> is committed, the content of all
      modified pages within the <i>page cache</i> are copied into the
      database file.

    <p class=todo>
      Some kind of reference to the 'page cache algorithms' section.
 

  <h2 id=open_new_connection>Opening a New Connection</h2>

    <p>
      This section describes the VFS operations that take place when a
      new database connection is created. 

    <p>
      Opening a new database connection is a two-step process:

    <ol>
      <li> A file-handle is opened on the database file.
      <li> If step 1 was successful, an attempt is made to read the 
	   <i>database file header</i> from the database file using the 
           new file-handle.
    </ol>

    <p>
      In step 2 of the procedure above, the database file is not locked
      before it is read from. This is the only exception to the locking 
      rules described in section <cite>reading_data</cite>.

    <p>
      The reason for attempting to read the <i>database file header</i>
      is to determine the <i>page-size</i> used by the database file. 
      Because it is not possible to be certain as to the <i>page-size</i> 
      without holding at least a <i>shared lock</i> on the database file
      (because some other <i>database connection</i> might have changed it
      since the <i>database file header</i> was read), the value read from the
      <i>database file header</i> is known as the <i>expected page size</i>. 

680
681
682
683
684
685
686




687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
      If the <i>database file header</i> cannot be read from a newly opened 
      database file (because the file is less than 100 bytes in size), the 
      connections <i>expected page-size</i> shall be set to the compile time
      value of the SQLITE_DEFAULT_PAGESIZE option.

  <h2>Closing a Connection</h2>





    <p>
      Closing a database connection is a simple matter. The open VFS 
      file-handle is closed and in-memory <i>page cache</i> related resources
      are released. 

    REQ H21040
      When a <i>database connection</i> is closed, SQLite shall close the 
      associated file handle at the VFS level.

  <h2 id=page_cache_descripton>The Page Cache</h2>

    <p class=todo>
      Description of the page cache is.
 
<h1 id=reading_data>Reading Data</h1>
  <p>
    In order to return data from the database to the user, for example as
    the results of a SELECT query, SQLite must at some point read data
    from the database file. Usually, data is read from the database file in 
    aligned blocks of <i>page-size</i> bytes. The exception is when the
    database file header fields are being inspected, before the







>
>
>
>









<
<
<
<
<







712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731





732
733
734
735
736
737
738
      If the <i>database file header</i> cannot be read from a newly opened 
      database file (because the file is less than 100 bytes in size), the 
      connections <i>expected page-size</i> shall be set to the compile time
      value of the SQLITE_DEFAULT_PAGESIZE option.

  <h2>Closing a Connection</h2>

    <p>
      This section describes the VFS operations that take place when an
      existing database connection is closed (destroyed). 

    <p>
      Closing a database connection is a simple matter. The open VFS 
      file-handle is closed and in-memory <i>page cache</i> related resources
      are released. 

    REQ H21040
      When a <i>database connection</i> is closed, SQLite shall close the 
      associated file handle at the VFS level.






<h1 id=reading_data>Reading Data</h1>
  <p>
    In order to return data from the database to the user, for example as
    the results of a SELECT query, SQLite must at some point read data
    from the database file. Usually, data is read from the database file in 
    aligned blocks of <i>page-size</i> bytes. The exception is when the
    database file header fields are being inspected, before the