Documentation Source Text

Check-in [a4e2a17a94]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Add LLR to do with the advisory b-tree locks used in shared-cache mode.
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: a4e2a17a9432f6e9c9f3cf4e17c0ca443abe77d7
User & Date: dan 2009-07-02 00:26:31
Context
2009-07-03
15:38
Enhance the CREATE TRIGGER documentation to describe restrictions on INSERT, UPDATE, and DELETE statements that occur within triggers. CVS Ticket #3947. check-in: b7dfcf7883 user: drh tags: trunk
2009-07-02
00:26
Add LLR to do with the advisory b-tree locks used in shared-cache mode. check-in: a4e2a17a94 user: dan tags: trunk
2009-06-27
14:07
Preparing for the 3.6.16 release. check-in: 5eeec98501 user: drh tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to pages/btreemodule.in.

517
518
519
520
521
522
523



524
525
526
527
528
529
530
...
549
550
551
552
553
554
555
556





557
558



559
560
561
562
563
564
565
...
798
799
800
801
802
803
804







805
806
807
808
809
810
811
812
813
814
815
816
817
...
891
892
893
894
895
896
897














































































898
899
900
901
902
903
904
      with the open database.

      [fancyformat_import_requirement H50131]
      [fancyformat_import_requirement H50129]
      [fancyformat_import_requirement H50130]

    [h3 "Multi-User Database Requirements"]



  <ul>
    <li> Lock on schema memory object.
    <li> Locks on b-tree tables.
    <li> "Unlock notify" feature.
    <li> Mutexes/thread-safety features.
  </ul>

................................................................................
    [h3 "Caching and Memory Management Requirements" hlr_memory]
  <ul>
    <li> Memory allocation related features (pcache, scratch memory, other...).
    <li> Default pcache implementation (sqlite3_release_memory()).
    <li> Schema memory object allocation (destructor registration).
  </ul>

    [h3 "Fault Tolerance Requirements"]





  <ul>
    <li> Don't corrupt the database. Various modes and the expectations of them.



  </ul>

    [h3 "Well-Formedness Requirements"]
  <ul>
    <li> Identify the subset of file-format well-formedness requirements that
         this module is responsible for implementing.
    <li> Define how the module should respond to corrupt database files: don't
................................................................................
      <p>
	The sqlite3BtreeGetMeta interface may be used to retrieve the current
        value of certain fields from the database image header.

      [btree_api_defn sqlite3BtreeGetMeta]

      [fancyformat_import_requirement H51015]








      [btree_api_defn BTREE_FREE_PAGE_COUNT BTREE_SCHEMA_VERSION BTREE_FILE_FORMAT \
                      BTREE_DEFAULT_CACHE_SIZE BTREE_LARGEST_ROOT_PAGE BTREE_TEXT_ENCODING \
                      BTREE_USER_VERSION BTREE_INCR_VACUUM]

      [fancyformat_import_requirement H51016]




    [h2 "Modifying the Database Image"]

      [h3 sqlite3BtreeCreateTable sqlite3BtreeCreateTable]
................................................................................
        Malloc and IO error handling. Maybe these should be grouped together
        for a whole bunch of APIs. And hook into the above via a defintion of
        "successful call".

      [h3 sqlite3BtreeIncrVacuum sqlite3BtreeIncrVacuum]
      [btree_api_defn sqlite3BtreeIncrVacuum]















































































    [h2 "What do these do?"]

    <p class=todo>
      The following is used only from within VdbeExec() to check whether or not
      a cursor was opened on a table or index b-tree. Corruption tests can move into
      the b-tree layer.








>
>
>







 







|
>
>
>
>
>

<
>
>
>







 







>
>
>
>
>
>
>





|







 







>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
...
552
553
554
555
556
557
558
559
560
561
562
563
564
565

566
567
568
569
570
571
572
573
574
575
...
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
...
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
      with the open database.

      [fancyformat_import_requirement H50131]
      [fancyformat_import_requirement H50129]
      [fancyformat_import_requirement H50130]

    [h3 "Multi-User Database Requirements"]

      [fancyformat_import_requirement H50156]

  <ul>
    <li> Lock on schema memory object.
    <li> Locks on b-tree tables.
    <li> "Unlock notify" feature.
    <li> Mutexes/thread-safety features.
  </ul>

................................................................................
    [h3 "Caching and Memory Management Requirements" hlr_memory]
  <ul>
    <li> Memory allocation related features (pcache, scratch memory, other...).
    <li> Default pcache implementation (sqlite3_release_memory()).
    <li> Schema memory object allocation (destructor registration).
  </ul>

    [h3 "Exception Handling Requirements"]

  <p>
    System failure. Do not corrupt the database image.
  <p>
    Three kinds of exception:
  <ul>

    <li> IO Error.
    <li> Malloc request failure.
    <li> Database image corruption.
  </ul>

    [h3 "Well-Formedness Requirements"]
  <ul>
    <li> Identify the subset of file-format well-formedness requirements that
         this module is responsible for implementing.
    <li> Define how the module should respond to corrupt database files: don't
................................................................................
      <p>
	The sqlite3BtreeGetMeta interface may be used to retrieve the current
        value of certain fields from the database image header.

      [btree_api_defn sqlite3BtreeGetMeta]

      [fancyformat_import_requirement H51015]
      [fancyformat_import_requirement H51016]

      <p>
        The two requirements above imply that if sqlite3BtreeGetMeta is called with
        anything other than a b-tree database connection handle with an open read-only
        or read-write transaction as the first argument, or with anything other than
        an integer between 0 and 7 (inclusive) as the second, the results are undefined.

      [btree_api_defn BTREE_FREE_PAGE_COUNT BTREE_SCHEMA_VERSION BTREE_FILE_FORMAT \
                      BTREE_DEFAULT_CACHE_SIZE BTREE_LARGEST_ROOT_PAGE BTREE_TEXT_ENCODING \
                      BTREE_USER_VERSION BTREE_INCR_VACUUM]

      [fancyformat_import_requirement H51017]




    [h2 "Modifying the Database Image"]

      [h3 sqlite3BtreeCreateTable sqlite3BtreeCreateTable]
................................................................................
        Malloc and IO error handling. Maybe these should be grouped together
        for a whole bunch of APIs. And hook into the above via a defintion of
        "successful call".

      [h3 sqlite3BtreeIncrVacuum sqlite3BtreeIncrVacuum]
      [btree_api_defn sqlite3BtreeIncrVacuum]

    [h2 "Advisory B-Tree Locks"] 

      <p>
	This section describes the b-tree module interfaces used for acquiring
	and querying the advisory locks that can be placed on database image
	pages. The locking mechanisms described in this section are only used
	to arbitrate between multiple clients of the same in-memory page-cache.
	The locking mechanism used to control access to a file-system
	representation of the database when multiple in-memory page caches
	(possibly located in different OS processes) are open on it is
        described in <span class=todo>this</span>.

      <p>
        As well as obtaining advisory locks explicitly using the 
        sqlite3BtreeLockTable API (see below), a read-lock on page 1 of the
        database image is automatically obtained whenever a b-tree database 
	connection opens a read-only or read-write transaction (see 
        <span class=todo>requirement number</span>). Note that this means
        that a write-lock on page 1 is effectively an exclusive lock on
	the entire page-cache, as it prevents any other connection from opening
        a transaction of any kind.

      [h3 sqlite3BtreeLockTable]

      [btree_api_defn sqlite3BtreeLockTable]

      <p>
        The sqlite3BtreeLockTable API allows database clients to place 
        advisory read or write locks on a specified page of the database 
        image. The specified page need not exist within the database image.
        By convention, SQLite acquires read and write locks on the root
        pages of table b-trees only, but this is not required to be enforced
        by the b-tree module. Locks may only be obtained when a database
        client has an open transaction. All locks are automatically released
        when the open transaction is concluded.

        [fancyformat_import_requirement L50016]
        [fancyformat_import_requirement L50017]

      <p>
        The two requirements above imply that the results of calling 
        sqlite3BtreeLockTable on a b-tree database connection handle that does
        not currently have an open transaction, or attempting to obtain
        a write-lock using a b-tree database connection handle that only has
        a read-only transaction open are undefined.


        [fancyformat_import_requirement L50019]
        [fancyformat_import_requirement L50020]

      <p>
        Requirement L50020 is overly conservative. Because a write-lock may 
        only be requested if the b-tree database connection has an open read-write 
	transaction (L50017), and at most a single b-tree database connection
        may have such an open transaction at one time, it is not possible for
        a request for a write-lock to fail because another connection is holding
        a write-lock on the same b-tree database image page. It may, however,
        fail because another connection is holding a read-lock.

      <p>
        All locks are held until the current transaction is concluded.

        [fancyformat_import_requirement L50018]

      <p class=todo> Malloc failure?

      <p class=todo> Read uncommitted flag. Maybe this should be handled
        outside of the b-tree module. Is there anything to stop connections
        with this flag set simply not obtaining read locks? There are assert()
        statements in the b-tree module that need to take this flag into account,
        but not actual functionality.

      [h3 sqlite3BtreeSchemaLocked]
      [btree_api_defn sqlite3BtreeSchemaLocked]

        [fancyformat_import_requirement L50014]
        [fancyformat_import_requirement L50015]

    [h2 "What do these do?"]

    <p class=todo>
      The following is used only from within VdbeExec() to check whether or not
      a cursor was opened on a table or index b-tree. Corruption tests can move into
      the b-tree layer.

Changes to req/hlr50000.txt.

302
303
304
305
306
307
308





309
310
311
312
313
314
315
...
392
393
394
395
396
397
398
399

400










401
402
403
404
405
406

407
408
409
410
411
the safety-level of a page-cache to one of "off", "normal" or "full",
given an open b-tree database connection to that page-cache.

HLR H50155
The default value assigned to the safety-level configuration parameter of a
page-cache shall be "full".








HLR H51001      H50010
If successful, a call to the sqlite3BtreeOpen function shall return SQLITE_OK
and set the value of *ppBtree to contain a new B-Tree database connection
handle.

................................................................................
HLR H51014
A call to the sqlite3BtreeGetJournalname function with a valid B-Tree database
connection handle opened on a temporary database as the first argument shall
return a pointer to a buffer to a nul-terminated string zero bytes in length
(i.e. the first byte of the buffer shall be 0x00).




HLR H51015       H50109










If successful, a call to the sqlite3BtreeGetMeta function shall set the
value of *pValue to the current value of the specified 32-bit unsigned 
integer in the database image database header and return SQLITE_OK.

HLR H51016       H50109
The database header field read from the database image by a call to

sqlite3BtreeGetMeta shall be the 32-bit unsigned integer header field stored at
byte offset (36 + 4 * idx) of the database header, where idx is the value of
the second parameter passed to sqlite3BtreeGetMeta.









>
>
>
>
>







 








>

>
>
>
>
>
>
>
>
>
>
|
|
|

|

>
|
|
|

<
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
...
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427

the safety-level of a page-cache to one of "off", "normal" or "full",
given an open b-tree database connection to that page-cache.

HLR H50155
The default value assigned to the safety-level configuration parameter of a
page-cache shall be "full".

HLR H50156
The b-tree module shall provide an interface allowing database clients to
acquire advisory read (shared) or write (exclusive) locks on a specific b-tree
structure within the database.



HLR H51001      H50010
If successful, a call to the sqlite3BtreeOpen function shall return SQLITE_OK
and set the value of *ppBtree to contain a new B-Tree database connection
handle.

................................................................................
HLR H51014
A call to the sqlite3BtreeGetJournalname function with a valid B-Tree database
connection handle opened on a temporary database as the first argument shall
return a pointer to a buffer to a nul-terminated string zero bytes in length
(i.e. the first byte of the buffer shall be 0x00).




HLR H51015       H50109
If the first parameter is a b-tree database connection handle with an open
read-only or read-write transaction, and the second parameter is an integer
between 0 and 7 inclusive, and the database image consists of zero pages,
a call to the sqlite3BtreeGetMeta function shall set the value of *pValue to 
zero.

HLR H51016       H50109
If the first parameter is a b-tree database connection handle with an open
read-only or read-write transaction, and the second parameter is an integer
between 0 and 7 inclusive, and the database image consists of one or more
pages, a call to the sqlite3BtreeGetMeta function shall set the value of
*pValue to the current value of the specified 32-bit unsigned integer in the
database image database header.

HLR H51017       H50109
The database header field read from the database image by a call to
sqlite3BtreeGetMeta in the situation specified by H51016 shall be the 32-bit 
unsigned integer header field stored at byte offset (36 + 4 * idx) of the
database header, where idx is the value of the second parameter passed to
sqlite3BtreeGetMeta.


Changes to req/llr50000.txt.

73
74
75
76
77
78
79
80
81
82
83
84












































85
86
87
88
89
90
91
92
93
94
pointing to an entry with a smaller key than that requested, or the cursor
is left pointing a no entry at all because the b-tree structure is completely
empty, *pRes (the value of the "int" variable pointed to by the pointer passed
as the fifth parameter to sqlite3BtreeMovetoUnpacked) shall be set to -1.
Otherwise, if the b-tree cursor is left pointing to an entry with a larger key
than that requested, *pRes shall be set to 1.


HLR L50013  H50127
A successful call to the sqlite3BtreeDelete function made with a read/write
b-tree cursor passed as the first argument shall remove the entry pointed to by
the b-tree cursor from the b-tree structure. 















































HLR L51001
The balance-siblings algorithm shall redistribute the b-tree cells currently 
stored on a overfull or underfull page and up to two sibling pages, adding
or removing siblings as required, such that no sibling page is overfull and
the minimum possible number of sibling pages is used to store the 
redistributed b-tree cells.








<




>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>










73
74
75
76
77
78
79

80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
pointing to an entry with a smaller key than that requested, or the cursor
is left pointing a no entry at all because the b-tree structure is completely
empty, *pRes (the value of the "int" variable pointed to by the pointer passed
as the fifth parameter to sqlite3BtreeMovetoUnpacked) shall be set to -1.
Otherwise, if the b-tree cursor is left pointing to an entry with a larger key
than that requested, *pRes shall be set to 1.


HLR L50013  H50127
A successful call to the sqlite3BtreeDelete function made with a read/write
b-tree cursor passed as the first argument shall remove the entry pointed to by
the b-tree cursor from the b-tree structure. 



HLR L50014
A call to the sqlite3BtreeSchemaLocked function with a valid b-tree 
database connection as the only argument shall return SQLITE_LOCKED_SHAREDCACHE
if there exists another b-tree database connection connected to the
same page-cache that currently holds a write-lock on database image
page 1.

HLR L50015
A call to the sqlite3BtreeSchemaLocked function with a valid b-tree 
database connection as the only argument shall return SQLITE_OK if
H51017 does not apply.

HLR L50016
A call to sqlite3BtreeLockTable, specifying a b-tree database connection handle 
with an open read-only or read-write transaction as the first parameter, and 
zero as the third parameter, shall attempt to obtain a read-lock on the database
page specified by the second parameter.

HLR L50017
A call to sqlite3BtreeLockTable, specifying a b-tree database connection handle 
with an open read-write transaction as the first parameter, and a non-zero value as 
the third parameter, shall attempt to obtain a write-lock on the database
page specified by the second parameter.

HLR L50018
When a read-only or read-write transaction is concluded, all advisory b-tree locks
held by the b-tree database connection shall be relinquished.

HLR L50019
If, when attempting to obtain a read-lock as described in L50016, there exists
another b-tree database connection connected to the same page-cache that is
holding a write-lock on the same database image page, the read-lock shall not
be granted and the call to sqlite3BtreeLockTable shall return SQLITE_LOCKED_SHAREDCACHE.

HLR L50020
If, when attempting to obtain a write-lock as described in L50017, there exists
another b-tree database connection connected to the same page-cache that is
holding a read or write-lock on the same database image page, the write-lock 
shall not be granted and the call to sqlite3BtreeLockTable shall return 
SQLITE_LOCKED_SHAREDCACHE.




HLR L51001
The balance-siblings algorithm shall redistribute the b-tree cells currently 
stored on a overfull or underfull page and up to two sibling pages, adding
or removing siblings as required, such that no sibling page is overfull and
the minimum possible number of sibling pages is used to store the 
redistributed b-tree cells.