Documentation Source Text: Artifact [b6d231f913]

Artifact b6d231f91347ad6b0d0c858347b6285f1a0309cb0f1f75f50aa7d9a21331f38f:

File pages/backup.in — part of check-in [130193a5b9] at 2018-12-09 22:23:53 on branch trunk — Preliminary documentation for VACUUM INTO (user: drh size: 16955)
<title>SQLite Backup API</title>
<tcl>hd_keywords {Using the SQLite Online Backup API} {backup API}</tcl>

<style>
  /* Formatting for the blocks containing the example code */
  pre a:visited, pre a:link { text-decoration: none ; color: #40534b }
  pre { 
    background: #F3F3F3; 
    float: right; 
    padding: 1ex 2ex; 
    margin-left: 1em;
    border: solid black 1px; 
  }
  h1,h2 { clear: both }
</style>

<div class=fancy>
<h1>Using the SQLite Online Backup API</h1>

<p>
  Historically, backups (copies) of SQLite databases have been created
  using the following method:

<ol>
  <li> Establish a shared lock on the database file using the SQLite API (i.e.
       the shell tool).
  <li> Copy the database file using an external tool (for example the unix 'cp'
       utility or the DOS 'copy' command).
  <li> Relinquish the shared lock on the database file obtained in step 1.
</ol>

<p>
  This procedure works well in many scenarios and is usually very
  fast. However, this technique has the following shortcomings:

<ul>
  <li> Any database clients wishing to write to the database file while a
       backup is being created must wait until the shared lock is
       relinquished.

  <li> It cannot be used to copy data to or from in-memory databases.

  <li> If a power failure or operating system failure occurs while copying
       the database file the backup database may be corrupted following
       system recovery.
</ul>

<p>
  The [sqlite3_backup_init | Online Backup API] was created to 
  address these concerns. The online backup API allows the contents of
  one database to be copied into another database, overwriting the 
  original contents of the target database. The copy operation may be 
  done incrementally, in which case the source database does not need
  to be locked for the duration of the copy, only for the brief periods
  of time when it is actually being read from. This allows other database
  users to continue uninterrupted while a backup of an online database is
  made.

<p>
  The online backup API is [sqlite3_backup_init | documented here].
  The remainder of this page contains two C language examples illustrating 
  common uses of the API and discussions thereof. Reading these examples
  is no substitute for reading the API documentation!

<p>
  Update: The [VACUUM INTO] command introduced in 
  SQLite version 3.27.0 ([dateof:3.27.0]) can serve as an
  alternative to the backup API.

<h2>Example 1: Loading and Saving In-Memory Databases</h2>

<tcl>

proc C_Code {text} {
  hd_puts "<pre>\n"
  set iLine 0
  foreach zLine [split [string trim $text "\n"] "\n"] {
    regsub {^  } $zLine {} zLine 
    set zSubspec {<span style="color:blue;font-style:italic">&</span>}
    regsub {(/\*|\*\*|\*/).*} $zLine $zSubspec zLine 
    if {[regexp {(.*)(sqlite3[a-z0-9_]*)(.*)} $zLine -> one two three]} {
      hd_puts $one
      hd_resolve "\[$two\]"
      hd_puts $three
    } else {
      hd_puts $zLine
    }
    hd_puts "\n"
  }
  hd_puts "</pre>\n"
}

C_Code {
  /*
  ** This function is used to load the contents of a database file on disk 
  ** into the "main" database of open database connection pInMemory, or
  ** to save the current contents of the database opened by pInMemory into
  ** a database file on disk. pInMemory is probably an in-memory database, 
  ** but this function will also work fine if it is not.
  **
  ** Parameter zFilename points to a nul-terminated string containing the
  ** name of the database file on disk to load from or save to. If parameter
  ** isSave is non-zero, then the contents of the file zFilename are 
  ** overwritten with the contents of the database opened by pInMemory. If
  ** parameter isSave is zero, then the contents of the database opened by
  ** pInMemory are replaced by data loaded from the file zFilename.
  **
  ** If the operation is successful, SQLITE_OK is returned. Otherwise, if
  ** an error occurs, an SQLite error code is returned.
  */
  int loadOrSaveDb(sqlite3 *pInMemory, const char *zFilename, int isSave){
    int rc;                   /* Function return code */
    sqlite3 *pFile;           /* Database connection opened on zFilename */
    sqlite3_backup *pBackup;  /* Backup object used to copy data */
    sqlite3 *pTo;             /* Database to copy to (pFile or pInMemory) */
    sqlite3 *pFrom;           /* Database to copy from (pFile or pInMemory) */

    /* Open the database file identified by zFilename. Exit early if this fails
    ** for any reason. */
    rc = sqlite3_open(zFilename, &pFile);
    if( rc==SQLITE_OK ){
  
      /* If this is a 'load' operation (isSave==0), then data is copied
      ** from the database file just opened to database pInMemory. 
      ** Otherwise, if this is a 'save' operation (isSave==1), then data
      ** is copied from pInMemory to pFile.  Set the variables pFrom and
      ** pTo accordingly. */
      pFrom = (isSave ? pInMemory : pFile);
      pTo   = (isSave ? pFile     : pInMemory);
  
      /* Set up the backup procedure to copy from the "main" database of 
      ** connection pFile to the main database of connection pInMemory.
      ** If something goes wrong, pBackup will be set to NULL and an error
      ** code and message left in connection pTo.
      **
      ** If the backup object is successfully created, call backup_step()
      ** to copy data from pFile to pInMemory. Then call backup_finish()
      ** to release resources associated with the pBackup object.  If an
      ** error occurred, then an error code and message will be left in
      ** connection pTo. If no error occurred, then the error code belonging
      ** to pTo is set to SQLITE_OK.
      */
      pBackup = sqlite3_backup_init(pTo, "main", pFrom, "main");
      if( pBackup ){
        (void)sqlite3_backup_step(pBackup, -1);
        (void)sqlite3_backup_finish(pBackup);
      }
      rc = sqlite3_errcode(pTo);
    }

    /* Close the database connection opened on database file zFilename
    ** and return the result of this function. */
    (void)sqlite3_close(pFile);
    return rc;
  }
}
</tcl>

<p>
  The C function to the right demonstrates one of the simplest,
  and most common, uses of the backup API: loading and saving the contents
  of an in-memory database to a file on disk. The backup API is used as
  follows in this example:

  <ol>
    <li>Function [sqlite3_backup_init()] is called to create an [sqlite3_backup]
        object to copy data between the two databases (either from a file and
        into the in-memory database, or vice-versa).
    <li>Function [sqlite3_backup_step()] is called with a parameter of 
        <tt>-1</tt> to copy the entire source database to the destination.
    <li>Function [sqlite3_backup_finish()] is called to clean up resources
        allocated by [sqlite3_backup_init()].
  </ol>

<p><b>Error handling</b>

<p>
  If an error occurs in any of the three main backup API routines
  then the [error code] and [sqlite3_errmsg | message] are attached to
  the destination [database connection].
  Additionally, if
  [sqlite3_backup_step()] encounters an error, then the [error code] is returned
  by both the [sqlite3_backup_step()] call itself, and by the subsequent call
  to [sqlite3_backup_finish()]. So a call to [sqlite3_backup_finish()]
  does not overwrite an [error code] stored in the destination
  [database connection] by [sqlite3_backup_step()].  This feature
  is used in the example code to reduce amount of error handling required.
  The return values of the [sqlite3_backup_step()] and [sqlite3_backup_finish()]
  calls are ignored and the error code indicating the success or failure of
  the copy operation collected from the destination [database connection]
  afterward.

<p><b>Possible Enhancements</b>

<p>
  The implementation of this function could be enhanced in at least two ways:

  <ol>
    <li> Failing to obtain the lock on database file zFilename (an [SQLITE_BUSY]
         error) could be handled, and
    <li> Cases where the page-sizes of database pInMemory and zFilename are
         different could be handled better.
  </ol>

<p>
  Since database zFilename is a file on disk, then it may be accessed 
  externally by another process. This means that when the call to
  sqlite3_backup_step() attempts to read from or write data to it, it may
  fail to obtain the required file lock. If this happens, this implementation
  will fail, returning SQLITE_BUSY immediately. The solution would be to
  register a busy-handler callback or 
  timeout with [database connection] pFile 
  using [sqlite3_busy_handler()] or [sqlite3_busy_timeout()]
  as soon as it is opened. If it fails to obtain a required lock immediately,
  [sqlite3_backup_step()] uses any registered busy-handler callback or timeout
  in the same way as [sqlite3_step()] or [sqlite3_exec()] does.

<p>
  Usually, it does not matter if the page-sizes of the source database and the
  destination database are different before the contents of the destination
  are overwritten. The page-size of the destination database is simply changed
  as part of the backup operation. The exception is if the destination database
  happens to be an in-memory database. In this case, if the page sizes
  are not the same at the start of the backup operation, then the operation
  fails with an SQLITE_READONLY error. Unfortunately, this could occur when
  loading a database image from a file into an in-memory database using
  function loadOrSaveDb().

<p>
  However, if in-memory database pInMemory has just been opened (and is 
  therefore completely empty) before being passed to function loadOrSaveDb(), 
  then it is still possible to change its page size using an SQLite "PRAGMA
  page_size" command. Function loadOrSaveDb() could detect this case, and
  attempt to set the page-size of the in-memory database to the page-size
  of database zFilename before invoking the online backup API functions.
  
<h2>Example 2: Online Backup of a Running Database</h2>

<tcl>
C_Code {
  /*
  ** Perform an online backup of database pDb to the database file named
  ** by zFilename. This function copies 5 database pages from pDb to
  ** zFilename, then unlocks pDb and sleeps for 250 ms, then repeats the
  ** process until the entire database is backed up.
  ** 
  ** The third argument passed to this function must be a pointer to a progress
  ** function. After each set of 5 pages is backed up, the progress function
  ** is invoked with two integer parameters: the number of pages left to
  ** copy, and the total number of pages in the source file. This information
  ** may be used, for example, to update a GUI progress bar.
  **
  ** While this function is running, another thread may use the database pDb, or
  ** another process may access the underlying database file via a separate 
  ** connection.
  **
  ** If the backup process is successfully completed, SQLITE_OK is returned.
  ** Otherwise, if an error occurs, an SQLite error code is returned.
  */
  int backupDb(
    sqlite3 *pDb,               /* Database to back up */
    const char *zFilename,      /* Name of file to back up to */
    void(*xProgress)(int, int)  /* Progress function to invoke */     
  ){
    int rc;                     /* Function return code */
    sqlite3 *pFile;             /* Database connection opened on zFilename */
    sqlite3_backup *pBackup;    /* Backup handle used to copy data */

    /* Open the database file identified by zFilename. */
    rc = sqlite3_open(zFilename, &pFile);
    if( rc==SQLITE_OK ){

      /* Open the sqlite3_backup object used to accomplish the transfer */
      pBackup = sqlite3_backup_init(pFile, "main", pDb, "main");
      if( pBackup ){
  
        /* Each iteration of this loop copies 5 database pages from database
        ** pDb to the backup database. If the return value of backup_step()
        ** indicates that there are still further pages to copy, sleep for
        ** 250 ms before repeating. */
        do {
          rc = sqlite3_backup_step(pBackup, 5);
          xProgress(
              sqlite3_backup_remaining(pBackup),
              sqlite3_backup_pagecount(pBackup)
          );
          if( rc==SQLITE_OK || rc==SQLITE_BUSY || rc==SQLITE_LOCKED ){
            sqlite3_sleep(250);
          }
        } while( rc==SQLITE_OK || rc==SQLITE_BUSY || rc==SQLITE_LOCKED );
  
        /* Release resources allocated by backup_init(). */
        (void)sqlite3_backup_finish(pBackup);
      }
      rc = sqlite3_errcode(pFile);
    }
    
    /* Close the database connection opened on database file zFilename
    ** and return the result of this function. */
    (void)sqlite3_close(pFile);
    return rc;
  }
}
</tcl>

<p>
  The function presented in the previous example copies the entire source
  database in one call to [sqlite3_backup_step()]. This requires holding a 
  read-lock on the source database file for the duration of the operation,
  preventing any other database user from writing to the database. It also
  holds the mutex associated with database pInMemory throughout the copy,
  preventing any other thread from using it. The C function in this section,
  designed to be called by a background thread or process for creating a 
  backup of an online database, avoids these problems using the following 
  approach:

  <ol>
    <li>Function [sqlite3_backup_init()] is called to create an [sqlite3_backup]
        object to copy data from database pDb to the backup database file 
        identified by zFilename.
    <li>Function [sqlite3_backup_step()] is called with a parameter of 5 to
        copy 5 pages of database pDb to the backup database (file zFilename).
    <li>If there are still more pages to copy from database pDb, then the
        function sleeps for 250 milliseconds (using the [sqlite3_sleep()]
        utility) and then returns to step 2.
    <li>Function [sqlite3_backup_finish()] is called to clean up resources
        allocated by [sqlite3_backup_init()].
  </ol>

<p><b>File and Database Connection Locking</b>

<p>
  During the 250 ms sleep in step 3 above, no read-lock is held on the database
  file and the mutex associated with pDb is not held. This allows other threads
  to use [database connection] pDb and other connections to write to the
  underlying database file. 

<p>
  If another thread or process writes to the source database while this 
  function is sleeping, then SQLite detects this and usually restarts the 
  backup process when sqlite3_backup_step() is next called. There is one 
  exception to this rule: If the source database is not an in-memory database,
  and the write is performed from within the same process as the backup
  operation and uses the same database handle (pDb), then the destination
  database (the one opened using connection pFile) is automatically updated
  along with the source. The backup process may then be continued after the 
  sqlite3_sleep() call returns as if nothing had happened. 

<p>
  Whether or not the backup process is restarted as a result of writes to
  the source database mid-backup, the user can be sure that when the backup
  operation is completed the backup database contains a consistent and 
  up-to-date snapshot of the original. However:

  <ul>
    <li> Writes to an in-memory source database, or writes to a file-based 
         source database by an external process or thread using a 
         database connection other than pDb are significantly more expensive 
         than writes made to a file-based source database using pDb (as the
         entire backup operation must be restarted in the former two cases).

    <li> If the backup process is restarted frequently enough it may never
         run to completion and the backupDb() function may never return.
  </ul>
 
<p><b>backup_remaining() and backup_pagecount()</b>

<p>
  The backupDb() function uses the sqlite3_backup_remaining() and
  sqlite3_backup_pagecount() functions to report its progress via the
  user-supplied xProgress() callback. Function sqlite3_backup_remaining()
  returns the number of pages left to copy and sqlite3_backup_pagecount()
  returns the total number of pages in the source database (in this case 
  the database opened by pDb). So the percentage completion of the process
  may be calculated as:

<p style="font-family:fixed;margin-left:5em">
  Completion = 100% * (pagecount() - remaining()) / pagecount()

<p>
  The sqlite3_backup_remaining() and sqlite3_backup_pagecount() APIs report
  values stored by the previous call to sqlite3_backup_step(), they do not
  actually inspect the source database file. This means that if the source
  database is written to by another thread or process after the call to
  sqlite3_backup_step() returns but before the values returned by
  sqlite3_backup_remaining() and sqlite3_backup_pagecount() are used, the 
  values may be technically incorrect. This is not usually a problem.


<div style="clear:both"></div>