Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Update the documentation to include information about aggregated mode for the DBSTAT table. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA3-256: |
8abf78baf969e3fc213acd59f070929e |
User & Date: | drh 2019-11-19 19:15:14.877 |
Context
2019-11-23
| ||
00:22 | Documentation updates for new features. (check-in: 604ac8689d user: drh tags: trunk) | |
2019-11-19
| ||
19:15 | Update the documentation to include information about aggregated mode for the DBSTAT table. (check-in: 8abf78baf9 user: drh tags: trunk) | |
2019-11-18
| ||
17:43 | Mention the new SQLITE_OPEN_NOFOLLOW option in the change log. (check-in: d3ba10d894 user: drh tags: trunk) | |
Changes
Changes to pages/changes.in.
︙ | ︙ | |||
21 22 23 24 25 26 27 28 29 30 31 32 33 34 | incr nChng } chng {2019-12-31 (3.31.0)} { <li>Add support for [generated columns]. <li>Add the [sqlite3_hard_heap_limit64()] interface and the corresponding [PRAGMA hard_heap_limit] command. <li>Add the [SQLITE_OPEN_NOFOLLOW] option to [sqlite3_open_v2()] that prevents SQLite from opening symbolic links. <li>Faster response to [sqlite3_interrupt()]. <li>Added the [https://sqlite.org/src/file/ext/misc/uuid.c|uuid.c] extension module implementing functions for processing RFC-4122 UUIDs. <li>The [legacy_file_format pragma] is deactivated. It is now a no-op. In its place, the [SQLITE_DBCONFIG_LEGACY_FILE_FORMAT] option to [sqlite3_db_config()] is | > > | 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | incr nChng } chng {2019-12-31 (3.31.0)} { <li>Add support for [generated columns]. <li>Add the [sqlite3_hard_heap_limit64()] interface and the corresponding [PRAGMA hard_heap_limit] command. <li>Add the [DBSTAT aggregated mode|aggregated mode] feature to the [DBSTAT virtual table]. <li>Add the [SQLITE_OPEN_NOFOLLOW] option to [sqlite3_open_v2()] that prevents SQLite from opening symbolic links. <li>Faster response to [sqlite3_interrupt()]. <li>Added the [https://sqlite.org/src/file/ext/misc/uuid.c|uuid.c] extension module implementing functions for processing RFC-4122 UUIDs. <li>The [legacy_file_format pragma] is deactivated. It is now a no-op. In its place, the [SQLITE_DBCONFIG_LEGACY_FILE_FORMAT] option to [sqlite3_db_config()] is |
︙ | ︙ |
Changes to pages/dbstat.in.
1 | <title>The DBSTAT Virtual Table</title> | | | | | | < < | | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | <title>The DBSTAT Virtual Table</title> <tcl>hd_keywords dbstat {dbstat virtual table} {DBSTAT virtual table}</tcl> <fancy_format> <h1>Overview</h1> <p> The DBSTAT virtual table is a read-only [eponymous virtual table] that returns information about the amount of disk space used to store page or btrees of an SQLite database. The DBSTAT virtual table is used to implement [sqlite3_analyzer.exe] utility program, and to help compute the [https://www.sqlite.org/src/repo-tabsize|table size pie-chart] in the [https://www.fossil-scm.org/|Fossil-implemented] version control system for SQLite, for example. </p> <p> ^The DBSTAT virtual table is available on all [database connections] when SQLite is built using the [SQLITE_ENABLE_DBSTAT_VTAB] compile-time option. <p> ^The DBSTAT virtual table is an [eponymous virtual table], meaning that is not necessary to run [CREATE VIRTUAL TABLE] to create an instance of the dbstat virtual table before using it. The "dbstat" module name can be used as if it were a table name to query the dbstat virtual table directly. For example: <codeblock> SELECT * FROM dbstat; |
︙ | ︙ | |||
55 56 57 58 59 60 61 | so the use of "main" in the example above is redundant. For any particular query, the schema can be changed by specifying the alternative schema as a function argument to the virtual table name in the FROM clause of the query. (See further discussion of [table-valued functions in the FROM clause] for more details.) <p> | | | > < | | > | > > > > > > > > > > | 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 | so the use of "main" in the example above is redundant. For any particular query, the schema can be changed by specifying the alternative schema as a function argument to the virtual table name in the FROM clause of the query. (See further discussion of [table-valued functions in the FROM clause] for more details.) <p> The schema for the DBSTAT virtual table looks like this: <codeblock> CREATE TABLE dbstat( name TEXT, -- Name of table or index path TEXT, -- Path to page from root pageno INTEGER, -- Page number pagetype TEXT, -- 'internal', 'leaf' or 'overflow' ncell INTEGER, -- Cells on page (0 for overflow) payload INTEGER, -- Bytes of payload on this page unused INTEGER, -- Bytes of unused space on this page mx_payload INTEGER, -- Largest payload size of all cells on this page pgoffset INTEGER, -- Offset of page in file pgsize INTEGER, -- Size of the page schema TEXT HIDDEN, -- Database schema being analyzed aggregate BOOL HIDDEN -- True to enable aggregate mode ); </codeblock> <p> The DBSTAT table analyzes btrees within the database file. Freelist pages, pointer-map pages, and the lock page are all ignored by DBSTAT. <p> By default, there is a single row in the DBSTAT table for each btree page the database file. The row provides information about the space utilization of that one page of the database. However, if the hidden column "aggregate" is TRUE, then results are aggregated and there is a single row in the DBSTAT table for each btree in the database, providing information about space utilization across the entire btree. <a name="dbstatpath"></a> <h1>The "path" column of the dbstat virtual table</h1> <p> The "path" column describes the path taken from the root node of the btree structure to each page. The "path" of the root node itself is '/'. The "path" is NULL when "aggregate" is TRUE. The "path" for the left-most child page of the root of a btree page is '/000/'. (Btrees store content ordered from left to right so the pages to the left have smaller keys than the pages to the right.) The next to left-most child of the root page is '/001', and so on, each sibling page identified by a 3-digit hex value. The children of the 451st left-most sibling have paths such |
︙ | ︙ | |||
113 114 115 116 117 118 119 120 121 122 123 124 | If the paths are sorted using the BINARY collation sequence, then the overflow pages associated with a cell will appear earlier in the sort-order than its child page: <codeblock> '/1c2/000/' // Left-most child of 451st child of root </codeblock> <h1>Example uses of the dbstat virtual table</h1> <p> To find the total number of pages used to store table "xyz" in schema "aux1", | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | > | > > | > > | 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 | If the paths are sorted using the BINARY collation sequence, then the overflow pages associated with a cell will appear earlier in the sort-order than its child page: <codeblock> '/1c2/000/' // Left-most child of 451st child of root </codeblock> <tcl>hd_fragment {dbstatagg} {DBSTAT aggregated mode}</tcl> <h1>Aggregated Data</h1> <p> Beginning with SQLite version 3.31.0 ([dateof:3.31.0]), the DBSTAT table has a new [hidden column] named "aggregate", which if constrainted to be TRUE will cause DBSTAT to generate one row per btree in the database, rather than one row per page. When running in aggregated mode, the "path", "pagetype", and "pgoffset" columns are always NULL and the "pageno" column holds the number of pages in the entire btree, rather than the number of the page that corresponds to the row. <p> The following table shows the meanings of the (non-hidden) columns of DBSTAT in both normal and aggregated mode: <center><blockquote> <table border="1" cellpadding="5" cellspacing="0"> <tr> <th>Column <th>Normal meaning <th>Aggregate-mode meaning </tr> <tr> <th>name <td colspan="2"> The name of the table or index that is implemented by the btree of the current row <tr> <th>path <td>See <a href="#dbstatpath">description above</a> <td>Always NULL <tr> <th>pageno <td>The page number of the database page for the current row <td>The total number of pages in the btree for the current row <tr> <th>pagetype <td>'leaf' or 'interior' <td>Always NULL <tr> <th>ncell <td colspan="2">Number of cells on the current page or btree <tr> <th>payload <td colspan="2">Bytes of useful payload on the current page or btree <tr> <th>unused <td colspan="2">Unused bytes of on the current page or btree <tr> <th>mx_payload <td colspan="2">The largest payload found anywhere in the current page or btree. <tr> <th>pgoffset <td>Byte offset to the start of the page <td>Always NULL <tr> <th>pgsize <td colspan="2">Total storage space used by the current page or btree. </table> </blockquote></center> <h1>Example uses of the dbstat virtual table</h1> <p> To find the total number of pages used to store table "xyz" in schema "aux1", use either of the following two queries (the first is the traditional way, and the second shows the use of the aggregated feature): <codeblock> SELECT count(*) FROM dbstat('aux1') WHERE name='xyz'; SELECT pageno FROM dbstat('aux1',1) WHERE name='xyz'; </codeblock> <p> To see how efficiently the content of a table is stored on disk, compute the amount of space used to hold actual content divided by the total amount of disk space used. The closer this number is to 100%, the more efficient the packing. (In this example, the 'xyz' table is assumed to be in the 'main' schema. Again, there are two different versions that show the use of DBSTAT both without and with the new aggregated feature, respectively.) <codeblock> SELECT sum(pgsize-unused)*100.0/sum(pgsize) FROM dbstat WHERE name='xyz'; SELECT (pgsize-unused)*100.0/pgsize FROM dbstat WHERE name='xyz' AND aggregate=TRUE; </codeblock> <p> To find the average fan-out for a table, run: <codeblock> SELECT avg(ncell) FROM dbstat WHERE name='xyz' AND pagetype='internal'; |
︙ | ︙ |