/ Check-in [3e5c7771]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Updates to the sqlite3_analyzer utility: Change the names of some labels, especially change "Fragmentation" to "Non-sequential pages". Revise the computation of non-sequential pages so that it ignores itercalated non-leaf pages (overflow and index pages).
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 3e5c7771fa91d8ae1e495432329b87af87b1ebc6
User & Date: drh 2013-09-28 12:40:55
Context
2013-09-28
13:28
In the nextchar.c extension, allow the second argument to the next_char() function to be a subquery. check-in: 59b9fa22 user: drh tags: trunk
12:40
Updates to the sqlite3_analyzer utility: Change the names of some labels, especially change "Fragmentation" to "Non-sequential pages". Revise the computation of non-sequential pages so that it ignores itercalated non-leaf pages (overflow and index pages). check-in: 3e5c7771 user: drh tags: trunk
2013-09-26
15:21
Obtain the required shared-cache write-lock when executing "DELETE FROM tbl" statements. Fix for [1e1321ee98]. check-in: 1f8f4fdf user: dan tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to tool/spaceanal.tcl.

195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210


211
212
213
214
215
216
217
...
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
...
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
...
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
...
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
...
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
...
682
683
684
685
686
687
688
689
690
691
692
693
694
695




696
697
698
699
700
701
702
  # Column 'gap_cnt' is set to the number of non-contiguous entries in the
  # list of pages visited if the b-tree structure is traversed in a top-down
  # fashion (each node visited before its child-tree is passed). Any overflow
  # chains present are traversed from start to finish before any child-tree
  # is.
  #
  set gap_cnt 0
  set pglist [db eval {
    SELECT pageno FROM temp.dbstat WHERE name = $name ORDER BY rowid
  }]
  set prev [lindex $pglist 0]
  foreach pgno [lrange $pglist 1 end] {
    if {$pgno != $prev+1} {incr gap_cnt}
    set prev $pgno
  }



  mem eval {
    INSERT INTO space_used VALUES(
      $name,
      $tblname,
      $is_index,
      $nentry,
      $leaf_entries,
................................................................................
  if {$denom==0} {return 0.0}
  return [format %.2f [expr double($num)/double($denom)]]
}

# Generate a subreport that covers some subset of the database.
# the $where clause determines which subset to analyze.
#
proc subreport {title where} {
  global pageSize file_pgcnt compressOverhead

  # Query the in-memory database for the sum of various statistics 
  # for the subset of tables/indices identified by the WHERE clause in
  # $where. Note that even if the WHERE clause matches no rows, the
  # following query returns exactly one row (because it is an aggregate).
  #
................................................................................
  }
  statline {Bytes of payload} $payload $payload_percent
  statline {Average payload per entry} $avg_payload
  statline {Average unused bytes per entry} $avg_unused
  if {[info exists avg_fanout]} {
    statline {Average fanout} $avg_fanout
  }
  if {$total_pages>1} {
    set fragmentation [percent $gap_cnt [expr {$total_pages-1}] {fragmentation}]
    statline {Fragmentation} $fragmentation
  }
  statline {Maximum payload per entry} $mx_payload
  statline {Entries that use overflow} $ovfl_cnt $ovfl_cnt_percent
  if {$int_pages>0} {
    statline {Index pages used} $int_pages
  }
  statline {Primary pages used} $leaf_pages
................................................................................
statline {Pages in the whole file (calculated)} $file_pgcnt2
statline {Pages that store data} $inuse_pgcnt $inuse_percent
statline {Pages on the freelist (per header)} $free_pgcnt2 $free_percent2
statline {Pages on the freelist (calculated)} $free_pgcnt $free_percent
statline {Pages of auto-vacuum overhead} $av_pgcnt $av_percent
statline {Number of tables in the database} $ntable
statline {Number of indices} $nindex
statline {Number of named indices} $nmanindex
statline {Automatically generated indices} $nautoindex
if {$isCompressed} {
  statline {Size of uncompressed content in bytes} $file_bytes
  set efficiency [percent $true_file_size $file_bytes]
  statline {Size of compressed file on disk} $true_file_size $efficiency
} else {
  statline {Size of the file in bytes} $file_bytes
}
................................................................................
    statline {Header and free space} $overhead [percent $overhead $true_file_size]
  }
}

# Output subreports
#
if {$nindex>0} {
  subreport {All tables and indices} 1
}
subreport {All tables} {NOT is_index}
if {$nindex>0} {
  subreport {All indices} {is_index}
}
foreach tbl [mem eval {SELECT name FROM space_used WHERE NOT is_index
                       ORDER BY name}] {
  set qn [quote $tbl]
  set name [string toupper $tbl]
  set n [mem eval {SELECT count(*) FROM space_used WHERE tblname=$tbl}]
  if {$n>1} {
    set idxlist [mem eval "SELECT name FROM space_used
                            WHERE tblname='$qn' AND is_index
                            ORDER BY 1"]
    subreport "Table $name and all its indices" "tblname='$qn'"
    subreport "Table $name w/o any indices" "name='$qn'"
    if {[llength $idxlist]>1} {
      subreport "Indices of table $name" "tblname='$qn' AND is_index"
    }
    foreach idx $idxlist {
      set qidx [quote $idx]
      subreport "Index [string toupper $idx] of table $name" "name='$qidx'"
    }
  } else {
    subreport "Table $name" "name='$qn'"
  }
}

# Output instructions on what the numbers above mean.
#
puts ""
titleline Definitions
................................................................................
    The number of tables in the database, including the SQLITE_MASTER table
    used to store schema information.

Number of indices

    The total number of indices in the database.

Number of named indices

    The number of indices created using an explicit CREATE INDEX statement.

Automatically generated indices

    The number of indices used to implement PRIMARY KEY or UNIQUE constraints
    on tables.

Size of the file in bytes

    The total amount of disk space used by the entire database files.
................................................................................

Average unused bytes per entry

    The average amount of free space remaining on all pages under this
    category on a per-entry basis.  This is the number of unused bytes on
    all pages divided by the number of entries.

Fragmentation

    The percentage of pages in the table or index that are not
    consecutive in the disk file.  Many filesystems are optimized
    for sequential file access so smaller fragmentation numbers 
    sometimes result in faster queries, especially for larger
    database files that do not fit in the disk cache.





Maximum payload per entry

    The largest payload size of any entry.

Entries that use overflow








|
|
|
|
|
|
|
|
|
>
>







 







|







 







|
|
|







 







|
|







 







|

|

|










|
|

|



|


|







 







|



|







 







|

|
|
|
<
|
>
>
>
>







195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
...
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
...
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
...
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
...
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
...
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
...
684
685
686
687
688
689
690
691
692
693
694
695

696
697
698
699
700
701
702
703
704
705
706
707
  # Column 'gap_cnt' is set to the number of non-contiguous entries in the
  # list of pages visited if the b-tree structure is traversed in a top-down
  # fashion (each node visited before its child-tree is passed). Any overflow
  # chains present are traversed from start to finish before any child-tree
  # is.
  #
  set gap_cnt 0
  set prev 0
  db eval {
    SELECT pageno, pagetype FROM temp.dbstat
     WHERE name=$name
     ORDER BY pageno
  } {
    if {$prev>0 && $pagetype=="leaf" && $pageno!=$prev+1} {
      incr gap_cnt
    }
    set prev $pageno
  }
  mem eval {
    INSERT INTO space_used VALUES(
      $name,
      $tblname,
      $is_index,
      $nentry,
      $leaf_entries,
................................................................................
  if {$denom==0} {return 0.0}
  return [format %.2f [expr double($num)/double($denom)]]
}

# Generate a subreport that covers some subset of the database.
# the $where clause determines which subset to analyze.
#
proc subreport {title where showFrag} {
  global pageSize file_pgcnt compressOverhead

  # Query the in-memory database for the sum of various statistics 
  # for the subset of tables/indices identified by the WHERE clause in
  # $where. Note that even if the WHERE clause matches no rows, the
  # following query returns exactly one row (because it is an aggregate).
  #
................................................................................
  }
  statline {Bytes of payload} $payload $payload_percent
  statline {Average payload per entry} $avg_payload
  statline {Average unused bytes per entry} $avg_unused
  if {[info exists avg_fanout]} {
    statline {Average fanout} $avg_fanout
  }
  if {$showFrag && $total_pages>1} {
    set fragmentation [percent $gap_cnt [expr {$total_pages-1}]]
    statline {Non-sequential pages} $gap_cnt $fragmentation
  }
  statline {Maximum payload per entry} $mx_payload
  statline {Entries that use overflow} $ovfl_cnt $ovfl_cnt_percent
  if {$int_pages>0} {
    statline {Index pages used} $int_pages
  }
  statline {Primary pages used} $leaf_pages
................................................................................
statline {Pages in the whole file (calculated)} $file_pgcnt2
statline {Pages that store data} $inuse_pgcnt $inuse_percent
statline {Pages on the freelist (per header)} $free_pgcnt2 $free_percent2
statline {Pages on the freelist (calculated)} $free_pgcnt $free_percent
statline {Pages of auto-vacuum overhead} $av_pgcnt $av_percent
statline {Number of tables in the database} $ntable
statline {Number of indices} $nindex
statline {Number of defined indices} $nmanindex
statline {Number of implied indices} $nautoindex
if {$isCompressed} {
  statline {Size of uncompressed content in bytes} $file_bytes
  set efficiency [percent $true_file_size $file_bytes]
  statline {Size of compressed file on disk} $true_file_size $efficiency
} else {
  statline {Size of the file in bytes} $file_bytes
}
................................................................................
    statline {Header and free space} $overhead [percent $overhead $true_file_size]
  }
}

# Output subreports
#
if {$nindex>0} {
  subreport {All tables and indices} 1 0
}
subreport {All tables} {NOT is_index} 0
if {$nindex>0} {
  subreport {All indices} {is_index} 0
}
foreach tbl [mem eval {SELECT name FROM space_used WHERE NOT is_index
                       ORDER BY name}] {
  set qn [quote $tbl]
  set name [string toupper $tbl]
  set n [mem eval {SELECT count(*) FROM space_used WHERE tblname=$tbl}]
  if {$n>1} {
    set idxlist [mem eval "SELECT name FROM space_used
                            WHERE tblname='$qn' AND is_index
                            ORDER BY 1"]
    subreport "Table $name and all its indices" "tblname='$qn'" 0
    subreport "Table $name w/o any indices" "name='$qn'" 1
    if {[llength $idxlist]>1} {
      subreport "Indices of table $name" "tblname='$qn' AND is_index" 0
    }
    foreach idx $idxlist {
      set qidx [quote $idx]
      subreport "Index [string toupper $idx] of table $name" "name='$qidx'" 1
    }
  } else {
    subreport "Table $name" "name='$qn'" 1
  }
}

# Output instructions on what the numbers above mean.
#
puts ""
titleline Definitions
................................................................................
    The number of tables in the database, including the SQLITE_MASTER table
    used to store schema information.

Number of indices

    The total number of indices in the database.

Number of defined indices

    The number of indices created using an explicit CREATE INDEX statement.

Number of implied indices

    The number of indices used to implement PRIMARY KEY or UNIQUE constraints
    on tables.

Size of the file in bytes

    The total amount of disk space used by the entire database files.
................................................................................

Average unused bytes per entry

    The average amount of free space remaining on all pages under this
    category on a per-entry basis.  This is the number of unused bytes on
    all pages divided by the number of entries.

Non-sequential pages

    The number of pages in the table or index that are out of sequence.
    Many filesystems are optimized for sequential file access so a small
    number of non-sequential pages might result in faster queries,

    especially for larger database files that do not fit in the disk cache.
    Note that after running VACUUM, the root page of each table or index is
    at the beginning of the database file and all other pages are in a
    separate part of the database file, resulting in a single non-
    sequential page.

Maximum payload per entry

    The largest payload size of any entry.

Entries that use overflow