Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Rework the site search to use FTS5. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | experimental |
Files: | files | file ages | folders |
SHA1: |
18e4a7b15730362e5d69988169bf4413 |
User & Date: | dan 2016-08-18 20:07:47.654 |
Context
2016-08-19
| ||
17:02 | Fix more minor search problems. (check-in: c345859928 user: dan tags: experimental) | |
2016-08-18
| ||
20:07 | Rework the site search to use FTS5. (check-in: 18e4a7b157 user: dan tags: experimental) | |
2016-08-16
| ||
10:46 | Omit the underscore from hyperlinks in a few places where doing so makes the text less cluttered. (check-in: 70a94f20fa user: dan tags: experimental) | |
Changes
Added document_header.tcl.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > || proc document_header {title path {search {}}} { set ret [subst -nocommands { <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <meta http-equiv="content-type" content="text/html; charset=UTF-8"> <title>$title</title> <style type="text/css"> body { margin: auto; font-family: Verdana, sans-serif; padding: 8px 1%; } .nounderline a { text-decoration: none } a { color: #044a64 } a:visited { color: #734559 } .logo { position:absolute; margin:3px; } .tagline { float:right; text-align:right; font-style:italic; width:300px; margin:12px; margin-top:58px; } .menubar { clear: both; border-radius: 8px; background: #044a64; padding: 0px; margin: 0px; cell-spacing: 0px; } .toolbar { text-align: center; line-height: 1.6em; margin: 0; padding: 0px 8px; } .toolbar a { color: white; text-decoration: none; padding: 6px 12px; } .toolbar a:visited { color: white; } .toolbar a:hover { color: #044a64; background: white; } .content { margin: 5%; } .content dt { font-weight:bold; } .content dd { margin-bottom: 25px; margin-left:20%; } .content ul { padding:0px; padding-left: 15px; margin:0px; } /* Things for "fancyformat" documents start here. */ .fancy img+p {font-style:italic} .fancy .codeblock i { color: darkblue; } .fancy h1,.fancy h2,.fancy h3,.fancy h4 {font-weight:normal;color:#044a64} .fancy h2 { margin-left: 10px } .fancy h3 { margin-left: 20px } .fancy h4 { margin-left: 30px } .fancy th {white-space:xnowrap;text-align:left;border-bottom:solid 1px #444} .fancy th, .fancy td {padding: 0.2em 1ex; vertical-align:top} .fancy #toc a { color: darkblue ; text-decoration: none } .fancy .todo { color: #AA3333 ; font-style : italic } .fancy .todo:before { content: 'TODO:' } .fancy p.todo { border: solid #AA3333 1px; padding: 1ex } .fancy img { display:block; } .fancy :link:hover, .fancy :visited:hover { background: wheat } .fancy p,.fancy ul,.fancy ol,.fancy dl { margin: 1em 5ex } .fancy li p { margin: 1em 0 } .fancy blockquote { margin-left : 10ex } /* End of "fancyformat" specific rules. */ .yyterm { background: #fff; border: 1px solid #000; border-radius: 11px; padding-left: 4px; padding-right: 4px; } .doccat a { color: #044a64 ; text-decoration: none; } .doccat h { font-weight: bold; } .doccat h a { font-size: smaller; color: black; } .doccat { padding-left: 2ex; padding-right: 2ex; white-space:nowrap; } .doccat li { list-style-type: none; font-size: smaller; line-height: 150%; } .doccat ul { margin-top: 0.5em; } .footer { padding-top: 2px; padding-bottom: 1px; border-top: 2px solid #044a64; } </style> </head> }] if {[file exists DRAFT]} { set tagline {<font size="6" color="red">*** DRAFT ***</font>} } else { set tagline {Small. Fast. Reliable.<br>Choose any three.} } append ret [subst -nocommands {<body> <div><!-- container div to satisfy validator --> <a href="${path}index.html"> <img class="logo" src="${path}images/sqlite370_banner.gif" alt="SQLite Logo" border="0"></a> <div><!-- IE hack to prevent disappearing logo--></div> <div class="tagline">${tagline}</div> <table width=100% class="menubar"><tr> <td width=100%> <div class="toolbar"> <a href="${path}about.html">About</a> <a href="${path}docs.html">Documentation</a> <a href="${path}download.html">Download</a> <a href="${path}copyright.html">License</a> <a href="${path}support.html">Support</a> <a href="http://www.hwaci.com/sw/sqlite/prosupport.html">Purchase</a> </div> }] if {$search==""} { set initval "Search SQLite Docs..." set initstyle {font-style:italic;color:#044a64} } else { set initval $search set initstyle {font-style:normal;color:black} } append ret [subst -nocommands { <script> gMsg = "Search SQLite Docs..." function entersearch() { var q = document.getElementById("q"); if( q.value == gMsg ) { q.value = "" } q.style.color = "black" q.style.fontStyle = "normal" } function leavesearch() { var q = document.getElementById("q"); if( q.value == "" ) { q.value = gMsg q.style.color = "#044a64" q.style.fontStyle = "italic" } } function hideorshow(btn,obj){ var x = document.getElementById(obj); var b = document.getElementById(btn); if( x.style.display!='none' ){ x.style.display = 'none'; b.innerHTML='show'; }else{ x.style.display = ''; b.innerHTML='hide'; } return false; } </script> <td> <div style="padding:0 1em 0px 0;white-space:nowrap"> <form name=f method="GET" action="/search"> <input id=q name=q type=text onfocus="entersearch()" onblur="leavesearch()" style="width:24ex;padding:1px 1ex; border:solid white 1px; font-size:0.9em ; $initstyle;" value="$initval"> <input type=submit value="Go" style="border:solid white 1px;background-color:#044a64;color:white;font-size:0.9em;padding:0 1ex"> </form> </div> </table> }] return $ret } |
Changes to main.mk.
︙ | ︙ | |||
38 39 40 41 42 43 44 | all: base evidence format_evidence matrix doc private: base evidence private_evidence matrix doc fast: base doc tclsh: $(TCLSQLITE3C) | | | | 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | all: base evidence format_evidence matrix doc private: base evidence private_evidence matrix doc fast: base doc tclsh: $(TCLSQLITE3C) $(CC) -g -o tclsh -DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_FTS5 -DTCLSH=1 -DSQLITE_TCLMD5 $(TCLINC) $(TCLSQLITE3C) $(TCLFLAGS) tclsqlite3.fts3: $(TCLSQLITE3C) $(DOC)/search/searchc.c $(CC) -static -O2 -o tclsqlite3.fts3 -I. -DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_FTS5 $(TCLINC) $(DOC)/search/searchc.c $(TCLSQLITE3C) $(TCLFLAGS) sqlite3.h: tclsh $(SRC)/src/sqlite.h.in $(SRC)/manifest.uuid $(SRC)/VERSION ./tclsh $(SRC)/tool/mksqlite3h.tcl $(SRC) | \ sed 's/^SQLITE_API //' >sqlite3.h # Generate the directory into which generated documentation files will # be written. |
︙ | ︙ | |||
144 145 146 147 148 149 150 151 152 153 154 155 156 | # parsehtml.so: $(DOC)/search/parsehtml.c gcc -g -shared -fPIC $(TCLINC) -I. -I$(SRC)/ext/fts3 $(DOC)/search/parsehtml.c $(TCLSTUBFLAGS) -o parsehtml.so searchdb: parsehtml.so tclsh ./tclsh $(DOC)/search/buildsearchdb.tcl cp $(DOC)/search/search.tcl doc/search chmod +x doc/search always: clean: rm -rf tclsh doc sqlite3.h | > | 144 145 146 147 148 149 150 151 152 153 154 155 156 157 | # parsehtml.so: $(DOC)/search/parsehtml.c gcc -g -shared -fPIC $(TCLINC) -I. -I$(SRC)/ext/fts3 $(DOC)/search/parsehtml.c $(TCLSTUBFLAGS) -o parsehtml.so searchdb: parsehtml.so tclsh ./tclsh $(DOC)/search/buildsearchdb.tcl cp $(DOC)/search/search.tcl doc/search cp $(DOC)/document_header.tcl doc/document_header.tcl chmod +x doc/search always: clean: rm -rf tclsh doc sqlite3.h |
Changes to pages/capi3ref.in.
︙ | ︙ | |||
579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 | foreach {key title type keywords body code} $c break set kw [preferred_keyword [lsort $keywords]] if {$kw==""} {error "no keyword for $c"} hd_fragment $kw hd_open_aux c3ref/[convert_keyword_to_filename $kw] hd_header $title hd_enable_main 0 hd_puts {<a href="intro.html"><h2>SQLite C Interface</h2></a>} hd_enable_main 1 eval hd_keywords $keywords hd_puts "<h2>$title</h2>" hd_puts "<blockquote><pre>" hd_puts "$code" hd_puts "</pre></blockquote>" if {$supported($kw)==1} { hd_resolve {<p><b>Important:</b> This interface is [experimental] } hd_resolve {and is subject to change without notice.</p>} } | > > | 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 | foreach {key title type keywords body code} $c break set kw [preferred_keyword [lsort $keywords]] if {$kw==""} {error "no keyword for $c"} hd_fragment $kw hd_open_aux c3ref/[convert_keyword_to_filename $kw] hd_header $title hd_enable_main 0 hd_puts {<div class=nosearch>} hd_puts {<a href="intro.html"><h2>SQLite C Interface</h2></a>} hd_enable_main 1 eval hd_keywords $keywords hd_puts "<h2>$title</h2>" hd_puts {</div>} hd_puts "<blockquote><pre>" hd_puts "$code" hd_puts "</pre></blockquote>" if {$supported($kw)==1} { hd_resolve {<p><b>Important:</b> This interface is [experimental] } hd_resolve {and is subject to change without notice.</p>} } |
︙ | ︙ |
Changes to pages/fancyformat.tcl.
︙ | ︙ | |||
232 233 234 235 236 237 238 | $::TOC </div id> [FixReferences $body] }] } | | | 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 | $::TOC </div id> [FixReferences $body] }] } proc addtoc_cb {tag details args} { upvar #0 ::Addtoc G switch -glob -- $tag { "" { ;# Text node. Copy the text to the output. And the TOC, if applicable. if {$G(inCodeblock)} { append G(codeblock) $details } else { |
︙ | ︙ |
Changes to pages/lang.in.
︙ | ︙ | |||
109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 | hd_close_main hd_open_main lang_$label.html hd_header "SQLite Query Language: $name" $DOC/pages/lang.in eval hd_keywords $keywords if {[lsearch $keywords $name] == -1 && [lsearch $keywords *$name] == -1} { eval hd_keywords { $name } } hd_puts {<h1 align="center">SQL As Understood By SQLite</h1>} hd_puts {<p><a href="lang.html">[Top]</a></p>} hd_puts "<h2>$name</h2>" } ############################################################################### Section {ALTER TABLE} altertable {{ALTER TABLE} {*ALTER}} RecursiveBubbleDiagram alter-table-stmt </tcl> | > > | 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 | hd_close_main hd_open_main lang_$label.html hd_header "SQLite Query Language: $name" $DOC/pages/lang.in eval hd_keywords $keywords if {[lsearch $keywords $name] == -1 && [lsearch $keywords *$name] == -1} { eval hd_keywords { $name } } hd_puts {<div class=nosearch>} hd_puts {<h1 align="center">SQL As Understood By SQLite</h1>} hd_puts {<p><a href="lang.html">[Top]</a></p>} hd_puts "<h2>$name</h2>" hd_puts {</div>} } ############################################################################### Section {ALTER TABLE} altertable {{ALTER TABLE} {*ALTER}} RecursiveBubbleDiagram alter-table-stmt </tcl> |
︙ | ︙ |
Changes to search/buildsearchdb.tcl.
1 2 3 4 5 | load ./parsehtml.so # Return a list of relative paths to documents that should be included # in the index. | > > < | | > > > > | > > > | > > > > > > > > | > | | | > > | | > > > > > > > > > > > | | < | | < < > | | | > > > > > > > | > | | | | | | > > > > > > > | > > > > > > > > > > | < < < > > | < | > > > > | < < < < | < > | | < < < > | > | | < | < < | | > | > | < | > | > | < | | > | | > > | < | | > | < | | < < < | | < | | | | | > > > > | > > > > > > > > > > | < | < < < > | < < < < < | < < | > > > | > > | | > > | > > > | > > | > > | > | > > < < < | | > | < < < < < < < < < < | < | < | < < < | < < < | | | | < < < | < < < < < | > > > < < > < || load ./parsehtml.so source [file join [file dirname [info script]] hdom.tcl] # Return a list of relative paths to documents that should be included # in the index. proc document_list {type} { set lFiles [list] switch -- $type { lang { foreach f [glob lang_*.html] { lappend lFiles $f } } c3ref { foreach f [glob c3ref/*.html] { lappend lFiles $f } } generic { set nosearch(doc_keyword_crossref.html) 1 set nosearch(doc_backlink_crossref.html) 1 set nosearch(doc_pagelink_crossref.html) 1 set nosearch(doc_target_crossref.html) 1 set nosearch(keyword_index.html) 1 set nosearch(requirements.html) 1 set nosearch(sitemap.html) 1 set nosearch(fileio.html) 1 set nosearch(btreemodule.html) 1 set nosearch(capi3ref.html) 1 set nosearch(changes.html) 1 foreach f [glob *.html] { if {[string match lang_* $f]==0 && [info exists nosearch($f)]==0} { lappend lFiles $f } } foreach f [glob releaselog/*.html] { lappend lFiles $f } } default { error "document_list: unknown file type $type" } } return $lFiles } proc readfile {zFile} { set fd [open $zFile] set ret [read $fd] close $fd return $ret } proc insert_entry {url apis title content} { set content [string trim $content] db eval { INSERT INTO page VALUES($apis, $title, $content, $url); } } # Extract a document title from DOM object $dom passed as the first # argument. If no <title> node can be found in the DOM, use $fallback # as the title. # proc extract_title {dom fallback} { set title_node [lindex [[$dom root] search title] 0] if {$title_node==""} { set title $fallback } else { set title [$title_node text] } set title } proc lang_document_text {dom} { set text "" set bStartsearch 0 [$dom root] foreach_descendent N { if {$bStartsearch==0} { if {[$N tag]=="div" && [$N attr -default "" class]=="startsearch"} { set bStartsearch 1 } } elseif {[$N tag]==""} { set bAppend 1 for {set P [$N parent]} {$P!=""} {set P [$P parent]} { if {[$P attr -default "" class]=="nosearch"} { set bAppend 0 } if {[$P tag]=="a" && [string match syntax/* [$P attr -default "" href]] } { set bAppend 0 } if {[$P tag]=="button" } { set bAppend 0 } } if {$bAppend} {append text [$N text]} } } return $text } proc lang_document_import {doc} { set dom [::hdom::parse [readfile $doc]] # Find the <title> tag and extract the title. set title [extract_title $dom $doc] # Extract the entire document text. set text [lang_document_text $dom] # Insert into the database. insert_entry $doc {} $title $text $dom destroy } proc c3ref_document_apis {dom} { set blacklist(sqlite3_int64) 1 set res [list] foreach N [[$dom root] search blockquote] { set text [$N text] while {[regexp {(sqlite3[0-9a-z_]*) *\((.*)} $text -> api text]} { if {[info exists blacklist($api)]==0} { lappend res "${api}()" } } set pattern {typedef +struct +(sqlite3[0-9a-z_]*)(.*)} while {[regexp $pattern $text -> api text]} { if {[info exists blacklist($api)]==0} { lappend res "struct ${api}" } } set pattern {#define +(SQLITE_[0-9A-Z_]*)(.*)} while {[regexp $pattern $text -> api text]} { if {[info exists blacklist($api)]==0} { lappend res "${api}" } } } set res [lsort -uniq $res] return [join $res ", "] } proc c3ref_document_text {dom} { set text "" set bStartsearch 0 set bBlockquote 0 [$dom root] foreach_descendent N { if {$bStartsearch==0} { if {[$N tag]=="div" && [$N attr -default "" class]=="startsearch"} { set bStartsearch 1 } } elseif {[$N tag]==""} { set bAppend 1 for {set P [$N parent]} {$P!=""} {set P [$P parent]} { if {[$P attr -default "" class]=="nosearch"} { set bAppend 0 } if {[$P tag]=="blockquote" } { set bAppend 0 } } if {$bAppend} {append text [$N text]} } } return $text } proc c3ref_document_import {doc} { set dom [::hdom::parse [readfile $doc]] # Find the <title> tag and extract the title. set title [extract_title $dom $doc] set title "C API: $title" set text [c3ref_document_text $dom] set apis [c3ref_document_apis $dom] # Insert into the database. insert_entry $doc $apis $title $text } proc generic_document_import {doc} { set dom [::hdom::parse [readfile $doc]] # Find the <title> tag and extract the title. set title [extract_title $dom $doc] set text [lang_document_text $dom] # Insert into the database. insert_entry $doc {} $title $text } proc rebuild_database {} { db transaction { # Create the database schema. If the schema already exists, then those # tables that contain document data are dropped and recreated by this # proc. The 'config' table is left untouched. # db eval { CREATE TABLE IF NOT EXISTS config(item TEXT, value TEXT); DROP TABLE IF EXISTS page; CREATE VIRTUAL TABLE page USING fts5( apis, -- C APIs title, -- Title (or first heading) content, -- Complete document text url UNINDEXED, -- Indexed URL tokenize='porter unicode61 tokenchars _' -- Built-in porter tokenizer ); } foreach doc [document_list lang] { puts "Indexing $doc..." lang_document_import $doc } foreach doc [document_list c3ref] { puts "Indexing $doc..." c3ref_document_import $doc } foreach doc [document_list generic] { puts "Indexing $doc..." generic_document_import $doc } db eval { INSERT INTO page(page) VALUES('optimize') } } db eval VACUUM } cd doc sqlite3 db search.db rebuild_database |
Added search/hdom.tcl.
|| #------------------------------------------------------------------------- # # SUMMARY: # # set doc [hdom parse HTML] # # DOCUMENT OBJECT API: # # $doc root # Return the root node of the document. # # $doc destroy # Destroy DOM object # # NODE OBJECT API: # # $node tag # Return the nodes tag type. Always lower-case. Empty string for text. # # $node children # Return a list of the nodes children. # # $node text # For a text node, return the text. For any other node, return the # concatenation of the text belonging to all descendent text nodes # (in document order). # # $node parent # Return the nodes parent node. # # $node offset # Return the byte offset of the node within the document (if any). # # $node foreach_descendent VARNAME SCRIPT # Iterate through all nodes in the sub-tree headed by $node. $node # itself is not visited. # # $node attr ?-default VALUE? ATTR # # $node search PATTERN # load ./parsehtml.so #------------------------------------------------------------------------- # Throw an exception if the expression passed as the only argument does # not evaluate to true. # proc assert {condition} { uplevel [list if "! ($condition)" [list error "assert failed: $condition"]] } #-------------------------------------------------------------------------- # # A parsed HTML document tree is store in a single array object. Each node # is stored in three array entries: # # O($id,tag) ("" for text, tag name for other nodes) # O($id,detail) (text data for text, key-value attribute list for others) # O($id,children) (list of child ids) # O($id,parent) (parent node id) # O($id,offset) (byte offset within original document text) # # Each node is identified by its key in the array. The root node's key is # stored in O(root). All nodes have automatically generated keys. # namespace eval hdom { variable iNextid 0 # Ignore all tags in the aIgnore[] array. variable aIgnore set aIgnore(html) 1 set aIgnore(/html) 1 set aIgnore(!doctype) 1 # All inline tags. variable aInline foreach x { tt i b big small u em strong dfn code samp kbd var cite abbr acronym a img object br script map q sub sup span bdo input select textarea label button } { set aInline($x) 1 } variable aContentChecker set aContentChecker(p) HtmlInlineContent set aContentChecker(th) HtmlTableCellContent set aContentChecker(td) HtmlTableCellContent set aContentChecker(tr) HtmlTableRowContent set aContentChecker(table) HtmlTableContent set aContentChecker(a) HtmlAnchorContent set aContentChecker(ul) HtmlUlContent set aContentChecker(ol) HtmlUlContent set aContentChecker(menu) HtmlUlContent set aContentChecker(dir) HtmlUlContent set aContentChecker(form) HtmlFormContent set aContentChecker(option) HtmlPcdataContent # Add content checkers for all self-closing tags. foreach x { area base br hr iframe img input isindex link meta param script style embed nextid wbr bgsound } { set aContentChecker($x) HtmlEmptyContent } namespace export parse namespace ensemble create } proc ::hdom::nextNodeId {} { variable iNextid set res "::hdom::node_$iNextid" incr iNextid return $res } proc ::hdom::nextDocId {} { variable iNextid set res "::hdom::doc_$iNextid" incr iNextid return $res } # Return "close" if the content is not Ok. Or "parent" if it is ok, but the # caller should check the parent. Or "ok" if is unconditionally Ok. # proc ::hdom::HtmlInlineContent {tag} { variable aInline if {$tag == ""} { return "ok" } if {[info exists aInline($tag)]} { return "parent" } return "close" } proc ::hdom::HtmlEmptyContent {tag} { return "close" } proc ::hdom::HtmlTableCellContent {tag} { if {$tag == "th" || $tag == "td" || $tag == "tr"} { return "close" } return "parent" } proc ::hdom::HtmlTableRowContent {tag} { if {$tag == "tr"} { return "close" } return "parent" } proc ::hdom::HtmlTableContent {tag} { if {$tag == "table"} { return "close" } return "ok" } proc ::hdom::HtmlLiContent {tag} { if {$tag == ""} { return "ok" } if {$tag == "li" || $tag=="dd" || $tag=="dt"} { return "close" } return "parent" } proc ::hdom::HtmlAnchorContent {tag} { if {$tag == ""} { return "ok" } if {$tag == "a"} { return "close" } return "parent" } proc ::hdom::HtmlUlContent {tag} { if {$tag == "" || $tag=="li"} { return "ok" } return "parent" } proc ::hdom::HtmlDlContent {tag} { if {$tag == "dd" || $tag=="dt" || $tag==""} { return "ok" } return "parent" } proc ::hdom::HtmlFormContent {tag} { if {$tag == "tr" || $tag=="td" || $tag=="th"} { return "close" } return "parent" } proc ::hdom::HtmlPcdataContent {tag} { if {$tag == ""} { return "parent" } return "close" } proc ::hdom::parsehtml_cb {arrayname tag detail offset endoffset} { variable aIgnore variable aContentChecker upvar $arrayname O # Fold the tag name to lower-case. set tag [string tolower $tag] # Ignore <html> and </html> tags. if {[info exists aIgnore($tag)]} return # An explicit close tag. Search for a tag to close. if { [string range $tag 0 0]=="/" } { set match [string range $tag 1 end] for {set id $O(current)} {$id!=""} {set id $O($id,parent)} { if {$O($id,tag)==$match} break } # The closing tag matches node $id. So the new current node is its parent. if {$id!=""} { assert {$id!=$O(root)} set O(current) $O($id,parent) } return } # Check for implicit close tags. if {$tag!=""} { for {set id $O(current)} {$id!=""} {set id $O($id,parent)} { set ptag $O($id,tag) if { [info exists aContentChecker($ptag)] } { switch -- [$aContentChecker($ptag) $tag] { "parent" { # no-op } "close" { # Close tag $id assert {$id!=$O(root)} set O(current) $O($id,parent) } "ok" { # Break out of the for(...) loop break } default { error "content checker $aContentChecker($ptag) failed" } } } } } # Add the new node to the database. set newid [nextNodeId] set O($newid,tag) $tag set O($newid,detail) $detail set O($newid,children) [list] set O($newid,parent) $O(current) set O($newid,offset) $offset # Link it into its parent's child array. lappend O($O(current),children) $newid if {$tag != ""} { set O(current) $newid } } # Node method [$node tag] # proc ::hdom::nm_tag {arrayname id} { upvar $arrayname O return $O($id,tag) } # Node method [$node parent] # proc ::hdom::nm_parent {arrayname id} { upvar $arrayname O return [create_node_command $arrayname $O($id,parent)] } # Node method [$node children] # proc ::hdom::nm_children {arrayname id} { upvar $arrayname O foreach c $O($id,children) { create_node_command $arrayname $c } return $O($id,children) } proc ::hdom::foreach_desc {arrayname id varname script level} { upvar $arrayname O foreach c $O($id,children) { create_node_command $arrayname $c uplevel $level [list set $varname $c] set rc [catch { uplevel $level $script } msg info] if {$rc == 0 || $rc == 4} { # TCL_OK or TCL_CONTINUE Do nothing } elseif {$rc == 3} { # TCL_BREAK return 1 } else { # TCL_RETURN or TCL_ERROR. return -options $info } if {[foreach_desc $arrayname $c $varname $script [expr $level+1]]} { return 1 } } return 0 } # Node method [$node foreach_descendent] # proc ::hdom::nm_foreach_descendent {arrayname id varname script} { foreach_desc $arrayname $id $varname $script 2 return "" } # Node method [$node text] # proc ::hdom::nm_text {arrayname id} { upvar $arrayname O if { $O($id,tag)=="" } { return $O($id,detail) } set ret "" $id foreach_descendent N { if {[$N tag] == ""} { append ret [$N text] } } return $ret } # Node method [$node offset] # proc ::hdom::nm_offset {arrayname id} { upvar $arrayname O return $O($id,offset) } # Node method: $node attr ?-default VALUE? ?ATTR? # proc ::hdom::nm_attr {arrayname id args} { upvar $arrayname O set dict $O($id,detail) if {[llength $args]==0} { return $dict } if {[llength $args]==1} { set nm [lindex $args 0] if {[catch { set res [dict get $dict $nm] }]} { error "no such attribute: $nm" } } else { if {[lindex $args 0] != "-default"} { error "expected \"-default\" got \"[lindex $args 0]\"" } set nm [lindex $args 2] if {[catch { set res [dict get $dict $nm] }]} { set res [lindex $args 1] } } return $res } proc ::hdom::nodematches {N pattern} { set tag [$N tag] if {[string compare $pattern $tag]==0} { return 1 } return 0 } # Node method: $node search PATTERN # proc ::hdom::nm_search {arrayname id pattern} { set ret [list] $id foreach_descendent N { if {[::hdom::nodematches $N $pattern]} { lappend ret $N } } set ret } proc ::hdom::dm_root {arrayname} { upvar $arrayname O return [create_node_command $arrayname $O(root)] } # Document method [$doc destroy] # proc ::hdom::dm_destroy {arrayname} { upvar $arrayname O proc $arrayname {method args} {} catch { uplevel [list array unset $arrayname ] } } proc ::hdom::tohtml {arrayname {node {}}} { upvar $arrayname O if {$node==""} {set node $O(root)} set tag $O($node,tag) if {$tag == ""} { return $O($node,detail) } } proc ::hdom::node_method {arrayname id method args} { uplevel ::hdom::nm_$method $arrayname $id $args } proc ::hdom::document_method {arrayname method args} { uplevel ::hdom::dm_$method $arrayname $args } # Return the name of the command for node $id, part of document $arrayname. # proc ::hdom::create_node_command {arrayname id} { if { [llength [info commands $id]]==0} { proc $id {method args} [subst -nocommands { uplevel ::hdom::node_method $arrayname $id [set method] [set args] }] } return $id } # Parse the html document passed as the first argument. # proc ::hdom::parse {html} { set doc [nextDocId] variable $doc upvar 0 $doc O set root [nextNodeId] # Add the root node to the tree. # set O($root,tag) html set O($root,detail) [list] set O($root,children) [list] set O($root,parent) "" # Setup the other state data for the parse. # set O(current) $root set O(root) $root parsehtml $html [list parsehtml_cb O] # Create the document object command. # proc $doc {method args} [subst -nocommands { uplevel ::hdom::document_method $doc [set method] [set args] }] return $doc } |
Changes to search/parsehtml.c.
1 2 | /* | | < < < < < < | < < < < < < < < < < < < < < < < | | > > > | < < | < < < | < < | < < < < < < < < < < < < < < < | | < < < < < < < < < | < < < < < | < | < < < < < < | < | | < | < < | < < < < < | < | | < < | < < < | < < < < < < < < < < < < < < < < < | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 | /* ** This file contains the [parsehtml] command, a helper command used to extract ** text and markup tags from the HTML documents in the documentation. */ #include <tcl.h> #include <string.h> #include <strings.h> #include <assert.h> #include <ctype.h> #include <math.h> #define ISSPACE(c) (((c)&0x80)==0 && isspace(c)) #include "sqlite3.h" typedef unsigned int u32; typedef unsigned char u8; typedef sqlite3_uint64 u64; static int doTagCallback( Tcl_Interp *interp, Tcl_Obj **aCall, int nElem, const char *zTag, int nTag, int iOffset, int iEndOffset, Tcl_Obj *pParam ){ int rc; Tcl_Obj *pArg = pParam; if( pArg==0 ) pArg = Tcl_NewObj(); Tcl_IncrRefCount( aCall[nElem] = Tcl_NewStringObj(zTag, nTag) ); Tcl_IncrRefCount( aCall[nElem+1] = pArg ); Tcl_IncrRefCount( aCall[nElem+2] = Tcl_NewIntObj(iOffset) ); Tcl_IncrRefCount( aCall[nElem+3] = Tcl_NewIntObj(iEndOffset) ); rc = Tcl_EvalObjv(interp, nElem+4, aCall, 0); Tcl_DecrRefCount( aCall[nElem] ); Tcl_DecrRefCount( aCall[nElem+1] ); Tcl_DecrRefCount( aCall[nElem+2] ); Tcl_DecrRefCount( aCall[nElem+3] ); return rc; } static int doTextCallback( Tcl_Interp *interp, Tcl_Obj **aCall, int nElem, const char *zText, int nText, int iOffset, int iEndOffset ){ Tcl_Obj *pText = Tcl_NewStringObj(zText, nText); return doTagCallback(interp, aCall, nElem, "", 0, iOffset, iEndOffset, pText); } /* ** Tcl command: parsehtml HTML SCRIPT */ static int parsehtmlcmd( ClientData clientData, Tcl_Interp *interp, |
︙ | ︙ | |||
172 173 174 175 176 177 178 | Tcl_WrongNumArgs(interp, 1, objv, "HTML SCRIPT"); return TCL_ERROR; } zHtml = Tcl_GetString(objv[1]); rc = Tcl_ListObjGetElements(interp, objv[2], &nElem, &aElem); if( rc!=TCL_OK ) return rc; | | | < > | > > > > > > > > > > > | 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 | Tcl_WrongNumArgs(interp, 1, objv, "HTML SCRIPT"); return TCL_ERROR; } zHtml = Tcl_GetString(objv[1]); rc = Tcl_ListObjGetElements(interp, objv[2], &nElem, &aElem); if( rc!=TCL_OK ) return rc; aCall = (Tcl_Obj **)ckalloc(sizeof(Tcl_Obj *)*(nElem+4)); memcpy(aCall, aElem, sizeof(Tcl_Obj *)*nElem); memset(&aCall[nElem], 0, 3*sizeof(Tcl_Obj*)); z = zHtml; while( *z ){ char *zText = z; while( *z && *z!='<' ) z++; /* Invoke the callback script for the chunk of text just parsed. */ Tcl_IncrRefCount( aCall[nElem] = Tcl_NewObj() ); Tcl_IncrRefCount( aCall[nElem+1] = Tcl_NewStringObj(zText, z-zText) ); Tcl_IncrRefCount( aCall[nElem+2] = Tcl_NewIntObj(zText - zHtml) ); Tcl_IncrRefCount( aCall[nElem+3] = Tcl_NewIntObj(z - zHtml) ); rc = Tcl_EvalObjv(interp, nElem+4, aCall, 0); Tcl_DecrRefCount( aCall[nElem] ); Tcl_DecrRefCount( aCall[nElem+1] ); Tcl_DecrRefCount( aCall[nElem+2] ); Tcl_DecrRefCount( aCall[nElem+3] ); if( rc!=TCL_OK ) return rc; /* Unless is at the end of the document, z now points to the start of a ** markup tag. Either an opening or a closing tag. Parse it up and ** invoke the callback script. */ if( *z ){ int nTag; char *zTag; int iOffset; /* Offset of open tag (the '<' character) */ assert( *z=='<' ); iOffset = z - zHtml; z++; while( ISSPACE(*z) ) z++; zTag = z; while( *z && !ISSPACE(*z) && *z!='>' ) z++; nTag = z-zTag; |
︙ | ︙ | |||
249 250 251 252 253 254 255 | } Tcl_ListObjAppendElement(interp,pParam,Tcl_NewStringObj(zVal,nVal)); }else if( zAttr ){ Tcl_ListObjAppendElement(interp, pParam, Tcl_NewIntObj(1)); } } | | > > | | | 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 | } Tcl_ListObjAppendElement(interp,pParam,Tcl_NewStringObj(zVal,nVal)); }else if( zAttr ){ Tcl_ListObjAppendElement(interp, pParam, Tcl_NewIntObj(1)); } } rc = doTagCallback(interp, aCall, nElem, zTag, nTag, iOffset, 1+z-zHtml, pParam ); if( rc!=TCL_OK ) return rc; if( nTag==3 && memcmp(zTag, "tcl", 3)==0 ){ const char *zText = &z[1]; while( *z && strncasecmp("</tcl>", z, 6) ) z++; rc = doTextCallback(interp, aCall, nElem, zText, z-zText, 0, 0); if( rc!=TCL_OK ) return rc; rc = doTagCallback(interp, aCall, nElem, "/tcl", 4, 0, 0, 0); if( rc!=TCL_OK ) return rc; if( *z ) z++; } } while( *z && !ISSPACE(*z) && *z!='>' ) z++; if( *z ) z++; |
︙ | ︙ | |||
281 282 283 284 285 286 287 | #ifdef USE_TCL_STUBS if (Tcl_InitStubs(interp, "8.4", 0) == 0) { return TCL_ERROR; } #endif Tcl_CreateObjCommand(interp, "parsehtml", parsehtmlcmd, 0, 0); | < | 200 201 202 203 204 205 206 207 208 209 210 | #ifdef USE_TCL_STUBS if (Tcl_InitStubs(interp, "8.4", 0) == 0) { return TCL_ERROR; } #endif Tcl_CreateObjCommand(interp, "parsehtml", parsehtmlcmd, 0, 0); return TCL_OK; } |
Changes to search/search.tcl.
|
| | > > | 1 2 3 4 5 6 7 8 9 10 | #!/home/dan/bin/tclsqlite3 source [file dirname [info script]]/document_header.tcl # Decode an HTTP %-encoded string # proc percent_decode {str} { # rewrite "+" back to space # protect \ and [ and ] by quoting with '\' set str [string map [list + { } "\\" "\\\\" \[ \\\[ \] \\\]] $str] |
︙ | ︙ | |||
129 130 131 132 133 134 135 | } proc footer {} { return { <hr> <table align=right> <td> | | | 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 | } proc footer {} { return { <hr> <table align=right> <td> <i>Powered by <a href="http://www.sqlite.org/fts5.html">FTS5</a>.</i> </table> } } #------------------------------------------------------------------------- # This command is similar to the builtin Tcl [time] command, except that |
︙ | ︙ | |||
154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 | proc ttime {script} { set t [lindex [time [list uplevel $script]] 0] if {$t>1000000} { return [format "%.2f s" [expr {$t/1000000.0}]] } return [format "%.2f ms" [expr {$t/1000.0}]] } proc rank {matchinfo args} { binary scan $matchinfo i* I set nPhrase [lindex $I 0] set nCol [lindex $I 1] set G [lrange $I 2 [expr {1+$nCol*$nPhrase}]] set L [lrange $I [expr {2+$nCol*$nPhrase}] end] foreach a $args { lappend log [expr {log10(100+$a)}] } set score 0.0 set i 0 foreach l $L g $G { if {$l > 0} { set div [lindex $log [expr $i%3]] set score [expr {$score + (double($l) / double($g)) / $div}] } incr i } return $score } | > > | 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 | proc ttime {script} { set t [lindex [time [list uplevel $script]] 0] if {$t>1000000} { return [format "%.2f s" [expr {$t/1000000.0}]] } return [format "%.2f ms" [expr {$t/1000.0}]] } proc rank {matchinfo args} { return 10.0 binary scan $matchinfo i* I set nPhrase [lindex $I 0] set nCol [lindex $I 1] set G [lrange $I 2 [expr {1+$nCol*$nPhrase}]] set L [lrange $I [expr {2+$nCol*$nPhrase}] end] foreach a $args { lappend log [expr {log10(100+$a)}] } set score 0.0 set i 0 foreach l $L g $G { if {$l > 0} { set div [lindex $log [expr $i%3]] set div 1.0 set score [expr {$score + (double($l) / double($g)) / $div}] } incr i } return $score } |
︙ | ︙ | |||
203 204 205 206 207 208 209 | set nRes [db one { SELECT count(*) FROM page WHERE page MATCH $::A(q) }] }] if {$rc} { set ::A(q) "\"$::A(q)\"" set nRes [db one { SELECT count(*) FROM page WHERE page MATCH $::A(q) }] } | | < | 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 | set nRes [db one { SELECT count(*) FROM page WHERE page MATCH $::A(q) }] }] if {$rc} { set ::A(q) "\"$::A(q)\"" set nRes [db one { SELECT count(*) FROM page WHERE page MATCH $::A(q) }] } db one { INSERT INTO page(page, rank) VALUES('rank', 'bm25(20.0, 10.0)') } # If the user has clicked the "Lucky" button and the query returns one or # more results, redirect the browser to the highest ranked result. If the # query returns zero results, fall through and display the "No results" # page as if the user had clicked "Search". # if {[info exists ::A(s)] && $::A(s) == "Lucky"} { |
︙ | ︙ | |||
248 249 250 251 252 253 254 | <table border=0> <p>Search results [expr $iStart+1]..[expr {($nRes < $iStart+10) ? $nRes : $iStart+10}] of $nRes for: <b>[htmlize $::A(q)]</b> }] db eval { SELECT | < | > > | < < | < < < < < | < < < > < | | | > | | | < < | 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 | <table border=0> <p>Search results [expr $iStart+1]..[expr {($nRes < $iStart+10) ? $nRes : $iStart+10}] of $nRes for: <b>[htmlize $::A(q)]</b> }] db eval { SELECT COALESCE(NULLIF(title,''), 'No Title.') AS title, snippet(page, 0, $open, $close, $ellipsis, 6) AS snippet1, snippet(page, 1, $open, $close, '', 40) AS snippet2, snippet(page, 2, $open, $close, $ellipsis, 40) AS snippet3, url, rank FROM page($::A(q)) ORDER BY rank LIMIT 10 OFFSET $iStart; } { #if {$snippet1!=""} { set snippet1 "($snippet1)" } append ret [subst -nocommands {<tr> <td valign=top style="line-height:150%"> <div style="white-space:wrap;font-size:larger" class=nounderline> <xi><a href="$url">$snippet2</a> </i> </div> <div style="margin-left: 10ex; font:larger monospace">$snippet1</div> <div style="ffont-size:small;margin-left: 2ex"> <div> $snippet3 </div> <div style="margin-left:1em; margin-bottom:1em"><a href="$url">$url</a></div> </div> </td> }] } append ret { </table> } # If the query returned more than 10 results, add up to 10 links to # each set of 10 results (first link to results 1-10, second to 11-20, |
︙ | ︙ | |||
310 311 312 313 314 315 316 | proc main {} { global A sqlite3 db search.db cgi_parse_args db transaction { | > | | | | | > < < < < < < < < < < < | < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < | | | | 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 | proc main {} { global A sqlite3 db search.db cgi_parse_args db transaction { set t [ttime { if {[catch searchresults srchout]} { set A(q) [string tolower $A(q)] set srchout [searchresults] } set doc "[searchform] $srchout [footer]" }] } append doc "<p>Page generated in $t." return $doc # return [cgi_env_dump] } #========================================================================= source [file dirname [info script]]/document_header.tcl if {![info exists env(REQUEST_METHOD)]} { set env(REQUEST_METHOD) GET set env(QUERY_STRING) rebuild=1 set ::HEADER "" #set env(QUERY_STRING) {q="one+two+three+four"+eleven} set env(QUERY_STRING) {q=windows} set ::HEADER "" } set TITLE "Search SQLite Documentation" if {0==[catch main res]} { if {[info exists ::A(q)]} { set initsearch [attrize $::A(q)] } else { set initsearch {} } set document [document_header {$TITLE} "" $initsearch] append document $res } else { set document "<pre>" append document "Error: $res\n\n" append document $::errorInfo append document "</pre>" } |
︙ | ︙ |
Changes to wrap.tcl.
︙ | ︙ | |||
36 37 38 39 40 41 42 43 44 45 46 47 48 49 | # set DOC [lindex $argv 0] set SRC [lindex $argv 1] set DEST [lindex $argv 2] set HOMEDIR [pwd] ;# Also remember our home directory. source [file dirname [info script]]/pages/fancyformat.tcl # Open the SQLite database. # sqlite3 db docinfo.db db eval { ATTACH 'history.db' AS history; BEGIN; | > | 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | # set DOC [lindex $argv 0] set SRC [lindex $argv 1] set DEST [lindex $argv 2] set HOMEDIR [pwd] ;# Also remember our home directory. source [file dirname [info script]]/pages/fancyformat.tcl source [file dirname [info script]]/document_header.tcl # Open the SQLite database. # sqlite3 db docinfo.db db eval { ATTACH 'history.db' AS history; BEGIN; |
︙ | ︙ | |||
397 398 399 400 401 402 403 | if {$srcfile==""} { set fd $hd(aux) set path $hd(rootpath-aux) } else { set fd $hd(main) set path $hd(rootpath-main) } | < < < < < < < < < | | < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < | 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 | if {$srcfile==""} { set fd $hd(aux) set path $hd(rootpath-aux) } else { set fd $hd(main) set path $hd(rootpath-main) } puts $fd [document_header $title $path] putsin4 $fd { <div class=startsearch></div> } if {$srcfile!=""} { if {[file exists DRAFT]} { set hd(footer) { <p align="center"><font size="6" color="red">*** DRAFT ***</font></p> |
︙ | ︙ |