Documentation Source Text

Check-in [106ae9b8df]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Add a documentation page that overviews Lemon, its history, and its importance to SQLite.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: 106ae9b8dfcb58082e3fa402a7dd12a14495291bdf149cff18ba1cf758e00569
User & Date: drh 2018-01-04 16:33:54.898
Context
2018-01-04
16:37
Fix typo in the new lemon document. (check-in: ca3748636f user: drh tags: trunk)
16:33
Add a documentation page that overviews Lemon, its history, and its importance to SQLite. (check-in: 106ae9b8df user: drh tags: trunk)
03:33
Update the change log for the 3.22.0 release. (check-in: a897222d15 user: drh tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to pages/amalgamation.in.
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
in the [https://www.sqlite.org/src | SQLite version control system]
and are edited manually in an ordinary text editor.
But some of the C-language files are generated using scripts
or auxiliary programs.  For example, the
[https://www.sqlite.org/src/artifact?ci=trunk&filename=src/parse.y|parse.y]
file contains an LALR(1) grammar of the SQL language which is compiled
down into are parser in files "parse.c" and "parse.h" by the
[https://www.sqlite.org/src/doc/trunk/doc/lemon.html|Lemon] parser generator.
</p>

<p>The makefiles for SQLite have an "sqlite3.c" target for building the
file we call "the amalgamation".
The amalgamation is a single C code file, named "sqlite3.c",
that contains all C code 
for the core SQLite library and the [FTS3], [FTS5], [RTREE],







|







31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
in the [https://www.sqlite.org/src | SQLite version control system]
and are edited manually in an ordinary text editor.
But some of the C-language files are generated using scripts
or auxiliary programs.  For example, the
[https://www.sqlite.org/src/artifact?ci=trunk&filename=src/parse.y|parse.y]
file contains an LALR(1) grammar of the SQL language which is compiled
down into are parser in files "parse.c" and "parse.h" by the
[Lemon parser generator].
</p>

<p>The makefiles for SQLite have an "sqlite3.c" target for building the
file we call "the amalgamation".
The amalgamation is a single C code file, named "sqlite3.c",
that contains all C code 
for the core SQLite library and the [FTS3], [FTS5], [RTREE],
Changes to pages/arch.in.
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
the tokenizer call the parser is better, though, because it can be made
threadsafe and it runs faster.</p>

<h1>Parser</h1>

<p>The parser assigns meaning to tokens based on
their context.  The parser for SQLite is generated using the
<a href="https://www.sqlite.org/src/doc/trunk/doc/lemon.html">Lemon</a>
LALR(1) parser generator.
Lemon does the same job as YACC/BISON, but it uses
a different input syntax which is less error-prone.
Lemon also generates a parser which is reentrant and thread-safe.
And Lemon defines the concept of a non-terminal destructor so
that it does not leak memory when syntax errors are encountered.
The grammar file that drives Lemon and that defines the SQL language
that SQLite understands is found in <file>parse.y</file>.







<
|







75
76
77
78
79
80
81

82
83
84
85
86
87
88
89
the tokenizer call the parser is better, though, because it can be made
threadsafe and it runs faster.</p>

<h1>Parser</h1>

<p>The parser assigns meaning to tokens based on
their context.  The parser for SQLite is generated using the

[Lemon parser generator].
Lemon does the same job as YACC/BISON, but it uses
a different input syntax which is less error-prone.
Lemon also generates a parser which is reentrant and thread-safe.
And Lemon defines the concept of a non-terminal destructor so
that it does not leak memory when syntax errors are encountered.
The grammar file that drives Lemon and that defines the SQL language
that SQLite understands is found in <file>parse.y</file>.
Changes to pages/changes.in.
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
  global nChng aChng xrefChng
  set aChng($nChng) [list $date $desc $options]
  set xrefChng($date) $nChng
  incr nChng
}

chng {2018-02-00 (3.22.0)} {
<li> The output of [sqlite3_trace()] now shows each individual SQL statements
     run within a trigger.
<li> Add the ability to read from [WAL mode] databases even if the application 
     lacks write permission on the database and its containing directory, as long as
     the -shm and -wal files exist in that directory.
<li> Added the [rtreecheck()] scalar SQL function to the [R-Tree extension].
<li> Query planner enhancements:
<ol type='a'>







|







18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
  global nChng aChng xrefChng
  set aChng($nChng) [list $date $desc $options]
  set xrefChng($date) $nChng
  incr nChng
}

chng {2018-02-00 (3.22.0)} {
<li> The output of [sqlite3_trace_v2()] now shows each individual SQL statements
     run within a trigger.
<li> Add the ability to read from [WAL mode] databases even if the application 
     lacks write permission on the database and its containing directory, as long as
     the -shm and -wal files exist in that directory.
<li> Added the [rtreecheck()] scalar SQL function to the [R-Tree extension].
<li> Query planner enhancements:
<ol type='a'>
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
  <li> Omit unused LEFT JOINs even if they are not the right-most joins
       of a query.
</ol>
<li> Other performance optimizations:
<ol type='a'>
  <li> A smaller and faster implementation of text to floating-point
       conversion subroutine: sqlite3AtoF().
  <li> The LEMON parser generator creates a faster parser.
  <li> Use the strcspn() C-library routine to speed up the LIKE and
       GLOB operators.
str</ol>
<li> Improvements to the [command-line shell]:
<ol type='a'>
  <li> The ".schema" command shows the structure of virtual tables
       inside of a comment.







|







43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
  <li> Omit unused LEFT JOINs even if they are not the right-most joins
       of a query.
</ol>
<li> Other performance optimizations:
<ol type='a'>
  <li> A smaller and faster implementation of text to floating-point
       conversion subroutine: sqlite3AtoF().
  <li> The [Lemon parser generator] creates a faster parser.
  <li> Use the strcspn() C-library routine to speed up the LIKE and
       GLOB operators.
str</ol>
<li> Improvements to the [command-line shell]:
<ol type='a'>
  <li> The ".schema" command shows the structure of virtual tables
       inside of a comment.
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
    extension.
<li>In the [command-line shell], enhance the ".mode" command so that it
    restores the default column and row separators for modes "line",
    "list", "column", and "tcl". 
<li>Enhance the [SQLITE_DIRECT_OVERFLOW_READ] option so that it works
    in [WAL mode] as long as the pages being read are not in the WAL file.
<li>Enhance the 
    [https://www.sqlite.org/src/doc/trunk/doc/lemon.html|LEMON parser generator]
    so that it can store the parser object as a stack variable rather than 
    allocating space from the heap and make use of that enhancement in
    the [amalgamation].
<li>Other performance improvements. Uses about [CPU cycles used|6.5% fewer CPU cycles].
<p><b>Bug Fixes:</b>
<li>Throw an error if the ON clause of a LEFT JOIN references tables
    to the right of the ON clause.  This is the same behavior as







|







443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
    extension.
<li>In the [command-line shell], enhance the ".mode" command so that it
    restores the default column and row separators for modes "line",
    "list", "column", and "tcl". 
<li>Enhance the [SQLITE_DIRECT_OVERFLOW_READ] option so that it works
    in [WAL mode] as long as the pages being read are not in the WAL file.
<li>Enhance the 
    [Lemon parser generator]
    so that it can store the parser object as a stack variable rather than 
    allocating space from the heap and make use of that enhancement in
    the [amalgamation].
<li>Other performance improvements. Uses about [CPU cycles used|6.5% fewer CPU cycles].
<p><b>Bug Fixes:</b>
<li>Throw an error if the ON clause of a LEFT JOIN references tables
    to the right of the ON clause.  This is the same behavior as
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
<li>Added the [SQLITE_DBSTATUS_CACHE_USED_SHARED] option to [sqlite3_db_status()].
<li>Add the 
    [https://www.sqlite.org/src/artifact?ci=trunk&filename=ext/misc/vfsstat.c|vfsstat.c]
    loadable extension - a VFS shim that measures I/O
    together with an [eponymous virtual table] that provides access to the measurements.
<li>Improved algorithm for running queries with both an ORDER BY and a LIMIT where
    only the inner-most loop naturally generates rows in the correct order.
<li>Enhancements to Lemon parser generator, so that it generates a
    faster parser.
<li>The [PRAGMA compile_options] command now attempts to show the version number
    of the compiler that generated the library.
<li>Enhance [PRAGMA table_info] so that it provides information about
    [eponymous virtual tables].
<li>Added the "win32-none" VFS, analogous to the "unix-none" VFS, that works like
    the default "win32" VFS except that it ignores all file locks.







|







659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
<li>Added the [SQLITE_DBSTATUS_CACHE_USED_SHARED] option to [sqlite3_db_status()].
<li>Add the 
    [https://www.sqlite.org/src/artifact?ci=trunk&filename=ext/misc/vfsstat.c|vfsstat.c]
    loadable extension - a VFS shim that measures I/O
    together with an [eponymous virtual table] that provides access to the measurements.
<li>Improved algorithm for running queries with both an ORDER BY and a LIMIT where
    only the inner-most loop naturally generates rows in the correct order.
<li>Enhancements to [Lemon parser generator], so that it generates a
    faster parser.
<li>The [PRAGMA compile_options] command now attempts to show the version number
    of the compiler that generated the library.
<li>Enhance [PRAGMA table_info] so that it provides information about
    [eponymous virtual tables].
<li>Added the "win32-none" VFS, analogous to the "unix-none" VFS, that works like
    the default "win32" VFS except that it ignores all file locks.
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
<p><b>Potentially Disruptive Change:</b>
<li>The [SQLITE_DEFAULT_PAGE_SIZE] is increased from 1024 to 4096.  
    The [SQLITE_DEFAULT_CACHE_SIZE] is changed from 2000 to -2000 so 
    the same amount of cache memory is used by default.
    See the application note on the
    [version 3.12.0 page size change] for further information.
<p><b>Performance enhancements:</b>
<li>Enhancements to the [https://www.sqlite.org/src/doc/trunk/doc/lemon.html|Lemon]
    parser generator so that it creates a smaller and faster SQL parser.
<li>Only create [master journal] files if two or more attached databases are all
    modified, do not have [PRAGMA synchronous] set to OFF, and
    do not have the [journal_mode] set to OFF, MEMORY, or WAL.
<li>Only create [statement journal] files when their size exceeds a threshold.
    Otherwise the journal is held in memory and no I/O occurs.  The threshold
    can be configured at compile-time using [SQLITE_STMTJRNL_SPILL] or at
    start-time using [sqlite3_config]([SQLITE_CONFIG_STMTJRNL_SPILL]).







|
|







782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
<p><b>Potentially Disruptive Change:</b>
<li>The [SQLITE_DEFAULT_PAGE_SIZE] is increased from 1024 to 4096.  
    The [SQLITE_DEFAULT_CACHE_SIZE] is changed from 2000 to -2000 so 
    the same amount of cache memory is used by default.
    See the application note on the
    [version 3.12.0 page size change] for further information.
<p><b>Performance enhancements:</b>
<li>Enhancements to the [Lemon parser generator]
    so that it creates a smaller and faster SQL parser.
<li>Only create [master journal] files if two or more attached databases are all
    modified, do not have [PRAGMA synchronous] set to OFF, and
    do not have the [journal_mode] set to OFF, MEMORY, or WAL.
<li>Only create [statement journal] files when their size exceeds a threshold.
    Otherwise the journal is held in memory and no I/O occurs.  The threshold
    can be configured at compile-time using [SQLITE_STMTJRNL_SPILL] or at
    start-time using [sqlite3_config]([SQLITE_CONFIG_STMTJRNL_SPILL]).
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
    the filename argument to [ATTACH].
<li>Allow a [VALUES clause] to be used anywhere a [SELECT] statement is valid.
<li>Reseed the PRNG used by [sqlite3_randomness(N,P)] when invoked with N==0.
    Automatically reseed after a fork() on unix.
<li>Enhance the [spellfix1] virtual table so that it can search efficiently by rowid.
<li>Performance enhancements.
<li>Improvements to the comments in the VDBE byte-code display when running [EXPLAIN].
<li>Add the "%token_class" directive to LEMON parser generator and use it to simplify
    the grammar.
<li>Change the LEMON source code to avoid calling C-library functions that OpenBSD
    considers dangerous.  (Ex: sprintf).
<li>Bug fix: In the [command-line shell] CSV import feature, do not end a field
    when an escaped double-quote occurs at the end of a CRLN line.
<li>SQLITE_SOURCE_ID: 
    "2014-02-03 13:52:03 e816dd924619db5f766de6df74ea2194f3e3b538"
<li>SHA1 for sqlite3.c: 98a07da78f71b0275e8d9c510486877adc31dbee
}







|

|







1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
    the filename argument to [ATTACH].
<li>Allow a [VALUES clause] to be used anywhere a [SELECT] statement is valid.
<li>Reseed the PRNG used by [sqlite3_randomness(N,P)] when invoked with N==0.
    Automatically reseed after a fork() on unix.
<li>Enhance the [spellfix1] virtual table so that it can search efficiently by rowid.
<li>Performance enhancements.
<li>Improvements to the comments in the VDBE byte-code display when running [EXPLAIN].
<li>Add the "%token_class" directive to [Lemon parser generator] and use it to simplify
    the grammar.
<li>Change the [Lemon] source code to avoid calling C-library functions that OpenBSD
    considers dangerous.  (Ex: sprintf).
<li>Bug fix: In the [command-line shell] CSV import feature, do not end a field
    when an escaped double-quote occurs at the end of a CRLN line.
<li>SQLITE_SOURCE_ID: 
    "2014-02-03 13:52:03 e816dd924619db5f766de6df74ea2194f3e3b538"
<li>SHA1 for sqlite3.c: 98a07da78f71b0275e8d9c510486877adc31dbee
}
2512
2513
2514
2515
2516
2517
2518
2519
2520
2521
2522
2523
2524
2525
2526
     [SQLITE_CONFIG_LOG] verb to [sqlite3_config()].  The ".log" command
     is added to the [Command Line Interface].
<li> Improvements to [FTS3].
<li> Improvements and bug-fixes in support for [SQLITE_OMIT_FLOATING_POINT].
<li> The [integrity_check pragma] is enhanced to detect out-of-order rowids.
<li> The ".genfkey" operator has been removed from the
     [Command Line Interface].
<li> Updates to the co-hosted Lemon LALR(1) parser generator. (These
     updates did not affect SQLite.)
<li> Various minor bug fixes and performance enhancements.
}

chng {2010-01-06 (3.6.22)} {
<li>Fix bugs that can (rarely) lead to incorrect query results when
    the CAST or OR operators are used in the WHERE clause of a query.







|







2512
2513
2514
2515
2516
2517
2518
2519
2520
2521
2522
2523
2524
2525
2526
     [SQLITE_CONFIG_LOG] verb to [sqlite3_config()].  The ".log" command
     is added to the [Command Line Interface].
<li> Improvements to [FTS3].
<li> Improvements and bug-fixes in support for [SQLITE_OMIT_FLOATING_POINT].
<li> The [integrity_check pragma] is enhanced to detect out-of-order rowids.
<li> The ".genfkey" operator has been removed from the
     [Command Line Interface].
<li> Updates to the co-hosted [Lemon LALR(1) parser generator]. (These
     updates did not affect SQLite.)
<li> Various minor bug fixes and performance enhancements.
}

chng {2010-01-06 (3.6.22)} {
<li>Fix bugs that can (rarely) lead to incorrect query results when
    the CAST or OR operators are used in the WHERE clause of a query.
3827
3828
3829
3830
3831
3832
3833
3834
3835
3836
3837
3838
3839
3840
3841
}

chng {2003-12-04 (2.8.7)} {
<li>Added experimental sqlite_bind() and sqlite_reset() APIs.</li>
<li>If the name of the database is an empty string, open a new database
    in a temporary file that is automatically deleted when the database
    is closed.</li>
<li>Performance enhancements in the lemon-generated parser</li>
<li>Experimental date/time functions revised.</li>
<li>Disallow temporary indices on permanent tables.</li>
<li>Documentation updates and typo fixes</li>
<li>Added experimental sqlite_progress_handler() callback API</li>
<li>Removed support for the Oracle8 outer join syntax.</li>
<li>Allow GLOB and LIKE operators to work as functions.</li>
<li>Other minor documentation and makefile changes and bug fixes.</li>







|







3827
3828
3829
3830
3831
3832
3833
3834
3835
3836
3837
3838
3839
3840
3841
}

chng {2003-12-04 (2.8.7)} {
<li>Added experimental sqlite_bind() and sqlite_reset() APIs.</li>
<li>If the name of the database is an empty string, open a new database
    in a temporary file that is automatically deleted when the database
    is closed.</li>
<li>Performance enhancements in the [Lemon]-generated parser</li>
<li>Experimental date/time functions revised.</li>
<li>Disallow temporary indices on permanent tables.</li>
<li>Documentation updates and typo fixes</li>
<li>Added experimental sqlite_progress_handler() callback API</li>
<li>Removed support for the Oracle8 outer join syntax.</li>
<li>Allow GLOB and LIKE operators to work as functions.</li>
<li>Other minor documentation and makefile changes and bug fixes.</li>
4180
4181
4182
4183
4184
4185
4186
4187
4188
4189
4190
4191
4192
4193
4194
<li>Change the name of the sanity_check PRAGMA to <b>integrity_check</b>
    and make it available in all compiles.</li>
<li>SELECT min() or max() of an indexed column with no WHERE or GROUP BY
    clause is handled as a special case which avoids a complete table scan.</li>
<li>Automatically generated ROWIDs are now sequential.</li>
<li>Do not allow dot-commands of the command-line shell to occur in the
    middle of a real SQL command.</li>
<li>Modifications to the "lemon" parser generator so that the parser tables
    are 4 times smaller.</li>
<li>Added support for user-defined functions implemented in C.</li>
<li>Added support for new functions: <b>coalesce()</b>, <b>lower()</b>,
    <b>upper()</b>, and <b>random()</b>
<li>Added support for VIEWs.</li>
<li>Added the subquery flattening optimizer.</li>
<li>Modified the B-Tree and Pager modules so that disk pages that do not







|







4180
4181
4182
4183
4184
4185
4186
4187
4188
4189
4190
4191
4192
4193
4194
<li>Change the name of the sanity_check PRAGMA to <b>integrity_check</b>
    and make it available in all compiles.</li>
<li>SELECT min() or max() of an indexed column with no WHERE or GROUP BY
    clause is handled as a special case which avoids a complete table scan.</li>
<li>Automatically generated ROWIDs are now sequential.</li>
<li>Do not allow dot-commands of the command-line shell to occur in the
    middle of a real SQL command.</li>
<li>Modifications to the [Lemon parser generator] so that the parser tables
    are 4 times smaller.</li>
<li>Added support for user-defined functions implemented in C.</li>
<li>Added support for new functions: <b>coalesce()</b>, <b>lower()</b>,
    <b>upper()</b>, and <b>random()</b>
<li>Added support for VIEWs.</li>
<li>Added the subquery flattening optimizer.</li>
<li>Modified the B-Tree and Pager modules so that disk pages that do not
4507
4508
4509
4510
4511
4512
4513
4514
4515
4516
4517
4518
4519
4520
4521
<li>Added limited support for transactions.  At this point, transactions
    will do table locking on the GDBM backend.  There is no support (yet)
    for rollback or atomic commit.</li>
<li>Added special column names ROWID, OID, and _ROWID_ that refer to the
    unique random integer key associated with every row of every table.</li>
<li>Additional tests added to the regression suite to cover the new ROWID
    feature and the TCL interface bugs mentioned below.</li>
<li>Changes to the "lemon" parser generator to help it work better when
    compiled using MSVC.</li>
<li>Bug fixes in the TCL interface identified by Oleg Oleinick.</li>
}

chng {2001-03-20 (1.0.27)} {
<li>When doing DELETE and UPDATE, the library used to write the record
    numbers of records to be deleted or updated into a temporary file.







|







4507
4508
4509
4510
4511
4512
4513
4514
4515
4516
4517
4518
4519
4520
4521
<li>Added limited support for transactions.  At this point, transactions
    will do table locking on the GDBM backend.  There is no support (yet)
    for rollback or atomic commit.</li>
<li>Added special column names ROWID, OID, and _ROWID_ that refer to the
    unique random integer key associated with every row of every table.</li>
<li>Additional tests added to the regression suite to cover the new ROWID
    feature and the TCL interface bugs mentioned below.</li>
<li>Changes to the [Lemon parser generator] to help it work better when
    compiled using MSVC.</li>
<li>Bug fixes in the TCL interface identified by Oleg Oleinick.</li>
}

chng {2001-03-20 (1.0.27)} {
<li>When doing DELETE and UPDATE, the library used to write the record
    numbers of records to be deleted or updated into a temporary file.
Changes to pages/compile.in.
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
}

COMPILE_OPTION {SQLITE_ENABLE_UPDATE_DELETE_LIMIT} {
  This option enables an optional ORDER BY and LIMIT clause on 
  [UPDATE] and [DELETE] statements.

  <p>If this option is defined, then it must also be 
  defined when using the 'lemon' tool to generate a parse.c
  file. Because of this, this option may only be used when the library is built
  from source, not from the [amalgamation] or from the collection of
  pre-packaged C files provided for non-Unix like platforms on the website.
  </p>
}

COMPILE_OPTION {SQLITE_ENABLE_UNKNOWN_SQL_FUNCTION} {







|







1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
}

COMPILE_OPTION {SQLITE_ENABLE_UPDATE_DELETE_LIMIT} {
  This option enables an optional ORDER BY and LIMIT clause on 
  [UPDATE] and [DELETE] statements.

  <p>If this option is defined, then it must also be 
  defined when using the [Lemon parser generator] tool to generate a parse.c
  file. Because of this, this option may only be used when the library is built
  from source, not from the [amalgamation] or from the collection of
  pre-packaged C files provided for non-Unix like platforms on the website.
  </p>
}

COMPILE_OPTION {SQLITE_ENABLE_UNKNOWN_SQL_FUNCTION} {
1202
1203
1204
1205
1206
1207
1208
1209

1210
1211
1212
1213
1214
1215
1216
compilation switches all have the same effect:<br>
-DSQLITE_OMIT_ALTERTABLE<br>
-DSQLITE_OMIT_ALTERTABLE=1<br>
-DSQLITE_OMIT_ALTERTABLE=0
</p>

<p>If any of these options are defined, then the same set of SQLITE_OMIT_*
options must also be defined when using the 'lemon' tool to generate the

parse.c file and when compiling the 'mkkeywordhash' tool which generates 
the keywordhash.h file.
Because of this, these options may only be used when the library is built
from canonical source, not from the [amalgamation].
Some SQLITE_OMIT_* options might work, or appear to work, when used with
the [amalgamation].  But this is not guaranteed.  In general, always compile
from canonical sources in order to take advantage of SQLITE_OMIT_* options.







|
>







1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
compilation switches all have the same effect:<br>
-DSQLITE_OMIT_ALTERTABLE<br>
-DSQLITE_OMIT_ALTERTABLE=1<br>
-DSQLITE_OMIT_ALTERTABLE=0
</p>

<p>If any of these options are defined, then the same set of SQLITE_OMIT_*
options must also be defined when using the [Lemon parser generator]
tool to generate the
parse.c file and when compiling the 'mkkeywordhash' tool which generates 
the keywordhash.h file.
Because of this, these options may only be used when the library is built
from canonical source, not from the [amalgamation].
Some SQLITE_OMIT_* options might work, or appear to work, when used with
the [amalgamation].  But this is not guaranteed.  In general, always compile
from canonical sources in order to take advantage of SQLITE_OMIT_* options.
Added pages/lemon.in.








































































































































































































































































































>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
<title>The Lemon LALR(1) Parser Generator</title>
<tcl>hd_keywords {Lemon parser generator} {Lemon} \
                 {Lemon LALR(1) parser generator}</tcl>

<table_of_contents>

<h1>Overview</h1>

<p>The SQL language parser for SQLite is generated using a code-generator
program called "Lemon".  The Lemon program reads a grammar of the input
language and emits C-code to implement a parser for that langauge.


<h2>Lemon Source Files And Documentation</h2>

<p>Lemon does not have its own source repository.  Rather, Lemon consists
of a few files in the SQLite source tree:

<ul>
<li><p>
     [https://sqlite.org/src/doc/trunk/doc/lemon.html|lemon.html] &rarr;
     The original detailed usage documentation and programmers reference
     for Lemon.
<li><p>
     [https://sqlite.org/src/file/tool/lemon.c|lemon.c] &rarr; The source code
     for the utility program that reads a grammar file and generates 
     corresponding parser C-code.
<li><p>
     [https://sqlite.org/src/file/tool/lempar.c|lempar.c] &rarr; A template
     for the generated parser C-code.  The "lemon" utility program reads this
     template and inserts additional code in order to generate a parser.
</ul>

<h1>Advantages of Lemon</h1>

<p>Lemon generates an LALR(1) parser.  It's operation is similar to the
more familiar tools [https://en.wikipedia.org/wiki/Yacc|Yacc] and
[https://en.wikipedia.org/wiki/GNU_bison|Bison], but Lemon include important
improvements, including:

<ul>
<li><p>
     The grammar syntax is less error prone - using symbol names for
     semantic values rather that the "$1"-style positional notation
     of Yacc.
<li><p>
     In Lemon, the tokenizer calls the parser.  Yacc operates the other
     way around, with the parser calling the tokenizer.  The Lemon
     approach is reentrant and threadsafe, whereas Yacc uses global 
     variables and is therefore neither.  Reentrancy is especially
     important for SQLite since some SQL statements make recursive calls
     to the parser.  For example, when parsing a CREATE TABLE statement,
     SQLite invokes the parser recursively to generate an INSERT statement
     to make a new entry in the [sqlite_master] table.
<li><p>
     Lemon has the concept of a non-terminal destructor that can be
     used to reclaim memory or other resources following an syntax error
     or other aborted parse.
</ul>

<h2>Use of Lemon Within SQLite</h2>

<p>Lemon is used in two places in SQLite.

<p>The primary use of Lemon is to create the SQL language parser.
A grammar file ([https://sqlite.org/src/file/src/parse.y|parse.y]) is
compiled by Lemon into parse.c and parse.h.  The parse.c file is
incorporated into the [amalgamation] without further modification.
The parse.h file is post-processed by the
[https://sqlite.org/src/file/tool/addopcodes.tcl|addopcodes.tcl] script
before being incorporated into the [amalgamation].

<p>Lemon is also used to generate parse for the query pattern
expressions in the [FTS5] extension.  In this case, the input grammar
file is [https://sqlite.org/src/file/ext/fts5/fts5parse.y|fts5parse.y].

<h2>Lemon Customizations Especially For SQLite</h2>

<p>One of the advantages of hosting code generator tool as part of
the project is that the tools can be optimized to serve specific needs of
the overall project.  Lemon has benefited from this effect. Over the years,
the Lemon parser generator has been extended and enhanced to provide
new capabilities and improved performance to SQLite.  A few of the
specific enhancements to Lemon that are specifically designed for use
by SQLite include:

<ul>
<li><p>
Lemon has the concept of a "fallback" tokens.
The SQL language contains a large number of keywords and these keywords
have the potential to collide with identifier names.
Lemon has the ability to designate some keywords has being able to
"fallback" to an indentifier.  If the keyword appears in the input token
stream in a context that would otherwise be a syntax error, the token
is automatically transformed into its fallback before the syntax error
is raised.  This feature allows the parser to be very forgiving of
reserved words used as identifiers, which is a problem that comes up
frequently in the SQL language.

<li><p>
In support of the [MC/DC|100% MC/DC testing] goal for SQLite, 
the parser code generated by Lemon has no unreachable branches,
and contains extra (compile-time selected) instrumentation useful
for measuring test coverage.

<li><p>
Lemon supports conditional compilation of grammar file rules, so that
a different parser can be generated depending on compile-time options.

<li><p>
As a performance optimization, reduce actions in the Lemon input grammar
are allowed to contain comments of the form "/*A-overwrites-Z*/" to indicate
that the semantic value "A" on the right-hand side of the rule is allowed
to directly overwrite the semantic value "Z" on the left-hand side.
This simple optimization reduces the number of stack operations in the
push-down automaton used to parse the input grammar, and thus improve
performance of the parser.  It also makes the generated code a little smaller.
</ul>

<p>The parsing of SQL statements is a significant consumer of CPU cycles 
in any SQL database engine.  On-going efforts to optimize SQLite have caused
the developers to spend a lot of time tweaking Lemon to generate faster
parsers.  These efforts have benefited all users of the Lemon parser generator,
not just SQLite.  But if Lemon had been a separately maintained tool, it
would have been more difficulty to make coordinated changes to both SQLite
and Lemon, and as a result not as much optimization would have been
accomplished.  Hence, the fact that the parser generator tool is included
in the source tree for SQLite has turned out to be a net benefit for both
the tool itself and for SQLite.

<h1>History Of Lemon</h1>

<p>Lemon was original written by D. Richard Hipp (also the creator of SQLite)
while he was in graduate school at Duke University between 1987 and 1992.
The original creation date of Lemon has been lost, but was probably sometime
around 1990.  Lemon generates an LALR(1) parser.  There was companion 
LL(1) parser generator tool named "Lime", but the source code for Lime
has been lost.

<p>The Lemon source code was originally written as separate source files,
and only later merged into a single "lemon.c" source file.

<p>The author of Lemon and SQLite (Hipp) reports that his C programming
skills were greatly enhanced by studing John Ousterhout's original
source code to Tcl.  Hipp discovered and studied Tcl in 1993.  Lemon
was written before then, and SQLite afterwards.  There is a clear
difference in the coding styles of these two products, with SQLite seeming
to be cleaner, more readable, and easier to maintain.