Documentation Source Text

Check-in [e7ab215c5b]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Updates to the whyc.html and assert.html documentation pages.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: e7ab215c5bf6c5b024069393e610fec32f8ca7d193c4038d8faa340f684671e0
User & Date: drh 2018-03-23 17:42:31.804
Context
2018-03-23
17:50
Tweaks to the new assert documentation. (check-in: 19d3bdb7e3 user: drh tags: trunk)
17:42
Updates to the whyc.html and assert.html documentation pages. (check-in: e7ab215c5b user: drh tags: trunk)
14:53
Clarification of text in the new optoverview.html section on the LEFT JOIN Optimization. (check-in: 9e6750c1fc user: drh tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to pages/assert.in.
1

2
3
4
5
6
7
8
<title>The Use Of assert() In SQLite</title>


<table_of_contents>

<h1>Assert() And Similar Macros In SQLite</h1>

<p>
The assert(X) macro is 

>







1
2
3
4
5
6
7
8
9
<title>The Use Of assert() In SQLite</title>
<tcl>hd_keywords {The Use Of assert() In SQLite}</tcl>

<table_of_contents>

<h1>Assert() And Similar Macros In SQLite</h1>

<p>
The assert(X) macro is 
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
[https://blog.regehr.org/archives/1576|tacit acknowledgement that they
do not fully believe that X is always true].
We believe that this use of assert(X) is wrong and violates the intent
and purpose of having assert(X) available in C in the first place.
An assert(X) should not be seen as a safety-net or top-rope used to
guard against mistakes.  Nor is assert(X) appropriate for defense-in-depth.
An ALWAYS(X) or NEVER(X) macro, or something similar, should be used in 
those cases because ALWAYS(X) or NEVER(X) will followed by code to
actually deal with the problem in the case where the programmers reasoning
turns out to be wrong.  Since the code that follows ALWAYS(X) or NEVER(X)
is untested, it should be something very simple, like a "return" statement,
that is easily verified by inspection.

<p>The [https://golang.org|Go programming language] omits assert().
The Go developers







|







61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
[https://blog.regehr.org/archives/1576|tacit acknowledgement that they
do not fully believe that X is always true].
We believe that this use of assert(X) is wrong and violates the intent
and purpose of having assert(X) available in C in the first place.
An assert(X) should not be seen as a safety-net or top-rope used to
guard against mistakes.  Nor is assert(X) appropriate for defense-in-depth.
An ALWAYS(X) or NEVER(X) macro, or something similar, should be used in 
those cases because ALWAYS(X) or NEVER(X) will be followed by code to
actually deal with the problem in the case where the programmers reasoning
turns out to be wrong.  Since the code that follows ALWAYS(X) or NEVER(X)
is untested, it should be something very simple, like a "return" statement,
that is easily verified by inspection.

<p>The [https://golang.org|Go programming language] omits assert().
The Go developers
145
146
147
148
149
150
151
152

153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178

179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201

<p>An assert() statement is often used to validate pre-conditions on 
internal functions and methods.
Example: [https://sqlite.org/src/artifact/c1e97e4c6f?ln=1048].
This is deemed better than simply stating the pre-condition in a header 
comment, since the assert() is actually executed.  In a highly tested
program like SQLite, the reader knows that the pre-condition is true
for all of the hundreds of millions of test cases run against SQLite.

since by using an assert.  A text pre-condition statement in a header comment
might have been true when the code was written, but who is to say that
it is still true now?

<p>
Sometimes SQLite uses compile-time evaluatable assert() statements.
Consider the code at
[https://sqlite.org/src/artifact/c1e97e4c6f?ln=2130-2138].
Four assert() statements verify the values for compile-time constants
so that the reader can quickly check the validity of the if-statement
that follows, without having to look up the constant values in a separate
header file.

<p>
Sometimes compile-time assert() statements are used to verify that
SQLite has been correctly compiled.  For example, the code at
[https://sqlite.org/src/artifact/c1e97e4c6f?ln=157].
verifies that the SQLITE_PTRSIZE preprocessor macro is set correctly
for the target architecture.

<p>
The CORRUPT_DB macro is used in many assert() statements.
In functional testing builds, CORRUPT_DB references a global variable
that is true if the database file might contain corruption.  This variable
is true by default, since we do not normally know whether or not a database
is corrupt, but during testing while working on databases that are known

to not be corrupt, it can be set to false.  Then the CORRUPT_DB macro
can be used in assert() statements such as seen at
[https://sqlite.org/src/artifact/18a53540aa3?ln=1679-1680].
Those assert()s specify pre-conditions to the routine that are true for
consistent database files, but which might be false if the database file
is corrupt. Knowledge of these kinds of conditions is very helpful to
readers who are trying to understand a block of code in isolation.

<p>
ALWAYS(X) and NEVER(X) functions are used in places where we always
want the test to occur even though the developers believe the value of
X is always true or false.  For example, the sqlite3BtreeCloseCursor()
routine shown must remove the closing cursor from a linked list of all
cursors.  We know that the cursor is on the list, so that the loop
must terminate by the "break" statement, but it is convenient to
us the ALWAYS(X) test at
[https://sqlite.org/src/artifact/18a53540aa3?ln=4371] to prevent
running off the end of the linked list in case there is an error in some
other part of the code that has corrupted the linked list.

<p>
An ALWAYS(X) or NEVER(X) sometimes verifies pre-conditions that are
subject to change if other parts of the code are modified in







|
>
|
|
|













|









>
|














|







146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204

<p>An assert() statement is often used to validate pre-conditions on 
internal functions and methods.
Example: [https://sqlite.org/src/artifact/c1e97e4c6f?ln=1048].
This is deemed better than simply stating the pre-condition in a header 
comment, since the assert() is actually executed.  In a highly tested
program like SQLite, the reader knows that the pre-condition is true
for all of the hundreds of millions of test cases run against SQLite,
since it has been verified by the assert().
In contrast, a text pre-condition statement in a header comment
is untested.  It might have been true when the code was written, 
but who is to say that it is still true now?

<p>
Sometimes SQLite uses compile-time evaluatable assert() statements.
Consider the code at
[https://sqlite.org/src/artifact/c1e97e4c6f?ln=2130-2138].
Four assert() statements verify the values for compile-time constants
so that the reader can quickly check the validity of the if-statement
that follows, without having to look up the constant values in a separate
header file.

<p>
Sometimes compile-time assert() statements are used to verify that
SQLite has been correctly compiled.  For example, the code at
[https://sqlite.org/src/artifact/c1e97e4c6f?ln=157]
verifies that the SQLITE_PTRSIZE preprocessor macro is set correctly
for the target architecture.

<p>
The CORRUPT_DB macro is used in many assert() statements.
In functional testing builds, CORRUPT_DB references a global variable
that is true if the database file might contain corruption.  This variable
is true by default, since we do not normally know whether or not a database
is corrupt, but during testing while working on databases that are known
to be well-formed, that global variable can be set to false.
Then the CORRUPT_DB macro
can be used in assert() statements such as seen at
[https://sqlite.org/src/artifact/18a53540aa3?ln=1679-1680].
Those assert()s specify pre-conditions to the routine that are true for
consistent database files, but which might be false if the database file
is corrupt. Knowledge of these kinds of conditions is very helpful to
readers who are trying to understand a block of code in isolation.

<p>
ALWAYS(X) and NEVER(X) functions are used in places where we always
want the test to occur even though the developers believe the value of
X is always true or false.  For example, the sqlite3BtreeCloseCursor()
routine shown must remove the closing cursor from a linked list of all
cursors.  We know that the cursor is on the list, so that the loop
must terminate by the "break" statement, but it is convenient to
use the ALWAYS(X) test at
[https://sqlite.org/src/artifact/18a53540aa3?ln=4371] to prevent
running off the end of the linked list in case there is an error in some
other part of the code that has corrupted the linked list.

<p>
An ALWAYS(X) or NEVER(X) sometimes verifies pre-conditions that are
subject to change if other parts of the code are modified in
Changes to pages/whyc.in.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
<title>Why Is SQLite Coded In C</title>

<table_of_contents>

<h1>C Is Best</h1>

<p>
Since its inception on 2000-05-29, SQLite has been implemented in generic C.
C was and continues to be the best language for implementing a software
library like SQLite.  There are no plans to recode SQLite in any other
programming language anytime soon.

<p>
The reasons why C is the best language to implement SQLite include:


<ul>
<li> Performance










|







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
<title>Why Is SQLite Coded In C</title>

<table_of_contents>

<h1>C Is Best</h1>

<p>
Since its inception on 2000-05-29, SQLite has been implemented in generic C.
C was and continues to be the best language for implementing a software
library like SQLite.  There are no plans to recode SQLite in any other
programming language at this time.

<p>
The reasons why C is the best language to implement SQLite include:


<ul>
<li> Performance
90
91
92
93
94
95
96





































































































































<p>
The C language is old and boring.
It is a well-known and well-understood language.
This is exactly what one wants when developing a module like SQLite.
Writing a small, fast, and reliable database engine is hard enough as it
is without the implementation language changing out from under you with
each update to the implementation language specification.












































































































































>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
<p>
The C language is old and boring.
It is a well-known and well-understood language.
This is exactly what one wants when developing a module like SQLite.
Writing a small, fast, and reliable database engine is hard enough as it
is without the implementation language changing out from under you with
each update to the implementation language specification.

<h1>Why Isn't SQLite Coded In An Object-Oriented Language?</h1>

<p>
Some programmers cannot imagine developing a complex system like
SQLite in a language that is not "object oriented".  So why is
SQLite not coded in C++ or Java?

<ol>
<li><p>
Libraries written in C++ or Java can generally only be used by
applications written in the same language. It is difficult to
get an application written in Haskell or Java to invoke a library
written in C++.  On the other hand, libraries written in C are
callable from any programming language.

<li><p>
Object-Oriented is a design pattern, not a programming language.
You can do object-oriented programming in any language you want,
including assembly language.  Some languages (ex: C++ or Java) make
object-oriented easier.  But you can still do object-oriented programming
in languages like C.

<li><p>
Object-oriented is not the only valid design pattern.
Many programmers have been taught to think purely in terms of
objects.  And, to be fair, objects are often a good way to
decompose a problem.  But objects are not the only way, and are
not always the best way to decompose a problem.  Sometimes good old
procedural code is easier to write, easier to maintain and understand,
and faster than object-oriented code.

<li><p>
When SQLite was first being developed, Java was a young an immature
language.  C++ was older, but was undergoing such growing pains that
it was difficult to find any two C++ compilers that worked the same
way.  So C was definitely a better choice back when SQLite was first
being developed.  The situation is less stark now, but there is little
to no benefit in recoding SQLite at this point.
</ol>

<h1>Why Isn't SQLite Coded In A "Safe" Language?</h1>

<p>
There has lately been a lot of interest in "safe" programming languages
like Rust or Go in which is impossible, or at least difficult, to make
common programming errors like memory leaks or array overruns.  So the
question often arises as to why SQLite is not coded in a "safe" language.

<ol>
<li><p>
None of the safe programming languages existed for the first 10 years
of SQLite's existance.  SQLite could be recoded in Go or Rust, but doing
so would probably introduce far more bugs than would be fixed, and it
seems also likely to result in slower code.

<li><p>
Safe programming languages solve the easy problems: memory leaks,
use-after-free errors, array overruns, etc.  Safe languages provide
no help beyond ordinary C code in solving the rather more difficult
problem of computing a correct answer to an SQL statement.

<li><p>
Safe languages are often touted for helping to prevent
security vulnerabilities. True enough, but SQLite is
not a particularly security-sensitive library.  If an application is
running untrusted and unverified SQL, then it already has way bigger
security issues (SQL injection) that no "safe" language will fix.
<p>
It is true that applications sometimes import complete binary SQLite
database files from untrusted sources, and such imports could present a
possible attack vector.  However, those code paths in SQLite are
limited and are extremely well tested.  And pre-validation routines 
are available to applications that want to read untrusted databases
that can help detect possible attacks prior to use.

<li><p>
Some "safe" languages (ex: Go) dislike the use of assert().  But
the use of assert() is a vital part of keeping SQLite maintainable.
The lack of assert() in Go is a show-stopper as far as the developers
of SQLite are concerned.  See the [The Use Of assert() In SQLite]
article for additional information.

<li><p>
Safe languages insert additional machine branches to do things like
verify that array accesses are in-bounds.  In correct code, those
branches are never taken.  That means that the machine code cannot
be 100% branch tested, which is an important component of SQLite's
quality strategy.

<li><p>
Safe languages usually want to abort if they encounter an out-of-memory
(OOM) situation.  SQLite is designed to recovery gracefully from an OOM.
It is unclear how this could be accomplished in the current crop of
safe languages.

<li><p>
All of the existing safe languages are new.  The developers of SQLite
applaud the efforts of computer language researchers in trying to
develop languages that are easier to program safely.  We encourage these
efforts to continue.  Be we ourselves are more interested in old and
boring languages when it comes to implementing SQLite.
</ol>

<p>
All that said, it is possible that SQLite might
one day be recoded in Rust.  Recoding SQLite in Go is unlikely
since Go hates assert().  But Rust is a possibility.  Some
preconditions that must occur before SQLite is recoded in Rust
include:

<p>
<ol type="A">
<li> Rust needs to mature a little more, stop changing so fast, and
     move further toward being old and boring.
<li> Rust needs to demonstrate that it can be used to create general-purpose
     libraries that are callable from all other programming languages.
<li> Rust needs to demonstrate that it can produce object code that
     works on obscure embedded devices, including devices that lack
     an operating system.
<li> Rust needs to pick up the necessary tooling that enables one to
     do 100% branch coverage testing of the compiled binaries.
<li> Rust needs a mechanism to recover gracefully from OOM errors.
<li> Rust needs to demonstrate that it can do the kinds of work that
     C does in SQLite without a significant speed penalty.
</ol>

<p>
If you are a "rustacean" and feel that Rust already meets the
preconditions listed above, and that SQLite should be recoded in
Rust, then you are welcomed and encouraged
to contact the SQLite developers privately
and argue your case.