Documentation Source Text

Check-in [846fd3a3f7]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Update the change log. Typo fixes in the floating-point documentation.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: 846fd3a3f79bb48712f0268d57856f1f623fc8eb134a914b2fb3f38a2e6a6209
User & Date: drh 2020-07-15 01:42:10.598
Context
2020-07-15
01:43
Merge fixes from the 3.32 branch. (check-in: bd3dff05f9 user: drh tags: trunk)
01:42
Update the change log. Typo fixes in the floating-point documentation. (check-in: 846fd3a3f7 user: drh tags: trunk)
2020-07-14
00:20
Initial draft of documentation on floating point, IEEE 754 binary64, and the ieee754 and decimal extensions. (check-in: 448d185aac user: drh tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to pages/changes.in.
21
22
23
24
25
26
27


28
29
30
31
32
33
34
35

36
37
38
39
40
41
42
43
44
45


46
47
48
49
50
51
52
  if {[regexp {\(([0-9.]+)\)} $date all vers]} {
    set xrefChng($vers) $nChng
  }
  incr nChng
}

chng {2020-08-22 (3.33.0)} {


<li> [CLI] enhancements:
    <ol type="a">
    <li> Added four new [.mode|output modes]: "box", "json", "markdown",
         and "table".
    <li> The "column" output mode automatically expands columns to
         contain the longest output row and automatically turns
         ".header" on if it has not been previously set.
    <li> The "quote" output mode honors ".separator"

    </ol>
<li> Query planner improvments:
     <ol type="a">
     <li> Add the ability to find a
          full-index-scan query plan for queries using [INDEXED BY]
          which previously would fail with "no query solution".
     <li> Do a better job of
          detecting missing, incomplete, and/or dodgy [sqlite_stat1]
          data and generates good query plans in spite of the
          misinformation.


     </ol>
}

chng {2020-06-18 (3.32.3)} {
<li> Various minor bug fixes including fixes for tickets
     [https://www.sqlite.org/src/info/8f157e8010b22af0|8f157e8010b22af0],
     [https://www.sqlite.org/src/info/9fb26d37cefaba40|9fb26d37cefaba40],







>
>








>










>
>







21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
  if {[regexp {\(([0-9.]+)\)} $date all vers]} {
    set xrefChng($vers) $nChng
  }
  incr nChng
}

chng {2020-08-22 (3.33.0)} {
<li> Added the [decimal extension] for doing arbitrary-precision decimal arithmetic.
<li> Enhancments to the [ieee754 extension] for working with IEEE 754 binary64 numbers.
<li> [CLI] enhancements:
    <ol type="a">
    <li> Added four new [.mode|output modes]: "box", "json", "markdown",
         and "table".
    <li> The "column" output mode automatically expands columns to
         contain the longest output row and automatically turns
         ".header" on if it has not been previously set.
    <li> The "quote" output mode honors ".separator"
    <li> The [decimal extension] and the [ieee754 extension] are built-in to the CLI
    </ol>
<li> Query planner improvments:
     <ol type="a">
     <li> Add the ability to find a
          full-index-scan query plan for queries using [INDEXED BY]
          which previously would fail with "no query solution".
     <li> Do a better job of
          detecting missing, incomplete, and/or dodgy [sqlite_stat1]
          data and generates good query plans in spite of the
          misinformation.
     <li> Improved performance of queries like "SELECT min(x) FROM t WHERE y IN (?,?,?)"
          assuming an index on t(x,y).
     </ol>
}

chng {2020-06-18 (3.32.3)} {
<li> Various minor bug fixes including fixes for tickets
     [https://www.sqlite.org/src/info/8f157e8010b22af0|8f157e8010b22af0],
     [https://www.sqlite.org/src/info/9fb26d37cefaba40|9fb26d37cefaba40],
Changes to pages/floatingpoint.in.
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
are 1.845e+19 different possible floating point values.  On the other hand
there are infinitely many real numbers in the range of 
1.7977e+308 and 4.9407e-324.  It follows then that binary64 cannot possibly
represent all possible real numbers within that range.  Approximations are
required.

<p>
An IEEE 754 floating-point is an integer multiplied by a power
of two:

<blockquote>
<big>M &times 2<sup><small>E</small></sup></big>
</blockquote>

<p>The M value is the "mantissa" and E is the "exponent".  Both
M and E are integers.

<p>For Binary64, M is a 53-bit integer and E is an 11-bit integer that
represents a range of values between -1074 and +972, inclusive.

<p><i>(NB:  The usual description of IEEE 754 is more complex, and it is important
to understand the added complexity if you really want to appreciate the details,
merits, and limitations of IEEE 754.  However, the integer description shown
here, while not exactly right, is easier to understand and is sufficient for
the purposes of this article.)</i></p>








|










|







69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
are 1.845e+19 different possible floating point values.  On the other hand
there are infinitely many real numbers in the range of 
1.7977e+308 and 4.9407e-324.  It follows then that binary64 cannot possibly
represent all possible real numbers within that range.  Approximations are
required.

<p>
An IEEE 754 floating-point value is an integer multiplied by a power
of two:

<blockquote>
<big>M &times 2<sup><small>E</small></sup></big>
</blockquote>

<p>The M value is the "mantissa" and E is the "exponent".  Both
M and E are integers.

<p>For Binary64, M is a 53-bit integer and E is an 11-bit integer that
offset so that represents a range of values between -1074 and +972, inclusive.

<p><i>(NB:  The usual description of IEEE 754 is more complex, and it is important
to understand the added complexity if you really want to appreciate the details,
merits, and limitations of IEEE 754.  However, the integer description shown
here, while not exactly right, is easier to understand and is sufficient for
the purposes of this article.)</i></p>

111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
<p>Which works out to be:

<blockquote>
47.49000000000000198951966012828052043914794921875
</blockquote>

<p>That number is very close to 47.49, but it is not exact.  It is a little
too big.  But if we reduce M by one to 6683623321994526 so that we have the
next smaller possible binary64 value, we get:

<blockquote>
47.4899999999999948840923025272786617279052734375
</blockquote>









|







111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
<p>Which works out to be:

<blockquote>
47.49000000000000198951966012828052043914794921875
</blockquote>

<p>That number is very close to 47.49, but it is not exact.  It is a little
too big.  If we reduce M by one to 6683623321994526 so that we have the
next smaller possible binary64 value, we get:

<blockquote>
47.4899999999999948840923025272786617279052734375
</blockquote>


135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156

<p>If you remember nothing else about floating-point values, 
please don't forget this one key idea.

<h3>Is it close enough?</h3>

<p>15 significant digits of precision are sufficient for most computations.
For example, if "47.49" represents the price, and inflation is running
at 2% per year, then the price is going up by about 0.0000000301 dollars per
second.  The error in the recorded value of 47.49 represents about 65 nanoseconds
worth of inflation.  So, if price of 47.49 is exact
when you enter it, then value actually stored
(47.4900000000000019895196601282805204391479492187) will be exact about
65 nanoseconds later, due to inflation, and will be too small long before you
get your finger off of the Enter key.
Surely that level of precision is sufficient for most purposes?

<h1>Extensions For Dealing With Floating Point Numbers</h1>

<tcl>hd_fragment ieee754ext {ieee754 extension}</tcl>
<h2>The ieee754.c Extension</h2>








|


|
|
|
|
<







135
136
137
138
139
140
141
142
143
144
145
146
147
148

149
150
151
152
153
154
155

<p>If you remember nothing else about floating-point values, 
please don't forget this one key idea.

<h3>Is it close enough?</h3>

<p>15 significant digits of precision are sufficient for most computations.
For example, if "47.49" represents a price and inflation is running
at 2% per year, then the price is going up by about 0.0000000301 dollars per
second.  The error in the recorded value of 47.49 represents about 65 nanoseconds
worth of inflation.  So if 47.49 price is exact
when you enter it, then the effects of inflation will cause the value actually stored
(47.4900000000000019895196601282805204391479492187) to be exact about
65 nanoseconds later.

Surely that level of precision is sufficient for most purposes?

<h1>Extensions For Dealing With Floating Point Numbers</h1>

<tcl>hd_fragment ieee754ext {ieee754 extension}</tcl>
<h2>The ieee754.c Extension</h2>

188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
│               x               │
├───────────────────────────────┤
│ ieee754(6683623321994527,-47) │
└───────────────────────────────┘
</codeblock>

<p>
Going in the other direction, the 2-argument version of ieee754() tags
the M and E values and converts them into the corresponding F value:

<codeblock>
sqlite> select ieee754(6683623321994527,-47) as x;
┌───────┐
│   x   │
├───────┤







|







187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
│               x               │
├───────────────────────────────┤
│ ieee754(6683623321994527,-47) │
└───────────────────────────────┘
</codeblock>

<p>
Going in the other direction, the 2-argument version of ieee754() takes
the M and E values and converts them into the corresponding F value:

<codeblock>
sqlite> select ieee754(6683623321994527,-47) as x;
┌───────┐
│   x   │
├───────┤
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
</ul>
</p>

<p>These functions respectively add, substract, and multiply their arguments
and return a new text string that is the decimal representation of the result.
There is no division operator at this time.

<p>Use the decimal_cmp(A,B) to compare to decimal values.  The result will
be negative, zero, or positive if A is less than, equal to, or greater than B,
respectively.

<p>The decimal_sum(X) function is an aggregate, like the built-in
[sum() aggregate function] except that decimal_sum() computes its result
to arbitrary precision and is therefore precise.

<p>Finally, the decimal extension provides the "decimal" collating sequences
that compares decimal text strings in numeric order.

<h1>Techniques</h1>

<p>
The following SQL illustrates how to use the ieee754 and decimal
extensions compute the exact decimal equivalent
for a binary64 floating-point number.

<codeblock>
-- The pow2 table will hold all the necessary powers of two.
CREATE TABLE pow2(x INTEGER PRIMARY KEY, v TEXT);
WITH RECURSIVE c(x,v) AS (
  VALUES(0,'1')
  UNION ALL
  SELECT x+1, decimal_mul(v,'2') FROM c WHERE x+1&lt;=971
) INSERT INTO pow2(x,v) SELECT x, v FROM c;
WITH RECURSIVE c(x,v) AS (
  VALUES(-1,'0.5')
  UNION ALL
  SELECT x-1, decimal_mul(v,'0.5') FROM c WHERE x-1&gt;=-1075
) INSERT INTO pow2(x,v) SELECT x, v FROM c;

-- This query finds the decimal representation of each value in the "c" table.
WITH c(n) AS (VALUES(47.49))
      ---------------^^^^^---- Replace with whatever you want
SELECT decimal_mul(ieee754_mantissa(c.n),pow2.v)
  FROM pow2, c WHERE pow2.x=ieee754_exponent(c.n);
</codeblock>







|














|


















|



282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
</ul>
</p>

<p>These functions respectively add, substract, and multiply their arguments
and return a new text string that is the decimal representation of the result.
There is no division operator at this time.

<p>Use the decimal_cmp(A,B) to compare two decimal values.  The result will
be negative, zero, or positive if A is less than, equal to, or greater than B,
respectively.

<p>The decimal_sum(X) function is an aggregate, like the built-in
[sum() aggregate function] except that decimal_sum() computes its result
to arbitrary precision and is therefore precise.

<p>Finally, the decimal extension provides the "decimal" collating sequences
that compares decimal text strings in numeric order.

<h1>Techniques</h1>

<p>
The following SQL illustrates how to use the ieee754 and decimal
extensions to compute the exact decimal equivalent
for a binary64 floating-point number.

<codeblock>
-- The pow2 table will hold all the necessary powers of two.
CREATE TABLE pow2(x INTEGER PRIMARY KEY, v TEXT);
WITH RECURSIVE c(x,v) AS (
  VALUES(0,'1')
  UNION ALL
  SELECT x+1, decimal_mul(v,'2') FROM c WHERE x+1&lt;=971
) INSERT INTO pow2(x,v) SELECT x, v FROM c;
WITH RECURSIVE c(x,v) AS (
  VALUES(-1,'0.5')
  UNION ALL
  SELECT x-1, decimal_mul(v,'0.5') FROM c WHERE x-1&gt;=-1075
) INSERT INTO pow2(x,v) SELECT x, v FROM c;

-- This query finds the decimal representation of each value in the "c" table.
WITH c(n) AS (VALUES(47.49))
                 ----XXXXX----------- Replace with whatever you want
SELECT decimal_mul(ieee754_mantissa(c.n),pow2.v)
  FROM pow2, c WHERE pow2.x=ieee754_exponent(c.n);
</codeblock>