Documentation Source Text

Update of "checkin/393a3d19ae2e7b56d1909d4225cc098c7825556473a5df9704a54c6925c1e42b"
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview

Artifact ID: cfbf5c9679105526b3553fcd16b03ed14258277d3f4091fc7dd4e1301d06705e
Page Name:checkin/393a3d19ae2e7b56d1909d4225cc098c7825556473a5df9704a54c6925c1e42b
Date: 2019-03-01 13:37:39
Original User: drh
Content

JSON parsing performance was measured by this script:

.param init
INSERT INTO temp.[$Parameters](key,value) 
VALUES('$json',readfile('/home/drh/tmp/gsoc-2018.json'));
SELECT length(printf('{"a":%d,"b":%s}',50,$json));
.timer on
WITH RECURSIVE c(x) AS (VALUES(1) UNION ALL SELECT x+1 FROM c WHERE x<1000)
SELECT DISTINCT json_valid(printf('{"a":%d,"b":%s}',x,$json)) FROM c;

WITH RECURSIVE c(x) AS (VALUES(1) UNION ALL SELECT x+1 FROM c WHERE x<1000)
SELECT DISTINCT substr(printf('{"a":%d,"b":%s}',x,$json),1,5) FROM c;

It is necessary to feed slightly different JSON strings into the parser on each cycle in order to overcome the json cache. The first query (after starting the timer) measure the parser speed. The second query measure all of the extraneous non-parsing overhead of the first query. The idea is that the time used by the parser is the time of the first query minus the overhead time of the second query. Running the script above on an optimized version of SQLite on a 4-year-old Ubuntu desktop gives:

3327844
1
Run Time: real 3.218 user 3.176000 sys 0.040000
{"a":
Run Time: real 0.275 user 0.268000 sys 0.008000

So roughly 3 seconds were used to parse 3,327,844,000 bytes of JSON, which gives a parsing speed in excess of 1.1 GB/s. Round it down to an even 1 GB/s to be conservative.