Documentation Source Text

Notes About Checkin 393a3d19ae
Login

JSON parsing performance was measured by this script:

.param init
INSERT INTO temp.[$Parameters](key,value) 
VALUES('$json',readfile('/home/drh/tmp/gsoc-2018.json'));
SELECT length(printf('{"a":%d,"b":%s}',50,$json));
.timer on
WITH RECURSIVE c(x) AS (VALUES(1) UNION ALL SELECT x+1 FROM c WHERE x<1000)
SELECT DISTINCT json_valid(printf('{"a":%d,"b":%s}',x,$json)) FROM c;

WITH RECURSIVE c(x) AS (VALUES(1) UNION ALL SELECT x+1 FROM c WHERE x<1000)
SELECT DISTINCT substr(printf('{"a":%d,"b":%s}',x,$json),1,5) FROM c;

It is necessary to feed slightly different JSON strings into the parser on each cycle in order to overcome the json cache. The first query (after starting the timer) measure the parser speed. The second query measure all of the extraneous non-parsing overhead of the first query. The idea is that the time used by the parser is the time of the first query minus the overhead time of the second query. Running the script above on an optimized version of SQLite on a 4-year-old Ubuntu desktop gives:

3327844
1
Run Time: real 3.218 user 3.176000 sys 0.040000
{"a":
Run Time: real 0.275 user 0.268000 sys 0.008000

So roughly 3 seconds were used to parse 3,327,844,000 bytes of JSON, which gives a parsing speed in excess of 1.1 GB/s. Round it down to an even 1 GB/s to be conservative.