SQLite

View Ticket
Login
Ticket Hash: 7bbfb7d4422ff413b065b410bf9f3dd00a4dd2a6
Title: Problem with using stale auto-indexes created on materialized sub-selects within trigger programs
Status: Fixed Type: Code_Defect
Severity: Important Priority: Immediate
Subsystem: Virtual_Machine Resolution: Fixed
Last Modified: 2011-12-10 13:53:09
Version Found In: 3.7.9
Description:
Sometimes when a query features a non-correlated sub-select, SQLite executes the query and stores the results in an ephemeral table. If required, it may also generate an automatic-index based on the contents of the ephemeral table. If it does so, it generates the automatic-index lazily - when it is first required by the VDBE at runtime. Once the automatic index has been generated, a flag is set by the OP_Once opcode to indicate that this has taken place and the index does not need to be generated again.

If all this happens within a trigger program, both the automatic index and the flag persist for the lifetime of the sqlite3_step() call, even if the trigger program is invoked more than once (either recursively or because the calling statement modifies or inserts more than one row). The ephemeral table on the other hand, is regenerated each time the trigger program is invoked.

The bug is that even though the sub-select may be non-correlated from the point of view of the trigger program, it may depend on the values of new.* or old.* registers. And so it may change for each invocation of the trigger program. Since the index will not be updated in this case, the second and subsequent invocations of the trigger program can end up using an automatic-index filled with the wrong data. Obviously this can cause a malfunction.

For example:

  CREATE TABLE t1(a, b);
  INSERT INTO t1 VALUES(1, 'one'); 
  INSERT INTO t1 VALUES(2, 'two');
  INSERT INTO t1 VALUES(3, 'three');

  CREATE TABLE t2(c, d);
  INSERT INTO t2 VALUES('one', 'I');
  INSERT INTO t2 VALUES('two', 'II');
  INSERT INTO t2 VALUES('three', 'III');

  CREATE TABLE t3(t3_a PRIMARY KEY, t3_d);
  CREATE TRIGGER t3t AFTER INSERT ON t3 WHEN new.t3_d IS NULL BEGIN
    UPDATE t3 SET t3_d = (
      SELECT d FROM
        (SELECT * FROM t1 WHERE (new.t3_a%2)=(rowid%2) LIMIT 10),
        (SELECT * FROM t2 WHERE (new.t3_a%2)=(rowid%2) LIMIT 10)
      WHERE a = new.t3_a AND b = c
    ) WHERE t3_a = new.t3_a;
  END;

Then:

sqlite> INSERT INTO t3(t3_a) VALUES(1);
sqlite> INSERT INTO t3(t3_a) VALUES(2);
sqlite> INSERT INTO t3(t3_a) VALUES(3);
sqlite> SELECT * FROM t3;
1|I
2|II
3|III
sqlite> DELETE FROM t3;
sqlite> INSERT INTO t3(t3_a) SELECT 1 UNION SELECT 2 UNION SELECT 3;
sqlite> SELECT * FROM t3;
1|I
2|
3|III

When the three rows above are inserted using discrete INSERT statements, the automatic-index associated with the sub-select in the trigger program is generated each time a row is inserted. And things work. However, when all three rows are inserted using a single INSERT, the automatic-index is generated once when the first row is inserted (the row with t3_a==1) and reused by each invocation of the trigger program thereafter. Since the index generated for (t3_a==1) does not include an entry for the tuple (2, 'two', 'two', 'II'), the second invocation of the trigger program (with t3_a==2) malfunctions and writes a NULL into the table instead of the string 'II'.

Note: For the example dataset, the trigger in the above is equivalent to the following. The inner sub-select is convoluted a bit in order to prevent SQLite from applying the flattening optimization, and so that its results depend on the value of "new.t3_a" when the trigger program is invoked.

  CREATE TRIGGER t3t AFTER INSERT ON t3 WHEN new.t3_d IS NULL BEGIN
    UPDATE t3 SET t3_d = (
      SELECT d FROM t1, t2 WHERE a = new.t3_a AND b = c
    ) WHERE t3_a = new.t3_a;
  END;