Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Clean up the IN operator code generation logic to make it easier to reason about. In the process, improve code generation to omit some unused OP_Null operations. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
7c6fbcfe6ed5739e8e4639b7b123fbf9 |
User & Date: | drh 2014-08-01 15:51:36.528 |
Context
2014-08-01
| ||
18:00 | Remove an unnecessary OP_Null in the IN-operator logic. Attempt to clarify comments explaining the IN-operator code, though it is not clear that the comments are correct even yet - more work to be done. (check-in: c11e55fabb user: drh tags: trunk) | |
15:51 | Clean up the IN operator code generation logic to make it easier to reason about. In the process, improve code generation to omit some unused OP_Null operations. (check-in: 7c6fbcfe6e user: drh tags: trunk) | |
15:34 | The idea of coding IN operator with a short list on the RHS as an OR expression turns out to be helpful. If the list is of length 1 or 2, the OR expression is very slightly faster, but the ephemeral table approach is clearly better for all list lengths greater than 2. Better to keep the code simple. (Closed-Leaf check-in: e13175d357 user: drh tags: IN-operator-improvements) | |
01:40 | Enhance the PRAGMA integrity_check command to detect UNIQUE and NOT NULL constraint violations. (check-in: 9abcf2698c user: drh tags: trunk) | |
Changes
Changes to src/expr.c.
︙ | ︙ | |||
1480 1481 1482 1483 1484 1485 1486 | ** The pX parameter is the expression on the RHS of the IN operator, which ** might be either a list of expressions or a subquery. ** ** The job of this routine is to find or create a b-tree object that can ** be used either to test for membership in the RHS set or to iterate through ** all members of the RHS set, skipping duplicates. ** | | | > > > > > > > | | | | | | | | | | > | 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 | ** The pX parameter is the expression on the RHS of the IN operator, which ** might be either a list of expressions or a subquery. ** ** The job of this routine is to find or create a b-tree object that can ** be used either to test for membership in the RHS set or to iterate through ** all members of the RHS set, skipping duplicates. ** ** A cursor is opened on the b-tree object that is the RHS of the IN operator ** and pX->iTable is set to the index of that cursor. ** ** The returned value of this function indicates the b-tree type, as follows: ** ** IN_INDEX_ROWID - The cursor was opened on a database table. ** IN_INDEX_INDEX_ASC - The cursor was opened on an ascending index. ** IN_INDEX_INDEX_DESC - The cursor was opened on a descending index. ** IN_INDEX_EPH - The cursor was opened on a specially created and ** populated epheremal table. ** ** An existing b-tree might be used if the RHS expression pX is a simple ** subquery such as: ** ** SELECT <column> FROM <table> ** ** If the RHS of the IN operator is a list or a more complex subquery, then ** an ephemeral table might need to be generated from the RHS and then ** pX->iTable made to point to the ephermeral table instead of an ** existing table. ** ** The inFlags parameter must contain exactly one of the bits ** IN_INDEX_MEMBERSHIP or IN_INDEX_LOOP. If inFlags contains ** IN_INDEX_MEMBERSHIP, then the generated table will be used for a ** fast membership test. When the IN_INDEX_LOOP bit is set, the ** IN index will be used to loop over all values of the RHS of the ** IN operator. ** ** When IN_INDEX_LOOP is used (and the b-tree will be used to iterate ** through the set members) then the b-tree must not contain duplicates. ** An epheremal table must be used unless the selected <column> is guaranteed ** to be unique - either because it is an INTEGER PRIMARY KEY or it ** has a UNIQUE constraint or UNIQUE index. ** ** When IN_INDEX_MEMBERSHIP is used (and the b-tree will be used ** for fast set membership tests) then an epheremal table must ** be used unless <column> is an INTEGER PRIMARY KEY or an index can ** be found with <column> as its left-most column. ** ** When the b-tree is being used for membership tests, the calling function ** might need to know whether or not the RHS side of the IN operator ** contains a NULL. If prNotFound is not NULL and ** if there is any chance that the (...) might contain a NULL value at ** runtime, then a register is allocated and the register number written ** to *prNotFound. If there is no chance that the (...) contains a ** NULL value, then *prNotFound is left unchanged. ** ** If a register is allocated and its location stored in *prNotFound, then ** its initial value is NULL. If the (...) does not remain constant ** for the duration of the query (i.e. the SELECT within the (...) ** is a correlated subquery) then the value of the allocated register is ** reset to NULL each time the subquery is rerun. This allows the ** caller to use vdbe code equivalent to the following: ** ** if( register==NULL ){ ** has_null = <test if data structure contains null> ** register = 1 ** } ** ** in order to avoid running the <test if data structure contains null> ** test more often than is necessary. */ #ifndef SQLITE_OMIT_SUBQUERY int sqlite3FindInIndex(Parse *pParse, Expr *pX, u32 inFlags, int *prNotFound){ Select *p; /* SELECT to the right of IN operator */ int eType = 0; /* Type of RHS table. IN_INDEX_* */ int iTab = pParse->nTab++; /* Cursor of the RHS table */ int mustBeUnique; /* True if RHS must be unique */ Vdbe *v = sqlite3GetVdbe(pParse); /* Virtual machine being coded */ assert( pX->op==TK_IN ); mustBeUnique = (inFlags & IN_INDEX_LOOP)!=0; /* Check to see if an existing table or index can be used to ** satisfy the query. This is preferable to generating a new ** ephemeral table. */ p = (ExprHasProperty(pX, EP_xIsSelect) ? pX->x.pSelect : 0); if( ALWAYS(pParse->nErr==0) && isCandidateForInOpt(p) ){ |
︙ | ︙ | |||
1626 1627 1628 1629 1630 1631 1632 | if( eType==0 ){ /* Could not find an existing table or index to use as the RHS b-tree. ** We will have to generate an ephemeral table to do the job. */ u32 savedNQueryLoop = pParse->nQueryLoop; int rMayHaveNull = 0; eType = IN_INDEX_EPH; | | < < < > > > | 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 | if( eType==0 ){ /* Could not find an existing table or index to use as the RHS b-tree. ** We will have to generate an ephemeral table to do the job. */ u32 savedNQueryLoop = pParse->nQueryLoop; int rMayHaveNull = 0; eType = IN_INDEX_EPH; if( inFlags & IN_INDEX_LOOP ){ pParse->nQueryLoop = 0; if( pX->pLeft->iColumn<0 && !ExprHasProperty(pX, EP_xIsSelect) ){ eType = IN_INDEX_ROWID; } }else if( prNotFound ){ *prNotFound = rMayHaveNull = ++pParse->nMem; sqlite3VdbeAddOp2(v, OP_Null, 0, rMayHaveNull); } sqlite3CodeSubselect(pParse, pX, rMayHaveNull, eType==IN_INDEX_ROWID); pParse->nQueryLoop = savedNQueryLoop; }else{ pX->iTable = iTab; } return eType; |
︙ | ︙ | |||
1664 1665 1666 1667 1668 1669 1670 | ** to be of the form "<rowid> IN (?, ?, ?)", where <rowid> is a reference ** to some integer key column of a table B-Tree. In this case, use an ** intkey B-Tree to store the set of IN(...) values instead of the usual ** (slower) variable length keys B-Tree. ** ** If rMayHaveNull is non-zero, that means that the operation is an IN ** (not a SELECT or EXISTS) and that the RHS might contains NULLs. | < < | | | < < < < | 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 | ** to be of the form "<rowid> IN (?, ?, ?)", where <rowid> is a reference ** to some integer key column of a table B-Tree. In this case, use an ** intkey B-Tree to store the set of IN(...) values instead of the usual ** (slower) variable length keys B-Tree. ** ** If rMayHaveNull is non-zero, that means that the operation is an IN ** (not a SELECT or EXISTS) and that the RHS might contains NULLs. ** All this routine does is initialize the register given by rMayHaveNull ** to NULL. Calling routines will take care of changing this register ** value to non-NULL if the RHS is NULL-free. ** ** For a SELECT or EXISTS operator, return the register that holds the ** result. For IN operators or if an error occurs, the return value is 0. */ #ifndef SQLITE_OMIT_SUBQUERY int sqlite3CodeSubselect( Parse *pParse, /* Parsing context */ |
︙ | ︙ | |||
1924 1925 1926 1927 1928 1929 1930 | /* Compute the RHS. After this step, the table with cursor ** pExpr->iTable will contains the values that make up the RHS. */ v = pParse->pVdbe; assert( v!=0 ); /* OOM detected prior to this routine */ VdbeNoopComment((v, "begin IN expr")); | | > | 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 | /* Compute the RHS. After this step, the table with cursor ** pExpr->iTable will contains the values that make up the RHS. */ v = pParse->pVdbe; assert( v!=0 ); /* OOM detected prior to this routine */ VdbeNoopComment((v, "begin IN expr")); eType = sqlite3FindInIndex(pParse, pExpr, IN_INDEX_MEMBERSHIP, destIfFalse==destIfNull ? 0 : &rRhsHasNull); /* Figure out the affinity to use to create a key from the results ** of the expression. affinityStr stores a static string suitable for ** P4 of OP_MakeRecord. */ affinity = comparisonAffinity(pExpr); |
︙ | ︙ | |||
1970 1971 1972 1973 1974 1975 1976 | /* If the set membership test fails, then the result of the ** "x IN (...)" expression must be either 0 or NULL. If the set ** contains no NULL values, then the result is 0. If the set ** contains one or more NULL values, then the result of the ** expression is also NULL. */ | > | | 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 | /* If the set membership test fails, then the result of the ** "x IN (...)" expression must be either 0 or NULL. If the set ** contains no NULL values, then the result is 0. If the set ** contains one or more NULL values, then the result of the ** expression is also NULL. */ assert( destIfFalse!=destIfNull || rRhsHasNull==0 ); if( rRhsHasNull==0 ){ /* This branch runs if it is known at compile time that the RHS ** cannot contain NULL values. This happens as the result ** of a "NOT NULL" constraint in the database schema. ** ** Also run this branch if NULL is equivalent to FALSE ** for this particular IN operator. */ |
︙ | ︙ |
Changes to src/sqliteInt.h.
︙ | ︙ | |||
3587 3588 3589 3590 3591 3592 3593 | void sqlite3BeginBenignMalloc(void); void sqlite3EndBenignMalloc(void); #else #define sqlite3BeginBenignMalloc() #define sqlite3EndBenignMalloc() #endif | > > > | | | | > > > > > | | 3587 3588 3589 3590 3591 3592 3593 3594 3595 3596 3597 3598 3599 3600 3601 3602 3603 3604 3605 3606 3607 3608 3609 3610 3611 3612 3613 | void sqlite3BeginBenignMalloc(void); void sqlite3EndBenignMalloc(void); #else #define sqlite3BeginBenignMalloc() #define sqlite3EndBenignMalloc() #endif /* ** Allowed return values from sqlite3FindInIndex() */ #define IN_INDEX_ROWID 1 /* Search the rowid of the table */ #define IN_INDEX_EPH 2 /* Search an ephemeral b-tree */ #define IN_INDEX_INDEX_ASC 3 /* Existing index ASCENDING */ #define IN_INDEX_INDEX_DESC 4 /* Existing index DESCENDING */ /* ** Allowed flags for the 3rd parameter to sqlite3FindInIndex(). */ #define IN_INDEX_MEMBERSHIP 0x0001 /* IN operator used for membership test */ #define IN_INDEX_LOOP 0x0002 /* IN operator used as a loop */ int sqlite3FindInIndex(Parse *, Expr *, u32, int*); #ifdef SQLITE_ENABLE_ATOMIC_WRITE int sqlite3JournalOpen(sqlite3_vfs *, const char *, sqlite3_file *, int, int); int sqlite3JournalSize(sqlite3_vfs *); int sqlite3JournalCreate(sqlite3_file *); int sqlite3JournalExists(sqlite3_file *p); #else |
︙ | ︙ |
Changes to src/where.c.
︙ | ︙ | |||
2518 2519 2520 2521 2522 2523 2524 | ){ testcase( iEq==0 ); testcase( bRev ); bRev = !bRev; } assert( pX->op==TK_IN ); iReg = iTarget; | | | 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 | ){ testcase( iEq==0 ); testcase( bRev ); bRev = !bRev; } assert( pX->op==TK_IN ); iReg = iTarget; eType = sqlite3FindInIndex(pParse, pX, IN_INDEX_LOOP, 0); if( eType==IN_INDEX_INDEX_DESC ){ testcase( bRev ); bRev = !bRev; } iTab = pX->iTable; sqlite3VdbeAddOp2(v, bRev ? OP_Last : OP_Rewind, iTab, 0); VdbeCoverageIf(v, bRev); |
︙ | ︙ |