Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Begin making changes to the IN operator in an attempt to make it run faster and to make the code easier to understand. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | IN-operator-improvements |
Files: | files | file ages | folders |
SHA1: |
ee0fd6aaf94cda1dce3fe752bfe3b0f8 |
User & Date: | drh 2014-08-01 14:46:57.155 |
Context
2014-08-01
| ||
15:34 | The idea of coding IN operator with a short list on the RHS as an OR expression turns out to be helpful. If the list is of length 1 or 2, the OR expression is very slightly faster, but the ephemeral table approach is clearly better for all list lengths greater than 2. Better to keep the code simple. (Closed-Leaf check-in: e13175d357 user: drh tags: IN-operator-improvements) | |
14:46 | Begin making changes to the IN operator in an attempt to make it run faster and to make the code easier to understand. (check-in: ee0fd6aaf9 user: drh tags: IN-operator-improvements) | |
01:40 | Enhance the PRAGMA integrity_check command to detect UNIQUE and NOT NULL constraint violations. (check-in: 9abcf2698c user: drh tags: trunk) | |
Changes
Changes to src/expr.c.
︙ | ︙ | |||
1480 1481 1482 1483 1484 1485 1486 | ** The pX parameter is the expression on the RHS of the IN operator, which ** might be either a list of expressions or a subquery. ** ** The job of this routine is to find or create a b-tree object that can ** be used either to test for membership in the RHS set or to iterate through ** all members of the RHS set, skipping duplicates. ** | | > > | > > > > > > > | | | | | > > > > > > > | < > | | | > | 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 | ** The pX parameter is the expression on the RHS of the IN operator, which ** might be either a list of expressions or a subquery. ** ** The job of this routine is to find or create a b-tree object that can ** be used either to test for membership in the RHS set or to iterate through ** all members of the RHS set, skipping duplicates. ** ** A cursor is opened on the b-tree object that is the RHS of the IN operator ** and pX->iTable is set to the index of that cursor. ** ** The returned value of this function indicates the b-tree type, as follows: ** ** IN_INDEX_ROWID - The cursor was opened on a database table. ** IN_INDEX_INDEX_ASC - The cursor was opened on an ascending index. ** IN_INDEX_INDEX_DESC - The cursor was opened on a descending index. ** IN_INDEX_EPH - The cursor was opened on a specially created and ** populated epheremal table. ** IN_INDEX_NOOP - No cursor was allocated. The in operator must be ** implemented as a sequence of comparisons. ** ** An existing b-tree might be used if the RHS expression pX is a simple ** subquery such as: ** ** SELECT <column> FROM <table> ** ** If the RHS of the IN operator is a list or a more complex subquery, then ** an ephemeral table might need to be generated from the RHS and then ** pX->iTable made to point to the ephermeral table instead of an ** existing table. ** ** The inFlags parameter must contain exactly one of the bits ** IN_INDEX_MEMBERSHIP or IN_INDEX_LOOP. If inFlags contains ** IN_INDEX_MEMBERSHIP, then the generated table will be used for a ** fast membership test. When the IN_INDEX_LOOP bit is set, the ** IN index will be used to loop over all values of the RHS of the ** IN operator. ** ** When IN_INDEX_LOOP is used (and the b-tree will be used to iterate ** through the set members) then the b-tree must not contain duplicates. ** An epheremal table must be used unless the selected <column> is guaranteed ** to be unique - either because it is an INTEGER PRIMARY KEY or it ** has a UNIQUE constraint or UNIQUE index. ** ** When IN_INDEX_MEMBERSHIP is used (and the b-tree will be used ** for fast set membership tests) then an epheremal table must ** be used unless <column> is an INTEGER PRIMARY KEY or an index can ** be found with <column> as its left-most column. ** ** If the IN_INDEX_NOOP_OK and IN_INDEX_MEMBERSHIP are both set and ** if the RHS of the IN operator is a list (not a subquery) then this ** routine might decide that creating an ephemeral b-tree for membership ** testing is too expensive and return IN_INDEX_NOOP. IN that case, the ** calling routine should implement the IN operator using a sequence ** of Eq or Ne comparison operations. ** ** When the b-tree is being used for membership tests, the calling function ** might need to know whether or not the RHS side of the IN operator ** contains a NULL. If prNotFound is not NULL and ** if there is any chance that the (...) might contain a NULL value at ** runtime, then a register is allocated and the register number written ** to *prNotFound. If there is no chance that the (...) contains a ** NULL value, then *prNotFound is left unchanged. ** ** If a register is allocated and its location stored in *prNotFound, then ** its initial value is NULL. If the (...) does not remain constant ** for the duration of the query (i.e. the SELECT within the (...) ** is a correlated subquery) then the value of the allocated register is ** reset to NULL each time the subquery is rerun. This allows the ** caller to use vdbe code equivalent to the following: ** ** if( register==NULL ){ ** has_null = <test if data structure contains null> ** register = 1 ** } ** ** in order to avoid running the <test if data structure contains null> ** test more often than is necessary. */ #ifndef SQLITE_OMIT_SUBQUERY int sqlite3FindInIndex(Parse *pParse, Expr *pX, u32 inFlags, int *prNotFound){ Select *p; /* SELECT to the right of IN operator */ int eType = 0; /* Type of RHS table. IN_INDEX_* */ int iTab = pParse->nTab++; /* Cursor of the RHS table */ int mustBeUnique; /* True if RHS must be unique */ Vdbe *v = sqlite3GetVdbe(pParse); /* Virtual machine being coded */ assert( pX->op==TK_IN ); mustBeUnique = (inFlags & IN_INDEX_LOOP)!=0; /* Check to see if an existing table or index can be used to ** satisfy the query. This is preferable to generating a new ** ephemeral table. */ p = (ExprHasProperty(pX, EP_xIsSelect) ? pX->x.pSelect : 0); if( ALWAYS(pParse->nErr==0) && isCandidateForInOpt(p) ){ |
︙ | ︙ | |||
1626 1627 1628 1629 1630 1631 1632 | if( eType==0 ){ /* Could not find an existing table or index to use as the RHS b-tree. ** We will have to generate an ephemeral table to do the job. */ u32 savedNQueryLoop = pParse->nQueryLoop; int rMayHaveNull = 0; eType = IN_INDEX_EPH; | | < < < > > > | 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 | if( eType==0 ){ /* Could not find an existing table or index to use as the RHS b-tree. ** We will have to generate an ephemeral table to do the job. */ u32 savedNQueryLoop = pParse->nQueryLoop; int rMayHaveNull = 0; eType = IN_INDEX_EPH; if( inFlags & IN_INDEX_LOOP ){ pParse->nQueryLoop = 0; if( pX->pLeft->iColumn<0 && !ExprHasProperty(pX, EP_xIsSelect) ){ eType = IN_INDEX_ROWID; } }else if( prNotFound ){ *prNotFound = rMayHaveNull = ++pParse->nMem; sqlite3VdbeAddOp2(v, OP_Null, 0, rMayHaveNull); } sqlite3CodeSubselect(pParse, pX, rMayHaveNull, eType==IN_INDEX_ROWID); pParse->nQueryLoop = savedNQueryLoop; }else{ pX->iTable = iTab; } return eType; |
︙ | ︙ | |||
1664 1665 1666 1667 1668 1669 1670 | ** to be of the form "<rowid> IN (?, ?, ?)", where <rowid> is a reference ** to some integer key column of a table B-Tree. In this case, use an ** intkey B-Tree to store the set of IN(...) values instead of the usual ** (slower) variable length keys B-Tree. ** ** If rMayHaveNull is non-zero, that means that the operation is an IN ** (not a SELECT or EXISTS) and that the RHS might contains NULLs. | < < | | | < < < < | 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 | ** to be of the form "<rowid> IN (?, ?, ?)", where <rowid> is a reference ** to some integer key column of a table B-Tree. In this case, use an ** intkey B-Tree to store the set of IN(...) values instead of the usual ** (slower) variable length keys B-Tree. ** ** If rMayHaveNull is non-zero, that means that the operation is an IN ** (not a SELECT or EXISTS) and that the RHS might contains NULLs. ** All this routine does is initialize the register given by rMayHaveNull ** to NULL. Calling routines will take care of changing this register ** value to non-NULL if the RHS is NULL-free. ** ** For a SELECT or EXISTS operator, return the register that holds the ** result. For IN operators or if an error occurs, the return value is 0. */ #ifndef SQLITE_OMIT_SUBQUERY int sqlite3CodeSubselect( Parse *pParse, /* Parsing context */ |
︙ | ︙ | |||
1924 1925 1926 1927 1928 1929 1930 | /* Compute the RHS. After this step, the table with cursor ** pExpr->iTable will contains the values that make up the RHS. */ v = pParse->pVdbe; assert( v!=0 ); /* OOM detected prior to this routine */ VdbeNoopComment((v, "begin IN expr")); | | > | 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 | /* Compute the RHS. After this step, the table with cursor ** pExpr->iTable will contains the values that make up the RHS. */ v = pParse->pVdbe; assert( v!=0 ); /* OOM detected prior to this routine */ VdbeNoopComment((v, "begin IN expr")); eType = sqlite3FindInIndex(pParse, pExpr, 0, destIfFalse==destIfNull ? 0 : &rRhsHasNull); /* Figure out the affinity to use to create a key from the results ** of the expression. affinityStr stores a static string suitable for ** P4 of OP_MakeRecord. */ affinity = comparisonAffinity(pExpr); |
︙ | ︙ |
Changes to src/sqliteInt.h.
︙ | ︙ | |||
3587 3588 3589 3590 3591 3592 3593 | void sqlite3BeginBenignMalloc(void); void sqlite3EndBenignMalloc(void); #else #define sqlite3BeginBenignMalloc() #define sqlite3EndBenignMalloc() #endif | > > > | | | | > > > > > > > | | 3587 3588 3589 3590 3591 3592 3593 3594 3595 3596 3597 3598 3599 3600 3601 3602 3603 3604 3605 3606 3607 3608 3609 3610 3611 3612 3613 3614 3615 | void sqlite3BeginBenignMalloc(void); void sqlite3EndBenignMalloc(void); #else #define sqlite3BeginBenignMalloc() #define sqlite3EndBenignMalloc() #endif /* ** Allowed return values from sqlite3FindInIndex() */ #define IN_INDEX_ROWID 1 /* Search the rowid of the table */ #define IN_INDEX_EPH 2 /* Search an ephemeral b-tree */ #define IN_INDEX_INDEX_ASC 3 /* Existing index ASCENDING */ #define IN_INDEX_INDEX_DESC 4 /* Existing index DESCENDING */ #define IN_INDEX_NOOP 5 /* No table available. Use comparisons */ /* ** Allowed flags for the 3rd parameter to sqlite3FindInIndex(). */ #define IN_INDEX_NOOP_OK 0x0001 /* OK to return IN_INDEX_NOOP */ #define IN_INDEX_MEMBERSHIP 0x0002 /* IN operator used for membership test */ #define IN_INDEX_LOOP 0x0004 /* IN operator used as a loop */ int sqlite3FindInIndex(Parse *, Expr *, u32, int*); #ifdef SQLITE_ENABLE_ATOMIC_WRITE int sqlite3JournalOpen(sqlite3_vfs *, const char *, sqlite3_file *, int, int); int sqlite3JournalSize(sqlite3_vfs *); int sqlite3JournalCreate(sqlite3_file *); int sqlite3JournalExists(sqlite3_file *p); #else |
︙ | ︙ |
Changes to src/where.c.
︙ | ︙ | |||
2518 2519 2520 2521 2522 2523 2524 | ){ testcase( iEq==0 ); testcase( bRev ); bRev = !bRev; } assert( pX->op==TK_IN ); iReg = iTarget; | | | 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 | ){ testcase( iEq==0 ); testcase( bRev ); bRev = !bRev; } assert( pX->op==TK_IN ); iReg = iTarget; eType = sqlite3FindInIndex(pParse, pX, IN_INDEX_LOOP, 0); if( eType==IN_INDEX_INDEX_DESC ){ testcase( bRev ); bRev = !bRev; } iTab = pX->iTable; sqlite3VdbeAddOp2(v, bRev ? OP_Last : OP_Rewind, iTab, 0); VdbeCoverageIf(v, bRev); |
︙ | ︙ |