Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Refactor the sqlite3ExprCodeIN() routine for improved maintainability. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | rowvalue |
Files: | files | file ages | folders |
SHA1: |
b56705ae6374db9db82613ef89faa1a1 |
User & Date: | drh 2016-08-25 21:14:34.728 |
Context
2016-08-25
| ||
22:31 | Merge recent changes from trunk. (check-in: 5789aab8ef user: drh tags: rowvalue) | |
21:14 | Refactor the sqlite3ExprCodeIN() routine for improved maintainability. (check-in: b56705ae63 user: drh tags: rowvalue) | |
17:47 | Another fix in the IN-operator algorithm description. (check-in: f474aeac4f user: drh tags: rowvalue) | |
Changes
Changes to src/expr.c.
︙ | ︙ | |||
2634 2635 2636 2637 2638 2639 2640 | /* ** Generate code for an IN expression. ** ** x IN (SELECT ...) ** x IN (value, value, ...) ** ** The left-hand side (LHS) is a scalar or vector expression. The | | > > > > | | > | | > | | | | > > > > > > | | 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 | /* ** Generate code for an IN expression. ** ** x IN (SELECT ...) ** x IN (value, value, ...) ** ** The left-hand side (LHS) is a scalar or vector expression. The ** right-hand side (RHS) is an array of zero or more scalar values, or a ** subquery. If the RHS is a subquery, the number of result columns must ** match the number of columns in the vector on the LHS. If the RHS is ** a list of values, the LHS must be a scalar. ** ** The IN operator is true if the LHS value is contained within the RHS. ** The result is false if the LHS is definitely not in the RHS. The ** result is NULL if the presence of the LHS in the RHS cannot be ** determined due to NULLs. ** ** This routine generates code that jumps to destIfFalse if the LHS is not ** contained within the RHS. If due to NULLs we cannot determine if the LHS ** is contained in the RHS then jump to destIfNull. If the LHS is contained ** within the RHS then fall through. ** ** See the separate in-operator.md documentation file in the canonical ** SQLite source tree for additional information. */ static void sqlite3ExprCodeIN( Parse *pParse, /* Parsing and code generating context */ Expr *pExpr, /* The IN expression */ int destIfFalse, /* Jump here if LHS is not contained in the RHS */ int destIfNull /* Jump here if the results are unknown due to NULLs */ ){ int rRhsHasNull = 0; /* Register that is true if RHS contains NULL values */ int eType; /* Type of the RHS */ int rLhs; /* Register(s) holding the LHS values */ int rLhsOrig; /* LHS values prior to reordering by aiMap[] */ Vdbe *v; /* Statement under construction */ int *aiMap = 0; /* Map from vector field to index column */ char *zAff = 0; /* Affinity string for comparisons */ int nVector; /* Size of vectors for this IN operator */ int iDummy; /* Dummy parameter to exprCodeVector() */ Expr *pLeft; /* The LHS of the IN operator */ int i; /* loop counter */ int destStep2; /* Where to jump when NULLs seen in step 2 */ int destStep6 = 0; /* Start of code for Step 6 */ int addrTruthOp; /* Address of opcode that determines the IN is true */ int destNotNull; /* Jump here if a comparison is not true in step 6 */ int addrTop; /* Top of the step-6 loop */ pLeft = pExpr->pLeft; if( sqlite3ExprCheckIN(pParse, pExpr) ) return; zAff = exprINAffinity(pParse, pExpr); nVector = sqlite3ExprVectorSize(pExpr->pLeft); aiMap = (int*)sqlite3DbMallocZero( pParse->db, nVector*(sizeof(int) + sizeof(char)) + 1 ); if( pParse->db->mallocFailed ) goto sqlite3ExprCodeIN_oom_error; /* Attempt to compute the RHS. After this step, if anything other than ** IN_INDEX_NOOP is returned, the table opened ith cursor pExpr->iTable ** contains the values that make up the RHS. If IN_INDEX_NOOP is returned, ** the RHS has not yet been coded. */ v = pParse->pVdbe; assert( v!=0 ); /* OOM detected prior to this routine */ |
︙ | ︙ | |||
2699 2700 2701 2702 2703 2704 2705 2706 2707 | assert( cnt==1 ); } #endif /* Code the LHS, the <expr> from "<expr> IN (...)". If the LHS is a ** vector, then it is stored in an array of nVector registers starting ** at r1. */ sqlite3ExprCachePush(pParse); | > > > > > | | | | | | > > | | | | | | > | < | | < < < < | > > > > | | | | | | | < < < < < < | | > > > | | > > | | > | < < | | > > | | < < < | | | | | | > > | | | | < | < < < < < < | | | < | | < < < < < | < | < < < < < < < < < | < < < < | | | > > > > > > > > > > > > > > > > > > | | < | | < < < < < | < < < < < < | > | | | | > > | | | > > | > | | | 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 | assert( cnt==1 ); } #endif /* Code the LHS, the <expr> from "<expr> IN (...)". If the LHS is a ** vector, then it is stored in an array of nVector registers starting ** at r1. ** ** sqlite3FindInIndex() might have reordered the fields of the LHS vector ** so that the fields are in the same order as an existing index. The ** aiMap[] array contains a mapping from the original LHS field order to ** the field order that matches the RHS index. */ sqlite3ExprCachePush(pParse); rLhsOrig = exprCodeVector(pParse, pLeft, &iDummy); for(i=0; i<nVector && aiMap[i]==i; i++){} /* Are LHS fields reordered? */ if( i==nVector ){ /* LHS fields are not reordered */ rLhs = rLhsOrig; }else{ /* Need to reorder the LHS fields according to aiMap */ rLhs = sqlite3GetTempRange(pParse, nVector); for(i=0; i<nVector; i++){ sqlite3VdbeAddOp3(v, OP_Copy, rLhsOrig+i, rLhs+aiMap[i], 0); } } /* If sqlite3FindInIndex() did not find or create an index that is ** suitable for evaluating the IN operator, then evaluate using a ** sequence of comparisons. ** ** This is step (1) in the in-operator.md optimized algorithm. */ if( eType==IN_INDEX_NOOP ){ ExprList *pList = pExpr->x.pList; CollSeq *pColl = sqlite3ExprCollSeq(pParse, pExpr->pLeft); int labelOk = sqlite3VdbeMakeLabel(v); int r2, regToFree; int regCkNull = 0; int ii; assert( !ExprHasProperty(pExpr, EP_xIsSelect) ); if( destIfNull!=destIfFalse ){ regCkNull = sqlite3GetTempReg(pParse); sqlite3VdbeAddOp3(v, OP_BitAnd, rLhs, rLhs, regCkNull); } for(ii=0; ii<pList->nExpr; ii++){ r2 = sqlite3ExprCodeTemp(pParse, pList->a[ii].pExpr, ®ToFree); if( regCkNull && sqlite3ExprCanBeNull(pList->a[ii].pExpr) ){ sqlite3VdbeAddOp3(v, OP_BitAnd, regCkNull, r2, regCkNull); } if( ii<pList->nExpr-1 || destIfNull!=destIfFalse ){ sqlite3VdbeAddOp4(v, OP_Eq, rLhs, labelOk, r2, (void*)pColl, P4_COLLSEQ); VdbeCoverageIf(v, ii<pList->nExpr-1); VdbeCoverageIf(v, ii==pList->nExpr-1); sqlite3VdbeChangeP5(v, zAff[0]); }else{ assert( destIfNull==destIfFalse ); sqlite3VdbeAddOp4(v, OP_Ne, rLhs, destIfFalse, r2, (void*)pColl, P4_COLLSEQ); VdbeCoverage(v); sqlite3VdbeChangeP5(v, zAff[0] | SQLITE_JUMPIFNULL); } sqlite3ReleaseTempReg(pParse, regToFree); } if( regCkNull ){ sqlite3VdbeAddOp2(v, OP_IsNull, regCkNull, destIfNull); VdbeCoverage(v); sqlite3VdbeGoto(v, destIfFalse); } sqlite3VdbeResolveLabel(v, labelOk); sqlite3ReleaseTempReg(pParse, regCkNull); goto sqlite3ExprCodeIN_finished; } /* Step 2: Check to see if the LHS contains any NULL columns. If the ** LHS does contain NULLs then the result must be either FALSE or NULL. ** We will then skip the binary search of the RHS. */ if( destIfNull==destIfFalse ){ destStep2 = destIfFalse; }else{ destStep2 = destStep6 = sqlite3VdbeMakeLabel(v); } for(i=0; i<nVector; i++){ Expr *p = sqlite3VectorFieldSubexpr(pExpr->pLeft, i); if( sqlite3ExprCanBeNull(p) ){ sqlite3VdbeAddOp2(v, OP_IsNull, rLhs+i, destStep2); VdbeCoverage(v); } } /* Step 3. The LHS is now known to be non-NULL. Do the binary search ** of the RHS using the LHS as a probe. If found, the result is ** true. */ if( eType==IN_INDEX_ROWID ){ /* In this case, the RHS is the ROWID of table b-tree and so we also ** know that the RHS is non-NULL. Hence, we combine steps 3 and 4 ** into a single opcode. */ sqlite3VdbeAddOp3(v, OP_SeekRowid, pExpr->iTable, destIfFalse, rLhs); VdbeCoverage(v); addrTruthOp = sqlite3VdbeAddOp0(v, OP_Goto); /* Return True */ }else{ sqlite3VdbeAddOp4(v, OP_Affinity, rLhs, nVector, 0, zAff, nVector); if( destIfFalse==destIfNull ){ /* Combine Step 3 and Step 5 into a single opcode */ sqlite3VdbeAddOp4Int(v, OP_NotFound, pExpr->iTable, destIfFalse, rLhs, nVector); VdbeCoverage(v); goto sqlite3ExprCodeIN_finished; } /* Ordinary Step 3, for the case where FALSE and NULL are distinct */ addrTruthOp = sqlite3VdbeAddOp4Int(v, OP_Found, pExpr->iTable, 0, rLhs, nVector); VdbeCoverage(v); } /* Step 4. If the RHS is known to be non-NULL and we did not find ** an match on the search above, then the result must be FALSE. */ if( rRhsHasNull && nVector==1 ){ sqlite3VdbeAddOp2(v, OP_NotNull, rRhsHasNull, destIfFalse); VdbeCoverage(v); } /* Step 5. If we do not care about the difference between NULL and ** FALSE, then just return false. */ if( destIfFalse==destIfNull ) sqlite3VdbeGoto(v, destIfFalse); /* Step 6: Loop through rows of the RHS. Compare each row to the LHS. ** If any comparison is NULL, then the result is NULL. If all ** comparisons are FALSE then the final result is FALSE. ** ** For a scalar LHS, it is sufficient to check just the first row ** of the RHS. */ if( destStep6 ) sqlite3VdbeResolveLabel(v, destStep6); addrTop = sqlite3VdbeAddOp2(v, OP_Rewind, pExpr->iTable, destIfFalse); VdbeCoverage(v); if( nVector>1 ){ destNotNull = sqlite3VdbeMakeLabel(v); }else{ /* For nVector==1, combine steps 6 and 7 by immediately returning ** FALSE if the first comparison is not NULL */ destNotNull = destIfFalse; } for(i=0; i<nVector; i++){ Expr *p; CollSeq *pColl; int r3 = sqlite3GetTempReg(pParse); p = sqlite3VectorFieldSubexpr(pLeft, i); pColl = sqlite3ExprCollSeq(pParse, p); sqlite3VdbeAddOp3(v, OP_Column, pExpr->iTable, i, r3); sqlite3VdbeAddOp4(v, OP_Ne, rLhs+i, destNotNull, r3, (void*)pColl, P4_COLLSEQ); VdbeCoverage(v); sqlite3ReleaseTempReg(pParse, r3); } sqlite3VdbeAddOp2(v, OP_Goto, 0, destIfNull); if( nVector>1 ){ sqlite3VdbeResolveLabel(v, destNotNull); sqlite3VdbeAddOp2(v, OP_Next, pExpr->iTable, addrTop+1); VdbeCoverage(v); /* Step 7: If we reach this point, we know that the result must ** be false. */ sqlite3VdbeAddOp2(v, OP_Goto, 0, destIfFalse); } /* Jumps here in order to return true. */ sqlite3VdbeJumpHere(v, addrTruthOp); sqlite3ExprCodeIN_finished: if( rLhs!=rLhsOrig ) sqlite3ReleaseTempReg(pParse, rLhs); sqlite3ExprCachePop(pParse); VdbeComment((v, "end IN expr")); sqlite3ExprCodeIN_oom_error: sqlite3DbFree(pParse->db, aiMap); sqlite3DbFree(pParse->db, zAff); } #endif /* SQLITE_OMIT_SUBQUERY */ #ifndef SQLITE_OMIT_FLOATING_POINT /* |
︙ | ︙ |
Changes to src/in-operator.md.
1 2 3 4 5 6 7 8 | IN-Operator Implementation Notes ================================ ## Definitions: An IN operator has one of the following formats: > | | > > > > > | > > > | > > > | | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | IN-Operator Implementation Notes ================================ ## Definitions: An IN operator has one of the following formats: > x IN (y1,y2,y3,...,yN) x IN (subquery) The "x" is referred to as the LHS (left-hand side). The list or subquery on the right is called the RHS (right-hand side). If the RHS is a list it must be a non-empty list. But if the RHS is a subquery, it can be an empty set. The LHS can be a scalar (a single quantity) or a vector (a list of two or or more values) or a subquery that returns one or more columns. We use the term "vector" to mean an actually list of values or a subquery that returns two or more columns. An isolated value or a subquery that returns a single columns is called a scalar. The RHS can be a subquery that returns a single column, a subquery that returns two or more columns, or a list of scalars. It is not currently support for the RHS to be a list of vectors. The number of columns for LHS must match the number of columns for the RHS. If the RHS is a list of values, then the LHS must be a scalar. If the RHS is a subquery returning N columns, then the LHS must be a vector of size N. NULL values can occur in either or both of the LHS and RHS. If the LHS contains only NULL values then we say that it is a "total-NULL". If the LHS contains some NULL values and some non-NULL values, then it is a "partial-NULL". For a scalar, there is no difference between a partial-NULL and a total-NULL. The RHS is a partial-NULL if any row contains a NULL value. The RHS is |
︙ | ︙ | |||
80 81 82 83 84 85 86 | ahead to step 5. 3. Do a binary search of the RHS using the LHS as a probe. If an exact match is found, return TRUE. 4. If the RHS is non-NULL then return FALSE. | | | 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 | ahead to step 5. 3. Do a binary search of the RHS using the LHS as a probe. If an exact match is found, return TRUE. 4. If the RHS is non-NULL then return FALSE. 5. If we do not need to distinguish between FALSE and NULL, then return FALSE. 6. For each row in the RHS, compare that row against the LHS and if the result is NULL, immediately return NULL. In the case of a scalar IN operator, we only need to look at the very first row the RHS because for a scalar RHS, all NULLs will always come first. If the RHS is empty, this step is a no-op. 7. Return FALSE. |