/ Check-in [2f397bd8]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:COUNT(DISTINCT x) is now fully functional, though it could benefit from additional testing. (CVS 2688)
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 2f397bd8142ec5e8f7238127012d14fdb558d918
User & Date: drh 2005-09-11 11:56:28
Context
2005-09-12
23:03
Bug fix in the new COUNT(DISTINCT) logic. Ticket #1420. (CVS 2689) check-in: 37116550 user: drh tags: trunk
2005-09-11
11:56
COUNT(DISTINCT x) is now fully functional, though it could benefit from additional testing. (CVS 2688) check-in: 2f397bd8 user: drh tags: trunk
02:03
The ".dump" command from the shell correctly saves the state of the sqlite_sequence and sqlite_stat1 tables, if they exist. Ticket #1419. (CVS 2687) check-in: 3f191cf4 user: drh tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to src/parse.y.

10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
...
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
**
*************************************************************************
** This file contains SQLite's grammar for SQL.  Process this file
** using the lemon parser generator to generate C code that runs
** the parser.  Lemon will also generate a header file containing
** numeric codes for all of the tokens.
**
** @(#) $Id: parse.y,v 1.178 2005/09/10 16:46:13 drh Exp $
*/

// All token codes are small integers with #defines that begin with "TK_"
%token_prefix TK_

// The type of the data attached to each token is Token.  This is also the
// default type for non-terminals.
................................................................................
  sqlite3ExprSpan(A,&X,&Y);
}
%endif // SQLITE_OMIT_CAST
expr(A) ::= ID(X) LP distinct(D) exprlist(Y) RP(E). {
  A = sqlite3ExprFunction(Y, &X);
  sqlite3ExprSpan(A,&X,&E);
  if( D ){
    sqlite3ErrorMsg(pParse, "DISTINCT in an aggregate function "
        "is not currently supported");
    A->flags |= EP_Distinct;
  }
}
expr(A) ::= ID(X) LP STAR RP(E). {
  A = sqlite3ExprFunction(0, &X);
  sqlite3ExprSpan(A,&X,&E);
}







|







 







<
<







10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
...
661
662
663
664
665
666
667


668
669
670
671
672
673
674
**
*************************************************************************
** This file contains SQLite's grammar for SQL.  Process this file
** using the lemon parser generator to generate C code that runs
** the parser.  Lemon will also generate a header file containing
** numeric codes for all of the tokens.
**
** @(#) $Id: parse.y,v 1.179 2005/09/11 11:56:28 drh Exp $
*/

// All token codes are small integers with #defines that begin with "TK_"
%token_prefix TK_

// The type of the data attached to each token is Token.  This is also the
// default type for non-terminals.
................................................................................
  sqlite3ExprSpan(A,&X,&Y);
}
%endif // SQLITE_OMIT_CAST
expr(A) ::= ID(X) LP distinct(D) exprlist(Y) RP(E). {
  A = sqlite3ExprFunction(Y, &X);
  sqlite3ExprSpan(A,&X,&E);
  if( D ){


    A->flags |= EP_Distinct;
  }
}
expr(A) ::= ID(X) LP STAR RP(E). {
  A = sqlite3ExprFunction(0, &X);
  sqlite3ExprSpan(A,&X,&E);
}

Changes to src/select.c.

8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
...
359
360
361
362
363
364
365




























366
367
368
369
370
371
372
...
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
....
2403
2404
2405
2406
2407
2408
2409

2410
2411
2412
2413
2414
2415
2416
2417
2418












2419
2420
2421
2422
2423
2424
2425
....
2444
2445
2446
2447
2448
2449
2450

2451
2452
2453
2454
2455
2456
2457





2458
2459
2460
2461
2462
2463
2464
2465
2466
2467
2468
2469
2470



2471
2472
2473
2474
2475
2476
2477
**    May you find forgiveness for yourself and forgive others.
**    May you share freely, never taking more than you give.
**
*************************************************************************
** This file contains C code routines that are called by the parser
** to handle SELECT statements in SQLite.
**
** $Id: select.c,v 1.267 2005/09/10 15:28:09 drh Exp $
*/
#include "sqliteInt.h"


/*
** Allocate a new Select structure and return a pointer to that
** structure.
................................................................................
    VdbeComment((v, "# skip OFFSET records"));
  }
  if( p->iLimit>=0 && iBreak!=0 ){
    sqlite3VdbeAddOp(v, OP_MemIncr, p->iLimit, iBreak);
    VdbeComment((v, "# exit when LIMIT reached"));
  }
}





























/*
** This routine generates the code for the inside of the inner loop
** of a SELECT.
**
** If srcTab and nColumn are both zero, then the pEList expressions
** are evaluated in order to get the data for this row.  If nColumn>0
................................................................................

  /* If the DISTINCT keyword was present on the SELECT statement
  ** and this row has been seen before, then do not make this row
  ** part of the result.
  */
  if( hasDistinct ){
    int n = pEList->nExpr;
#if NULL_ALWAYS_DISTINCT
    sqlite3VdbeAddOp(v, OP_IsNull, -pEList->nExpr, sqlite3VdbeCurrentAddr(v)+7);
#endif
    /* Deliberately leave the affinity string off of the following
    ** OP_MakeRecord */
    sqlite3VdbeAddOp(v, OP_MakeRecord, -n, 0);
    sqlite3VdbeAddOp(v, OP_Distinct, distinct, sqlite3VdbeCurrentAddr(v)+3);
    sqlite3VdbeAddOp(v, OP_Pop, n+1, 0);
    sqlite3VdbeAddOp(v, OP_Goto, 0, iContinue);
    VdbeComment((v, "# skip indistinct records"));
    sqlite3VdbeAddOp(v, OP_IdxInsert, distinct, 0);
    if( pOrderBy==0 ){
      codeLimiter(v, p, iContinue, iBreak, nColumn);
    }
  }

  switch( eDest ){
    /* In this mode, write each query result to the key of the temporary
................................................................................
** intermediate results while calculating an aggregate.  This
** routine simply stores NULLs in all of those memory cells.
*/
static void resetAccumulator(Parse *pParse, AggInfo *pAggInfo){
  Vdbe *v = pParse->pVdbe;
  int i;
  int addr;

  if( pAggInfo->nFunc+pAggInfo->nColumn==0 ){
    return;
  }
  sqlite3VdbeAddOp(v, OP_Null, 0, 0);
  for(i=0; i<pAggInfo->nColumn; i++){
    addr = sqlite3VdbeAddOp(v, OP_MemStore, pAggInfo->aCol[i].iMem, 0);
  }
  for(i=0; i<pAggInfo->nFunc; i++){
    addr = sqlite3VdbeAddOp(v, OP_MemStore, pAggInfo->aFunc[i].iMem, 0);












  }
  sqlite3VdbeChangeP2(v, addr, 1);
}

/*
** Invoke the OP_AggFinalize opcode for every aggregate function
** in the AggInfo structure.
................................................................................
  int i;
  struct AggInfo_func *pF;
  struct AggInfo_col *pC;

  pAggInfo->directMode = 1;
  for(i=0, pF=pAggInfo->aFunc; i<pAggInfo->nFunc; i++, pF++){
    int nArg;

    ExprList *pList = pF->pExpr->pList;
    if( pList ){
      nArg = pList->nExpr;
      sqlite3ExprCodeExprList(pParse, pList);
    }else{
      nArg = 0;
    }





    if( pF->pFunc->needCollSeq ){
      CollSeq *pColl = 0;
      struct ExprList_item *pItem;
      int j;
      for(j=0, pItem=pList->a; !pColl && j<pList->nExpr; j++, pItem++){
        pColl = sqlite3ExprCollSeq(pParse, pItem->pExpr);
      }
      if( !pColl ){
        pColl = pParse->db->pDfltColl;
      }
      sqlite3VdbeOp3(v, OP_CollSeq, 0, 0, (char *)pColl, P3_COLLSEQ);
    }
    sqlite3VdbeOp3(v, OP_AggStep, pF->iMem, nArg, (void*)pF->pFunc, P3_FUNCDEF);



  }
  for(i=0, pC=pAggInfo->aCol; i<pAggInfo->nAccumulator; i++, pC++){
    sqlite3ExprCode(pParse, pC->pExpr);
    sqlite3VdbeAddOp(v, OP_MemStore, pC->iMem, 1);
  }
  pAggInfo->directMode = 0;
}







|







 







>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







 







|
<
<
<
<
<
<
<
<
<
<







 







>







|
|
>
>
>
>
>
>
>
>
>
>
>
>







 







>







>
>
>
>
>













>
>
>







8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
...
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
...
443
444
445
446
447
448
449
450










451
452
453
454
455
456
457
....
2421
2422
2423
2424
2425
2426
2427
2428
2429
2430
2431
2432
2433
2434
2435
2436
2437
2438
2439
2440
2441
2442
2443
2444
2445
2446
2447
2448
2449
2450
2451
2452
2453
2454
2455
2456
....
2475
2476
2477
2478
2479
2480
2481
2482
2483
2484
2485
2486
2487
2488
2489
2490
2491
2492
2493
2494
2495
2496
2497
2498
2499
2500
2501
2502
2503
2504
2505
2506
2507
2508
2509
2510
2511
2512
2513
2514
2515
2516
2517
**    May you find forgiveness for yourself and forgive others.
**    May you share freely, never taking more than you give.
**
*************************************************************************
** This file contains C code routines that are called by the parser
** to handle SELECT statements in SQLite.
**
** $Id: select.c,v 1.268 2005/09/11 11:56:28 drh Exp $
*/
#include "sqliteInt.h"


/*
** Allocate a new Select structure and return a pointer to that
** structure.
................................................................................
    VdbeComment((v, "# skip OFFSET records"));
  }
  if( p->iLimit>=0 && iBreak!=0 ){
    sqlite3VdbeAddOp(v, OP_MemIncr, p->iLimit, iBreak);
    VdbeComment((v, "# exit when LIMIT reached"));
  }
}

/*
** Add code that will check to make sure the top N elements of the
** stack are distinct.  iTab is a sorting index that holds previously
** seen combinations of the N values.  A new entry is made in iTab
** if the current N values are new.
**
** A jump to addrRepeat is made and the K values are popped from the
** stack if the top N elements are not distinct.
*/
static void codeDistinct(
  Vdbe *v,           /* Generate code into this VM */
  int iTab,          /* A sorting index used to test for distinctness */
  int addrRepeat,    /* Jump to here if not distinct */
  int N,             /* The top N elements of the stack must be distinct */
  int K              /* Pop K elements from the stack if indistinct */
){
#if NULL_ALWAYS_DISTINCT
  sqlite3VdbeAddOp(v, OP_IsNull, -N, sqlite3VdbeCurrentAddr(v)+6);
#endif
  sqlite3VdbeAddOp(v, OP_MakeRecord, -N, 0);
  sqlite3VdbeAddOp(v, OP_Distinct, iTab, sqlite3VdbeCurrentAddr(v)+3);
  sqlite3VdbeAddOp(v, OP_Pop, K, 0);
  sqlite3VdbeAddOp(v, OP_Goto, 0, addrRepeat);
  VdbeComment((v, "# skip indistinct records"));
  sqlite3VdbeAddOp(v, OP_IdxInsert, iTab, 0);
}


/*
** This routine generates the code for the inside of the inner loop
** of a SELECT.
**
** If srcTab and nColumn are both zero, then the pEList expressions
** are evaluated in order to get the data for this row.  If nColumn>0
................................................................................

  /* If the DISTINCT keyword was present on the SELECT statement
  ** and this row has been seen before, then do not make this row
  ** part of the result.
  */
  if( hasDistinct ){
    int n = pEList->nExpr;
    codeDistinct(v, distinct, iContinue, n, n+1);










    if( pOrderBy==0 ){
      codeLimiter(v, p, iContinue, iBreak, nColumn);
    }
  }

  switch( eDest ){
    /* In this mode, write each query result to the key of the temporary
................................................................................
** intermediate results while calculating an aggregate.  This
** routine simply stores NULLs in all of those memory cells.
*/
static void resetAccumulator(Parse *pParse, AggInfo *pAggInfo){
  Vdbe *v = pParse->pVdbe;
  int i;
  int addr;
  struct AggInfo_func *pFunc;
  if( pAggInfo->nFunc+pAggInfo->nColumn==0 ){
    return;
  }
  sqlite3VdbeAddOp(v, OP_Null, 0, 0);
  for(i=0; i<pAggInfo->nColumn; i++){
    addr = sqlite3VdbeAddOp(v, OP_MemStore, pAggInfo->aCol[i].iMem, 0);
  }
  for(pFunc=pAggInfo->aFunc, i=0; i<pAggInfo->nFunc; i++, pFunc++){
    addr = sqlite3VdbeAddOp(v, OP_MemStore, pFunc->iMem, 0);
    if( pFunc->iDistinct>=0 ){
      Expr *pE = pFunc->pExpr;
      if( pE->pList==0 || pE->pList->nExpr!=1 ){
        sqlite3ErrorMsg(pParse, "DISTINCT in aggregate must be followed "
           "by an expression");
        pFunc->iDistinct = -1;
      }else{
        KeyInfo *pKeyInfo = keyInfoFromExprList(pParse, pE->pList);
        sqlite3VdbeOp3(v, OP_OpenVirtual, pFunc->iDistinct, 0, 
                          (char*)pKeyInfo, P3_KEYINFO_HANDOFF);
      }
    }
  }
  sqlite3VdbeChangeP2(v, addr, 1);
}

/*
** Invoke the OP_AggFinalize opcode for every aggregate function
** in the AggInfo structure.
................................................................................
  int i;
  struct AggInfo_func *pF;
  struct AggInfo_col *pC;

  pAggInfo->directMode = 1;
  for(i=0, pF=pAggInfo->aFunc; i<pAggInfo->nFunc; i++, pF++){
    int nArg;
    int addrNext = 0;
    ExprList *pList = pF->pExpr->pList;
    if( pList ){
      nArg = pList->nExpr;
      sqlite3ExprCodeExprList(pParse, pList);
    }else{
      nArg = 0;
    }
    if( pF->iDistinct>=0 ){
      addrNext = sqlite3VdbeMakeLabel(v);
      assert( nArg==1 );
      codeDistinct(v, pF->iDistinct, addrNext, 1, 1);
    }
    if( pF->pFunc->needCollSeq ){
      CollSeq *pColl = 0;
      struct ExprList_item *pItem;
      int j;
      for(j=0, pItem=pList->a; !pColl && j<pList->nExpr; j++, pItem++){
        pColl = sqlite3ExprCollSeq(pParse, pItem->pExpr);
      }
      if( !pColl ){
        pColl = pParse->db->pDfltColl;
      }
      sqlite3VdbeOp3(v, OP_CollSeq, 0, 0, (char *)pColl, P3_COLLSEQ);
    }
    sqlite3VdbeOp3(v, OP_AggStep, pF->iMem, nArg, (void*)pF->pFunc, P3_FUNCDEF);
    if( addrNext ){
      sqlite3VdbeResolveLabel(v, addrNext);
    }
  }
  for(i=0, pC=pAggInfo->aCol; i<pAggInfo->nAccumulator; i++, pC++){
    sqlite3ExprCode(pParse, pC->pExpr);
    sqlite3VdbeAddOp(v, OP_MemStore, pC->iMem, 1);
  }
  pAggInfo->directMode = 0;
}

Added test/distinctagg.test.

























































































>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
# 2005 September 11
#
# The author disclaims copyright to this source code.  In place of
# a legal notice, here is a blessing:
#
#    May you do good and not evil.
#    May you find forgiveness for yourself and forgive others.
#    May you share freely, never taking more than you give.
#
#***********************************************************************
# This file implements regression tests for SQLite library.  The
# focus of this script is the DISTINCT modifier on aggregate functions.
#
# $Id: distinctagg.test,v 1.1 2005/09/11 11:56:28 drh Exp $


set testdir [file dirname $argv0]
source $testdir/tester.tcl

do_test distinctagg-1.1 {
  execsql {
    CREATE TABLE t1(a,b,c);
    INSERT INTO t1 VALUES(1,2,3);
    INSERT INTO t1 VALUES(1,3,4);
    INSERT INTO t1 VALUES(1,3,5);
    SELECT count(distinct a),
           count(distinct b),
           count(distinct c),
           count(all a) FROM t1;
  }
} {1 2 3 3}
do_test distinctagg-1.2 {
  execsql {
    SELECT b, count(distinct c) FROM t1 GROUP BY b ORDER BY b
  }
} {2 1 3 2}

do_test distinctagg-2.1 {
  catchsql {
    SELECT count(distinct) FROM t1;
  }
} {1 {DISTINCT in aggregate must be followed by an expression}}

finish_test

Changes to www/lang.tcl.

1
2
3
4
5
6
7
8
9
10
11
....
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359









1360
1361
1362
1363
1364
1365
1366
#
# Run this Tcl script to generate the lang-*.html files.
#
set rcsid {$Id: lang.tcl,v 1.99 2005/09/08 20:37:44 drh Exp $}
source common.tcl

if {[llength $argv]>0} {
  set outputdir [lindex $argv 0]
} else {
  set outputdir ""
}
................................................................................
upper-case letters.  The implementation of this function uses the C library
routine <b>toupper()</b> which means it may not work correctly on 
UTF-8 strings.</td>
</tr>
</table>

<p>
The following aggregate functions are available by default.  Additional
aggregate functions written in C may be added using the 
<a href="capi3ref.html#sqlite3_create_function">sqlite3_create_function()</a>
API.</p>










<table border=0 cellpadding=10>
<tr>
<td valign="top" align="right" width=120>avg(<i>X</i>)</td>
<td valign="top">Return the average value of all non-NULL <i>X</i> within a
group.  Non-numeric values are interpreted as 0.</td>
</tr>




|







 







|




>
>
>
>
>
>
>
>
>







1
2
3
4
5
6
7
8
9
10
11
....
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
#
# Run this Tcl script to generate the lang-*.html files.
#
set rcsid {$Id: lang.tcl,v 1.100 2005/09/11 11:56:28 drh Exp $}
source common.tcl

if {[llength $argv]>0} {
  set outputdir [lindex $argv 0]
} else {
  set outputdir ""
}
................................................................................
upper-case letters.  The implementation of this function uses the C library
routine <b>toupper()</b> which means it may not work correctly on 
UTF-8 strings.</td>
</tr>
</table>

<p>
The aggregate functions shown below are available by default.  Additional
aggregate functions written in C may be added using the 
<a href="capi3ref.html#sqlite3_create_function">sqlite3_create_function()</a>
API.</p>

<p>
In any aggregate function that takes a single argument, that argument
can be preceeded by the keyword DISTINCT.  In such cases, duplicate
elements are filtered before being passed into the aggregate function.
For example, the function "count(distinct X)" will return the number
of distinct values of column X instead of the total number of non-null
values in column X.
</p>

<table border=0 cellpadding=10>
<tr>
<td valign="top" align="right" width=120>avg(<i>X</i>)</td>
<td valign="top">Return the average value of all non-NULL <i>X</i> within a
group.  Non-numeric values are interpreted as 0.</td>
</tr>

Changes to www/omitted.tcl.

1
2
3
4
5
6
7
8
9
10
11
..
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
#
# Run this script to generated a omitted.html output file
#
set rcsid {$Id: omitted.tcl,v 1.8 2005/03/19 01:41:22 drh Exp $}
source common.tcl
header {SQL Features That SQLite Does Not Implement}
puts {
<h2>SQL Features That SQLite Does Not Implement</h2>

<p>
Rather than try to list all the features of SQL92 that SQLite does
................................................................................
  DROP COLUMN, ALTER COLUMN, ADD CONSTRAINT, and so forth are omitted.
}

feature {Nested transactions} {
  The current implementation only allows a single active transaction.
}

feature {The COUNT(DISTINCT X) function} {
  You can accomplish the same thing using a subquery, like this:<br />
  &nbsp;&nbsp;SELECT count(x) FROM (SELECT DISTINCT x FROM tbl);
}

feature {RIGHT and FULL OUTER JOIN} {
  LEFT OUTER JOIN is implemented, but not RIGHT OUTER JOIN or
  FULL OUTER JOIN.
}

feature {Writing to VIEWs} {
  VIEWs in SQLite are read-only.  You may not execute a DELETE, INSERT, or



|







 







<
<
<
<
<







1
2
3
4
5
6
7
8
9
10
11
..
52
53
54
55
56
57
58





59
60
61
62
63
64
65
#
# Run this script to generated a omitted.html output file
#
set rcsid {$Id: omitted.tcl,v 1.9 2005/09/11 11:56:28 drh Exp $}
source common.tcl
header {SQL Features That SQLite Does Not Implement}
puts {
<h2>SQL Features That SQLite Does Not Implement</h2>

<p>
Rather than try to list all the features of SQL92 that SQLite does
................................................................................
  DROP COLUMN, ALTER COLUMN, ADD CONSTRAINT, and so forth are omitted.
}

feature {Nested transactions} {
  The current implementation only allows a single active transaction.
}






feature {RIGHT and FULL OUTER JOIN} {
  LEFT OUTER JOIN is implemented, but not RIGHT OUTER JOIN or
  FULL OUTER JOIN.
}

feature {Writing to VIEWs} {
  VIEWs in SQLite are read-only.  You may not execute a DELETE, INSERT, or