Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | In the testing document, use <codeblock> instead of <pre> and add a section on mutation testing. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
15b43e7cdb48e06ea03a3822b3095430 |
User & Date: | drh 2016-09-05 16:51:45.012 |
Context
2016-09-05
| ||
17:18 | More mutation testing documentation. (check-in: 34514986f4 user: drh tags: trunk) | |
16:51 | In the testing document, use <codeblock> instead of <pre> and add a section on mutation testing. (check-in: 15b43e7cdb user: drh tags: trunk) | |
2016-09-03
| ||
19:43 | Improved hyperlinks in the history of releases. (check-in: a5156889f3 user: drh tags: trunk) | |
Changes
Changes to pages/testing.in.
︙ | ︙ | |||
484 485 486 487 488 489 490 | coverage measures the number of machine-code branch instructions that are evaluated at least once on both directions.</p> <p>To illustrate the difference between statement coverage and branch coverage, consider the following hypothetical line of C code:</p> | | | | 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 | coverage measures the number of machine-code branch instructions that are evaluated at least once on both directions.</p> <p>To illustrate the difference between statement coverage and branch coverage, consider the following hypothetical line of C code:</p> <codeblock> if( a>b && c!=25 ){ d++; } </codeblock> <p>Such a line of C code might generate a dozen separate machine code instructions. If any one of those instructions is ever evaluated, then we say that the statement has been tested. So, for example, it might be the case that the conditional expression is always false and the "d" variable is never incremented. Even so, statement coverage counts this line of |
︙ | ︙ | |||
532 533 534 535 536 537 538 | macros called ALWAYS() and NEVER(). The ALWAYS() macro surrounds conditions which are expected to always evaluate as true and NEVER() surrounds conditions that are always evaluated to false. These macros serve as comments to indicate that the conditions are defensive code. In release builds, these macros are pass-throughs:</p> | | | | | | | | | | | | | | | | 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 | macros called ALWAYS() and NEVER(). The ALWAYS() macro surrounds conditions which are expected to always evaluate as true and NEVER() surrounds conditions that are always evaluated to false. These macros serve as comments to indicate that the conditions are defensive code. In release builds, these macros are pass-throughs:</p> <codeblock> #define ALWAYS(X) (X) #define NEVER(X) (X) </codeblock> <p>During most testing, however, these macros will throw an assertion fault if their argument does not have the expected truth value. This alerts the developers quickly to incorrect design assumptions. <codeblock> #define ALWAYS(X) ((X)?1:assert(0),0) #define NEVER(X) ((X)?assert(0),1:0) </codeblock> <p>When measuring test coverage, these macros are defined to be constant truth values so that they do not generate assembly language branch instructions, and hence do not come into play when calculating the branch coverage:</p> <codeblock> #define ALWAYS(X) (1) #define NEVER(X) (0) </codeblock> <p>The test suite is designed to be run three times, once for each of the ALWAYS() and NEVER() definitions shown above. All three test runs should yield exactly the same result. There is a run-time test using the [sqlite3_test_control]([SQLITE_TESTCTRL_ALWAYS], ...) interface that can be used to verify that the macros are correctly set to the first form (the pass-through form) for deployment.</p> <tcl>hd_fragment {testcase} {testcase macros}</tcl> <h2>Forcing coverage of boundary values and boolean vector tests</h2> <p>Another macro used in conjunction with test coverage measurement is the <tt>testcase()</tt> macro. The argument is a condition for which we want test cases that evaluate to both true and false. In non-coverage builds (that is to say, in release builds) the <tt>testcase()</tt> macro is a no-op:</p> <codeblock> #define testcase(X) </codeblock> <p>But in a coverage measuring build, the <tt>testcase()</tt> macro generates code that evaluates the conditional expression in its argument. Then during analysis, a check is made to ensure tests exist that evaluate the conditional to both true and false. <tt>Testcase()</tt> macros are used, for example, to help verify that boundary values are tested. For example:</p> <codeblock> testcase( a==b ); testcase( a==b+1 ); if( a>b && c!=25 ){ d++; } </codeblock> <p>Testcase macros are also used when two or more cases of a switch statement go to the same block of code, to make sure that the code was reached for all cases:</p> <codeblock> switch( op ){ case OP_Add: case OP_Subtract: { testcase( op==OP_Add ); testcase( op==OP_Subtract ); /* ... */ break; } /* ... */ } </codeblock> <p>For bitmask tests, <tt>testcase()</tt> macros are used to verify that every bit of the bitmask affects the outcome. For example, in the following block of code, the condition is true if the mask contains either of two bits indicating either a MAIN_DB or a TEMP_DB is being opened. The <tt>testcase()</tt> macros that precede the if statement verify that both cases are tested:</p> <codeblock> testcase( mask & SQLITE_OPEN_MAIN_DB ); testcase( mask & SQLITE_OPEN_TEMP_DB ); if( (mask & (SQLITE_OPEN_MAIN_DB|SQLITE_OPEN_TEMP_DB))!=0 ){ ... } </codeblock> <p>The SQLite source code contains <tcl>N {$stat(nTestcase)}</tcl> uses of the <tt>testcase()</tt> macro.</p> <tcl>hd_fragment {mcdc} *MC/DC {MC/DC testing}</tcl> <h2>Branch coverage versus MC/DC</h2> |
︙ | ︙ | |||
688 689 690 691 692 693 694 695 696 697 698 699 700 701 | differences in output indicate either the use of undefined or indeterminate behavior in the SQLite code (and hence a bug), or a bug in the compiler. Note that SQLite has, over the previous decade, encountered bugs in each of GCC, Clang, and MSVC. Compiler bugs, while rare, do happen, which is why it is so important to test the code in an as-delivered configuration. <tcl>hd_fragment thoughts1</tcl> <h2>Experience with full test coverage</h2> <p>The developers of SQLite have found that full coverage testing is an extremely effective method for locating and preventing bugs. Because every single branch | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 | differences in output indicate either the use of undefined or indeterminate behavior in the SQLite code (and hence a bug), or a bug in the compiler. Note that SQLite has, over the previous decade, encountered bugs in each of GCC, Clang, and MSVC. Compiler bugs, while rare, do happen, which is why it is so important to test the code in an as-delivered configuration. <tcl>hd_fragment mutationtests</tcl> <h2>Mutation testing</h2> <p>Using gcov (or similar) to show that every branch instruction is taken at least once in both directions is good measure of test suite quality. But even better is showing that every branch instruction makes a difference in the output. In other words, we want to show not only that every branch instruction both jumps and falls through but also that every branch is doing useful work and that the test suite is able to detect and verify that work. When a branch is found that does not make a difference in the output, that suggests that the code associated the branch can be removed (reducing the size of the library and perhaps making it run faster) or that the test suite is inadequately testing the feature that the branch implements. <p>SQLite strives to verify that every branch instruction makes a difference using [https://en.wikipedia.org/wiki/Mutation_testing|mutation testing]. A script first compiles the SQLite source code into assembly language (using, for example, the -S option to gcc). Then the script steps through the generated assembly language and, one by one, changes each branch instruction into either an unconditional jump or a no-op, compiles the result, and verifies that the test suite catches the mutation. <p> Unfortunately, SQLite contains many branch instructions that help the code run faster without changing the output. Such branches generate false-positives during mutation testing. As an example, consider the following [https://www.sqlite.org/src/artifact/55b5fb474?ln=55-62 | hash function] used to accelerate table-name lookup: <codeblock> 55 static unsigned int strHash(const char *z){ 56 unsigned int h = 0; 57 unsigned char c; 58 while( (c = (unsigned char)*z++)!=0 ){ /*OPTIMIZATION-IF-TRUE*/ 59 h = (h<<3) ^ h ^ sqlite3UpperToLower[c]; 60 } 61 return h; 62 } </codeblock> <p> If the branch instruction that implements the "c!=0" test on line 58 is changed into a no-op, then the while-loop will loop forever and the test suite will fail with a time-out. But if that branch is changed into an unconditional jump, then the hash function will always return 0. The problem is that 0 is a valid hash. A hash function that always returns 0 still works in the sense that SQLite still always gets the correct answer. The table-name hash table degenerates into a linked-list and so the table-name lookups that occur while parsing SQL statements might be a little slower, but the end result will be the same. <p> To work around this problem, comments of the form "<code>/*OPTIMIZATION-IF-TRUE*/</code>" and "<code>/*OPTIMIZATION-IF-FALSE*/</code>" are inserted into the SQLite source code to tell the mutation testing script to ignore some branch instructions. <tcl>hd_fragment thoughts1</tcl> <h2>Experience with full test coverage</h2> <p>The developers of SQLite have found that full coverage testing is an extremely effective method for locating and preventing bugs. Because every single branch |
︙ | ︙ |