CPSC 333 --- Lecture 20 --- Friday, February 23, 1996 Principles of Testing For the *kind* of system (extremely large; often complex; involving numerous developers) discussed in this course... 1) Testing is the process of examining or executing a program *with the intention of finding errors* ... and not to somehow deduce that it doesn't contain any errors Explanation: It must be assumed that the program *will* include numerous errors when it's first developed. The cost of *correcting* an error increases, drastically, as development proceeds. Therefore, the *sooner* an error is found, the better! "Corollaries:" - A *good* test is one that has a high probability of finding an as-yet-undetected error - A test *succeeds* if it *does* find an error! 2) It is impossible to completely test any nontrivial module for any system. Explanation: In general, a single test gives information about the behaviour of the program on *only one input.* If hardware and software bounds (maximum sizes of integer inputs, maximum array lengths, etc.) are ignored, then the number of inputs --- and the number of tests needed for complete or "exhaustive" testing --- is generally infinite. Even if these hardware and software limits are included, the number of tests needed for "exhaustive" testing for a function with only one (simple) input is large --- and the number of tests needed grows *exponentially* as the number of input parameters. ... You don't need your module to have very many parameters before you're forced to conclude that the *time* needed to complete "exhaustive testing" would exceed the lifetime of the universe, even when assuming that the time needed to complete a single test is shorter than is currently possible ... 3) Testing takes Creativity and Hard Work You can think of this as a "corollary" of #2: Exhaustive testing is impossible, so the best we can hope for is a small set of tests that somehow "cover" most plausible cases. Designing a set of tests that has this property is nontrivial. 4) Test Results should be Recorded... ... for comparison with results obtained during "retesting," after changes have been made ... in order to look for any unexpected or undesirable "side effects" of changes, as well as to see whether the changes helped. 5) Testing is Best Done by Several *Independent* Testers ... and not (entirely) by the developers who designed and coded the system. Explanation: One *very* common source of errors is a misunderstanding of system requirements. If developers misundersand the requirements, they are also liable to "test the wrong thing" --- designing tests based on their incorrect understanding of what the system is supposed to do. It's also possible that developers can become too "attached to" their own work and will be reluctant to be critical of it. Levels (or "Stages") of Testing So far, development has proceeded from "general" (the entire system) to "specific" (individual modules). Since *development* of tests can take place as development proceeds, *test design* can proceed in the same way. However, *execution of tests* is performed in reverse order: 1) Unit Testing: Each module in the system is *individually* tested. 2) System Integration and Integration Testing: *After unit testing,* the modules are combined (or "integrated") together in order to form progressively larger and more complicated subsystems -- and each subsystem is tested before it is combined into an even larger system. When errors are found, it is generally necessary to "roll the process back:" Changes are made to one or more of the modules in the subsystem. The modules that have been changed must be "unit tested" again --- and integration tests for the subsystems containing these modules must be repeated in order to try to ensure that the detected problems have been eliminated, and no new problems caused, by the changes. Eventually the entire system is combined together and tested. 3) Validation Testing: The *people who will be using the delivered system* begin to use the system, partly under "typical working conditions." 4) System Testing: Software is often part of a much larger system that includes hardware, a data base, people, etc. After the software has been tested it is necessary to test the system as a whole. We'll (eventually) consider each of these stages in turn. Types of Tests Both "static testing" and "dynamic testing" should be performed. *Static Testing* is testing done directly on the source code of a program, without executing it. Types of static testing that can be performed manually: - Desk Checking: reading the source code, and scanning for possible errors in either syntax or logic. - Hand Execution: "playing computer," by reading successive lines of source code and carrying out the appropriate activities by making notation on paper. Automated static testing can produce lists of errors, highlight questionable coding practices, or flag departures from coding standards. Static analyzers can also provide information about the structure of code, including symbol tables, call graphs (showing which modules are called by which other modules), logic flow graphs, lists of parameters passed to each module, etc. *Dynamic Testing* test the behaviour of a module or program during execution. Two major approaches to dynamic testing: - Black Box Testing: also called "functional testing:" includes tests based on the functional requirements of programs (as given by requirements and module specifications) - White Box Testing: also called "glass box testing" or "structural testing:" includes tests based on the internal workings and operations of a module. It isn't possible to perform either of these types of tests "exhaustively": For black box testing, there are generally too many possible inputs to test them all. For white box testing, there are generally too many control paths through a module for all to be checked in any reasonable amount of time. Black Box Testing is useful for finding - incorrect or missing functions - interface errors - errors in data structure or external data base interfaces - performance errors - initialization and termination errors This type of testing is conducted to some extent during unit testing, and more extensively during later testing stages. White Box Testing --- typically tries to ensure that - all "independent" control paths through a module have been checked - all logical decisions within a module have been exercised on both their "true" and "false" sides (or, all possible "cases" for a *case* statement* have been tested) - all loops are their boundaries --- both minimal and maximal numbers of iterations are checked --- as well as "typical" numbers of iterations - all internal data structures are checked This type of testing is conducted at earlier stages --- extensively, during unit testing. "Why Perform White Box Testing?" - Frequency of logical errors and incorrect assumptions appears to be *inversely* proportional to the probability that a program path will "normally" be executed --- and we want to find these errors (in infrequently used parts of the program) *soon* - We often believe that a logical path is unlikely to be executed when, in fact, it will be executed on a regular basis - Typographical errors are "random" --- they are as likely to be on "obscure" paths as anywhere else ... and white box testing *does* attempt to check these "obscure" control paths.