Home
Blog
How to Improve Testing for CSE...

How to Improve Testing for CSE Software

Software requires regular extensive testing to ensure correctly functioning code. This article provides a straightforward process to add testing to an existing software project that has no testing (or insufficient testing).

PublishedAug 06, 2019

Authors Roscoe A. Bartlett, Barry Smith, Ulrike Meier Yang, Glenn Hammond, Xiaoye Li, and James Willenbring

TOPICS

Better Reliability

Testing

Track How To

Overview

Adding tests of sufficient coverage and quality improves confidence in software and makes it easier to change and extend. Tests should be added to existing uncovered code before the code is changed. Tests should be added to new code before (or while) it is being written. These tests then become the foundation of a regression test suite that helps effectively drive future development while maintaining behavior and improves long-term sustainability.

Target Audience

Computational Science and Engineering (CSE) software project leaders and developers who are facing significant refactoring efforts because of hardware architecture changes or increased demands for multi-physics and multi-scale coupling, and who want to increase the quality and speed of development and reduce development and maintenance costs.

Purpose

Show how to add quality testing to a project in order to support efficient modification of existing code or addition of new code. Show how to add tests to support (1) adding a new feature, (2) fixing a bug, (3) improving the design and implementation, or (4) optimizing resource usage¹

Prerequisites

First read the document What Are Software Testing Practices? and browse through Definition and Categorization of Tests for CSE Software.

Steps

Set up automated builds of the code with high warning levels and eliminate all warnings.
Select test harness frameworks
1. Select a system-level test harness for system-executable tests that report results appropriately (e.g., CTest/CDash).
2. Select a unit test harness to effectively define and run finer-grained integration and unit tests (e.g., Google Test, pFUnit).
3. Customize or streamline system-level and/or unit test frameworks for use in your particular project.
Add system-level tests to protect major user functionality.
1. Select inputs for several important problem classes and run code to produce outputs.
2. Set up no-change or verification tests with a system-level test harness in order to pin down important behavior.
Add integration and unit tests (as needed for adding/changing code)
1. Incorporate tests for uncovered code to be changed using the Legacy Software Change Algorithm¹
  - Identify change points for target change or new code.
  - Find test points where code behavior can be sensed.
  - Break dependencies in order to get the targeted code into the unit test harness.
  - Cover targeted code to be changed with sufficient (characterization) tests.
2. Add new features or fix bugs^1,2,3
  - Add new tests that define desired behavior (feature or bug).
  - Run new tests and verify they fail.
  - Add the minimal code to get new tests to pass.
  - Refactor the covered code to clean up and remove duplication.
  - Review all changes to existing code, new code and new tests.
Select code coverage (e.g., gcov/lcov) and memory usage error detection (e.g., valgrind, clang memory/address/leak sanitizers) analysis tools.
Define a set of regression test suites.
1. Define a faster-running pre-merge regression test suite (e.g., single build with faster running tests) and run it before every merge to the mainline branch.
2. Define a more comprehensive nightly regression test suite (e.g., builds and all tests on several platforms and compilers, code coverage, and memory usage error detection) and run every night.
Have a policy of 100% passing pre-merge regression tests (run using a CI testing system like GitHub Actions, GitLab CI, or Jenkins) and work hard to maintain that.
Work to fix all failing nightly regression tests on a reasonable schedule.

FAQs:

Q: Why do you need both a system-level and a unit test harness?
A: A unit test harness aggregates hundreds of unit and integration tests into single executables. A system-level test harness runs these aggregate integration and unit test executables along with the other system-level acceptance and verification tests and alerts developers of any failures.

Q: Why not just add all of the tests for an existing code and get it over with?
A: Taking weeks or months (or years) to add sufficient tests for an entire existing code (that lacks sufficient testing) is not usually economical or necessary. Tests need to be added to code only when it is changed (or when adding new code). In that way tests can be added while regular development work is being done.

Q: Why demand 100% passing pre-merge regression tests?
A: This avoids expensive debugging and other investigations needed to determine whether your changes are breaking failing tests or not (hard). If all tests pass, then your changes could be breaking them (easy).

How to Improve Testing for CSE Software

Overview

Target Audience

Purpose

Prerequisites

Steps

FAQs:

References

Comment

More on Testing

Developing Coding Standards and Practices for Sustainable Software Development

Deep Dive and How To

Julia's Value Proposition for Better Scientific Software

Deep Dive

Rethinking Software Variants

Deep Dive

Build, Integration, and Testing for Sustainable Scientific Computing Software

Community

Overcoming Complexity in Testing Multiphysics Coupling Software

Deep Dive