What constitutes reasonable frequency and granularity for CI testing? Teams are free to define what is reasonable. This can vary amoung teams. Even within a team, different categories of work may be handled with different frequency and granularity. For example, for bug-fix work, frequency may be once at the end of each day and granularity may be completed bug fixes whereas for feature enhancements requiring many person-weeks, frequency may be once a week and granularity whatever changes are completed in a week. For high functioning CI, frequency may be many times per day and granularity may be each method/function/subroutine added or changed. To understand the ultimate aims of CI, one can imagine CI testing taken to the extreme is like auto correct in a word processor where programmers would get immediate feedback regarding test status with literally each key-stroke they enter.
CI testing may have a number of implications for both software and test development. First, it can demand a very high level of automation.
• Commits on multiple branches can be automatically merged.
• Tests can be automatically triggered as commits are made.
• Relevant tests can be automatically determined for a given commit.
• The cause of test failures can be automatically attributed to a given developer's commit(s).
Next, CI testing can often require that tests be designed with several important attributes
• Tests know which code blocks they depend on.
• Different tests do not depend on each other.
• Incremental re-test of only failed tests is supported.
• Multiple tests can be executed in parallel.
• Tests are designed and compute resources are such that tests complete with immediacy.
High frequency and/or fine granularity of CI testing can also mean that software changes are required to follow incrimental stages of development. It means that new tests are added and obsoleted tests updated or removed, almost as often as the code that is being tested. Finally, it may require that test coverage be high enough that relatively small, isolated code changes anywhere in the code wind up being tested. In particlar, CI testing typically means that developers are prevented from working on separate branches of development for extended periods without being required to merge other's work with their own on a routine basis. Indeed, this is one of the aims of CI testing; to prevent any one branch of development to fall too far out of sync with any other.
Compute resources necessary to complete tests must be sufficient that CI testing feedbacks to developers are immediate or nearly so. Typically, for scientific software, this means that only certain types of tests are appropriate. For example, a test that checks convergence of a numerical algorithm at large scale is likely to require too much time (or too large of a compute resource) to be appropriate for CI testing.