Training Resources for Software Productivity and Sustainability

Information on effective practices for scientific software can help individuals and groups to improve software quality while decreasing the effort, time, and cost to develop, deploy, maintain, and extend software over its intended lifetime. Improving software practices thus promotes more effective CSE collaboration through software and helps to increase overall scientific productivity.

Prerequisites

What is CSE Software Refactoring?

What is CSE Software Testing?

How to Improve Testing for CSE Software

What is Reproducibility?

What is Continuous Integration Testing?

What is Good Documentation for CSE Software

How to Write Good Documentation for CSE Software

Published June 20, 2017

A variety of tutorials, webinars, and training sessions provide resources for improving various aspects of software productivity.

Resource Template

This is a template file to use as a starting point for a new resource for the Better Scientific Software site. The comments section of this file includes guidelines on naming conventions and metadata, as provided in the BSSw Style Guide.

While this template is appropriate for most new resources on the site, we also provide templates for aggregate resources, composed of an introduction to a topic and then multiple subresources (topical sections), contributed by potentially different authors. For this case, see ResourceTemplate.AggregateBase and ResourceTemplate.AggregateSubresource.

To add a new resource using this file as a starting point:

  • View this file in Raw mode.
  • Copy all text.
  • Select Create New File.
  • Paste text into your new document, as a starting point. Then edit as you like.
  • Continue following instructions in How To Contribute.

To incorporate images, see the guidelines in the Better Scientific Software images repository.

What is CSE Software Refactoring?

Software refactoring is the process of restructuring source code to achieve improvements in various non-functional or quality attributes of the software such as maintainability, readability, complexity and extensibility to name a few. In particular, refactoring does not change any of the software product's external functionality. Refactoring is a way of improving developer productivity.

Ideally, refactoring should not proceed without first having a collection of unit tests with sufficient coverage of the code to be refactored. In practice, however, in high functioning development teams some common forms of re-factoring occur routinely and organically as part of their development processes to avoid duplication of closely related functionalities and reuse bits and pieces of source code.

It should be noted that in all but the most trivial of situations, the common practice of cut-n-paste-n-adjust programming, although possibly the most expedient, inevitably creates a re-factoring burden rather than solves one. Later on, another developer having to maintain such code will have to do the software engineering work to collapse all the cut-n-paste instances into a single implementation that can be re-used where necessary.

Software development practices in academia: a case study comparison

D. Groen, X. Guo, J.A. Grogan, U.D. Schiller, J.M. Osborne, arXiv:1506.05272

Academic software development practices often differ from those of commercial development settings, yet only limited research has been conducted on assessing software development practices in academia. In this paper the authors present a case study of software development practices in four open-source scientific codes over a period of nine years, characterizing the evolution of their respective development teams, their scientific productivity, and the adoption (or discontinuation) of specific software engineering practices as the team size changes. The authors show that the transient nature of the development team results in the adoption of different development strategies. They relate measures of publication output to accumulated numbers of developers and find that for the projects considered the time-scale for returns on expended development effort is approximately three years.

This case study is extremely relevant for the better scientific software development as it shows the long term impacts of implementing certain software engineering policy and practices.

Tutorial: CSE Collaboration through Software: Improving Productivity and Sustainability

Webinar Series: Best Practices for HPC Software Developers

Venue: Joint activity by ALCF, NERSC, OLCF, and the IDEAS Software Productivity Project

Dates: May—June, 2016

Overview: This series of webinars presents best practices that help users of HPC systems carry out their software development more productively. This series is designed for HPC software developers who are seeking help in increasing their team’s productivity, as well as facility staff who interact extensively with users.

Resources: slides and video for all presentations: https://www.olcf.ornl.gov/training-event/webinar-series-best-practices-for-hpc-software-developers/

Topics:

  • What All Codes Should Do: Overview of Best Practices in HPC Software Development - Anshu Dubey (ANL)

Abstract: Scientific code developers have increasingly been adopting software processes derived from the mainstream (non-scientific) community. Software practices are typically adopted when continuing without them becomes impractical. However, many software best practices need modification and/or customization, partly because the codes are used for research and exploration, and partly because of the combined funding and sociological challenges. This presentation will describe the lifecycle of scientific software and important ways in which it differs from other software development. We will provide a compilation of software engineering best practices that have generally been found to be useful by science communities, and we will provide guidelines for adoption of practices based on the size and the scope of the project.

Abstract: The process of developing HPC software requires consideration of issues in software design as well as practices that support the collaborative writing of well-structured code that is easy to maintain, extend, and support. This presentation will provide an overview of development environments and how to configure, build, and deploy HPC software using some of the tools that are frequently used in the community. We will also discuss ways in which these and other tools are best utilized by various categories of scientific software developers, ranging from small teams (for example, a faculty member and graduate students who are writing research code intended primarily for their own use) through moderate/large teams (for example, collaborating developers spread among multiple institutions who are writing publicly distributable code intended for use by others in the community).

Abstract: Recently, many tools and workflows have emerged in the software industry that have greatly enhanced the productivity of development teams. GitHub, a site that hosts projects in Git repositories, is a popular platform for open source and closed source projects. GitHub has encoded several best practices into easily followed procedures such as pull requests, which enrich the software engineering vocabularies of non-professionals and professionals alike. GitHub also provides integration to other services (for example, continuous integration such as Travis CI, which allows code changes to be automatically tested before they are merged into a master development branch). This presentation will discuss how to set up a project on GitHub, illustrate the use of pull requests to incorporate code changes, and show how Travis CI can be used to boost confidence that changes will not break existing code.

Abstract: Software verification and validation are needed for high-quality and reliable scientific codes. For software with moderate to long lifecycles, a strong automated testing regime is indispensable for continued reliability. Similarly, comprehensive and comprehensible documentation is vital for code maintenance and extensibility. This presentation will provide guidelines on testing and documentation that can help to ensure high-quality and long-lived HPC software. We will present methodologies, with examples, for developing tests and adopting regular automated testing. We also will provide guidelines for minimum, adequate, and good documentation practices depending on the available resources of the development team.

Abstract: High performance computing has transformed how science and engineering research is conducted. Answering a question in 30 minutes that used to take 6 months can quickly change the way one asks questions. Large computing facilities provide access to some of the world’s largest computing, data, and network resources in the world. Indeed, the DOE complex has the highest concentration of supercomputing capability in the world. However, by nature of their existence, making use of the largest computers in the world can be a challenging and unique task. This talk will discuss how supercomputers are unique and explain how that impacts their use.

  • An Introduction to High-Performance Parallel I/O - Feiyi Li (ORNL)

Abstract: Parallel data management is a complex problem at large-scale HPC environments. The HPC I/O stack can be viewed as a multi-layered cake and presents an high-level abstraction to the scientists. While this abstraction shields the users from many of the I/O system details, it is very hard to obtain parallel I/O performance or functionality without understanding the end-to-end hierarchical I/O stack in today’s modern complex HPC environments. This talk will introduce the basic parallel I/O concepts and will provide guidelines on obtaining better I/O performance on large-scale parallel platforms.

Abstract: How is optimizing HPC applications like an Ant Farm? Attend this presentation to find out. We’ll discuss the basic concepts around optimizing code for the HPC systems of today and tomorrow. These systems require codes to effectively exploit both parallelism between nodes and an ever growing amount of parallelism on-node. We’ll discuss profiling strategies, tools (for profiling and debugging) and common issues with both internode communication and on-node parallelism. We will give an overview of traditional optimizations areas in HPC applications like parallel IO and MPI strong and weak scaling as well as topics relevant for modern GPU and many-core systems like threading, SIMD/AVX, SIMT and effectively using cache and memory hierarchies. The “Ant Farm” approach places a heavy emphasis on the roofline performance model and encouraging users to understand the compute, bandwidth and latency sensitivity of their applications and kernels through a series of easy to perform experiments and an easy to follow flow chart. Finally, we’ll discuss what we expect to change in the optimization process as we move towards exascale computers.