The SciCodes consortium was formed to help discipline- and institutionally-based software registries and repositories share work methods and develop standards. They welcome involvement from both developers of research software and managers of repositories and registries.
Originally posted on the Software Sustainability Institute blog.
Scientific disciplines that rely on computational methods often have a resource, a code registry or repository, that serves as a library for the discipline and collects the software itself and/or metadata about the software. SciCodes, formed in 2021, is a consortium of academic discipline and institutional software registries and repositories. Among its goals are sharing work methods and creating a virtual registry standard to enable searching across multiple software registries.
Software in science
If you work in geodynamics, astronomy, or with biostatistics, or with any scientific research software, you are likely familiar with either the Astrophysics Source Code Library (ASCL), Computational Infrastructure for Geodynamics (CIG), bio.tools, or Zenodo. Software registries and repositories such as these, CaltechDATA, CoMSES, DOECODE, and others improve research by making these codes discoverable, thus providing transparency and reproducibility, and by promoting reuse of software, thus potentially making research more efficient. These services are also active in promoting formal software citation in research articles.
Several years ago, managers and editors of these and other similar resources got together to share and discuss their practices, and to develop a list of best practices for software registries and repositories. We met virtually for about a year, and then held a workshop to refine our ideas. At the conclusion of that project, the group decided to continue to meet and formed the SciCodes consortium.
One of the goals of the consortium is to enable the ability to search for code across multiple software registries. Software developed for one discipline may also be useful in another. For example, WND-CHARM1, written originally for use in biological imaging, has also proven useful in galaxy morphology research. Wouldn’t it be great if there were a way for you to query multiple research software resources to find code that solves a computational problem you have? We think so! And we are working toward a way to do this!
One of the first efforts of the group is to render our own holdings – the metadata in each software library – in the CodeMeta format (codemeta.json). This translates (or “crosswalks”) the information in each of our resources, which use different schemas, to one standard schema. Having the metadata from these various resources in one standard schema will allow us to build a search tool that can then search all of this metadata, enabling you to find the code that you need, regardless of which discipline it originated in.
The SciCodes consortium also works to improve software citation and findability, strengthen our individual resources by adopting and adapting the best practices we identified, and share advances and information through presentations at our monthly meetings. Because the consortium’s members are spread out over many time zones, the group holds two meetings, seven hours apart, on the same day each month. Meetings include discussions on best practices and presentations from group members. The consortium is currently led by Hervé Ménager and Tom Morrell, who were elected in late 2021 to overlapping terms to run the group.
Would you like to learn more about our activities? View our outreach materials, including our video above that describes SciCodes and how it may help one particular field which is a good quick introduction to the consortium. We also record the presentations that are given in our meetings; these are available online and cover topics such as the Zenodo/InvenioRDM Codemeta Integration, Research software review as part of the publication process, and how to Archive and promote open source code with HAL and Software Heritage.
Are you writing software for research? If so, please consider submitting it for inclusion in a suitable registry/repository. And! Make your own software more easily cited by listing your preferred citation on your code’s download site, preferably in a standard format such as codemeta.json or CITATION.cff.
If your discipline or institution has a software repository or registry that is not currently represented in SciCodes, please consider sharing the best practices for software registries and repositories with it, and let us know about the resource by emailing email@example.com so we can consider it for membership.
Hervé Ménager is a Research Engineer at the Institut Pasteur, where he initially joined the Software and Databases group. He is currently co-head of the Bioinformatics and Biostatistics Hub of the Institut Pasteur. He is also involved in multiple software and infrastructure projects and organizations, such as ELIXIR Europe, the EDAM ontology, the SciCodes Consortium, the Galaxy framework, and the Common Workflow Language standard.
Tom Morrell is the Research Data Specialist at Caltech Library. He is responsible for managing the CaltechDATA institutional data and software repository and helping researchers effectively store and share their data and software. Tom also contributes to the FORCE11 Software Citation Implementation Working Group, SciCodes Consortium, and InvenioRDM repository development.
Alice Allen is the Editor of the Astrophysics Source Code Library (ASCL), which works to improve the transparency and reproducibility of astronomy research by making the computational methods used in this research more discoverable. She is a member of the FORCE11 Software Citation Implementation Working Group, the SciCodes Consortium, and the Astronomy Picture of the Day Evaluation and Advisory Committee.