There are several environments that support collecting data, software, and publication content in order to support generating a publication, reproducing its results, and exploring its parameter space.
Data collection is an important step for all fields of research and, more importantly, for published scientific articles that are based on the research. Data collection is usually done before high quality research can begin so questions can be answered. A critical objective of data collection is to ensure that reliable information and data is collected so that data-driven informed decisions can be made from further analysis, study, and research. Both data collection and analysis allows researchers to stay on top of trends, provide answers to problems, and analyze new insights to great effect.
Research papers are often based on huge amounts of summarized data that can be easily read and understood. When writing and preparing to publish a research paper, it is important for data to be analyzed and presented in a visually appealing way. The data collection software products described in this article are indicative of data collection methods for analyzing and publishing studies of outcomes of research, and these quality data collection methods improve accuracy and validity for publications. Among environments that support collecting data, software, and publication content are OCCAM, CodeOcean, Popper, Nvivo, and Shablona. As the need becomes greater, more such environments are being developed.
|Websites||OCCAM Source Code, OCCAM Demo Site|
|Focus||Rebuliding software, workflows|
OCCAM (Open Curation of Computation And Metadata) is a project that serves as the catalyst for the tools, education, and community-building needed to bring openness, accountability, comparability, and repeatability to software distribution and scientific digital exploration. It is a toolset for the preservation of rebuilding software and a self-hostable, federated web portal for composing repeatable workflows.
This product is built on three pillars: Infrastructure, Education, and Community. Their claim is is that OCCAM "provides a strong infrastructure supported by the community to educate the current and next generations on how to construct and distribute better software, scientific artifacts, and research”.
|Resource Name||Code Ocean|
Code Ocean is a centralized platform for the creation, sharing, publication, preservation, and reuse of executable code and data. It is advertised as being ‘One place for an integrated computational research experience. Some of the advertised features include:
- Out-of-the-box popular computational tools
- Easy access to any computing resource and data storage
- Integrated collaboration and access control
- Centralized repository to keep projects and results organized
With Code Ocean, researchers are able to analyze, organize, and execute work and publish into repositories and journals. Code Ocean is a research platform in which users can share and run code in the cloud. With the current new feature, authors are able to upload code referenced in their articles to Code Ocean for free, and IEEE Xplore users can run and/or modify that code.
|Websites||Popper, Popper GitHub Repo, Popper 'Read the Docs' File|
Popper is a more unique and restrictive tool for defining and executing container-native testing workflows in Docker. With Popper, you define a workflow in a YAML file, and then execute it with a single command. Popper is a container-native task automation engine that runs on distinct container engines, orchestration frameworks and CI services.
There is a main website for information, a GitHub repository, and a ‘Popper Read the Docs’ file, where users can find information about the tool. The 'Read the Doc' document specifically instructs you on how to install the product, create, run, debug, and execute your workflow. The section on ‘Life of a Workflow’ describes what Popper does when it executes a workflow and shows sample workflows to get you started. The guide assumes familiarity with Linux containers and the container-native paradigm to software development.
|Website||NVivo Qualitative Data Analysis Software|
|Focus||Data analysis, publishing|
NVivo is advertised to be an excellent tool used by Academia and other institutions. Although it is more commonly used by professors looking to publish papers or a researchers applying for grants, NVivo is popular software to conduct qualitative and mixed methods analysis, prepare for publication, and highlight accomplishments. Researchers who have used this product say, “NVivo lets you import and work with research data from virtually any source, including surveys, interviews, articles, video, email, social media and web content, rich or plain text, PDF, audio, digital photos, spreadsheets and notes from integrated third-party applications.”
Shablona caters to a very particular group, but it was found to be highly useful to researchers who are involved with small scientific Python projects. It is a template developed by the University of Washington’s e-science Institute, touted to be very popular, and an excellent template for universal solutions to small Python projects.