Open Science
Last updated on 2025-05-22 | Edit this page
Overview
Questions
- What is open science?
- How is open science valuable?
- How can version control help me make my work more open?
Objectives
- Define open science and be able to list attributes or processes that make a research project open.
- Explain why open science is valuable.
- Explain how a version control system can be leveraged as an electronic lab notebook for computational work.
In 2023, the U.S. government declared a Year of Open Science and defined open science for federal agencies:
“Open Science is the principle and practice of making research products and processes available to all, while respecting diverse cultures, maintaining security and privacy, and fostering collaborations, reproducibility, and equity.”
But what does this mean in practice? NASA is one agency leading the way in developing a culture of open science with their Transform to Open Science (TOPS) program, including the publication of an Open Science 101 curriculum. Here at USGS, we can practice open science by releasing scientific code with Git version control via a USGS software information product.
Check Out How USGS Celebrated The Year Of Open Science!
Check out the USGS Year of Open Science webpage to learn about the Community for Data Integration’s (CDI) ‘Open Data for Open Science’ workshop and other USGS open science stories.
Let us take a step back. How is open science valuable and how does publishing your code make your research more open?
Making Code Citable
All USGS software information products are citable with a unique Digital Object Identifier (DOI). You will learn how to create the citation and DOI in the later episode on Citation.
Unless your methods are restricted to a single mathematical operation, it is very difficult to make your research fully reproducible without the code used to analyze and generate results. Sharing the analysis code can significantly increase the reproducibility of published papers (Ince et al. 2012, Laurinavichyute et al. 2022). Additionally, open science practices can lead to more citations, potential collaborators, and funding opportunities (McKiernan et al. 2016). This open model accelerates discovery: the more open work is, the more widely it is cited and re-used (Piwowar et al. 2007).
Researchers are also exploring how the FAIR (Findable, Accessible, Interoperable, and Reusable) data standards can apply to research software. Check out the FAIR Principles for Research Software to learn more.
Are you worried that your code is too messy to share? Fear not: here is an open letter from a professional software engineer telling you that it is good enough. In fact, “if your code is good enough to do the job, then it is good enough to release”.
Is My Work Reproducible?
When analysis is conducted using scientific code, domain and code reviews can help to determine reproducibility (and therefore the accuracy and validity) of the results. You will learn more about these types of reviews in a later episode.
However, people who want to work this way may have some questions about how to approach publishing the code. This is one of the (many) reasons we teach version control. When used diligently, version control with Git acts as a shareable electronic lab notebook for computational work:
- The conceptual stages of your work are documented, including who did what and when. Every step is stamped with an identifier (the commit ID) that is for most intents and purposes unique.
- You can tie documentation of rationale, ideas, and other intellectual work directly to the changes that spring from them.
- You can refer to what you used in your research to obtain your computational results in a way that is unique and recoverable.
- With a version control system such as Git, the entire history of the repository is easy to archive for perpetuity.
Challenge: Is There An Advantage To Publishing Scientific Code Using Version Control Software?
Publishing your scientific code as a git repository is more open or valuable than publishing it as part of a data release. TRUE or FALSE?
True. The advantages of publishing your scripts in a Git repository include:
- Publishing the history of changes. This keeps a record of what methods were explored, prior versions and approaches, and what did not work well.
- Keeping track of who authored what. Tracking helps authors receive credit for the work accomplished.
- Providing an easy way to correct errors or make updates as new information becomes available.
- Simplifying how others can access and use your code. Anyone can clone your repository and immediately start using your code.
Key Points
- Open scientific work is more useful and more highly cited than closed
- Publishing code is a critical part of making science reproducible
- If your code is good enough to produce scientific results, then it is good enough to publish