Automated Version Control
- Version control is like an unlimited ‘undo’.
- Version control also allows many people to work in parallel.
Setting Up Git
- Use
git config
with the--global
option to configure a user name, email address, editor, and other preferences once per machine.
Creating a Repository
-
git init
initializes a repository. - Git stores all of its repository data in the
.git
directory.
Tracking Changes
-
git status
shows the status of a repository. - Files can be stored in a project’s working directory (which users see), the staging area (where the next commit is being built up) and the local repository (where commits are permanently recorded).
-
git add
puts files in the staging area. -
git commit
saves the staged content as a new commit in the local repository. - Write a commit message that accurately describes your changes.
Exploring History
-
git diff
displays differences between commits. -
git checkout
recovers old versions of files.
Ignoring Things
- The
.gitignore
file tells Git what files to ignore.
Remotes in GitLab
- A local Git repository can be connected to one or more remote repositories.
- Use the HTTPS protocol to connect to remote repositories.
-
git push
copies changes from a local repository to a remote repository. -
git pull
copies changes from a remote repository to a local repository.
Branching and Merging
- A branching workflow enables you to keep your main repository clean
and allows for mistakes, fixes, and reviews before content is merged
into
main
.
Collaborating
-
git clone
copies a remote repository to create a local repository with a remote calledorigin
automatically set up. - Branches are an important part of collaborating with others in Git repositories.
- Ensure that you establish a collaborative workflow for your project team to use.
Conflicts
- Conflicts occur when two or more people change the same lines of the same file.
- The version control system does not allow people to overwrite each other’s changes blindly, but highlights conflicts so that they can be resolved.
Open Science
- Open scientific work is more useful and more highly cited than closed
- Publishing code is a critical part of making science reproducible
- If your code is good enough to produce scientific results, then it is good enough to publish
Policy
Software may be publicly accessible as an open source software project and/or as an official USGS software information product.
While both a project and product may be public, only the official USGS software information product is citable by other publications.
Governing policies cascade from Federal to local levels. Check with your supervisor to ensure compliance with all local policies.
Licensing
- A
LICENSE
file is often used in a repository to indicate how the contents of the repo may be used by others. - USGS software products require a
LICENSE.md
file in the project root of your repository. - Non-derivative USGS software products can use the CC0 1.0 license.
- If you need a different license, consult the solicitor’s office to determine the appropriate license.
Citation
- Create a DOI for your software information product.
- Add a suggested citation to your repository.
Commonly Included Files
- USGS software products typically contain “boilerplate” files.
- Some of these files, like the
DISCLAIMER.md
, are mandatory and must be included in all USGS software products. Others are optional. - Examples of these files may be found in existing projects, or on this page.
Creating Metadata
- A
code.json
file is a file formatted in JavaScript Object Notation (JSON) and contains project metadata. Thecode.json
file is saved at the top-level of the project. - USGS compiles all of the
code.json
files for public products in GitLab into an inventory that is required by Federal policy. - You can use the
code.json
file template above to begin creating your project and product metadata with the required fields.
Software Review for Reviewers
- An administrative review is required for all open-source software projects
- A technical code review and a scientific domain review are required for official USGS software products
- There are many ways to conduct and document a software review. One way is by creating a GitLab issue with comments documenting the review
Publishing
- Submit an new issue as the first step in publishing a software product
- Create an static Git tag and associated release
- Activate your DOI using the url of the Git release
Continuing Your Project
- A good workflow can streamline open-source project development while ensuring compliance with governing policies
- While specific criteria necessitate releasing subsequent versions, this may also be done at the author’s discretion
- Subsequent versions are released in a manner very similar to the initial version
- The code.json file should be updated to include another object within the array that describes the new version.