USGS Git, GitLab, and Software Release: All in One View

Last updated on 2025-11-24 | Edit this page

Overview

Questions

What is version control and why should I use it?

Objectives

Explain the benefits of an automated version control system.
Explain the basics of how automated version control systems work.

We’ll start by exploring how version control can be used to keep track of what one person did and when. Even if you aren’t collaborating with other people, automated version control is much better than trying to figure out which of the following is your most recent version:

GrantReport_Final.docx
GrantReport_Final-SupervisoryReview.docx
GrantReport_ReviewWithChanges.docx
GrantReport_Finalv3.docx
GrantReport_Final_for_Review.docx

We’ve all been in this situation before: it seems unnecessary to have multiple nearly-identical versions of the same document. Some word processors let us deal with this a little better, such as Microsoft Word’s Track Changes, Google Docs’ version history, or LibreOffice’s Recording and Displaying Changes.

Version control systems start with a base version of the document and then record changes you make each step of the way. You can think of it as a recording of your progress: you can rewind to start at the base document and play back each change you made, eventually arriving at your more recent version.

Once you think of changes as separate from the document itself, you can then think about “playing back” different sets of changes on the base document, ultimately resulting in different versions of that document. For example, two users can make independent sets of changes on the same document.

Unless multiple users make changes to the same section of the document - a conflict - you can incorporate two sets of changes into the same base document.

A version control system is a tool that keeps track of these changes for us, effectively creating different versions of our files. It allows us to decide which changes will be made to the next version (each record of these changes is called a commit), and keeps useful metadata about them. The complete history of commits for a particular project and their metadata make up a repository. Repositories can be kept in sync across different computers, facilitating collaboration among different people.

Callout

The Long History of Version Control Systems

Automated version control systems are nothing new. Tools like RCS, CVS, or Subversion have been around since the early 1980s and are used by many large companies. However, many of these are now considered legacy systems (i.e., outdated) due to various limitations in their capabilities. More modern systems, such as Git and Mercurial, are distributed, meaning that they do not need a centralized server to host the repository. These modern systems also include powerful merging tools that make it possible for multiple authors to work on the same files concurrently.

Challenge

Paper Writing

Imagine you drafted an excellent paragraph for a paper you are writing, but later ruin it. How would you retrieve the excellent version of your conclusion? Is it even possible?
Imagine you have 5 co-authors. How would you manage the changes and comments they make to your paper? If you use LibreOffice Writer or Microsoft Word, what happens if you accept changes made using the Track Changes option? Do you have a history of those changes?

Show me the solution

Recovering the excellent version is only possible if you created a copy of the old version of the paper.
Collaborative writing with traditional word processors is cumbersome. Either every collaborator has to work on a document sequentially (slowing down the process of writing), or you have to send out a version to all collaborators and manually merge their comments into your document. The ‘track changes’ or ‘record changes’ option can highlight changes for you and simplifies merging, but as soon as you accept changes you will lose their history. You will then no longer know who suggested that change, why it was suggested, or when it was merged into the rest of the document. Even online word processors like Google Docs or Microsoft Office Online do not fully resolve these problems.

Key Points

Version control is like an unlimited ‘undo’.
Version control also allows many people to work in parallel.

Content from Setting Up Git

Last updated on 2025-11-24 | Edit this page

Overview

Questions

How do I get set up to use Git?

Objectives

Configure Git the first time it is used on a computer.
Explain the meaning of the --global configuration flag.

When we use Git on a new computer for the first time, we need to configure a few things. Below are some configurations we will set as we get started with Git:

our name and email address,
what our preferred text editor is,
and that we want to use these settings globally (i.e. for every project).

On a command line, Git commands are written as git verb options, where verb is what we actually want to do and options is additional information which may be needed for the verb. So here is how Dracula sets up his new laptop:

BASH

$ git config --global user.name "Vlad Dracula"
$ git config --global user.email "vdracula@usgs.gov"

Please use your own name and email address instead of Dracula’s. This user name and email will be associated with your subsequent Git activity, which means that any changes pushed to GitHub, BitBucket, GitLab or another Git host server after this lesson will include this information.

For this lesson, we will be interacting with GitLab and so the email address used should be your USGS email.

Callout

Line Endings

As with other keys, when you hit Enter or ↵ (or, on Macs Return), your computer encodes this input as a character (or two). Different operating systems use different character(s) to represent the end of a line. Windows uses the combination of the carriage return and linefeed characters and Unix and Mac use only linefeed. These can cause otherwise identical files to look different to Git. The solution is to automatically strip the carriage return characters when you move files from Windows to the other systems and add them back when you move files in the other direction. You can read more about this issue in the Pro Git book.

You can change the way Git recognizes and encodes line endings using the core.autocrlf command to git config. The following settings are recommended:

On macOS and Linux:

BASH

$ git config --global core.autocrlf input

And on Windows:

BASH

$ git config --global core.autocrlf true

When Git spots a conflict (discussed later), it will automatically open your editor so you can resolve the conflict. To set your favorite editor, choose one of the following configuration commands:

Editor	Configuration command
Atom	`$ git config --global core.editor "atom --wait"`
nano	`$ git config --global core.editor "nano -w"`
BBEdit (Mac, with command line tools)	`$ git config --global core.editor "bbedit -w"`
Sublime Text (Mac)	`$ git config --global core.editor "/Applications/Sublime\ Text.app/Contents/SharedSupport/bin/subl -n -w"`
Sublime Text (Win, 32-bit install)	`$ git config --global core.editor "'c:/program files (x86)/sublime text 3/sublime_text.exe' -w"`
Sublime Text (Win, 64-bit install)	`$ git config --global core.editor "'c:/program files/sublime text 3/sublime_text.exe' -w"`
Notepad (Win)	`$ git config --global core.editor "c:/Windows/System32/notepad.exe"`
Notepad++ (Win, 32-bit install)	`$ git config --global core.editor "'c:/program files (x86)/Notepad++/notepad++.exe' -multiInst -notabbar -nosession -noPlugin"`
Notepad++ (Win, 64-bit install)	`$ git config --global core.editor "'c:/program files/Notepad++/notepad++.exe' -multiInst -notabbar -nosession -noPlugin"`
Kate (Linux)	`$ git config --global core.editor "kate"`
Gedit (Linux)	`$ git config --global core.editor "gedit --wait --new-window"`
Scratch (Linux)	`$ git config --global core.editor "scratch-text-editor"`
Emacs	`$ git config --global core.editor "emacs"`
Vim	`$ git config --global core.editor "vim"`
VS Code	`$ git config --global core.editor "code --wait"`

It is possible to reconfigure the text editor for Git whenever you want to change it.

Challenge

Configure Default Text Editor

Configure your Git Bash to use nano as the default text editor.

If you already have your text editor configured to a different text editor, you may leave it; however, please note that the instructors will be using nano throughout this course.

Show me the solution

git config --global core.editor "nano -w"

Callout

Exiting Vim

Note that Vim is the default editor for many programs. If you haven’t used Vim before and wish to exit a session without saving your changes, press Esc then type :q! and hit Enter or ↵ or on Macs, Return. If you want to save your changes and quit, press Esc then type :wq and hit Enter or ↵ or on Macs, Return.

Git (2.28+) allows configuration of the name of the branch created when you initialize any new repository. Dracula decides to use that feature to set it to main so it matches the cloud service he will eventually use.

BASH

$ git config --global init.defaultBranch main

Callout

Default Git branch naming

Source file changes are associated with a “branch.” For new learners in this lesson, it’s enough to know that branches exist, and this lesson uses one branch.

By default, Git will create a branch called master when you create a new repository with git init (as explained in the next Episode). The software development community has moved to adopt the term main instead.

In 2020, most Git code hosting services transitioned to using main as the default branch. As an example, any new repository that is opened in GitHub or the USGS GitLab defaults to main. However, Git has not yet made the same change. As a result, local repositories must be manually configured to have the same default branch name as most cloud services.

The five commands we just ran above only need to be run once: the flag --global tells Git to use the settings for every project, in your user account, on this computer.

Let’s review those settings and test our core.editor right away:

BASH

$ git config --global --edit

Let’s close the file without making any additional changes. Remember, since typos in the config file will cause issues, it’s safer to view the configuration with:

BASH

$ git config --list

And if necessary, change your configuration using the same commands to choose another editor or update your email address. This can be done as many times as you want.

Callout

Proxy

Typically, your work in USGS will not require the use of a proxy. In the unusual case that your group requires it, you may also need to tell Git about the proxy:

BASH

$ git config --global http.proxy proxy-url
$ git config --global https.proxy proxy-url

To disable the proxy, use

BASH

$ git config --global --unset http.proxy
$ git config --global --unset https.proxy

Callout

Git Help and Manual

Always remember that if you forget the subcommands or options of a git command, you can access the relevant list of options typing git <command> -h or access the corresponding Git manual by typing git <command> --help, e.g.:

BASH

$ git config -h
$ git config --help

While viewing the manual, remember the : is a prompt waiting for commands and you can press Q to exit the manual.

More generally, you can get the list of available git commands and further resources of the Git manual typing:

BASH

$ git help

Built-in Git Integrations

There are many development environments that have built-in integrations with Git to streamline the most common Git operations. This lesson does not go into details on using these integrations, but here are some resources that you can explore on your own: - RStudio: https://docs.posit.co/ide/user/ide/guide/tools/version-control.html - Visual Studio Code: https://code.visualstudio.com/docs/sourcecontrol/overview

Key Points

Use git config with the --global option to configure a user name, email address, editor, and other preferences once per machine.

Content from Creating a Repository

Last updated on 2025-11-24 | Edit this page

Overview

Questions

Where does Git store information?

Objectives

Create a local Git repository.
Describe the purpose of the .git directory.

Once Git is configured, we can start using it.

We will continue with the story of Wolfman and Dracula who are modeling the co-occurrences of vampires and werewolves on Mars.

We will do our work in the Desktop folder so let us change our directory:

BASH

$ cd
$ cd Desktop

If your Desktop is backed up by OneDrive, change your directory to it with:

BASH

$ cd OneDrive\ -\ DOI/Desktop

Note: You can start typing OneDrive and then hit Tab to autocomplete through “DOI/”. Then, starting typing Desktop and hit Tab to autocomplete. If you are struggling to find your Desktop via Git Bash, open your File Explorer, hold SHIFT and right click on the Desktop folder. You should see a menu with the option Open Git Bash here. Click that option and use the Git Bash terminal that is opened.

Now that we are in our Desktop, let us create a new directory for our work and then change the current working directory to the newly created one:

BASH

$ mkdir vampires-and-werewolves
$ cd vampires-and-werewolves

Then we tell Git to make vampires-and-werewolves a repository -- a place where Git can store versions of our files:

BASH

$ git init

If we show hidden files in File Explorer, we can see that a .git file has been created in the vampires-and-werewolves directory:

File Explorer will show the hidden .git file that is created when you run `git init`.

It is important to note that git init will create a repository that can include subdirectories and their files—there is no need to create separate repositories nested within the vampires-and-werewolves repository, whether subdirectories are present from the beginning or added later. Also, note that the creation of the vampires-and-werewolves directory and its initialization as a repository are completely separate processes.

If we use ls to show the directory’s contents, it appears that nothing has changed:

BASH

$ ls

But if we add the -a flag to show everything, we can see that Git has created a hidden directory within vampires-and-werewolves called .git:

BASH

$ ls -a

OUTPUT

.	..	.git

Git uses this special subdirectory to store all the information about the project, including the tracked files and sub-directories located within the project’s directory. If we ever delete the .git subdirectory, we will lose the project’s history.

Next, we will change the default branch to be called main. This might be the default branch depending on your settings and version of git. See the setup episode for more information on this change.

BASH

$ git branch -m main

We can check that everything is set up correctly by asking Git to tell us the status of our project:

BASH

$ git status

OUTPUT

On branch main

No commits yet

nothing to commit (create/copy files and use "git add" to track)

If you are using a different version of git, the exact wording of the output might be slightly different.

We just used a variety of bash and git commands which can be confusing to keep track of. Luckily, we have compiled a list of some cheatsheats for quick reference.

Challenge

Places to Create Git Repositories

Along with tracking information about the vampires and werewolves modeling project on Mars (the project we have already created), Dracula would also like to track information about vampires and werewolves on various moons. Despite Wolfman’s concerns, Dracula creates a moons project inside his vampires-and-werewolves project with the following sequence of commands:

BASH

$ cd ~/Desktop   # return to Desktop directory
$ cd vampires-and-werewolves     # go into vampires-and-werewolves directory, which is already a Git repository
$ ls -a          # ensure the .git subdirectory is still present in the vampires-and-werewolves directory
$ mkdir moons    # make a subdirectory vampires-and-werewolves/moons
$ cd moons       # go into moons subdirectory
$ git init       # make the moons subdirectory a Git repository
$ ls -a          # ensure the .git subdirectory is present indicating we have created a new Git repository

Is the git init command, run inside the moons subdirectory, required for tracking files stored in the moons subdirectory?

Show me the solution

No. Dracula does not need to make the moons subdirectory a Git repository because the vampires-and-werewolves repository can track any files, sub-directories, and subdirectory files under the vampires-and-werewolves directory. Thus, in order to track all information about moons, Dracula only needed to add the moons subdirectory to the vampires-and-werewolves directory.

Additionally, Git repositories can interfere with each other if they are “nested”: the outer repository will try to version-control the inner repository. Therefore, it is best to create each new Git repository in a separate directory. To be sure that there is no conflicting repository in the directory, check the output of git status. If it looks like the following, you are good to go to create a new repository as shown above:

BASH

$ git status

OUTPUT

fatal: Not a git repository (or any of the parent directories): .git

Challenge

Correcting `git init` Mistakes

Wolfman explains to Dracula how a nested repository is redundant and may cause confusion down the road. Dracula would like to remove the nested repository. How can Dracula undo his last git init in the moons subdirectory?

Solution – USE WITH CAUTION!

Background

Removing files from a Git repository needs to be done with caution. But we have not learned yet how to tell Git to track a particular file; we will learn this in the next episode. Files that are not tracked by Git can easily be removed like any other “ordinary” files with

BASH

$ rm filename

Similarly a directory can be removed using rm -r dirname or rm -rf dirname. If the files or folder being removed in this fashion are tracked by Git, then their removal becomes another change that we will need to track, as we will see in the next episode.

Solution

Git keeps all of its files in the .git directory. To recover from this little mistake, Dracula can just remove the .git folder in the moons subdirectory by running the following command from inside the vampires-and-werewolves directory:

BASH

$ rm -rf moons/.git

But be careful! Running this command in the wrong directory will remove the entire Git history of a project you might want to keep. Therefore, always check your current directory using the command pwd.

Key Points

git init initializes a repository.
Git stores all of its repository data in the .git directory.
Git repositories should not be nested in your file system. Have only one .git folder/directory within a directory.

Content from Tracking Changes

Last updated on 2025-11-24 | Edit this page

Overview

Questions

How do I record changes in Git?
How do I check the status of my version control repository?
How do I record notes about what changes I made and why?

Objectives

Go through the modify-add-commit cycle for one or more files.
Explain where information is stored at each stage of that cycle.
Distinguish between descriptive and non-descriptive commit messages.

First let us make sure we are still in the right directory. You should be in the vampires-and-werewolves directory.

BASH

$ cd ~/Desktop/vampires-and-werewolves

Let us create a file called mars.txt that contains some notes about the Red Planet’s suitability for vampires and werewolves. We will use nano to edit the file; you can use whatever editor you like. In particular, this does not have to be the core.editor you set globally earlier. But remember, the bash command to create or edit a new file will depend on the editor you choose (it might not be nano). If you are more comfortable with a graphical interface rather than the command line, you can also use an editor like Notepad, on Windows, to edit the file (e.g. notepad mars.txt). For a refresher on text editors, check out “Which Editor?” in The Unix Shell lesson.

BASH

$ nano mars.txt

Type the text below into the mars.txt file:

OUTPUT

Cold, dry, and everything is red, vampires' favorite color

Let us first verify that the file was properly created by running the list command (ls):

BASH

$ ls

OUTPUT

mars.txt

We can also see this file in our File Explorer:

mars.txt contains a single line, which we can see by running:

BASH

$ cat mars.txt

OUTPUT

Cold, dry, and everything is red, vampires' favorite color

If we check the status of our project again, Git tells us that it has noticed the new file:

BASH

$ git status

OUTPUT

On branch main

No commits yet

Untracked files:
   (use "git add <file>..." to include in what will be committed)

	mars.txt

nothing added to commit but untracked files present (use "git add" to track)

The “untracked files” message means that there is a file in the directory that Git is not keeping track of. We can tell Git to track a file using git add:

BASH

$ git add mars.txt

and then check that the right thing happened:

BASH

$ git status

OUTPUT

On branch main

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

	new file:   mars.txt

Git now knows that it is supposed to keep track of mars.txt, but it has not recorded these changes as a commit yet. To get it to do that, we need to run one more command:

BASH

$ git commit -m "Start notes on Mars suitability for vampires and werewolves"

OUTPUT

[main (root-commit) f22b25e] Start notes on Mars suitability for vampires and werewolves
 1 file changed, 1 insertion(+)
 create mode 100644 mars.txt

When we run git commit, Git takes everything we have told it to save by using git add and stores a copy permanently inside the special .git directory. This permanent copy is called a commit (or revision) and its short identifier is f22b25e. Your commit may have another identifier.

We use the -m flag (for “message”) to record a short, descriptive, and specific comment that will help us remember later on what we did and why. If we just run git commit without the -m option, Git will launch nano (or whatever other editor we configured as core.editor) so that we can write a longer message.

Good commit messages start with a brief (<50 characters) statement about the changes made in the commit. Generally, the message should complete the sentence “If applied, this commit will” . If you want to go into more detail, add a blank line between the summary line and your additional notes. Use this additional space to explain why you made changes and/or what their impact will be.

If we run git status now:

BASH

$ git status

OUTPUT

On branch main
nothing to commit, working tree clean

it tells us everything is up to date. If we want to know what we have done recently, we can ask Git to show us the project’s history using git log:

BASH

$ git log

OUTPUT

commit f22b25e3233b4645dabd0d81e651fe074bd8e73b
Author: Vlad Dracula <vdracula@usgs.gov>
Date:   Thu Aug 22 09:51:46 2013 -0400

    Start notes on Mars suitability for vampires and werewolves

git log lists all commits made to a repository in reverse chronological order. The listing for each commit includes the commit’s full identifier (which starts with the same characters as the short identifier printed by the git commit command earlier), the commit’s author, when it was created, and the log message Git was given when the commit was created.

Callout

Where Are My Changes?

If we run ls at this point, we will still see just one file called mars.txt. That is because Git saves information about files’ history in the special .git directory mentioned earlier so that our filesystem does not become cluttered (and so that we cannot accidentally edit or delete an old version).

Now suppose Dracula adds more information to the file. (Again, we will edit with nano and then cat the file to show its contents; you may use a different editor, and do not need to cat.)

BASH

$ nano mars.txt
$ cat mars.txt

OUTPUT

Cold, dry, and everything is red, vampires' favorite color
The two moons may be a problem for werewolves

When we run git status now, it tells us that a file it already knows about has been modified:

BASH

$ git status

OUTPUT

On branch main
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	modified:   mars.txt

no changes added to commit (use "git add" and/or "git commit -a")

The last line is the key phrase: “no changes added to commit”. We have changed this file, but we have not told Git we will want to save those changes (which we do with git add) nor have we saved them (which we do with git commit). So let us do that now. It is good practice to always review our changes before saving them. We do this using git diff. This shows us the differences between the current state of the file and the most recently saved version:

BASH

$ git diff

OUTPUT

diff --git a/mars.txt b/mars.txt
index df0654a..315bf3a 100644
--- a/mars.txt
+++ b/mars.txt
@@ -1 +1,2 @@
 Cold, dry, and everything is red, vampires' favorite color
+The two moons may be a problem for werewolves

The output is cryptic because it is actually a series of commands for tools like editors and patch telling them how to reconstruct one file given the other. If we break it down into pieces:

The first line tells us that Git is producing output similar to the Unix diff command comparing the old and new versions of the file.
The second line tells exactly which versions of the file Git is comparing; df0654a and 315bf3a are unique computer-generated labels for those versions.
The third and fourth lines once again show the name of the file being changed.
The remaining lines are the most interesting, they show us the actual differences and the lines on which they occur. In particular, the + marker in the first column shows where we added a line.

After reviewing our change, it is time to commit it:

BASH

$ git commit -m "Add information about suitability of Mars for werewolves"

OUTPUT

On branch main
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	modified:   mars.txt

no changes added to commit (use "git add" and/or "git commit -a")

Whoops: Git will not commit because we did not use git add first. Let us fix that:

BASH

$ git add mars.txt
$ git commit -m "Add information about suitability of Mars for werewolves"

OUTPUT

[main 34961b1] Add information about suitability of Mars for werewolves
 1 file changed, 1 insertion(+)

Git insists that we add files to the set we want to commit before actually committing anything. This allows us to commit our changes in stages and capture changes in logical portions rather than only large batches. For example, suppose we are adding a few citations to relevant research to our thesis. We might want to commit those additions, and the corresponding bibliography entries, but not commit some of our work drafting the conclusion (which we have not finished yet).

To allow for this, Git has a special staging area where it keeps track of things that have been added to the current changeset but not yet committed.

Callout

Staging Area

If you think of Git as taking snapshots of changes over the life of a project, git add specifies what will go in a snapshot (putting things in the staging area), and git commit then actually takes the snapshot, and makes a permanent record of it (as a commit). If you do not have anything staged when you type git commit, Git will prompt you to use git commit -a or git commit --all, which is kind of like gathering everyone to take a group photo! However, it is almost always better to explicitly add things to the staging area, because you might commit changes you forgot you made. (Going back to the group photo simile, you might get an extra with incomplete makeup walking on the stage for the picture because you used -a!) Try to stage things manually, or you might find yourself searching for “git undo commit” more than you would like!

Let us watch as our changes to a file move from our editor to the staging area and into long-term storage. First, we will add another line to the file:

BASH

$ nano mars.txt
$ cat mars.txt

OUTPUT

Cold, dry, and everything is red, vampires' favorite color
The two moons may be a problem for werewolves
Mummies will appreciate the lack of humidity

BASH

$ git diff

OUTPUT

diff --git a/mars.txt b/mars.txt
index 315bf3a..b36abfd 100644
--- a/mars.txt
+++ b/mars.txt
@@ -1,2 +1,3 @@
 Cold, dry, and everything is red, vampires' favorite color
 The two moons may be a problem for werewolves
+Mummies will appreciate the lack of humidity

So far, so good: we have added one line to the end of the file (shown with a + in the first column). Now let us put that change in the staging area and see what git diff reports:

BASH

$ git add mars.txt
$ git diff

There is no output: as far as Git can tell, there is no difference between what it has been asked to save permanently and what is currently in the directory. However, if we do this:

BASH

$ git diff --staged

OUTPUT

diff --git a/mars.txt b/mars.txt
index 315bf3a..b36abfd 100644
--- a/mars.txt
+++ b/mars.txt
@@ -1,2 +1,3 @@
 Cold, dry, and everything is red, vampires' favorite color
 The two moons may be a problem for werewolves
+Mummies will appreciate the lack of humidity

it shows us the difference between the last committed change and what is in the staging area. Let us save our changes:

BASH

$ git commit -m "Discuss suitability of Mars' climate for mummies"

OUTPUT

[main 005937f] Discuss suitability of Mars' climate for mummies
 1 file changed, 1 insertion(+)

check our status:

BASH

$ git status

OUTPUT

On branch main
nothing to commit, working tree clean

and look at the history of what we have done so far:

BASH

$ git log

OUTPUT

commit 005937fbe2a98fb83f0ade869025dc2636b4dad5 (HEAD -> main)
Author: Vlad Dracula <vdracula@usgs.gov>
Date:   Thu Aug 22 10:14:07 2013 -0400

    Discuss suitability of Mars' climate for mummies

commit 34961b159c27df3b475cfe4415d94a6d1fcd064d
Author: Vlad Dracula <vdracula@usgs.gov>
Date:   Thu Aug 22 10:07:21 2013 -0400

    Add information about suitability of Mars for werewolves

commit f22b25e3233b4645dabd0d81e651fe074bd8e73b
Author: Vlad Dracula <vdracula@usgs.gov>
Date:   Thu Aug 22 09:51:46 2013 -0400

    Start notes on Mars suitability for vampires and werewolves

Callout

Word-based diffing

Sometimes, e.g. in the case of the text documents a line-wise diff is too coarse. That is where the --color-words option of git diff comes in very useful as it highlights the changed words using colors.

Callout

Paging the Log

When the output of git log is too long to fit in your screen, git uses a program to split it into pages of the size of your screen. When this “pager” is called, you will notice that the last line in your screen is a :, instead of your usual prompt.

To get out of the pager, press Q.
To move to the next page, press Spacebar.
To search for some_word in all pages, press / and type some_word. Navigate through matches pressing n.

Callout

Limit Log Size

To avoid having git log cover your entire terminal screen, you can limit the number of commits that Git lists by using -N, where N is the number of commits that you want to view. For example, if you only want information from the last commit you can use:

BASH

$ git log -1

OUTPUT

commit 005937fbe2a98fb83f0ade869025dc2636b4dad5 (HEAD -> main)
Author: Vlad Dracula <vdracula@usgs.gov>
Date:   Thu Aug 22 10:14:07 2013 -0400

  Discuss suitability of Mars' climate for mummies

You can also reduce the quantity of information using the --oneline option:

BASH

$ git log --oneline

OUTPUT

005937f (HEAD -> main) Discuss suitability of Mars' climate for mummies
34961b1 Add information about suitability of Mars for werewolves
f22b25e Start notes on Mars suitability for vampires and werewolves

You can also combine the --oneline option with others. One useful combination adds --graph to display the commit history as a text-based graph and to indicate which commits are associated with the current HEAD, the current branch main, or other Git references:

BASH

$ git log --oneline --graph

OUTPUT

* 005937f (HEAD -> main) Discuss suitability of Mars' climate for mummies
* 34961b1 Add information about suitability of Mars for werewolves
* f22b25e Start notes on Mars suitability for vampires and werewolves

Callout

Directories

Two important facts you should know about directories in Git.

Git does not track directories on their own, only files within them. Try it for yourself:

BASH

$ mkdir spaceships
$ git status
$ git add spaceships
$ git status

Note, our newly created empty directory spaceships does not appear in the list of untracked files even if we explicitly add it (via git add) to our repository. This is the reason why you will sometimes see .gitkeep files in otherwise empty directories. Unlike .gitignore, these files are not special and their sole purpose is to populate a directory so that Git adds it to the repository. In fact, you can name such files anything you like.

If you create a directory in your Git repository and populate it with files, you can add all files in the directory at once by:

BASH

git add <directory-with-files>

Try it for yourself:

BASH

$ touch spaceships/apollo-11 spaceships/sputnik-1
$ git status
$ git add spaceships
$ git status

Before moving on, we will commit these changes.

BASH

$ git commit -m "Add some initial thoughts on spaceships"

To recap, when we want to add changes to our repository, we first need to add the changed files to the staging area (git add) and then commit the staged changes to the repository (git commit):

Challenge

Choosing a Commit Message

Which of the following commit messages would be most appropriate for the last commit made to mars.txt?

“Changes”
“Added line ‘Mummies will appreciate the lack of humidity’ to mars.txt”
“Discuss suitability of Mars’ climate for mummies”

Show me the solution

Answer 1 is not descriptive enough, and the purpose of the commit is unclear; and answer 2 is redundant to using “git diff” to see what changed in this commit; but answer 3 is good: short, descriptive, and imperative.

Challenge

Committing Changes to Git

Which command(s) below would save the changes of myfile.txt to my local Git repository?

BASH
```
   $ git commit -m "my recent changes"
```

BASH

   $ git init myfile.txt
   $ git commit -m "my recent changes"

BASH

   $ git add myfile.txt
   $ git commit -m "my recent changes"

BASH

   $ git commit -m myfile.txt "my recent changes"

Show me the solution

Would only create a commit if files have already been staged.
Would try to create a new repository.
Is correct: first add the file to the staging area, then commit.
Would try to commit a file “my recent changes” with the message myfile.txt.

Challenge

Committing Multiple Files

The staging area can hold changes from any number of files that you want to commit as a single snapshot.

Add some text to mars.txt noting your decision to consider adding mummies to your model
Create a new file mummies.txt with your initial thoughts about including co-occurrences of mummies in your model
Add changes from both files to the staging area, and commit those changes.

Show me the solution

The output below from cat mars.txt reflects only content added during this exercise. Your output may vary.

First we make our changes to the mars.txt and mummies.txt files:

BASH

$ nano mars.txt
$ cat mars.txt

OUTPUT

Maybe we should also consider including mummies in our model.

BASH

$ nano mummies.txt
$ cat mummies.txt

OUTPUT

Mummies often co-occur with vampires and werewolves in stories. We should definitely include mummies in our co-occurrence model.

Now you can add both files to the staging area. We can do that in one line:

BASH

$ git add mars.txt mummies.txt

Or with multiple commands:

BASH

$ git add mars.txt
$ git add mummies.txt

Now the files are ready to commit. You can check that using git status. If you are ready to commit use:

BASH

$ git commit -m "Write plans to add mummies to model"

OUTPUT

[main cc127c2]
 Write plans to add mummies to model
 2 files changed, 2 insertions(+)
 create mode 100644 mummies.txt

Challenge

`bio` Repository

Create a new Git repository on your computer called bio.
Write a three-line biography for yourself in a file called me.txt, commit your changes
Modify one line, add a fourth line
Display the differences between its updated state and its original state.

Show me the solution

If needed, move out of the vampires-and-werewolves folder:

BASH

$ cd ..

Create a new folder called bio and ‘move’ into it:

BASH

$ mkdir bio
$ cd bio

Initialize git:

BASH

$ git init

Create your biography file me.txt using nano or another text editor. Once in place, add and commit it to the repository:

BASH

$ git add me.txt
$ git commit -m "Add biography file"

Modify the file as described (modify one line, add a fourth line). To display the differences between its updated state and its original state, use git diff:

BASH

$ git diff me.txt

Key Points

git status shows the status of a repository.
Files can be stored in a project’s working directory (which users see), the staging area (where the next commit is being built up) and the local repository (where commits are permanently recorded).
git add puts files in the staging area.
git commit saves the staged content as a new commit in the local repository.
Write a commit message that accurately describes your changes.

Content from Ignoring Things

Last updated on 2025-11-24 | Edit this page

Overview

Questions

How can I tell Git to ignore files I don’t want to track?

Objectives

Configure Git to ignore specific files.
Explain why ignoring files can be useful.

What if we have files that we do not want Git to track for us, like backup files created by our editor or intermediate files created during data analysis? Let’s create a few dummy files:

BASH

$ mkdir results
$ touch a.csv b.csv c.csv results/a.out results/b.out

We can see the 5 files that we just created in File Explorer (two of the files, a.out and b.out, are in the results directory):

The five files just created displayed in File Explorer

and see what Git says:

BASH

$ git status

OUTPUT

On branch main
Untracked files:
  (use "git add <file>..." to include in what will be committed)

	a.csv
	b.csv
	c.csv
	results/

nothing added to commit but untracked files present (use "git add" to track)

Putting these files under version control would be a waste of disk space. What’s worse, having them all listed could distract us from changes that actually matter, so let’s tell Git to ignore them.

We do this by creating a file in the root directory of our project called .gitignore:

BASH

$ nano .gitignore
$ cat .gitignore

OUTPUT

*.csv
results/

These patterns tell Git to ignore any file whose name ends in .csv and everything in the results directory. (If any of these files were already being tracked, Git would continue to track them.)

Once we have created this file, the output of git status is much cleaner:

BASH

$ git status

OUTPUT

On branch main
Untracked files:
  (use "git add <file>..." to include in what will be committed)

	.gitignore

nothing added to commit but untracked files present (use "git add" to track)

The only thing Git notices now is the newly-created .gitignore file. You might think we wouldn’t want to track it, but everyone we’re sharing our repository with will probably want to ignore the same things that we’re ignoring. Let’s add and commit .gitignore:

BASH

$ git add .gitignore
$ git commit -m "Ignore data files and the results folder"
$ git status

OUTPUT

On branch main
nothing to commit, working tree clean

As a bonus, using .gitignore helps us avoid accidentally adding files to the repository that we don’t want to track:

BASH

$ git add a.csv

OUTPUT

The following paths are ignored by one of your .gitignore files:
a.csv
Use -f if you really want to add them.

If we really want to override our ignore settings, we can use git add -f to force Git to add something. For example, git add -f a.csv. We can also always see the status of ignored files if we want:

BASH

$ git status --ignored

OUTPUT

On branch main
Ignored files:
 (use "git add -f <file>..." to include in what will be committed)

        a.csv
        b.csv
        c.csv
        results/

nothing to commit, working tree clean

Challenge

Ignoring Nested Files

Given a directory structure that looks like:

BASH

results/data
results/plots

How would you ignore only results/plots and not results/data?

Show me the solution

If you only want to ignore the contents of results/plots, you can change your .gitignore to ignore only the /plots/ subfolder by adding the following line to your .gitignore:

OUTPUT

results/plots/

This line will ensure only the contents of results/plots is ignored, and not the contents of results/data.

As with most programming issues, there are a few alternative ways that one may ensure this ignore rule is followed. The “Ignoring Nested Files: Variation” exercise has a slightly different directory structure that presents an alternative solution. Further, the discussion page has more detail on ignore rules.

Challenge

Including Specific Files

How would you ignore all .csv files in your root directory except for final.csv? Hint: Find out what ! (the exclamation point operator) does

Show me the solution

You would add the following two lines to your .gitignore:

OUTPUT

*.csv           # ignore all data files
!final.csv      # except final.csv

The exclamation point operator will include a previously excluded entry.

Note also that because you’ve previously committed .csv files in this lesson they will not be ignored with this new rule. Only future additions of .csv files added to the root directory will be ignored.

Challenge

Ignoring Nested Files: Variation

Given a directory structure that looks similar to the earlier Nested Files exercise, but with a slightly different directory structure:

BASH

results/data
results/images
results/plots
results/analysis

How would you ignore all of the contents in the results folder, but not results/data?

Hint: think a bit about how you created an exception with the ! operator before.

Show me the solution

If you want to ignore the contents of results/ but not those of results/data/, you can change your .gitignore to ignore the contents of results folder, but create an exception for the contents of the results/data subfolder. Your .gitignore would look like this:

OUTPUT

results/*               # ignore everything in results folder
!results/data/          # do not ignore results/data/ contents

Challenge

Ignoring all data Files in a Directory

Assuming you have an empty .gitignore file, and given a directory structure that looks like:

BASH

results/data/position/gps/a.csv
results/data/position/gps/b.csv
results/data/position/gps/c.csv
results/data/position/gps/info.txt
results/plots

What’s the shortest .gitignore rule you could write to ignore all .csv files in result/data/position/gps? Do not ignore the info.txt.

Show me the solution

Appending results/data/position/gps/*.csv will match every file in results/data/position/gps that ends with .csv. The file results/data/position/gps/info.txt will not be ignored.

Challenge

Ignoring all data Files in the repository

Let us assume you have many .csv files in different subdirectories of your repository. For example, you might have:

BASH

results/a.csv
data/experiment_1/b.csv
data/experiment_2/c.csv
data/experiment_2/variation_1/d.csv

How do you ignore all the .csv files, without explicitly listing the names of the corresponding folders?

Show me the solution

In the .gitignore file, write:

OUTPUT

**/*.csv

This will ignore all the .csv files, regardless of their position in the directory tree. You can still include some specific exception with the exclamation point operator.

Challenge

The Order of Rules

Given a .gitignore file with the following contents:

BASH

*.csv
!*.csv

What will be the result?

Show me the solution

The ! modifier will negate an entry from a previously defined ignore pattern. Because the !*.csv entry negates all of the previous .csv files in the .gitignore, none of them will be ignored, and all .csv files will be tracked.

Challenge

Log Files

You wrote a script that creates many intermediate log-files of the form log_01, log_02, log_03, etc. You want to keep them but you do not want to track them through git.

Write one .gitignore entry that excludes files of the form log_01, log_02, etc.
Test your “ignore pattern” by creating some dummy files of the form log_01, etc.
You find that the file log_01 is very important after all, add it to the tracked files without changing the .gitignore again.
Discuss with your neighbor what other types of files could reside in your directory that you do not want to track and thus would exclude via .gitignore.

Show me the solution

append either log_* or log* as a new entry in your .gitignore
track log_01 using git add -f log_01

Key Points

The .gitignore file tells Git what files to ignore.

Content from Remotes in GitLab

Last updated on 2025-11-24 | Edit this page

Overview

Questions

How do I safely back up my work to a remote site?
How do I share my changes with others on the web?

Objectives

Explain what remote repositories are and why they are useful.
Push to or pull from a remote repository.

Version control really comes into its own when we begin to collaborate with other people. We already have most of the machinery we need to do this; the only thing missing is to copy changes from one repository to another.

Systems like Git allow us to move work between any two repositories. In practice, though, it is easiest to use one copy as a central hub, and to keep it on the web rather than on someone’s laptop. Most programmers use hosting services like GitHub, Bitbucket or GitLab to hold those main copies.

Let us start by sharing the changes we have made to our current project with the world. To this end we are going to create a remote repository that will be linked to our local repository.

1. Create a remote repository

Log in to USGS GitLab, then click on the icon in the top right corner to create a new project called vampires-and-werewolves:

Select Create blank project

Name your project “vampires-and-werewolves”, select your username as the namespace, uncheck “Initialize repository with a README”, and then click Create project.

Note: Since this repository will be connected to a local repository, it needs to be empty. That is why “Initialize repository with a README” needs to be unchecked. See the “GitLab README files” exercise below for a full explanation of why the project needs to be empty.

As soon as the repository is created, GitLab displays a page with a URL and some information on how to configure your local repository:

This effectively does the following on Gitlab’s servers:

BASH

$ mkdir vampires-and-werewolves
$ cd vampires-and-werewolves
$ git init

If you remember back to the earlier episode where we added and committed our earlier work on mars.txt, we had a diagram of the local repository which looked like this:

The Local Repository with Git Staging Area

Now that we have two repositories, we need a diagram like this:

Note that our local repository still contains our earlier work on mars.txt, but the remote repository on GitLab appears empty as it does not contain any files yet.

2. HTTPS Setup

Before Dracula can connect to a remote repository, he needs to set up a way for his computer to authenticate with GitLab so it knows it is him trying to connect to his remote repository.

We are going to set up an “Access token” that we can use to authenticate to GitLab.

In GitLab, click on your user icon and then Preferences.

Once you are in your User Settings, click on Access tokens. If you already have a personal access token and you have it saved, you do not need to follow these steps. If you do not have a personal access token, click on Add new token.

Screenshot of Access tokens page and Add new token button

Add a token name that will be meaningful to you. If you are setting this up as your primary access token for GitLab, you will probably want to select all of the scopes. These scopes establish what you are able to do in GitLab with this personal access token. You may also want to delete the expiration date. If removed, GitLab will automatically set the expiration date to the maximum of one year from the day created. Then, click Create personal access token.

Add a personal access token screen in GitLab with Token name entered and all scopes selected

You will be presented with your new personal access token. Make sure you save it some place secure since you will not be able to access it again through GitLab.

Screenshot of GitLab's presentation of your new personal access token

See the GitLab Documentation for more information on personal access tokens.

When you start interacting with the remote from your computer, if you have not already saved your personal access token, you will be prompted to enter a username and password. The prompt may appear in your command prompt or you may get a pop up on your machine. Your username will be your email address and the password will be your personal access token. See the Password Manager spoiler below for more information on saving your personal access token and handling token expiration.

3. Connect local to remote repository

Now we connect the two repositories. We do this by making the GitLab repository a remote for the local repository. The home page of the repository on GitLab includes the URL string we need to identify it:

Click on the clipboard icon under ‘Clone with HTTPS’ to use the HTTPS protocol.

HTTPS Versus SSH

HTTPS allows you to communicate with GitLab using the HTTPS protocol. This approach tends to be a little simpler and allows you to use a Personal Access Token (similar to a password) to authenticate. You can use the same Personal Access Token across multiple machines.

SSH is considered slightly more secure and requires setting up a public and a private key. There is a little more overhead to using SSH over HTTPS, especially if working on more than one machine, which is why we teach the HTTPS method in this Lesson. That being said, it is not too hard to configure your account to use SSH and the instructions are available at https://docs.gitlab.com/ee/user/ssh.html.

With the URL copied from the browser, go into the local vampires-and-werewolves repository, and run this command:

BASH

$ git remote add origin https://code.usgs.gov/vdracula/vampires-and-werewolves.git

Make sure to use the URL for your repository rather than Vlad’s: the only difference should be your username instead of vdracula.

origin is a local name used to refer to the remote repository. It could be called anything, but origin is a convention that is often used by default in git and GitLab, so it is helpful to stick with this unless there is a reason not to.

We can check that the command has worked by running git remote -v:

BASH

$ git remote -v

OUTPUT

origin  git@code.usgs.gov:vdracula/vampires-and-werewolves.git (fetch)
origin  git@code.usgs.gov:vdracula/vampires-and-werewolves.git (push)

We will discuss remotes in more detail in a future episode, while talking about how they might be used for collaboration.

4. Push local changes to a remote

Now that authentication is setup, we can return to the remote. This command will push the changes from our local repository to the repository on GitLab:

BASH

$ git push origin main

Since Dracula set up a personal access token, it will prompt him for it. If you have already saved your personal access token in Git, it may not prompt for a password.

OUTPUT

Enumerating objects: 16, done.
Counting objects: 100% (16/16), done.
Delta compression using up to 8 threads.
Compressing objects: 100% (11/11), done.
Writing objects: 100% (16/16), 1.45 KiB | 372.00 KiB/s, done.
Total 16 (delta 2), reused 0 (delta 0)
remote: Resolving deltas: 100% (2/2), done.
To https://code.usgs.gov/vdracula/vampires-and-werewolves.git
 * [new branch]      main -> main

Callout

Proxy

If the network you are connected to uses a proxy, there is a chance that your last command failed with “Could not resolve hostname” as the error message. To solve this issue, you need to tell Git about the proxy:

BASH

$ git config --global http.proxy http://user:password@proxy.url
$ git config --global https.proxy https://user:password@proxy.url

When you connect to another network that does not use a proxy, you will need to tell Git to disable the proxy using:

BASH

$ git config --global --unset http.proxy
$ git config --global --unset https.proxy

Password Managers

If your operating system has a password manager configured, git push will try to use it when it needs your username and password. For example, this is the default behavior for Git Bash on Windows. If you want to type your username and password at the terminal instead of using a password manager, type:

BASH

$ unset SSH_ASKPASS

in the terminal, before you run git push. Despite the name, Git uses SSH_ASKPASS for all credential entry, so you may want to unset SSH_ASKPASS whether you are using Git via SSH or https.

You may also want to add unset SSH_ASKPASS at the end of your ~/.bashrc to make Git default to using the terminal for usernames and passwords.

If your personal access token was saved in your password manager and it expires, you will need to generate a new personal access token and open the password manager to delete the saved credential. Then, Git will prompt you for the new password on your next git push.

Our local and remote repositories are now in this state:

Callout

The ‘-u’ Flag

You may see a -u option used with git push in some documentation. This option is synonymous with the --set-upstream-to option for the git branch command, and is used to associate the current branch with a remote branch so that the git pull command can be used without any arguments. To do this, simply use git push -u origin main once the remote has been set up.

We can pull changes from the remote repository to the local one as well:

BASH

$ git pull origin main

OUTPUT

From https://code.usgs.gov/vdracula/vampires-and-werewolves
 * branch            main     -> FETCH_HEAD
Already up-to-date.

In this case, we would see a merge conflict due to unrelated histories. When GitLab creates a README.md file, it performs a commit in the remote repository. When you try to pull the remote repository to your local repository, Git detects that they have histories that do not share a common origin and refuses to merge.

BASH

$ git pull origin main

OUTPUT

warning: no common commits
remote: Enumerating objects: 3, done.
remote: Counting objects: 100% (3/3), done.
remote: Total 3 (delta 0), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (3/3), done.
From https://code.usgs.gov/vdracula/vampires-and-werewolves
 * branch            main     -> FETCH_HEAD
 * [new branch]      main     -> origin/main
fatal: refusing to merge unrelated histories

You can force git to merge the two repositories with the option --allow-unrelated-histories. Be careful when you use this option and carefully examine the contents of local and remote repositories before merging.

BASH

$ git pull --allow-unrelated-histories origin main

OUTPUT

From https://code.usgs.gov/vdracula/vampires-and-werewolves
 * branch            main     -> FETCH_HEAD
Merge made by the 'recursive' strategy.
README.md | 1 +
1 file changed, 1 insertion(+)
create mode 100644 README.md

Key Points

A local Git repository can be connected to one or more remote repositories.
Use the HTTPS protocol to connect to remote repositories.
git push copies changes from a local repository to a remote repository.
git pull copies changes from a remote repository to a local repository.

Content from Exploring History

Last updated on 2025-11-24 | Edit this page

Overview

Questions

How can I look through older commits?
How do I review my changes?
How can I discard staged and unstaged changes?

Objectives

Explain what the HEAD of a repository is.
Know where to find a commit’s SHA.
In GitLab, look at changes made by prior commits.
Restore a file to the version before staged or unstaged changes.

As we saw in a previous episode, we can refer to commits by their identifiers. You can refer to the most recent commit of the working directory by using the identifier HEAD.

We have been adding one line at a time to mars.txt, so it is easy to track our progress by looking, so let us do that using our HEADs. Before we start, let us make a change to mars.txt, adding yet another line.

BASH

$ nano mars.txt
$ cat mars.txt

OUTPUT

Cold, dry, and everything is red, vampires' favorite color
The two moons may be a problem for werewolves
Mummies will appreciate the lack of humidity
Why are we talking about mummies?

Now, let us see what we get.

BASH

$ git diff HEAD mars.txt

OUTPUT

diff --git a/mars.txt b/mars.txt
index b36abfd..0848c8d 100644
--- a/mars.txt
+++ b/mars.txt
@@ -1,3 +1,4 @@
 Cold, dry, and everything is red, vampires' favorite color
 The two moons may be a problem for werewolves
 Mummies will appreciate the lack of humidity
+Why are we talking about mummies?

which is the same as what you would get if you leave out HEAD (try it).

Now let us switch to exploring the history of our commits using GitLab. On our repository’s main page, within the box showing your most recent commit, click “History”:

A screenshot showing a red box circling the button 'History' on the main page of the repository

You will see a list of each commit you have made so far. The date, commit message, and an alphanumeric string are shown for each one. The strings are unique IDs for the changes, and “unique” really does mean unique: every change to any set of files on any computer has a unique 40-character identifier, called an SHA (Secure Hash Algorithm).

If we want to see the changes made in prior commits, we click on the commit message, which will take us to a page showing what was changed between this commit and a prior version:

A screenshot showing what was added to mars.txt in a single commit

Discussion

Explore History and Prior Commits

Take some time to explore the GitLab interface.

What are a few scenarios where these features would be useful? How might the ease of exploring your history result in code that is cleaner and easier to read?

All right! So we can save changes to files and see what we have changed. Now, how can we restore older versions of things? Let us suppose we change our mind about the last update to mars.txt (questioning the topic of mummies).

git status now tells us that the file has been changed, but those changes have not been staged:

BASH

$ git status

OUTPUT

On branch main
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)

    modified:   mars.txt

no changes added to commit (use "git add" and/or "git commit -a")

We can put things back the way they were by using git restore:

BASH

$ git restore mars.txt
$ cat mars.txt

OUTPUT

Cold, dry, and everything is red, vampires' favorite color
The two moons may be a problem for werewolves
Mummies will appreciate the lack of humidity

This command can be handy when you have been experimenting with some changes, but ultimately decide to discard them before you have staged any of these changes.

Git also has ways of reverting back to earlier versions of a file that you have already committed, but those topics are more advanced and we will not be covering them here.

Challenge

Getting Rid of Staged Changes

git restore can be used to restore a previous commit when unstaged changes have been made, but will it also work for changes that have been staged but not committed? Make a change to mars.txt, add that change using git add, then use git restore to see if you can remove your change.

Show me the solution

After adding a change, git restore as-is does not remove the staged changes. To check if it did anything, let us look at the output of git status:

OUTPUT

On branch main
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)

        modified:   mars.txt

Note that if you do not have the same output as above you may either have forgotten to change the file, or you have added it and committed it.

Using the command git restore by itself does not give an error, but it does not restore the file either. Git helpfully tells us that we need to use git restore --staged to unstage the file:

BASH

$ git restore --staged mars.txt

Now, git status gives us:

BASH

$ git status

OUTPUT

On branch main
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)

        modified:   mars.txt

no changes added to commit (use "git add" and/or "git commit -a")

This means we can now use git restore to restore the file to the previous commit:

BASH

$ git restore mars.txt
$ git status

OUTPUT

On branch main
nothing to commit, working tree clean

Challenge

Understanding Workflow and History

What is the output of the last command in

BASH

$ cd vampires-and-werewolves
$ echo "Mummies are beautiful and full of love" > mummies.txt
$ git add mummies.txt
$ echo "Mummies are smelly and gross" >> mummies.txt
$ git commit -m "Comment on Mummy hygiene"
$ git restore mummies.txt
$ cat mummies.txt

OUTPUT
```
  Mummies are smelly and gross
```

OUTPUT

  Mummies are beautiful and full of love

OUTPUT

  Mummies are beautiful and full of love
  Mummies are smelly and gross

OUTPUT

  Error because you have changed mummies.txt without committing the changes

Show me the solution

The answer is 2.

The command git add mummies.txt places the current version of mummies.txt into the staging area. The changes to the file from the second echo command are only applied to the working copy, not the version in the staging area.

So, when git commit -m "Comment on Mummy hygiene" is executed, the version of mummies.txt committed to the repository is the one from the staging area and has only one line.

At this time, the working copy still has the second line (and git status will show that the file is modified). However, git restore mummies.txt replaces the working copy with the most recently committed version of mummies.txt.

So, cat mummies.txt will output

OUTPUT

Mummies are beautiful and full of love.

Key Points

GitLab has a “History” view where you can explore changes made in prior commits
git restore recovers prior versions of files.

Content from Branching and Merging

Last updated on 2025-11-24 | Edit this page

Overview

Questions

What are branches in Git and why should I use them?
How do I merge a branch back into my main branch?

Objectives

Explain why you would want to use a branching workflow, even when you are the only person working on your project.
Create a branch within a Git repository.
Create a merge request and merge a branch into a main branch.

Git Branches

A Git branch is a version of the repository where you can make and review changes before updating the clean, trusted content of the repository. A branch is a safe place to test things out without impacting your main branch. You are free to make mistakes and have the flexibility to fix them within a branch.

Create a branch

Open Git Bash (Windows) or Terminal (MacOS) and navigate to your local repository. Once you are in your project, the current branch will be specified (usually main).
Let us create a new branch:
- Execute git switch -c my-test-branch
- The -c flag is what tells Git to create a new branch
- This will create a new branch with the name you specified that is otherwise an identical copy of the branch you just created it from (in this case, main) and switch you over to the new branch
To switch between branches, execute: git switch <branch-name>
- We can switch back to the main branch with git switch main
- Switching branches will automatically load all of the files on that branch into your computer’s project file directory

Callout

git switch and git restore were introduced in 2019 to separate out the functionality of git checkout, which confused many people by doing too many things.

git switch <branch-name> can be used interchangeably with git checkout <branch-name>, but the command-line options can be slightly different. If you are switching to an existing branch, then the two would look the same:

BASH

git switch desired-branch-name
git checkout desired-branch-name

However, if you want to create a new branch, they differ:

BASH

git switch -c desired-new-branch-name
git checkout -b desired-new-branch-name

Challenge

Create a branch on your own

Create a new branch on your own called 1-my-first-issue
Switch back to main
Switch back to 1-my-first-issue

Show me the solution

git switch -c 1-my-first-issue
git switch main
git switch 1-my-first-issue

GitLab issues are common ways of tracking the work that needs to be done on a project. A common branch naming convention is to use the issue number and a short description of what you are doing as the branch name (e.g, <issue number>-<what-you-are-doing>), similar to what you did in this exercise. Another common naming convention is to use lower-spear-case for your branch names.

Make updates to code

This is when you do your work. Create your scripts, organize files/folders, etc. Do all your work in the repository with the correct branch checked out.

Callout

Important Note

Repositories should not contain any sensitive information, including personally identifiable information, usernames, passwords, or full file paths. While file paths may not be as obviously sensitive as other examples, they are frequently included in scripts. It is worth mentioning that full file paths also decrease portability of scripts to other users!

BASH

git switch 1-my-first-issue

Let us edit the mars.txt file, again.

BASH

nano mars.txt

Type the text below into the mars.txt file after the last line:

OUTPUT

Two vampires and three werewolves were spotted on Mars.

Add the file to the staging area:

BASH

git add mars.txt

Commit the changes:

BASH

git commit -m "Add information about vampire and werewolf co-occurrences"

Push the changes to remote:

BASH

git push -u origin 1-my-first-issue

The -u flag is shorthand for --set-upstream-to, which sets the default remote branch for the current local branch. Prior to this push, the remote repository was not aware of the local branch, and the local branch did not have any connection to the remote. Moving forward, this sets the remote-local association for any future git push or git pull attempts.

Discussion

Checkout a diagram-based tutorial on branching

Having trouble visualizing how the branches relate to each other? Navigate to the Learn Git Branching website, select the tutorial on “Branching in Git” from the “Introduction Sequence”, and follow the prompts.

Notice that this website has tutorials for other Git workflows. You may want to revisit this website for more practice after you finish this course.

Git Merge Requests

Merge requests allow for peer code review before merging new code into a branch (usually the main branch).

Creating Merge Requests

There are many ways to create a merge request in GitLab. See GitLab’s Creating merge requests to see them all.

When you push a new branch into GitLab, GitLab will add a banner message about the push and provide a convenient Create merge request button.

Screenshot of the `Create merge request` button

If you use this method to create the merge request, you will not need to specify the source and target branches.

Add a succinct title and description. The description can follow this basic format:

Describe why this merge request exists
Explain what was changed
Explain how the change addresses the issue
Provide information on how the reviewer can test your code

Select an Assignee (This is the person who owns the merge request but is not responsible for reviewing it) and a Reviewer.

Click Create merge request.

Merging Merge Requests

Once a merge request has been created, you can see an overview, the commits that were made, and all of the line-by-line changes that were made to the content.

After all of the changes have been reviewed, the Reviewer can click Approve and the Assignee can click the Merge button to merge the updates into the main branch.

Challenge

Review your merge request. Can you see the changes that were made? How might you add a comment to a specific line of code?
Merge your changes into main. Are you able to see the updated file in your main branch?

Show me the solution

Click on the “Changes” tab within the Merge Request page. Hover over a line number and click the “dialog box” pop-up to add a comment.
Back on the “Overview” tab of the Merge Request, click “Merge” to merge your changes into main.

Deleting a Branch

After merging the new branch into the main branch, the new branch will be automatically deleted from the remote repository (if Delete source branch is checked during the merge request). But that branch still exists in your local repository. If you want to clean up the clutter of an extra branch, hop back to your local terminal (Git Bash) to delete it:

Check all existing branches to see what you want to delete
Switch to main branch since Git will not delete a branch you are on
Delete the branch that has already been merged into main

BASH

git branch
git switch main
git branch -d 1-my-first-issue

Keep everything up-to-date

To keep your local repository up-to-date with your remote repository, you need to pull in the new changes:

BASH

git pull origin main

Key Points

A branching workflow enables you to keep your main repository clean and allows for mistakes, fixes, and reviews before content is merged into main.

Content from Collaborating

Last updated on 2025-11-24 | Edit this page

Overview

Questions

How can I use version control to collaborate with other people?

Objectives

Clone a remote repository.
Collaborate by pushing to a common repository.
Describe the basic collaborative workflow.

For the next step, get into pairs. Each person will play the role of “Owner” and “Collaborator”. The goal is for the Collaborator to add changes into the Owner’s repository. We will switch back and forth between the roles throughout this episode.

Callout

Practicing By Yourself

If you are working through this lesson on your own, you can carry on by opening a second terminal window. This window will represent your partner, working on another computer. You will not need to give anyone access on GitLab, because both ‘partners’ are you.

Update Repository Permissions

The Owner needs to give the Collaborator access. In your repository page on GitLab, click the Manage menu on the left, select Members, click Invite members. Enter your partner’s username or email address in the search box, select a role (either Developer or Maintainer), and click Invite.

screenshot of repository page with Manage then Members selected, showing how to add Collaborators in a GitLab repository

Clone the Repository

Once the Collaborator has access to the repository, they need to download a copy of the Owner’s repository to their machine. This is called “cloning a repo”.

The Collaborator does not want to overwrite their own version of vampires-and-werewolves.git, and so needs to clone the Owner’s repository to a different location than their own repository with the same name. (This is a weird case…you would not normally have two versions of the same Git repo on your local machine.)

To clone the Owner’s repo into their Desktop folder, the Collaborator can copy the repository URL from the repository homepage by clicking Code and Clone with HTTPS.

screenshot of the repository page with the Code menu opened and showing the copy button under Clone with HTTPS

HTTPS Versus SSH

SSH is considered slightly more secure and requires setting up a public and a private key. There is a little more overhead to using SSH over HTTPS, especially if working on more than one machine, which is why we teach the HTTPS method in this Lesson. SSH also requires being on the internal USGS network (including GlobalProtect) and will not work for external collaborators. That being said, it is not too hard to configure your account to use SSH and the instructions are available at https://docs.gitlab.com/ee/user/ssh.html.

Then, open bash and navigate to your Desktop. Remember your Desktop may be under your OneDrive (e.g., ~/OneDrive - DOI/Desktop).

BASH

$ cd ~/OneDrive\ -\ DOI/Desktop

Next, enter the following (replacing https://code.usgs.gov/vdracula/vampires-and-werewolves.git with the URL that was just copied and replacing vdracula with the Owner’s username):

BASH

$ git clone https://code.usgs.gov/vdracula/vampires-and-werewolves.git ./vdracula-vampires-and-werewolves

Create a New Branch and Make Changes

The Collaborator can now make a change in their clone of the Owner’s repository, exactly the same way as we have been doing before:

BASH

$ cd vdracula-vampires-and-werewolves
$ git switch -c pluto-branch
$ nano pluto.txt
$ cat pluto.txt

OUTPUT

It is so a planet!

Callout

The Importance of Branches

Using branches in Git becomes even more important when you begin collaborating with others. Branches can help you avoid conflicts and allow others to review your code before merging it with the main branch where it could potentially introduce bugs and conflicts with the work of others on your team. You can also ‘protect’ the default (e.g., main) branch to prevent developers from pushing changes directly to it. If the default branch is protected, the developers must push to a separate branch and then create a merge request to add their changes to the default branch. This workflow ensures that changes to the default branch get reviewed and approved. Learn more about GitLab protected branches in the GitLab Documentation.

Stage, Commit, and Push Changes

BASH

$ git add pluto.txt
$ git commit -m "Add notes about Pluto"

OUTPUT

 1 file changed, 1 insertion(+)
 create mode 100644 pluto.txt

Then push the change to the Owner’s repository on GitLab:

BASH

$ git push -u origin pluto-branch

OUTPUT

Enumerating objects: 4, done.
Counting objects: 4, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 306 bytes, done.
Total 3 (delta 0), reused 0 (delta 0)
To https://code.usgs.gov/vdracula/vampires-and-werewolves.git
   9272da5..29aba7c  main -> main

Note that we did not have to create a remote called origin: Git uses this name by default when we clone a repository. (This is why origin was a sensible choice earlier when we were setting up remotes by hand.)

Take a look at the Owner’s repository on GitLab again, and you should be able to see the new branch and commit made by the Collaborator. You may need to refresh your browser to see the new commit.

Challenge

Create and Comment on a Merge Request

Collaborator: Create a merge request that will merge pluto-branch with main. Set the Owner as the Reviewer.

Owner: Add a comment to the line that was added in pluto.txt. Then, approve and merge the merge request.

Show me the solution

Collaborator: Review Branching and Merging Episode “Creating Merge Requests” for a reminder of how to create a merge request in GitLab.

Owner: With GitLab, it is possible to comment on the diff of a merge request. Go to the Changes tab within the merge request. Hover over the line of code to comment and a blue comment icon appears. Click to open a comment window.

Pull Merged Changes to Local Repositories

Once the new code has been merged to the main branch, both the Collaborator and Owner should pull the changes to their local repositories.

To download the changes from GitLab, enter:

BASH

$ git switch main
$ git pull origin main

Now the three repositories (Owner’s local, Collaborator’s local, and Owner’s on GitLab) are back in sync.

Callout

A Basic Collaborative Workflow

In practice, it is good to be sure that you have an updated version of the repository you are collaborating on, so you should git pull before making our changes. The basic collaborative workflow would be:

update your local repo with git pull origin main,
create a feature branch git switch -c <branch-name>,
make your changes and stage them with git add,
commit your changes with git commit -m "YOUR COMMIT MESSAGE HERE",
upload the changes to GitLab with git push -u origin <branch-name>,
create a merge request in GitLab, and
merge once the feature branch has been reviewed and approved.
update your local main branch with git switch main and git pull origin main

It is better to make many commits with smaller changes rather than one commit with massive changes: small commits are easier to read and review.

Challenge

Review Changes

The Owner pushed commits to the repository’s main branch without giving any information to the Collaborator. How can the Collaborator find out what has changed with command line? And on GitLab?

Show me the solution

On the command line, the Collaborator can use git fetch origin main to get the remote changes into the local repository, but without merging them. Then by running git diff main origin/main the Collaborator will see the changes output in the terminal.

On GitLab, the Collaborator can go to the repository and click on “Code” -> “Commits” to view the most recent commits pushed to the repository.

Key Points

git clone copies a remote repository to create a local repository with a remote called origin automatically set up.
Branches are an important part of collaborating with others in Git repositories.
Ensure that you establish a collaborative workflow for your project team to use.

Content from Conflicts

Last updated on 2025-11-24 | Edit this page

Overview

Questions

What do I do when my changes conflict with someone else’s?

Objectives

Explain what conflicts are and when they can occur.
Resolve conflicts resulting from a merge.

As soon as people can work in parallel, they may end up introducing changes that conflict with one another. This will even happen with a single person: if we are working on a piece of software on both our laptop and a server in the lab, we could make different changes to each copy. Version control helps us manage these conflicts by giving us tools to resolve overlapping changes.

To see how we can resolve conflicts, we must first create one. The file mars.txt currently looks like this in our vampires-and-werewolves repository:

BASH

$ cat mars.txt

OUTPUT

Cold, dry, and everything is red, vampires' favorite color
The two moons may be a problem for werewolves
Mummies will appreciate the lack of humidity
Why are we talking about mummies?
Two vampires and three werewolves were spotted on Mars.

Let us add a line to the copy in GitLab and commit the change:

Screenshot adding the line "This line was added to the copy in GitLab." in GitLab

Now let us make a different change locally without updating from GitLab:

BASH

$ nano mars.txt
$ cat mars.txt

OUTPUT

Cold, dry, and everything is red, vampires' favorite color
The two moons may be a problem for werewolves
Mummies will appreciate the lack of humidity
Why are we talking about mummies?
Two vampires and three werewolves were spotted on Mars.
This line was added to my local copy.

We can commit the change locally:

BASH

$ git add mars.txt
$ git commit -m "Add a line in my copy"

OUTPUT

[main 07ebc69] Add a line in my copy
 1 file changed, 1 insertion(+)

but Git will not let us push it to GitLab:

BASH

$ git push origin main

OUTPUT

To https://code.usgs.gov/vdracula/vampires-and-werewolves.git
 ! [rejected]        main -> main (fetch first)
error: failed to push some refs to 'https://code.usgs.gov/vdracula/vampires-and-werewolves.git'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

Git rejects the push because it detects that the remote repository has new updates that have not been incorporated into the local branch. What we have to do is pull the changes from GitLab, merge them into the copy we are currently working in, and then push that. Let us start by pulling:

BASH

$ git pull origin main

OUTPUT

remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (1/1), done.
remote: Total 3 (delta 2), reused 3 (delta 2), pack-reused 0
Unpacking objects: 100% (3/3), done.
From https://code.usgs.gov/vdracula/vampires-and-werewolves
 * branch            main     -> FETCH_HEAD
    29aba7c..dabb4c8  main     -> origin/main
Auto-merging mars.txt
CONFLICT (content): Merge conflict in mars.txt
Automatic merge failed; fix conflicts and then commit the result.

The git pull command updates the local repository to include those changes already included in the remote repository. After the changes from remote branch have been fetched, Git detects that changes made to the local copy overlap with those made to the remote repository, and therefore refuses to merge the two versions to stop us from trampling on our previous work. The conflict is marked in the affected file:

BASH

$ nano mars.txt

OUTPUT

Cold, dry, and everything is red, vampires' favorite color
The two moons may be a problem for werewolves
Mummies will appreciate the lack of humidity
Why are we talking about mummies?
Two vampires and three werewolves were spotted on Mars.
<<<<<<< HEAD
This line was added to my local copy.
=======
This line was added to the copy in GitLab.
>>>>>>> dabb4c8c450e8475aee9b14b4383acc99f42af1d

The change made locally is preceded by <<<<<<< HEAD. Git has then inserted ======= as a separator between the conflicting changes and marked the end of the content downloaded from GitLab with >>>>>>>. (The string of letters and digits after that marker identifies the commit we have just downloaded.)

It is now up to us to edit this file to remove these markers and reconcile the changes. We can do anything we want: keep the change made in the local repository, keep the change made in the remote repository, write something new to replace both, or get rid of the change entirely. Let us replace both so that the file looks like this:

BASH

$ cat mars.txt

OUTPUT

Cold, dry, and everything is red, vampires' favorite color
The two moons may be a problem for werewolves
Mummies will appreciate the lack of humidity
Why are we talking about mummies?
Two vampires and three werewolves were spotted on Mars.
We removed the conflict on this line

To finish merging, we add mars.txt to the changes being made by the merge and then commit:

BASH

$ git add mars.txt
$ git status

OUTPUT

On branch main
All conflicts fixed but you are still merging.
  (use "git commit" to conclude merge)

Changes to be committed:

	modified:   mars.txt

BASH

$ git commit -m "Merge changes from GitLab"

OUTPUT

[main 2abf2b1] Merge changes from GitLab

Now we can push our changes to GitLab:

BASH

$ git push origin main

OUTPUT

Enumerating objects: 10, done.
Counting objects: 100% (10/10), done.
Delta compression using up to 8 threads
Compressing objects: 100% (6/6), done.
Writing objects: 100% (6/6), 645 bytes | 645.00 KiB/s, done.
Total 6 (delta 4), reused 0 (delta 0)
remote: Resolving deltas: 100% (4/4), completed with 2 local objects.
To https://code.usgs.gov/vdracula/vampires-and-werewolves.git
   dabb4c8..2abf2b1  main -> main

Git keeps track of what we have merged with what, so we do not have to fix things by hand again. When we return to GitLab and refresh, we will see the merged file:

Screenshot of mars.txt in GitLab with the merged content

Git’s ability to resolve conflicts is very useful, but conflict resolution costs time and effort, and can introduce errors if conflicts are not resolved correctly. If you find yourself resolving a lot of conflicts in a project, consider these technical approaches to reducing them:

Pull from origin more frequently, especially before starting new work
Use topic branches to segregate work, merging to main when complete
Make smaller more atomic commits
Push your work when it is done and encourage your team to do the same to reduce work in progress and, by extension, the chance of having conflicts
Where logically appropriate, break large files into smaller ones so that it is less likely that two authors will alter the same file simultaneously

Conflicts can also be minimized with project management strategies:

Clarify who is responsible for what areas with your collaborators
Discuss what order tasks should be carried out in with your collaborators so that tasks expected to change the same lines will not be worked on simultaneously
If the conflicts are stylistic churn (e.g. tabs vs. spaces), establish a project convention that is governing and use code style tools (e.g. black (Python), lintr (R), etc.) to enforce, if necessary

Challenge

Conflicts on Non-textual files

What does Git do when there is a conflict in an image or some other non-textual file that is stored in version control?

Show me the solution

Let us try it. Suppose Dracula takes a picture of Martian surface and calls it mars.jpg.

If you do not have an image file of Mars available, you can create a dummy binary file like this:

BASH

$ head -c 1024 /dev/urandom > mars.jpg
$ ls -lh mars.jpg

OUTPUT

-rw-r--r-- 1 vlad 57095 1.0K Mar  8 20:24 mars.jpg

ls shows us that this created a 1-kilobyte file. It is full of random bytes read from the special file, /dev/urandom.

Now, suppose Dracula adds mars.jpg to his repository:

BASH

$ git add mars.jpg
$ git commit -m "Add picture of Martian surface"

OUTPUT

[main 8e4115c] Add picture of Martian surface
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 mars.jpg

Suppose that Wolfman has added a similar picture in the meantime. His is a picture of the Martian sky, but it is also called mars.jpg. When Dracula tries to push, he gets a familiar message:

BASH

$ git push origin main

OUTPUT

To https://code.usgs.gov/vdracula/vampires-and-werewolves.git
 ! [rejected]        main -> main (fetch first)
error: failed to push some refs to 'https://code.usgs.gov/vdracula/vampires-and-werewolves.git'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

We have learned that we must pull first and resolve any conflicts:

BASH

$ git pull origin main

When there is a conflict on an image or other binary file, git prints a message like this:

OUTPUT

$ git pull origin main
remote: Counting objects: 3, done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
From https://code.usgs.gov/vdracula/vampires-and-werewolves.git
 * branch            main     -> FETCH_HEAD
   6a67967..439dc8c  main     -> origin/main
warning: Cannot merge binary files: mars.jpg (HEAD vs. 439dc8c08869c342438f6dc4a2b615b05b93c76e)
Auto-merging mars.jpg
CONFLICT (add/add): Merge conflict in mars.jpg
Automatic merge failed; fix conflicts and then commit the result.

The conflict message here is mostly the same as it was for mars.txt, but there is one key additional line:

OUTPUT

warning: Cannot merge binary files: mars.jpg (HEAD vs. 439dc8c08869c342438f6dc4a2b615b05b93c76e)

Git cannot automatically insert conflict markers into an image as it does for text files. So, instead of editing the image file, we must check out the version we want to keep. Then we can add and commit this version.

On the key line above, Git has conveniently given us commit identifiers for the two versions of mars.jpg. Our version is HEAD, and Wolfman’s version is 439dc8c0.... If we want to use our version, we can use git checkout:

BASH

$ git checkout HEAD mars.jpg
$ git add mars.jpg
$ git commit -m "Use image of surface instead of sky"

OUTPUT

[main 21032c3] Use image of surface instead of sky

If instead we want to use Wolfman’s version, we can use git checkout with Wolfman’s commit identifier, 439dc8c0:

BASH

$ git checkout 439dc8c0 mars.jpg
$ git add mars.jpg
$ git commit -m "Use image of sky instead of surface"

OUTPUT

[main da21b34] Use image of sky instead of surface

We can also keep both images. The catch is that we cannot keep them under the same name. But, we can check out each version in succession and rename it, then add the renamed versions. First, check out each image and rename it:

BASH

$ git checkout HEAD mars.jpg
$ git mv mars.jpg mars-surface.jpg
$ git checkout 439dc8c0 mars.jpg
$ mv mars.jpg mars-sky.jpg

Then, remove the old mars.jpg and add the two new files:

BASH

$ git rm mars.jpg
$ git add mars-surface.jpg
$ git add mars-sky.jpg
$ git commit -m "Use two images: surface and sky"

OUTPUT

[main 94ae08c] Use two images: surface and sky
 2 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 mars-sky.jpg
 rename mars.jpg => mars-surface.jpg (100%)

Now both images of Mars are checked into the repository, and mars.jpg no longer exists.

Challenge

A Typical Work Session

You sit down at your computer to work on a shared project that is tracked in a remote Git repository. During your work session, you take the following actions, but not in this order:

Make changes by appending the number 100 to a text file numbers.txt
Update remote repository to match the local repository
Celebrate your success with some fancy beverage(s)
Update local repository to match the remote repository
Stage changes to be committed
Commit changes to the local repository

In what order should you perform these actions to minimize the chances of conflicts? Put the commands above in order in the action column of the table below. When you have the order right, see if you can write the corresponding commands in the command column. A few steps are populated to get you started.

order	action . . . . . . . . . .	command . . . . . . . . . .
1
2		`echo 100 >> numbers.txt`
3
4
5
6	Celebrate!	`AFK`

Show me the solution

order	action . . . . . .	command . . . . . . . . . . . . . . . . . . .
1	Update local	`git pull origin main`
2	Make changes	`echo 100 >> numbers.txt`
3	Stage changes	`git add numbers.txt`
4	Commit changes	`git commit -m "Add 100 to numbers.txt"`
5	Update remote	`git push origin main`
6	Celebrate!	`AFK`

Key Points

Conflicts occur when two or more people change the same lines of the same file.
The version control system does not allow people to overwrite each other’s changes blindly, but highlights conflicts so that they can be resolved.

Content from Open Science

Last updated on 2025-11-24 | Edit this page

Overview

Questions

What is open science?
How is open science valuable?
How can version control help me make my work more open?

Objectives

Define open science and be able to list attributes or processes that make a research project open.
Explain why open science is valuable.
Explain how a version control system can be leveraged as an electronic lab notebook for computational work.

In 2023, the U.S. government declared a Year of Open Science and defined open science for federal agencies:

“Open Science is the principle and practice of making research products and processes available to all, while respecting diverse cultures, maintaining security and privacy, and fostering collaborations, reproducibility, and equity.”

But what does this mean in practice? NASA is one agency leading the way in developing a culture of open science with their Open Science 101 curriculum. Here at USGS, we can practice open science by releasing scientific code with Git version control via a USGS software information product.

Callout

Check Out How USGS Celebrated The Year Of Open Science!

Check out the USGS Year of Open Science webpage to learn about the Community for Data Integration’s (CDI) ‘Open Data for Open Science’ workshop and other USGS open science stories.

Let us take a step back. How is open science valuable and how does publishing your code make your research more open?

Callout

Making Code Citable

All USGS software information products are citable with a unique Digital Object Identifier (DOI). You will learn how to create the citation and DOI in the later episode on Citation.

Unless your methods are restricted to a single mathematical operation, it is very difficult to make your research fully reproducible without the code used to analyze and generate results. Sharing the analysis code can significantly increase the reproducibility of published papers (Ince et al. 2012, Laurinavichyute et al. 2022). Additionally, open science practices can lead to more citations, potential collaborators, and funding opportunities (McKiernan et al. 2016). This open model accelerates discovery: the more open work is, the more widely it is cited and re-used (Piwowar et al. 2007).

Researchers are also exploring how the FAIR (Findable, Accessible, Interoperable, and Reusable) data standards can apply to research software. Check out the FAIR Principles for Research Software to learn more.

Are you worried that your code is too messy to share? Fear not: here is an open letter from a professional software engineer telling you that it is good enough. In fact, “if your code is good enough to do the job, then it is good enough to release”.

Callout

Is My Work Reproducible?

When analysis is conducted using scientific code, domain and code reviews can help to determine reproducibility (and therefore the accuracy and validity) of the results. You will learn more about these types of reviews in a later episode.

However, people who want to work this way may have some questions about how to approach publishing the code. This is one of the (many) reasons we teach version control. When used diligently, version control with Git acts as a shareable electronic lab notebook for computational work:

The conceptual stages of your work are documented, including who did what and when. Every step is stamped with an identifier (the commit ID) that is for most intents and purposes unique.
You can tie documentation of rationale, ideas, and other intellectual work directly to the changes that spring from them.
You can refer to what you used in your research to obtain your computational results in a way that is unique and recoverable.
With a version control system such as Git, the entire history of the repository is easy to archive for perpetuity.

Challenge

Challenge: Is There An Advantage To Publishing Scientific Code Using Version Control Software?

Publishing your scientific code as a git repository is more open or valuable than publishing it as part of a data release. TRUE or FALSE?

Show me the solution

True. The advantages of publishing your scripts in a Git repository include:

Publishing the history of changes. This keeps a record of what methods were explored, prior versions and approaches, and what did not work well.
Keeping track of who authored what. Tracking helps authors receive credit for the work accomplished.
Providing an easy way to correct errors or make updates as new information becomes available.
Simplifying how others can access and use your code. Anyone can clone your repository and immediately start using your code.

Key Points

Open scientific work is more useful and more highly cited than closed
Publishing code is a critical part of making science reproducible
If your code is good enough to produce scientific results, then it is good enough to publish

Content from Policy

Last updated on 2025-11-24 | Edit this page

Overview

Questions

What is an official USGS software information product?
When am I required to release my software as an official USGS software information product?
When may I release my software as an official USGS software information product?

Objectives

Identify the difference between a software project and an official USGS software information product.
Explain requirements for releasing software as an official USGS software information product.
Identify the policy hierarchy relationship among federal, agency, and USGS authorities.

Source Code | Definition & Example

Computer commands written in a computer programming language that are meant to be read by people. As such, source code is a higher-level representation of computer commands and, therefore, must be assembled, interpreted, or compiled before a computer can execute it as a program.

Example

CPP

// file: hello.cpp
#include <stdio.h>

int main() {
  printf("Hello world!\n");
  return 0;
}

The above is an example of a file called “hello.cpp” that contains source code written in the C++ programming language. While the source code is relatively easily understood by a human, a computer is not able to execute this file directly.

BASH

$ ./hello.cpp
./hello.cpp: line 3: syntax error near unexpected token `('
./hello.cpp: line 3: `int main() {'

Instead this file must be compiled to an executable using a command similar to the following:

BASH

$ gcc -o hello hello.cpp

The resulting file, “hello”, contains binary machine code that can be executed by the computer.

BASH

$ ./hello
Hello world!

While C++ is an explicitly compiled language, other languages are more sublte and may leverage just-in-time compilation or interpretation of the source code. In these more subtle languages there may not be an explicit compilation command. Only the source code file exists and it is quietly compiled or interpreted behind the scenes upon execution.

Examples of these more subtle languages include Python or Shell scripts.

Source code developed by- or on behalf of- the U.S. Geological Survey that is- or intends to be- publicly accessible must be stored within a Git repository on the USGS Git Hosting Platform. This Git repository may have multiple branches, tags, and commits. There may exist issue trackers, build artifacts, milestones etc. Taken together, the activities and artifacts related to the prior, ongoing, or upcoming development activities of source code are considered a “software project”.

Official Software Information Product

When a software project reaches some level of maturity (e.g., results are used to support a published manuscript), it must be released as an “official USGS software information product”. While software projects may not be cited by other official USGS information product types (e.g., data releases, journal articles, etc.), official USGS software information products are citable. The desire to cite a software project is one example requiring the author to release the project as an official USGS software information product; however local policies (e.g., science center or equivalent organizational unit) may also define additional criteria requiring the author to release the project as an official USGS software information product.

An official USGS software information product reflects a point-in-time snapshot of a software project’s source code and relevant artifacts. This snapshot must be reviewed and receive appropriate approval to be made public. This snapshot is typically created using a Git tag in the repository and an associated GitLab release.

Open Source Software Project Development

A software project may be developed publicly as an open-source software project given the project complies with all governing policies. Subject to limited exceptions, current policy requires a software project must be made open-source when specific criteria are met, for example:

The project, or results thereof, are deemed sufficient to be used by the current or future research project(s)
A project that was contracted through a service contract vehicle is accepted by the federal contracting authority to satisfy contract requirements
The source code in the project is no longer considered truly exploratory or disposable in nature
The library or application produced by the software project is used by USGS or other federal staff on a regular, recurring basis
The library or application produces actionable information at scales and timeframes relevant to decision makers

When developing an open-source project, all contributions to the project must receive, at minimum, an administrative security review before the contribution is integrated into the project repository. Depending on the project, this review provides an opportunity to complete other types of review as well, e.g., technical code review.

Callout

Branching vs Forking Workflows

In this course we describe a branching workflow. This workflow is simpler for individual developers to understand when getting started with Git. However a forking workflow may be better suited for open source project development.

Current policy requires all contributions to open source projects be reviewed before they are integrated with the public project. Since all branches in the public project are themselves public, there is no way to use a branching workflow and comply with current policy. Under a branching workflow, the author must determine a method for sharing and reviewing their code prior to pushing their changes to GitLab; this may be fragile or lead to unversioned changes.

Conversely, a forking workflow enables reviews to occur by way of merge requests from the internal fork repository location to the public upstream repository location prior to integrating said contributions with the public project repository. In this way open source development may continue collaboratively while adhering to current policy requirements.

Governing policies (see below) determine the requirements for both open development practices and release of official USGS software information products, which are determined at each of Federal, Agency (Department of the Interior), Bureau (U.S. Geological Survey), and local (e.g., science center or equivalent organizational unit) levels.

Open Source Software Policy

Mission areas, science centers, and other organizational units within the U.S. Geological Survey may also define their own policies. Check with your supervisor for guidance on what might be applicable from local policies.

In general, the policies are structured in a hierarchy such that higher level policy (e.g., federal policy) provides generalized guidance and lower level policy provides increasing specificity and clarity. Lower level policy may not supersede or conflict with higher level policy.

Challenge

Challenge: Identifying relevant policies

Select from which source(s) there exist policies governing the release of official USGS software information products.

A. Federal

B. Departmental

C. Bureau

D. Local

Show me the solution

Policies are known to exist for each of (A), (B), and (C) sources. Science Centers and offices may implement additional policies with which you must comply.

Requirements

The following are required to release a software project as an official USGS software information product.

Proper license, disclaimer, and metadata (code.json file)
Appropriate review(s) and approval as defined by current policy
An approved Information Product Data System (IPDS) record
A Git tag within the project corresponding to the official USGS software information product and a GitLab release corresponding to the Git tag
A digital object identifier (DOI)

Other episodes in this lesson detail procedures to satisfy each of the preceding requirements.

Key Points

Software may be publicly accessible as an open source software project and/or as an official USGS software information product.
While both a project and product may be public, only the official USGS software information product is citable by other publications.
Governing policies cascade from Federal to local levels. Check with your supervisor to ensure compliance with all local policies.

Content from Licensing

Last updated on 2025-11-24 | Edit this page

Overview

Questions

What licensing information should I include with my work?

Objectives

Explain why adding licensing information to a repository is important.
Choose a proper license.

Under U.S. copyright law, copyright protection automatically arises in original creative works that are fixed in any tangible medium of expression (e.g., a written work on paper, an audio/visual recording on tape, a sculptured work out of marble). However an original work of the U.S. Government is not eligible for copyright protection in the United States (17 USC 105a). This restriction means that as USGS employees, any original work that we create in the course of our official duties and responsibilities are automatically in the public domain.

Callout

DOI solicitor note 1

Depending on the jurisdiction, the U.S. Government may have foreign copyright protections in U.S. Government work. Further, 17 USC 105a does not prevent the U.S. Government from owning copyright (e.g., if a USGS contractor creates an original creative work under an agreement, copyright arises in the work to the contractor, and the USGS may obtain ownership of the copyright through a contract).

Instead, all software developed by the USGS should include a LICENSE.md markdown file to notify the public of the copyright status of the software. Why add a LICENSE.md at all? If we do not include a license clarifying that we have waived the copyright - thus making this fact explicit - the uncertainty (is there a copyright, is there not?) in the mind of a potential user could inhibit potential usage of said work, thus reducing its impact and value. When someone reuses a creative work without a license, the author of that work could sue for copyright infringement. A license solves this problem by explicitly granting rights to others (the licensees) that they would otherwise not have (or not know that they have).

Challenge

What licenses have I already accepted?

Many of the software tools we use on a daily basis (including in this workshop) are released as open-source software. Pick a project on GitHub from the list below, or one of your own choosing. Find its license (usually in a file called LICENSE or COPYING) and talk about how it restricts your use of the software.

Git$^1$, the source-code management tool
CPython$^1$, the standard implementation of the Python language
Jupyter$^1$, the project behind web-based Python notebooks
R software$^1$, read-only mirror of the R software source code

Example solution

Both R software and Git use the GNU General Public License, which is one of the most commonly used series of software license for free and open-source software. One way in which it differs from the CC0 Public Domain license (more detail on that below) is that it specifies all derivative work must be distributed under the same or equivalent license terms, which is important for keeping open software open. In other words, an open-source license such as the GNU GPL series of licenses, differs greatly from the CC0 in that the former places certain restrictions on the use, copying, and redistribution of the software, while the latter places no restrictions whatsoever.

Callout

What is a markdown (.md) file?

Markdown (.md) files are plain text files that allow you to format text. They are commonly used for documentation, README files, and writing content for the web. Markdown allows users to easily add formatting elements such as headings, lists, links, and images without needing to write HTML code. For example, you can create headings by using the # symbol and create bullet points with - or *.

A cheatsheat for Markdown syntax (along other cheatsheats) can be found here.

What rights are being granted under which conditions differs, often only slightly, from one license to another. The Creative Commons Public Domain Dedication (CC0) is the most commonly used ‘license’ at USGS (currently CC0 1.0$^1$). It assumes the software is either completely original or using other software also with the CC0 license. This license places the work as completely as possible in the public domain so that it is free for others to build upon, enhance, or reuse. It should work for most USGS software, assuming that it was developed solely by federal employees and does not include any software developed by others that is not publicly dedicated. The text for this license is included in a callout box below. You can add a LICENSE.md file in your project root repository and copy and paste the text below.

Callout

DOI solicitor note 2

If the USGS wishes to release software originally created by a federal contractor, it may either:

require the contractor to release the software under a CC0 public domain dedication, or
require the contractor to assign all intellectual property rights, title, and interest in the software to USGS

and then release the software under a CC0 public domain dedication.

For contractor positions that work closely alongside federal positions, please refer to the Contracting Officer for questions concerning how code sharing is addressed in the contract.

Note that you can also use this Copyright Dedication Agreement to formally place materials in the public domain.

Callout

When your product includes work under copyright:

If your code includes code developed by others, you will need to consider the license of the code developed by others. Any code that is used in USGS software products must be used with permission from the copyright holder or in accordance with the license, and should be marked as such. Do not assume that you can release such code under a CC0 public domain dedication.

If you have any questions or concerns regarding which license to use, please reach out to the DOI solicitor’s office for guidance.

CC0 1.0 license text

MARKDOWN


# License

Unless otherwise noted, this project work is in the public domain in the United
States because it is a work of the United States Geological Survey, an agency
of the United States Department of Interior. For more information, see the
official USGS copyright policy at
https://www.usgs.gov/information-policies-and-instructions/copyrights-and-credits

Additionally, the USGS waives all copyright and related rights in the work
worldwide through the CC0 1.0 Universal public domain dedication.


## CC0 1.0 Universal Summary

This is a human-readable summary of the
[Legal Code (read the full text)][1].


### No Copyright

The person or entity who associated a work with this deed has dedicated the
work to the public domain by waiving all of his or her rights to the work
worldwide under copyright law, including all related and neighboring rights,
to the extent allowed by law.

You can copy, modify, distribute and perform the work, even for commercial
purposes, all without asking permission.


### Other Information

In no way are the patent or trademark rights of any person affected by CC0,
nor are the rights that other persons may have in the work or in how the
work is used, such as publicity or privacy rights.

Unless expressly stated otherwise, the person who associated a work with
this deed makes no warranties about the work, and disclaims liability for
all uses of the work, to the fullest extent permitted by applicable law.
When using or citing the work, you should not imply endorsement by the
author or the affirmer.



[1]: https://creativecommons.org/publicdomain/zero/1.0/legalcode

Challenge

Create a LICENSE.md for your repository

Using either Git Bash or the GitLab browser, add an appropriate license to the main branch of your vampires-and-werewolves repository.

Show me the solution

Here is a solution using Git Bash:

Create a LICENSE.md file and open it in nano:

BASH

nano LICENSE.md

Paste in the CC0 1.0 license text from above (note that you may need to paste by right-clicking with your mouse instead of using the keyboard shortcut CTRL+V).

Save the text and close nano (CTRL+X, Y, Enter).

BASH

git add LICENSE.md
git commit -m "Add license to repository"
git push origin main

If you choose to create the LICENSE.md file in GitLab, make sure you pull the new file locally:

BASH

git pull origin main

Key Points

A LICENSE file is often used in a repository to indicate how the contents of the repo may be used by others.
USGS software products require a LICENSE.md file in the project root of your repository.
Non-derivative USGS software products can use the CC0 1.0 license.
If you need a different license, consult the solicitor’s office to determine the appropriate license.

1: non-Federal link

Content from Citation

Last updated on 2025-11-24 | Edit this page

Overview

Questions

How do I create a digital object identifier (DOI)?
How can I make my work easy to cite?

Objectives

Learn how to create a digital object identifier (DOI).
Make your work easy to cite.

All USGS Software Information Products are required to have a digital object identifier (DOI) assigned to them. A DOI is persistent identifier tied to a unique object that you specify. USGS uses the Asset Identifier Service to reserve and manage DOIs for software products. You can reserve a DOI by providing the Title of your Software Product and the USGS Science Center or Program responsible for the software project (note: some work units have a designated person create all DOIs; please reach out to your supervisor before creating a DOI to find out about the local process). Remember not to activate/publish the DOI to DataCite until you have received official approval to release the software product. You will learn more about activating/publishing the DOI in a later episode.

Never create a fake DOI, even for the purposes of creating an example citation file for this course!

Once you have the DOI, you can write a suggested citation by including the reserved DOI that you receive from the Asset Identifier Service in your citation as a full URL (e.g., https://doi.org/10.5066/xxxxxxxx). For example:

Dracula, V. and Wolfman, L.T., 2024, Vampires and Werewolves, version 1.0.0: U.S. Geological Survey software release, https://doi.org/10.5066/xxxxxxxx.

You can place the suggested citation in the README.md file in your root directory, making it easy to find.

Callout

Try adding a CITATION.md file

Although not required for a USGS Software Information Product, you may want to consider adding a CITATION.md file that describes how to reference or cite your project. You can include a plain text version of the citation that’s easy to copy and paste as well as a BibTex entry.

Here’s an example of what Dracula would write in his CITATION.md file:


To reference the Vampire and Werewolves software product in a publication, you can cite:

Dracula, V. and Wolfman, L.T., 2024, Vampires and Werewolves, version 1.0.0, U.S. Geological Survey software release, https://doi.org/10.5066/xxxxxxxx.

Or, for the BibTeX entry, use:
```
@software{dracula-vampires-werewolves-2024,
  author      = {Dracula, Vlad AND Wolfman, L.T.},
  title       = {Vampire and Werewolves},
  version     = {1.0.0},
  year        = {2024},
  doi         = {10.5066/xxxxxxxx}
}
```

The second part of that documentation is a Bibtex entry, which can be ingested by some bibliography software. If there is an associated publication, you can add that to CITATION.md too.

To explore this topic in more detail, check out the Software Sustainability Institute blog or the FORCE11 Software Citation Group’s citation principles.

Key Points

Create a DOI for your software information product.
Add a suggested citation to your repository.

Content from Commonly Included Files

Last updated on 2025-11-24 | Edit this page

Overview

Questions

What are some files which are usually included in USGS software projects, and what should their content be?

Objectives

Draw awareness to common “boilerplate” files which are usually included in software products.
Provide usable examples of each of these.

Additional Files

Several additional files are commonly included in code repositories, and here we review four of these: DISCLAIMER.md, README.md, CONTRIBUTING.md, and CODE_OF_CONDUCT.md. Of these four files, only the DISCLAIMER.md is required for USGS software information products. The other three are recommended and often found in USGS repositories, but not required.

We describe below how to create each of these files. Templates from CHS are also available.

Disclaimers

All USGS software information products must contain appropriate disclaimers. This is unique among the files discussed here. While the others are strongly recommended, they are not required by Fundamental Science Practices (FSP).

The location of the disclaimer must be given as part of the code.json metadata which accompanies USGS software information products (see episode Creating Metadata).

The disclaimer used for open-source software projects must be different from the one used for official USGS software information Products.

Callout

Provisional disclaimers

The provisional disclaimer must remain in any branch or tag which does not represent an official USGS software information product. The official disclaimer may only be used in tags (or temporarily in release-candidate branches working towards a tag) that represent Official USGS Software Information Products.

For more information on this, see the Reviews for Authors lesson on preparing the release branch and the Publishing lesson on managing tags.

Open-source software projects

For an open-source software project, appropriate content for the DISCLAIMER.md may be found in section 11 of the FSP Guidance on Disclaimer Statements Allowed in USGS Science Information Products. This disclaimer is sometimes referred to as the “provisional” disclaimer.

Official USGS Software Information Product

For an official USGS software information product, appropriate content for the DISCLAIMER.md may be found in section 5 of the FSP Guidance on Disclaimer Statements Allowed in USGS Science Information Products.

Challenge

Add a DISCLAIMER.md file

Using either Git Bash or the GitLab browser interface, add the appropriate disclaimer text to the main branch of your vampires-and-werewolves repository.

Show me the solution

Here is a solution using the GitLab browser:

First, create a new file:

Screenshot showing a red circle around where to click to create a new file in a GitLab repository

Then, copy the software text found in section 11 of the FSP Guidance on Disclaimer Statements Allowed in USGS Science Information Products.

Change the name of the file to DISCLAIMER.md, paste the text into your new file, and commit the new file:

Screenshot showing red boxes around where to click to rename the file, where to add the new text, and where to click "Commit"

Last, write a useful commit message and click commit to add the new, provisional DISCLAIMER.md file to your repository:

Screenshot showing the box where you can write a commit message in the Gitlab interface

You should now see the new file in the top level of your repository. To see it locally, you will need to pull in the new changes:

BASH

git pull origin main

Readme

Almost all code repositories contain a README.md which is rendered to text on the project’s GitLab landing page. This file should give a human-readable description of the project, and it usually contains basic information such a summary description, author names, background/context, and applications. It should also give pointers to relevant information about the project contained in other files. For example, there might be a “Contributing to this project” section which points to a CONTRIBUTING.md file, or an R library’s README.md might point users to that library’s vignettes.

For some examples of effective README files in USGS projects, see

the dataRetrieval R package (also available as a nicely rendered GitLab Pages site)
the ISIS3 software
the EGRET R package (also available as a nicely rendered GitLab Pages site)

For all published USGS software information products, there is an associated Drupal page. You can modify this page with additional info that may be helpful to anyone interested in learning more. You may need to ask a certified content manager to help you make these edits.

Contributing

If you are willing to accept contributions from outside your team, you can include a CONTRIBUTING.md which explains your project’s policies and procedures for doing so. An example is below.

Callout

Customize the example for your project

Before using the example below, you would need to change [1] and [4] to appropriate URLs from your package repository, and choose appropriate URLs for [2] and [3] based on whether your project is on GitHub or GitLab.

MARKDOWN

Contributing
============

Contributions are welcome from the community. Questions can be asked on the
[issues page][1]. Before creating a new issue, please take a moment to
search and make sure a similar issue does not already exist. If one does
exist, you can comment (most simply even with just a `:+1:`) to show your
support for that issue.

If you have direct contributions you would like considered for incorporation
into the project you can [fork this
repository](https://docs.gitlab.com/ee/user/project/repository/forking_workflow.html#create-a-fork)
and [submit a merge
request](https://docs.gitlab.com/ee/user/project/merge_requests/creating_merge_requests.html#when-you-work-in-a-fork)
for review. Please note that all contributions will be considered public domain
(see [license][2] for details).

[1]: Replace this text with the URL for your project's issues page
[2]: Replace this text with the URL for your project's license

Code of Conduct

This is a file, typically called CODE_OF_CONDUCT.md, that describes expected conduct from users contributing to the project. At a minimum this file must specify that all contributions to the project must abide by the USGS Code of Scientific Conduct. It is also appropriate for it to include further language specifying expectations for contributors’ behavior as part of the project’s community. A suitable example of such a file’s contents follows:

MARKDOWN

# Contributor Code of Conduct

All contributions to- and interactions surrounding- this project will abide
by the [USGS Code of Conduct][1].

[1]: https://d9-wret.s3.us-west-2.amazonaws.com/assets/palladium/production/s3fs-public/media/files/USGS-Code-of-Conduct-2023.pdf

We are committed to making participation in this project a harassment-free
experience for everyone, regardless of level of experience, gender, gender
identity and expression, sexual orientation, disability, personal
appearance, body size, race, ethnicity, age, or religion.

Examples of unacceptable behavior by participants include the use of sexual
language or imagery, derogatory comments or personal attacks, trolling,
public or private harassment, insults, or other unprofessional conduct.

Project maintainers have the right and responsibility to remove, edit, or
reject comments, commits, code, wiki edits, issues, and other contributions
that are not aligned to this Code of Conduct. Project maintainers who do
not follow the Code of Conduct may be removed from the project team.

Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported by opening an issue or contacting one or more of the project
maintainers.

This Code of Conduct is adapted from the Contributor Covenant, version 1.0.0

Challenge

Which file must be included in all USGS software repositories?

There are conventions for files included in software repositories that explain the purpose of the repository or how its team works. Many of these are recommended but optional. One, however, is mandatory. Which file is mandatory, and why?

Show me the solution

The DISCLAIMER.md is mandatory in published USGS software repositories, because it is required by FSP.

Key Points

USGS software products typically contain “boilerplate” files.
Some of these files, like the DISCLAIMER.md, are mandatory and must be included in all USGS software products. Others are optional.
Examples of these files may be found in existing projects, or on this page.

Content from Creating Metadata

Last updated on 2025-11-24 | Edit this page

Overview

Questions

What is a code.json file?
How do you create a code.json file?
What are the required fields in the code.json file for a USGS software project?

Objectives

Explain what a code.json file is and how it is used.
Create a code.json file with the minimum required fields for a USGS software project and software information products.
Validate a code.json file.
Update a code.json file for a new version of the software.

Introduction

Metadata are descriptive elements in a standardized format that are necessary for identification, discovery, access, and use of information products such as software and data. Metadata answer fundamental questions such as who, what, when, where, why, and how.

Metadata for a software project are stored and maintained in a file called code.json located at the top-level of the project repository in GitLab. This code.json file is in JavaScript Object Notation (JSON) format. The code.json file provides basic information about the project and official software information products and will be aggregated with the information from other Department of the Interior software projects to form the Departmental Enterprise Code Inventory, which is required by the Federal Source Code Policy. The code.json file is required for software information products but may be created for projects without official software information products.

JSON Overview

The JSON data format allows machine-to-machine communication with structured text. JSON is language agnostic.

JSON Syntax:

Use key/value pairs
- keys are strings, indicated by double quotes
- values can be:
  - strings ("Vlad Dracula"),
  - numbers (1.5),
  - objects ({"key": "value", "key2": "value2"}),
  - arrays ([lists]),
  - boolean (true / false), or
  - null
- separate keys from values with a colon
  - Format: "key": "value"
  - Example: "name": "Vlad Dracula"

Separate key/value pairs with commas:

JSON

{
    "name": "Vlad Dracula",
    "organization": "U.S. Geological Survey"
}

Callout

Note: Many other languages (e.g., Python) allow trailing commas; however, trailing commons are considered an error for JSON syntax. For example, the following would give you an error:

JSON

{
  "name": "Vlad Dracula",
  "organization": "U.S. Geological Survey",
}

Generally, a JSON file will contain an object or an array. If it is an object, it will start and end with curly brackets {}. If it is an array, it will start and end with square brackets [].

Let us create a JSON file in our GitLab project space with the filename hello-world.json.

In the web browser, add the following content to hello-world.json:

JSON

{
    "greeting": "hello-world"
}

Notice that in our example, our JSON represents an object since it starts with curly brackets. Also, notice that the GitLab web editor provides some highlighting and indentation assistance similar to what a desktop editor might provide.

You can use a JSON Validator like JSON Formatter & Validator to format and check your JSON.

Let us try adding a trailing comma in our JSON and validating it:

JSON

{
    "greeting": "hello-world",
}

The JSON Formatter & Validator will tell you what it found wrong and attempt to fix it for you:

OUTPUT

Info: Removed trailing comma.

Metadata Template

USGS provides a code.json template (see below) to help you get started writing project metadata. Notice that its top-level element is an array, which is designated by the square brackets.

JSON

[
  {
    "name": "REPOSITORY_NAME",
    "organization": "U.S. Geological Survey",
    "description": "REPOSITORY_DESCRIPTION",
    "version": "RELEASE_VERSION",
    "status": "RELEASE_STATUS",

    "permissions": {
      "usageType": "openSource",
      "licenses": [
        {
          "name": "Public Domain, CC0-1.0",
          "URL": "https://code.usgs.gov/GROUP_HIERARCHY/REPOSITORY_NAME/-/raw/RELEASE_VERSION/LICENSE.md"
        }
      ]
    },

    "homepageURL": "https://code.usgs.gov/GROUP_HIERARCHY/REPOSITORY_NAME",
    "downloadURL": "https://code.usgs.gov/GROUP_HIERARCHY/REPOSITORY_NAME/-/archive/RELEASE_VERSION/REPOSITORY_NAME-RELEASE_VERSION.zip",
    "disclaimerURL": "https://code.usgs.gov/GROUP_HIERARCHY/REPOSITORY_NAME/-/raw/RELEASE_VERSION/DISCLAIMER.md",
    "repositoryURL": "https://code.usgs.gov/GROUP_HIERARCHY/REPOSITORY_NAME.git",
    "vcs": "git",

    "laborHours": 0,

    "tags": [
      "TOPIC_TAG_1",
      "TOPIC_TAG_2"
    ],

    "languages": [
      "PROGRAMMING_LANG_1",
      "PROGRAMMING_LANG_2"
    ],

    "contact": {
      "name": "REPOSITORY_ADMINISTRATOR_NAME",
      "email": "REPOSITORY_ADMINISTRATOR_EMAIL"
    },

    "date": {
      "metadataLastUpdated": "YYYY-MM-DD"
    }
  }
]

Create a Metadata File in GitLab

Create a new code.json file at the top level of your GitLab repository:

Paste the template JSON into the file, add a commit message, and click Commit changes:

Screenshot of writing a commit message for adding the code.json file to a GitLab repository

Add Project-Specific Information

Now, edit the code.json file to include project-specific information. While viewing the code.json file in GitLab, click Edit and Edit single file:

Screenshot of clicking Edit on the code.json file in the web browser

Replace the ALL_CAPS placeholders with meaningful values for the project. For the purposes of this exercise, the project includes code for modeling the co-occurrence of Vampires and Werewolves on Mars. The project team is actively developing the code. Eventually, they will release a USGS software information product in the public domain. This particular metadata object will document the entire project as opposed to a single product, so use “main” as the version. The project uses machine learning / artificial intelligence techniques and the code is written in Python.

GROUP_HIERARCHY is the group name under which your project is nested in GitLab. The GROUP_HIERARCHY may be one level if you are working out of a personal space (e.g., vdracula) or it may be a nested hierarchy (e.g., ecosystems/FRESC).

Below are the field definitions for code.json and examples of how the template can be updated:

name: Should be a short, human readable name for the project. This should match the value provided when creating the project in GitLab. The best practice is to use lowercase words with hyphens separating them.

JSON

"name": "vampires-and-werewolves"

organization: Must always be "U.S. Geological Survey"; casing and punctuation are important. No updates are needed to the template.

JSON

"organization": "U.S. Geological Survey"

description: This may be a longer description of the project. It should be no more than 1-2 sentences. Verbose descriptions may exist in the README.md file.

JSON

"description": "Code for modeling the co-occurrence of Vampires and Werewolves on Mars."

version: This should be a semantic version number for the product (e.g., 1.0.0) or the DEFAULT_BRANCH name (e.g., main or master) depending on whether the metadata object is referencing the project or an information product. The version number should not include a leading v (i.e., v1.0.0) or other identifier. A Git branch (release candidate branch) must exist with the same name (e.g., 1.0.0) during the review process. Upon publication, the version branch is converted to a tag. (We will discuss more about release tags in a future episode).

JSON

"version": "main"

status: Must be one of the enumerated values listed below. There are no official definitions for these terms in code.gov; however, Wikipedia provides some good definitions, which are paraphrased below.
- Ideation: planning phase of a software project.
- Development: work on software project prior to formal testing.
- Alpha: initial testing phase, often done within the project team or organization.
- Beta: feature complete testing phase that follows Alpha testing, often available to users outside project team or organization.
- Release Candidate: a Beta version with the potential to be ready for production. In USGS, a release candidate would be going through formal review and approval.
- Production: the product has passed all stages of testing. In USGS, a production release has been reviewed and approved.
- Archival: a version of the software that is no longer supported.

JSON

"status": "Development"

permissions
- usageType: A list of enumerated values which describes the usage permissions for the release:
  1. openSource: Open source
  2. governmentWideReuse: Government-wide reuse
  3. exemptByLaw: The sharing of the source code is restricted by law or regulation, including—but not limited to—patent or intellectual property law, the Export Asset Regulations, the International Traffic in Arms Regulation, and the Federal laws and regulations governing classified information
  4. exemptByNationalSecurity: The sharing of the source code would create an identifiable risk to the detriment of national security, confidentiality of Government information, or individual privacy
  5. exemptByAgencySystem: The sharing of the source code would create an identifiable risk to the stability, security, or integrity of the agency’s systems or personnel
  6. exemptByAgencyMission: The sharing of the source code would create an identifiable risk to agency mission, programs, or operations
  7. exemptByCIO: The CIO believes it is in the national interest to exempt sharing the source code
  8. exemptByPolicyDate: The release was created prior to the M-16-21 policy (August 8, 2016)
- license
  - name: The name of the license under which the product is released (e.g., Public Domain, CC0-1.0). In most cases, the appropriate license for USGS products is Public Domain, CC0-1.0, but sometimes (e.g., when some of the code is from outside sources or collaborators) different licenses are required. For more information on selecting an appropriate license see the Licensing episode in this Lesson.
  - URL: A link to the LICENSE.md file stored in this project
    - Must reference the main or master branch (this will differ for an official product, which should point to the immutable tagged version)
    - Must use the raw variant of the file, which provides access to the plain text of the file and not the GitLab-formatted text. To get the raw variant of a file, click into the file, and click the Open raw button next to the Download button:

JSON

"permissions": {
      "usageType": "openSource",
      "licenses": [
        {
          "name": "Public Domain, CC0-1.0",
          "URL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/main/LICENSE.md"
        }
      ]
    }

homepageURL*: A link to the project homepage
- May point to the project on GitLab, but will not include the .git extension
- May point to a project home page elsewhere as long as it is publicly accessible (or soon-to-be publicly accessible, once you have gone through the release process) and in an approved location (e.g., usgs.gov webpage as opposed to a personal website)

JSON

"homepageURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves"

downloadURL: A link to download a ZIP archive of the project source code
- Must point to the main or master branch (this will differ for an official product, which should point to the immutable tagged version)
- In GitLab, you can get the download URL by selecting Code–> right click zip (under Download source code) –> Copy Link:

JSON

"downloadURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/archive/main/vampires-and-werewolves-main.zip"

disclaimerURL: A link to the DISCLAIMER.md file stored in this project
- Must use the raw variant of the file, which provides access to the plain text of the file and not the GitLab-formatted text
- Must point to the main or master branch (this will differ for an official product, which should point to the immutable tagged version)

JSON

"disclaimerURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/main/DISCLAIMER.md"

repositoryURL*: A link to this project on GitLab
- Must include the .git extension

*Note: homepageURL and repositoryURL are different. repositoryURL should end with .git whereas the homepageURL should not.

JSON

"repositoryURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves.git"

vcs: A lowercase string with the name of the version control system that is being used. For USGS, this will be git. No updates are needed to the template.

JSON

"vcs": "git"

laborHours: An estimate of total labor hours spent by your organization across the current version and all previous versions, including labor performed by federal employees and contractors. Labor hours are cumulative across all versions. Your best guess is fine. If not known, the recommendation is to use -1.

JSON

"laborHours": 0

tags: An array of topical/domain tags relevant to the project
- Consider using the USGS Thesaurus or other controlled vocabularies to improve browse functionality in the code inventory.
- These tags can be used to help people narrow down searches for software, so consider terms that will help direct potential users to your project
- If the project supports AI/ML research and development, this array must include the tag usg-artificial-intelligence. This tag is short for U.S. Government Artificial Intelligence (i.e., do not use “usgs-artificial-intelligence”).

JSON

"tags": [
      "usg-artificial-intelligence",
      "vampires",
      "werewolves",
      "mars"
    ]

languages: An array of the programming languages used within this project (e.g., “Python”, “R”, “C++”). There is not a controlled vocabulary, so use your best judgement on how to represent the programming languages in your project.

JSON

"languages": [
      "Python"
    ]

contact: Point of contact information for the software information product.

JSON

"contact": {
      "name": "Vlad Dracula",
      "email": "vdracula@usgs.gov"
    }

date
- metadataLastUpdated: An ISO datestamp (YYYY-MM-DD) of when the metadata item within the code.json file was last modified. Be sure to update this value whenever you modify any of the other key/value pairs for this metadata item. Note that you must use two digits for month and day (e.g., 2024-8-9 is not correct).

JSON

"date": {
      "metadataLastUpdated": "2024-05-29"
    }

Callout

Personal Space in GitLab

In the examples above, the URLs that we are generating reference Vlad Dracula’s or your own personal GitLab space. In reality, you cannot make a repository public that is located under a personal username. Instead, public repositories need to be located under a public group. The current recommendation is to have groups at the USGS Mission Area level (e.g., Ecosystems) and then subgroups at the USGS Science Center level. Project repositories will then be located within the Science Center subgroup. To avoid needing to rename all of your URLs, it is a best practice to start projects within these public groups and maintain more restrictive permissions at the project level.

This is what the full code.json file should look like after making the updates above:

JSON

[
  {
    "name": "vampires-and-werewolves",
    "organization": "U.S. Geological Survey",
    "description": "Code for modeling the co-occurrence of Vampires and Werewolves on Mars",
    "version": "main",
    "status": "Development",

    "permissions": {
      "usageType": "openSource",
      "licenses": [
        {
          "name": "Public Domain, CC0-1.0",
          "URL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/main/LICENSE.md"
        }
      ]
    },

    "homepageURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves",
    "downloadURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/archive/main/vampires-and-werewolves-main.zip",
    "disclaimerURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/main/DISCLAIMER.md",
    "repositoryURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves.git",
    "vcs": "git",

    "laborHours": 0,

    "tags": [
      "usg-artificial-intelligence",
      "vampires",
      "werewolves",
      "mars"
    ],

    "languages": [
      "Python"
    ],

    "contact": {
      "name": "Vlad Dracula",
      "email": "vdracula@usgs.gov"
    },

    "date": {
      "metadataLastUpdated": "2024-05-29"
    }
  }
]

Discussion

Challenge

Use JSON Formatter & Validator to format and check your JSON. What errors were present in your JSON? Note that this tool only validates against the JSON syntax and does not validate against the code.gov metadata schema.

Callout

Additional `code.json` fields

Additional fields are also available. See the official code.gov metadata schema for additional details. Note that you should only add fields from the “releases” array within this schema. The full code.gov metadata schema includes other fields that are necessary for building the Enterprise Code Inventory, but those should not be included in the individual project code.json files. Fields that are not documented in the official code.gov metadata schema cannot be included in the code.json files.

Updating Metadata for Initial Software Information Product

Remember that the top-level element in code.json file is an array. This means it may contain more than one object for your project. The recommended practice is to order metadata objects with the DEFAULT_BRANCH (e.g., main) appearing first, followed by the most recently released version. For an initial software information product release, it would look something like this (please note the metadata objects are truncated for demonstration purposes):

JSON

[
 {
    "name": "vampires-and-werewolves",
    "version": "main",
    "status": "Development"
 },
 {
    "name": "vampires-and-werewolves",
    "version": "1.0.0",
    "status": "Production"
 }
]

Metadata evolve over time. There is some confusion where people think the metadata in the main branch should be for the main branch code only and not for any other branches. The reality is the metadata in the DEFAULT_BRANCH (e.g., main) should contain metadata for each version of the project (official or otherwise). The metadata in the tags associated with a specific version should contain metadata for the current version and all preceding versions; in this way, it will match the metadata in the main branch at the time the version is created.

Visualizing Metadata Through Time

The following accordions demonstrate what the code.json file looks like over time in a repository. The diagram was created at https://learngitbranching.js.org and the commands are provided if you’d like to try it out yourself!

Initial Development

Initial project development occurs in the main branch and code.json metadata is created to describe the project:

BASH

git commit -m "Add code.json to main branch"

Version 1.0.0

The code.json metadata is updated to include version 1.0.0 metadata, the 1.0.0 release candidate branch is created, the DISCLAIMER.md file is updated with the approved disclaimer statement, and version 1.0.0 is released:

BASH

git commit -m "Update code.json with version 1.0.0 metadata"
git switch -c 1.0.0
git commit -m "Update DISCLAIMER in 1.0.0 branch to approved"

Continued Development

After the release of version 1.0.0, project development continues in the main branch:

BASH

git switch main
git commit -m "Continue developing in main branch"

Version 2.0.0

After more development, the team is ready to release version 2.0.0. The code.json metadata is updated to include version 2.0.0 metadata, the 2.0.0 release candidate branch is created, the DISCLAIMER.md file is updated with the approved disclaimer statement, and version 2.0.0 is released:

BASH

git commit -m "Update code.json with version 2.0.0 metadata"
git switch -c 2.0.0
git commit -m "Update DISCLAIMER in 2.0.0 branch to approved"

Challenge

Releasing an Initial Software Information Product

You are ready to release an initial version of your software information product. In the code.json file, copy the text for the main branch’s release object and paste it directly below in the code.json array (you will use the main branch release object as a type of template to make further changes). You will need to add a comma between the two objects after the closing } for the first object. In the second object, update the status field to Production. Additionally, update the URL fields in the second object to use 1.0.0 (or whatever version number you are using; it is not required to use 1.0.0) instead of main in the RELEASE_VERSION section of the URL. You will also need to update the laborHours and the metadataLastUpdated fields.

Show me the solution

JSON

[
  {
    "name": "vampires-and-werewolves",
    "organization": "U.S. Geological Survey",
    "description": "Code for modeling the co-occurrence of Vampires and Werewolves on Mars",
    "version": "main",
    "status": "Development",

    "permissions": {
      "usageType": "openSource",
      "licenses": [
        {
          "name": "Public Domain, CC0-1.0",
          "URL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/main/LICENSE.md"
        }
      ]
    },

    "homepageURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves",
    "downloadURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/archive/main/vampires-and-werewolves-main.zip",
    "disclaimerURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/main/DISCLAIMER.md",
    "repositoryURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves.git",
    "vcs": "git",

    "laborHours": 200,

    "tags": [
      "usg-artificial-intelligence",
      "vampires",
      "werewolves",
      "mars"
    ],

    "languages": [
      "Python"
    ],

    "contact": {
      "name": "Vlad Dracula",
      "email": "vdracula@usgs.gov"
    },

    "date": {
      "metadataLastUpdated": "2024-06-15"
    }
  },
  {
    "name": "vampires-and-werewolves",
    "organization": "U.S. Geological Survey",
    "description": "Code for modeling the co-occurrence of Vampires and Werewolves on Mars",
    "version": "1.0.0",
    "status": "Production",

    "permissions": {
      "usageType": "openSource",
      "licenses": [
        {
          "name": "Public Domain, CC0-1.0",
          "URL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/1.0.0/LICENSE.md"
        }
      ]
    },

    "homepageURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves",
    "downloadURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/archive/1.0.0/vampires-and-werewolves-1.0.0.zip",
    "disclaimerURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/1.0.0/DISCLAIMER.md",
    "repositoryURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves.git",
    "vcs": "git",

    "laborHours": 200,

    "tags": [
      "usg-artificial-intelligence",
      "vampires",
      "werewolves",
      "mars"
    ],

    "languages": [
      "Python"
    ],

    "contact": {
      "name": "Vlad Dracula",
      "email": "vdracula@usgs.gov"
    },

    "date": {
      "metadataLastUpdated": "2024-06-15"
    }
  }
]

The version of the code.json file that was created in the exercise above will be included in the 1.0.0 branch, once the branch is created, and ultimately the immutable tagged product, as well as in the main branch.

Callout

Note about Status Field

There are no set rules for what status needs to be assigned to a given version or branch of a project. The goal is to do the best to communicate to users how thoroughly particular code has been tested, reviewed, and approved, and how you might anticipate them using the project and products. For example, if you have testing, reviews, and approvals built into your development process such that the main branch is always the latest and greatest and should be the go-to code to use, then the main branch might be labeled with a status of ‘Production’. If instead the content in the main branch is not formally approved until a release branch is created, then the main branch might maintain a status of ‘Development’ to encourage users to use the most recent formal version.

Likewise, if a previous version of a product is still relevant and usable, it may continue to have a status label of ‘Production’. If, however, the newer version corrects some bugs and should be used instead of a previous version, then, the previous version should have its status updated to ‘Archival’.

Key Points

A code.json file is a file formatted in JavaScript Object Notation (JSON) and contains project metadata. The code.json file is saved at the top-level of the project.
USGS compiles all of the code.json files for public products in GitLab into an inventory that is required by Federal policy.
You can use the code.json file template above to begin creating your project and product metadata with the required fields.

Content from Software Review for Authors

Last updated on 2025-11-24 | Edit this page

Overview

Questions

How do I prepare my code for a software review?
What information do I need to provide to my reviewer(s)?
How should I reconcile reviewer comments?
How can I document the review to meet Fundamental Science Practices requirements?

Objectives

Create a release candidate branch in preparation for a software review.
Develop a GitLab merge request to facilitate the software review.
Reconcile reviewer comments by updating main branch and replying to comments in merge request.

Software Review Overview

After you have completed your code development, added the required files to your repository, and are ready to publish your software product, you will need to request a review. As noted in the Policy episode, official USGS software information products must be reviewed and approved before they can be released. The review must include an administrative, code, and domain review. We will go into more depth on how to conduct these reviews in the next episode Software Review for Reviewers.

Release Candidate Branch

A release candidate branch should be created to help facilitate the review process. The release candidate branch should have the same name as the eventual release tag. For example, if you are preparing to release version 1.0.0 of your software product, the release candidate branch will be 1.0.0.

Open Git Bash (Windows) or Terminal (MacOS) and navigate to your local repository. Make sure you are in your main branch and it is up to date. We will use --ff-only to reject the pull request if there are any local commits that are not already on the remote.

BASH

git switch main
git pull --ff-only origin main

Create an empty release candidate branch. We want the branch to be empty so that we can create a merge request to facilitate reviewer comments. The following command creates a new branch named 1.0.0 without any commit history. The --orphan flag is crucial here as it ensures the branch starts with no commits, making it an “orphan” branch. This is useful for creating a completely clean slate for your release candidate.

BASH

git switch --orphan 1.0.0

If you have directories in your repository, Git may give you the following prompt:

OUTPUT

Deletion of directory 'NAME_OF_DIR' failed. Should I try again? (y/n)

Type n for each directory to allow the continued creation of the orphan branch.

Normally, Git does not allow you to commit an empty branch (a branch with no files or changes). The --allow-empty flag allows you to create a commit even though there are no changes to commit. The -m 'Create Release Candidate Branch' part adds a commit message to this empty commit, which helps in identifying the purpose of this commit.

BASH

git commit --allow-empty -m 'Create Release Candidate Branch'

Push the release candidate branch to the remote

BASH

git push -u origin 1.0.0

GitLab Merge Request for Review

There is not a single prescribed way to perform a software review. In this lesson, we offer an example for how you can help a reviewer structure and document their review within a merge request.

We need to create a merge request to merge main into 1.0.0. Unlike in the Branching and Merging episode, do NOT click on the “Create merge request” button in the GitLab banner message. That button will create a merge request from 1.0.0 to main.

Navigate to Merge requests in the GitLab side navigation and select New merge request:
Select main as the source branch and 1.0.0 as the target branch:
Give your merge request a meaningful title (e.g., Review of vampires-and-werewolves)
Add a review template to the Description field. An example review template is available here. Feel free to add more checks or simplify the checks depending on your use case (example of simplified review template). Help your reviewer by only including checks that are relevant to your project. To copy from the template linked above, select Display source and then click the Copy file contents button.

Discussion

Update Your Merge Request

Think about a project that you have worked on in the past. Update your merge request description to help a hypothetical reviewer structure their review for that project. Make sure to include the following information:

a brief description of the project and the repository structure
special instructions for the reviewer
a checklist of elements to review

With a partner or small group, discuss the following questions:

Which parts of the review template did you keep for the project?
What elements or checks would you add?
Are there any parts of the template that you do not understand?

Reconcile Review

Once the reviewer(s) completes the review, you will need to reconcile the review. Make updates to your code to reconcile reviewer feedback. Refer to the Branching and Merging episode for a refresher.

Removing Sensitive Information

When reconciling a review, you may be required to remove private and/or sensitive information that was mistakenly added to the repository history. This can be a challenge in a Git repository. First, you must rewrite the repository history; consider using a tool such as git-filter-repo. Once the local repository is cleaned up and the changes have been pushed to the remote location in GitLab (this will typically require a --force push), you should also run housekeeping on the remote repository.

To run housekeeping, navigate to your project on GitLab and go to the Settings -> General page and expand the Advanced section. Here you should Run housekeeping and Prune unreachable objects.

Screenshot of `Housekeeping` in the project advanced general settings.

Merge the changes into your main branch. These changes will automatically get included in your merge request.

Once you push the commit to GitLab, you can reference the commit hash for each change that you made in response to a reviewer’s comments and then resolve the comment thread.

Callout

Note that the referenced commit hash must be pushed to GitLab before you submit the comment, otherwise GitLab will not link the hash and it will look like gibberish text in the comment.

Once all changes are made, expand any collapsed threads on the merge request and print the final merge request documentation to PDF. Upload the PDF as the review and reconciliation documentation for the USGS Information Product Data System.

Depending on your Science Center’s review and approval process, you may merge main into your release candidate branch prior to receiving final approval from your Science Center. Alternatively, your Science Center approver may want to Approve the merge request in GitLab.

Callout

You may want to include a link to the original merge request in the review artifact that is loaded into the USGS Information Product Data System. Please remember that the workflow presented in this and the next episode is just one of many ways to complete a software review and reconciliation. Please check with your Science Center leadership to see if they have different requirements.

Key Points

A release candidate branch is named with the version number for the anticipated software information product release
A GitLab merge request provides structure for reviewers and makes it easier for them to conduct a review
A PDF of the final merge request documentation can serve as the review and reconciliation documentation in the USGS Information Product Data System

Content from Software Review for Reviewers

Last updated on 2025-11-24 | Edit this page

Overview

Questions

What are my responsibilities as a reviewer of software?
How do I conduct a software review?

Objectives

Explain the topics that need to be covered during review.
Conduct a software review.

Software Review Overview

All USGS open-source software projects must undergo an administrative review. Official USGS software products must undergo two additional forms of review: Technical code review and scientific (domain) review. For an official overview, see Types of Software Review.

Administrative Review

The administrative reviewer’s duty is to make sure that the entire history of the project is free of potential security or privacy violations. There are several types of information which must not be present in code released to the public:

Personally identifiable information (PII)
Absolute file system paths
Internal server host names or IP addresses
Usernames or passwords

This review must be done for every commit in the released software. This can be a very onerous requirement if it is to be done all at once. For that reason, collaborative workflows where changes to the codebase go through merge requests are a very good way of making sure that the administrative review has been adequately done.

It is acceptable for team members to review each others’ contributions, even if they are both listed as authors of the software. Reviewing a merge request is done by people who are not authors on that specific code; they are only “authors” of the project generally. Therefore, mutual in-team reviews are a convenient way to comply with this requirement.

Technical code review

The technical code review focuses on such concerns as adherence to coding standards and other measures of code quality. This review is required for all official USGS software products, but not for provisional products. Unlike administrative review, this does not need to be done for every commit of the released software.

Typical focuses in technical code review include

checking for adherence to explicit coding standards, such as conventions for naming variables and functions
ensuring that unit tests pass
inspecting for vulnerabilities or bugs

Some of these areas of concern are amenable to automation. For instance, linter software can test for adherence to coding standards, calculate measures of code complexity, and identify common bug-prone patterns.

Scientific (domain) review

Scientific software requires a domain review as well. Like the technical code review, the domain review only needs to be done on the end product, not on individual commits. What constitutes an appropriate domain review will vary a great deal depending on the domain, but generally involves checking for scientific flaws or errors. It is similar to a peer review for a scientific publication: checking that the methods are applied correctly and are appropriate for the scientific question. You can leverage community resources, such as CDI or your local colleagues, for insight into scientific reviews in your domain.

How To Make The Review Streamlined

The person requesting the review may have already set up a way for you to do the review. For instance, if following the instructions in Review for Authors, they will have created a GitLab merge request where you can conduct your review.

There are many possible ways to do reviews, including methods such as Word documents which do not involve git at all. But if the reviewer has not specified how to conduct the review, one good way to do it is to create your own GitLab issue. Using either the issue title or a label, make clear what kind of review you are doing, and what version or tag of the software it pertains to. Now you can put your comments, requests for changes, or approval into this issue. This keeps everything tied into the code repository, rather than in email or Teams chats, where it could get lost. Below, we describe the process for adding comments to a merge request created by the authors.

Adding Comments to Your Review

Let us try adding a comment to a merge request as a reviewer and as an author.

Reviewer Role:

Navigate to Code -> Commits
Make sure you are on the main branch
Copy commit SHA
Start new comment in the merge request: “Starting review as of commit [paste commit SHA]
Click Comment
In the example exercise here, you start your review with the metadata file first and note that a date needs to be updated. Navigate to Changes tab within the merge request and find the code.json file.
Right click on the line number next to ‘metadataLastUpdated’, which will insert a comment under that line of code.
Add your comment (e.g., “Make sure to update the metadataLastUpdated date before you submit for publication”)
Click Start a review
Click Finish review, select Comment, Approve, or Request changes, and Submit review. During a real review, you would continue adding comments to your review. For this exercise, we will have only this one comment.

Author Role:

For reconciling a review, you can create a branch to address all comments. In this exercise, we just have one comment to address:

Create a feature branch to address the comment

BASH

  git switch -c review-recon-1.0.0

Open editor and make the change

BASH

nano code.json

Stage, commit, and push the changes

BASH

git add code.json
git commit -m "Update metadataLastUpdated date per review comment"
git push -u origin review-recon-1.0.0

Create merge request and merge to main in GitLab
Once it is merged to main you can pull it down locally to keep everything in sync:

BASH

git switch main
git pull --ff-only origin main

Reply to the comment in the merge request with the commit SHA for the commit in which the comment was addressed. If you addressed the comment with a change to the same line of code, GitLab will automatically include the commit SHA and provide a diff so you can see the changes.

Key Points

An administrative review is required for all open-source software projects
A technical code review and a scientific domain review are required for official USGS software products
There are many ways to conduct and document a software review. One way is by creating a GitLab merge request with comments documenting the review

Content from Publishing

Last updated on 2025-11-24 | Edit this page

Overview

Questions

How do I get my Git repository published once it is ready and has been approved?
What static objects should I create in Git for the final release?
How do I make the DOI point to the correct Git object?

Objectives

Update the DISCLAIMER.md to the approved language.
Create a ticket to request publication of your software.
Create a tag and a “release” in your Git repo.
Publish a DOI.

Publishing Software Overview

Congratulations! You have added all the required files, completed the software reviews, and are ready to publish the software in your Git repository!

Checklist

As a reminder, at this point you should have an appropriate version of the following files:

LICENSE.md
DISCLAIMER.md
README.md
code.json
optional: CITATION.md

And completed the following tasks:

Created a DOI
Completed the review process
Obtained IPDS approval
Started a release candidate branch

Update Disclaimer for Release Candidate Branch

Once you have merged your final approved content into the release candidate branch, you should update the DISCLAIMER.md from provisional to approved in the release candidate branch only. In the main branch, this file must continue to reflect the provisional wording. Language for approved software can be found in section 5 of the FSP Guidance on Disclaimer Statements Allowed in USGS Science Information Products

Please note that the release candidate branch should not be merged back in the main branch at any point. During the review process and up until publication, changes can be promoted from the main branch into the release candidate branch.

Visualize the Git Workflow

The following diagram was created with Learn Git Branching using the following commands to demonstrate the process of updating disclaimer statements and rebasing the release candidate branch:

BASH

# Switch to release candidate branch 1.0.0
git switch -c 1.0.0

# Update the release candidate branch's disclaimer statement to approved
git commit -m "Update disclaimer statement to approved in release candidate branch"

# Return to the main branch and make a change that will need to be promoted to the release candidate branch
git switch main
git commit -m "Make a change in the main branch"

# Switch back to the release candidate branch 1.0.0
git switch 1.0.0

# Rebase the release candidate branch from main. This will add the commit from the main branch to 1.0.0 without changing the disclaimer statement in 1.0.0
git rebase main

# Return to the main branch after publishing 1.0.0
git switch main
git commit -m "Continue working on main branch"

# Create a new release candidate branch for 2.0.0
git switch -c 2.0.0

# Update the release candidate branch's disclaimer statement to approved
git commit -m "Update disclaimer statement to approved in release candidate branch"

# Return to the main branch and make a change that will need to be promoted to the release candidate branch
git switch main
git commit -m "Make a change in the main branch"

# Switch back to the release candidate branch 2.0.0
git switch 2.0.0

# Rebase the release candidate branch from main. This will add the commit from the main branch to 2.0.0 without changing the disclaimer statement in 2.0.0
git rebase main

Diagram created with https://learngitbranching.js.org demonstrating the git flow for updating a release candidate branch.

Try it out yourself to see the workflow in action: Learn Git Branching!

Challenge

Update the DISCLAIMER.md in the release candidate branch `1.0.0`

Update the DISCLAIMER.md file in your 1.0.0 branch to use the appropriate approved software disclaimer.

Show me the solution

In GitLab, navigate to your 1.0.0 branch (e.g., Code –> Branches –> Select 1.0.0).
Open DISCLAIMER.md
Select Edit –> Edit single file
Replace the current text with the approved disclaimer text from FSP Guidance on Disclaimer Statements Allowed in USGS Science Information Products.
Commit changes

Initiate a Request to Publish your Code

After double checking the above list and reviewing the software release checklist, navigate to the USGS GitLab Software Management repository issue page. To initiate a request, open an issue on this project and in the fields seen below, add a descriptive title and select the “GitLab Official Release” template under the “Description” field:

Screenshot of Issue Template for Publishing a Git repository

Selecting “GitLab Official Release” will pre-populate the text box with a template that includes sections for you to fill out. If you have followed along so far, you should have all the information requested. Update the template text with information relevant to your request:

For fields requesting textual input, examples are provided between backticks `e.g. example`; replace the content between the backticks with your answer.

Fields with a checkbox (a space between two square brackets) are asking you to acknowledge or agree to associated text; replace the space between the brackets with an x to indicate you agree/acknowledge.

Do not edit the /label lines as they may delay notification/processing of your request.

When done, click “Create Issue” at the bottom.

Callout

The template may ask for the username of the approving official. If they have a GitLab username, you can tag them using the @ symbol: @vdracula. If they do not, you can write their email address instead: vdracula@usgs.gov.

Once an administrator sees the issue, they run an automated final validation tool to provide feedback on errors. That is why you will see this note at the bottom of their message:

Screenshot of the note that explains a comment was automatically generated

Often, the feedback concerns errors from the code.json file not containing the correct urls. Reviewing the Creating Metadata lesson may help clarify how to fix these errors.

Discussion

As part of this course, you will not be submitting your Vampire and Wolfman project for publication, but you can take a look at current Git projects that have been submitted.

Navigate to the Software Management Issues page using the link provided by your instructor. Do you see any requests for publishing software? Click on a few and see how they filled out the template. What responses did they receive? What did they need to edit before publication? Do the requested edits align with what you’ve learned so far? How would you fix the errors?

Create a Git Tag

Once you have corrected all errors and received approval via the Git Issue, your next step is to create a static Git tag and delete the release candidate branch. A tag is a human readable name that points to a specific commit ID and does not change with subsequent updates or commits. Because of this stability, it is used for the official version of the software.

To create a tag, navigate to the left-hand menu and select “Tags” under “Code”, then click “New Tag”:

Then, fill out the tag information with the tag name as the version name, select the release candidate branch (which should have the same name), and write a brief description:

Create a Release from the Tag

On the next page, create a release from the tag. This release will be used to activate the DOI, i.e., the DOI will point to the release (not the tag, the release candidate branch, or the main branch).

Screenshot of where to click "Create Release"

Add a title for the release, which can be the same as the tag and version number. There is a Description box for any notes you may want to add, which you can edit at any point. For example, if you publish an updated version of the repository, you may want to come back, and redirect users to the most up-to-date version.

Screenshot of page used to create a Git release

Once done, click “Create Release”. Then use the url to activate the DOI. The url should be in this format: https://code.usgs.gov/GROUP_HIERARCHY/REPOSITORY_NAME/-/releases/RELEASE-NAME

Now that you have the static tag and release, delete the release candidate branch by navigating to “Branches” under “Code” on the left-hand menu, then click the three vertical dots on the release candidate branch, and click “Delete branch”:

Callout

Why do we create a Release?

Why is it preferable to point the DOI to a release rather than a tag or branch? You cannot edit a tag or the release branch, but you can add notes or updates to a release. These notes may be useful to the user in the case that there is a more updated version or other information you wish to share.

Discussion

Create a Git tag and associated release

Follow the above instructions to create a tag and release in your Vampires and Werewolves repository.

Activate your DOI

Use the USGS Asset Identifier Service to manage the DOI you created in the citation lesson. Before you can activate the DOI, you will need to include the creators, publication year, URL to the Release page, IPDS number, and related publication (if applicable). If you would like your scientific software information product to be displayed on the USGS website, you will also need to include a brief description of the product in the DOI. Once you have filled in this required information, the “Publish Approved Release to DataCite” button on the left-hand menu will become active and you will be able to click on it:

Note: As part of this course we are not creating and publishing DOIs

Disseminate in IPDS

Your last step, as with any USGS product, is to Disseminate the record in IPDS. Follow your Center’s policy on how to disseminate the product.

Key Points

Once you have approval to release the software, update the DISCLAIMER.md
Submit an new issue as the first step in publishing a software product
Create an static Git tag and associated release
Activate your DOI using the url of the Git release

Content from Continuing Your Project

Last updated on 2025-11-24 | Edit this page

Overview

Questions

How do I follow policy when developing an open-source project?
When should I release updated versions of my project?
How do I prepare my project for subsequent releases?

Objectives

Continue development on your open-source project.
Release subsequent versions of your open-source project.

In many cases, work must continue on a project after it becomes publicly accessible. This may be following an official USGS Software Information Product release, or following a more informal open-source release process. In any case, the USGS supports open-source project development with some conditions.

Continue Open-Source Project Development

When developing an open-source project, all modifications must receive, at minimum, one administrative security review before being incorporated into the open-source project. This review must ensure no sensitive or personally identifiable information is exposed by incorporating these changes.

There are workflows supporting this review process. Previously in this course, we introduced a branching workflow, which must be modified in order to align with policy during open-source development. One modified workflow that aligns with policy requirements is called a “Forking Workflow”.

Forking Workflow

With a forking workflow, each developer on the project creates a private personal copy, or fork, of the shared open-source (public) project. This fork is often referred to as the developer’s “origin” and the shared open-source project is often referred to as the “upstream”.

A forking workflow is also beneficial because it removes barriers to new collaborator contributions. Rather than needing to individually grant access to each potential collaborator, anyone can fork the open-source project and submit a merge request to contribute.

Callout

What is in a name?

The terms “origin” and “upstream” are conventions within the broader software development community for referencing the remote repository locations. These could be called anything, but following the convention improves shared understanding across development teams.

To view all your remote locations and their aliases using the command line, try

BASH

git remote -v

The forking workflow is similar to the branching workflow except the branches are created within the developer’s origin and the merge requests are from the developer’s origin to the shared upstream repositories. Let us see how this works.

Diagram showing an Upstream and Origin as part of USGS Gitlab, and a Local Clone as part of the the Local Workstation. Arrows go back and forth between Upstream and Origin and between Origin and Local Clone. A dashed line goes between the Local Clone and Upstream.

In the diagram above we see an upstream and origin location within the USGS GitLab platform. Within the developer’s local workstation we see a local clone where the developer will work. A high-level overview of the workflow is as follows:

Developer creates a personal fork called an origin
Developer configures their fork to their local workstation
Developer continues project development on branches within the local clone
Developer pushes completed branches from their local clone to their origin
Developer submits a merge request from the branch in their origin to the default branch in the upstream. A maintainer reviews and optionally merges the changes.

1. Create a fork

Creating a developer fork is a one-time process for each developer. The developer will fork the upstream repository to create their origin repository. This is completed within the GitLab interface by navigating to the upstream location and clicking the “Fork” button in the upper right area of the page.

Screenshot of GitLab UI showing location of fork button

It is important to click the “Fork” text and not the number to the right of the “Fork” text as these have different effects. On the next screen the developer must provide some information about their fork and then click the “Fork project” button near the bottom.

Screenshot of GitLab UI for creating a fork

Primarily, the developer must “Select a namespace” where the fork will be created. Typically they would select their personal user namespace. It is uncommon to change the project name, project slug, or project description. Typically all branches should be included in the fork and the visibility can be either “Private” or “Internal”; however, “Public” will be disabled.

Callout

Visibility Matters

Personal forks are not allowed to be made publicly accessible. Only the shared upstream project location may be publicly accessible. However, when the fork has a more restrictive visibility than the upstream, GitLab often makes incorrect default assumptions when the developer subsequently creates merge requests. GitLab will assume the merge request is from the developer fork and to the developer fork, which is incorrect. For this reason, it is important to pay attention when creating the merge request later.

2. Configure local clone

The local clone may be configured in one of two different ways. If the developer had previously cloned the repository from what is now called the upstream, we can rename the existing remote to be called “upstream” and then add a new remote called “origin”. Alternatively, if the developer does not yet have a local clone of the project, they can clone their origin and add an “upstream”. The end result is the same.

BASH

cd vampires-and-warewolves
git remote rename origin upstream
git remote add origin <ORIGIN_URL>

BASH

git clone <ORIGIN_URL>
cd vampires-and-warewolves
git remote add upstream <UPSTREAM_URL>

The ORIGIN_URL and UPSTREAM_URL values may be copied from the GitLab web interface by navigating to the corresponding project page, selecting the “Code” drop down option and then clicking the copy icon for the “Clone with HTTPS” option.

Screenshot of GitLab UI for obtaining ORIGIN_URL

3. Continue project development

Within your local clone and personal origin, development continues following the branching workflow as described in the previous “Branching and Merging” episode. The developer creates different branches for each logical group of changes and commits them locally.

4. Push completed branches

When local development work is ready for integration, the developer pushes their local branch to their developer origin. If the developer previously pushed with the -u or --set-upstream-to flags as described in the “Branching and Merging” episode, it is important to reset these now since the “origin” is pointing to a new location. More simply, you may always explicitly specify what is pushed to where using:

BASH

git push origin 1-my-first-issue

Callout

In the above command, 1-my-first-issue is the name of the branch that is pushed and origin is the remote destination to where that branch is pushed.

5. Integrate changes

The developer should open a merge request from the development branch in their origin repository to the upstream default branch (e.g., main). To do this, first navigate a web browser to the developer origin project page on USGS GitLab. Then, select “Code” and “Merge requests” from the navigation menu on the left. Next click the “New merge request” button.

Screenshot of GitLab UI for creating a new merge request

On the next screen, select the correct “Source branch” and “Target branch” information and then click “Compare branches and continue”.

Screenshot of GitLab UI for finalizing new merge request

In the “Source branch”, the developer fork location should be selected in the first drop down box. This should be the default if opening a merge request from the developer fork project page. The second drop down box in this section does not default to anything and the desired development branch should be selected.

In the target branch, it is important the correct upstream location is selected. In the screenshot, the “mlangseth” location is selected as the upstream. The default branch in the selected target location will be selected by default, this is typically correct but may be different for specific development teams.

Callout

Visibility (still) matters

If the visibility of the origin and upstream match, GitLab will select the correct values for the source and target repository locations. In general, this will not be the case following this open-source continuing development guide. It is for this reason you must carefully select the correct repository locations when on this screen.

On the final screen, you are given the option to provide a custom merge request title, description, labels, assignments, etc. Complete these choices appropriately and click the “Create merge request” button at the bottom to create the final merge request.

This new merge request can now be reviewed, commented on, reconciled, and integrated in the same manner as was described in the previous “Branching and Merging” episode.

Subsequent Releases

Following some amount of development on the open source project, it may become appropriate and/or necessary to release a new version of the software project as a new official USGS software information product. The new version of the project is subject to the same review and approval requirements as if it were the first or only release of the project. A new Information Product Data System (IPDS) record, a new digital object identifier (DOI), and updated metadata (code.json), are all required.

Challenge

Triggering a subsequent release

When may a subsequent version of the software project be released as a new official USGS software information product?

When must a subsequent version of the software project be released as a new official USGS software information product?

Show me the solution

In general, the triggering criteria for a subsequent release of a software project as an official USGS software information product are the same as for the original release of the software project.

A subsequent version of the software development project may be released as a new official USGS software information product at the author’s discretion.

A subsequent version of the software development project must be released as a new official USGS software information product if this new version is desired to be cited and/or results thereof are intended to be used to support some other official USGS information product.

Review & Reconciliation Workflow for your 2nd Software Product

If you are preparing a second official USGS software information product, we recommend the below workflow:

In the public, upstream repository, create a new branch from the tag that contains the first official USGS software information product. This branch will become your new release candidate branch. In this example, we call it 2.0.0
Start a merge request from main into 2.0.0. It will contain every commit that occurred since the first release, and therefore every change and commit that needs to be reviewed. Ask your reviewer(s) to conduct their reviews using this merge request.
After the peer reviewer(s) have submitted their comments, create a new branch from main on your personal fork and address each comment (i.e., conduct the reconciliation). This is also a good time to update the code.json with an entry for the new 2.0.0 branch (see instructions below). Merge this branch into main back on the upstream repository. These changes will then appear in your first merge request, demonstrating how you reconciled the peer review.
Merge the first merge request from main into 2.0.0. From here, you can update the DISCLAIMER.md with the approved language and follow the standard workflow, as described in the Publishing episode.

Note that whether you are publishing your 2nd product or your 1st product but using an open-source workflow, the release candidate branch is created in the upstream repository and not on your personal fork.

Preparing Metadata

For releasing subsequent software information products, modify the code.json file in the main branch. Update the status field for the previous version to Archival, if applicable. Multiple versions may be in Production at once.

Copy the text from the previously released object in the code.json and paste it between the main branch object and the previously released object (still within the array []). Add a comma after the closing bracket (}) for the object to separate it from the previous product.

Update the version, status, permissions.license.URL, downloadURL, disclaimerURL, and laborHours in this object to document the newest version. Additionally, update the metadataLastUpdated for any metadata objects that have been modified, including the metadata object for this newest version.

Remember from the Creating Metadata episode that the top-level element in a code.json file is an array. If a project has been under development for a long time, there may be multiple released versions. In this case, objects should be ordered with the DEFAULT_BRANCH (e.g., main) appearing first, followed by the most recently released version, and so-on in reverse chronological order. For example:

JSON

[
  {
    // ... main (DEFAULT_BRANCH), status Development
  },
  {
    // ... release 3.0.0, status Production
  },
  {
    // ... release 2.0.0, status Archival
  },
  {
    // ... release 1.0.0, status Archival
  }
]

In the hypothetical example code.json file above, the release tag for version 1.0.0 would only include metadata for that product (in addition to the DEFAULT_BRANCH metadata) and it would likely have a status of Production. Once you release version 2.0.0, three objects would exist in the array, first would be the DEFAULT_BRANCH metadata with a status of Development, next 2.0.0 with status Production and third would appear 1.0.0 with status Archival. However, because we never go back and edit released tags, you would not change the code.json file in the 1.0.0 tagged version, and it would still specify that version as Production. However, in the main branch, the code.json file must be updated to include new software information products. The code.json file may include metadata objects marking other milestone tagged versions in addition to those associated with official USGS software information products.

Challenge

Update the code.json File for Subsequent Release

Update the code.json file within the main branch to prepare to release version 2.0.0. What fields did you need to update? How many objects are now in your JSON array? Did you need to change anything in the version 1.0.0 object? What about the main object?

Show me the solution

JSON

[
  {
    "name": "vampires-and-werewolves",
    "organization": "U.S. Geological Survey",
    "description": "Code for modeling the co-occurrence of Vampires and Werewolves on Mars",
    "version": "main",
    "status": "Development",

    "permissions": {
      "usageType": "openSource",
      "licenses": [
        {
          "name": "Public Domain, CC0-1.0",
          "URL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/main/LICENSE.md"
        }
      ]
    },

    "homepageURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves",
    "downloadURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/archive/main/vampires-and-werewolves-main.zip",
    "disclaimerURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/main/DISCLAIMER.md",
    "repositoryURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves.git",
    "vcs": "git",

    "laborHours": 0,

    "tags": [
      "usg-artificial-intelligence",
      "vampires",
      "werewolves",
      "mars"
    ],

    "languages": [
      "Python"
    ],

    "contact": {
      "name": "Vlad Dracula",
      "email": "vdracula@usgs.gov"
    },

    "date": {
      "metadataLastUpdated": "2024-06-15"
    }
  },
  {
    "name": "vampires-and-werewolves",
    "organization": "U.S. Geological Survey",
    "description": "Code for modeling the co-occurrence of Vampires and Werewolves on Mars",
    "version": "2.0.0",
    "status": "Production",

    "permissions": {
      "usageType": "openSource",
      "licenses": [
        {
          "name": "Public Domain, CC0-1.0",
          "URL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/2.0.0/LICENSE.md"
        }
      ]
    },

    "homepageURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves",
    "downloadURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/archive/2.0.0/vampires-and-werewolves-main.zip",
    "disclaimerURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/2.0.0/DISCLAIMER.md",
    "repositoryURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves.git",
    "vcs": "git",

    "laborHours": 300,

    "tags": [
      "usg-artificial-intelligence",
      "vampires",
      "werewolves",
      "mars"
    ],

    "languages": [
      "Python"
    ],

    "contact": {
      "name": "Vlad Dracula",
      "email": "vdracula@usgs.gov"
    },

    "date": {
      "metadataLastUpdated": "2024-07-01"
    }
  },
  {
    "name": "vampires-and-werewolves",
    "organization": "U.S. Geological Survey",
    "description": "Code for modeling the co-occurrence of Vampires and Werewolves on Mars",
    "version": "1.0.0",
    "status": "Archival",

    "permissions": {
      "usageType": "openSource",
      "licenses": [
        {
          "name": "Public Domain, CC0-1.0",
          "URL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/1.0.0/LICENSE.md"
        }
      ]
    },

    "homepageURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves",
    "downloadURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/archive/1.0.0/vampires-and-werewolves-main.zip",
    "disclaimerURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/1.0.0/DISCLAIMER.md",
    "repositoryURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves.git",
    "vcs": "git",

    "laborHours": 200,

    "tags": [
      "usg-artificial-intelligence",
      "vampires",
      "werewolves",
      "mars"
    ],

    "languages": [
      "Python"
    ],

    "contact": {
      "name": "Vlad Dracula",
      "email": "vdracula@usgs.gov"
    },

    "date": {
      "metadataLastUpdated": "2024-07-01"
    }
  }
]

The 2.0.0 object was added between the main and 1.0.0 release objects. The following fields were updated for the 2.0.0 object: version, status, permissions.license.URL, downloadURL, disclaimerURL, metadataLastUpdated, and laborHours. There are now 3 objects in the code.json array. The status and the metadataLastUpdated fields were updated in the 1.0.0 object. Nothing was updated in the main object.

Key Points

A good workflow can streamline open-source project development while ensuring compliance with governing policies
While specific criteria necessitate releasing subsequent versions, this may also be done at the author’s discretion
Subsequent versions are released in a manner very similar to the initial version
The code.json file should be updated to include another object within the array that describes the new version.

Overview

Questions

Objectives

The Long History of Version Control Systems

Paper Writing

Show me the solution

Overview

Questions

Objectives

BASH

Line Endings

BASH

BASH

Configure Default Text Editor

Show me the solution

Exiting Vim

BASH

Default Git branch naming

BASH

BASH

Proxy

BASH

BASH

Git Help and Manual

BASH

BASH

Built-in Git Integrations

Overview

Questions

Objectives

BASH

BASH

BASH

BASH

BASH

BASH

OUTPUT

BASH

BASH

OUTPUT

Places to Create Git Repositories

BASH

Show me the solution

BASH

OUTPUT

Correcting git init Mistakes

Solution – USE WITH CAUTION!

Background

BASH

Solution

BASH

Overview

Questions

Objectives

BASH

BASH

OUTPUT

BASH

OUTPUT

BASH

OUTPUT

BASH

OUTPUT

BASH

BASH

OUTPUT

BASH

OUTPUT

BASH

OUTPUT

BASH

OUTPUT

Where Are My Changes?

BASH

OUTPUT

BASH

OUTPUT

BASH

OUTPUT

BASH

Correcting `git init` Mistakes

`bio` Repository