Creating Metadata
Last updated on 2025-05-22 | Edit this page
Overview
Questions
- What is a
code.json
file? - How do you create a
code.json
file? - What are the required fields in the
code.json
file for a USGS software project?
Objectives
- Explain what a
code.json
file is and how it is used. - Create a
code.json
file with the minimum required fields for a USGS software project and software information products. - Validate a
code.json
file. - Update a
code.json
file for a new version of the software.
Introduction
Metadata are descriptive elements in a standardized format that are necessary for identification, discovery, access, and use of information products such as software and data. Metadata answer fundamental questions such as who, what, when, where, why, and how.
Metadata for a software project are stored and maintained in a file
called code.json
located at the top-level of the project
repository in GitLab. This code.json
file is in JavaScript Object Notation (JSON)
format. The code.json
file provides basic information about
the project and official software information products and will be
aggregated with the information from other Department of the Interior
software projects to form the Departmental Enterprise Code Inventory,
which is required by the Federal Source Code Policy. The
code.json
file is required for software
information products but may be created for projects
without official software information products.
JSON Overview
The JSON data format allows machine-to-machine communication with structured text. JSON is language agnostic.
JSON Syntax:
-
Use key/value pairs
- keys are strings, indicated by double quotes
- values can be:
- strings (
"Vlad Dracula"
), - numbers (
1.5
), - objects (
{"key": "value", "key2": "value2"}
), - arrays (
[lists]
), - boolean (
true
/false
), or null
- strings (
- separate keys from values with a colon
- Format:
"key": "value"
- Example:
"name": "Vlad Dracula"
- Format:
-
Separate key/value pairs with commas:
Generally, a JSON file will contain an object or an array. If it is
an object, it will start and end with curly brackets {}
. If
it is an array, it will start and end with square brackets
[]
.
Let us create a JSON file in our GitLab project space with the
filename hello-world.json
.

In the web browser, add the following content to
hello-world.json
:
Notice that in our example, our JSON represents an object since it starts with curly brackets. Also, notice that the GitLab web editor provides some highlighting and indentation assistance similar to what a desktop editor might provide.
You can use a JSON Validator like JSON Formatter & Validator to format and check your JSON.
Let us try adding a trailing comma in our JSON and validating it:
The JSON Formatter & Validator will tell you what it found wrong and attempt to fix it for you:
OUTPUT
Info: Removed trailing comma.
Metadata Template
USGS provides a code.json
template (see below) to help
you get started writing project metadata. Notice that its top-level
element is an array, which is designated by the square brackets.
JSON
[
{
"name": "REPOSITORY_NAME",
"organization": "U.S. Geological Survey",
"description": "REPOSITORY_DESCRIPTION",
"version": "RELEASE_VERSION",
"status": "RELEASE_STATUS",
"permissions": {
"usageType": "openSource",
"licenses": [
{
"name": "Public Domain, CC0-1.0",
"URL": "https://code.usgs.gov/GROUP_HIERARCHY/REPOSITORY_NAME/-/raw/RELEASE_VERSION/LICENSE.md"
}
]
},
"homepageURL": "https://code.usgs.gov/GROUP_HIERARCHY/REPOSITORY_NAME",
"downloadURL": "https://code.usgs.gov/GROUP_HIERARCHY/REPOSITORY_NAME/-/archive/RELEASE_VERSION/REPOSITORY_NAME-RELEASE_VERSION.zip",
"disclaimerURL": "https://code.usgs.gov/GROUP_HIERARCHY/REPOSITORY_NAME/-/raw/RELEASE_VERSION/DISCLAIMER.md",
"repositoryURL": "https://code.usgs.gov/GROUP_HIERARCHY/REPOSITORY_NAME.git",
"vcs": "git",
"laborHours": 0,
"tags": [
"TOPIC_TAG_1",
"TOPIC_TAG_2"
],
"languages": [
"PROGRAMMING_LANG_1",
"PROGRAMMING_LANG_2"
],
"contact": {
"name": "REPOSITORY_ADMINISTRATOR_NAME",
"email": "REPOSITORY_ADMINISTRATOR_EMAIL"
},
"date": {
"metadataLastUpdated": "YYYY-MM-DD"
}
}
]
Create a Metadata File in GitLab
Create a new code.json
file at the top level of your
GitLab repository:

Paste the template JSON into the file, add a commit message, and
click Commit changes
:

Add Project-Specific Information
Now, edit the code.json
file to include project-specific
information. While viewing the code.json
file in GitLab,
click Edit
and Edit single file
:

Replace the ALL_CAPS placeholders with meaningful values for the project. For the purposes of this exercise, the project includes code for modeling the co-occurrence of Vampires and Werewolves on Mars. The project team is actively developing the code. Eventually, they will release a USGS software information product in the public domain. This particular metadata object will document the entire project as opposed to a single product, so use “main” as the version. The project uses machine learning / artificial intelligence techniques and the code is written in Python.
GROUP_HIERARCHY is the group name under which your project is nested
in GitLab. The GROUP_HIERARCHY may be one level if you are working out
of a personal space (e.g., vdracular
) or it may be a nested
hierarchy (e.g., ecosystems/FRESC
).
Below are the field definitions for code.json
and
examples of how the template can be updated:
-
name
: Should be a short, human readable name for the project. This should match the value provided when creating the project in GitLab. The best practice is to use lowercase words with hyphens separating them.
-
organization
: Must always be"U.S. Geological Survey"
; casing and punctuation are important. No updates are needed to the template.
-
description
: This may be a longer description of the project. It should be no more than 1-2 sentences. Verbose descriptions may exist in theREADME.md
file.
-
version
: This should be a semantic version number for the product (e.g.,1.0.0
) or the DEFAULT_BRANCH name (e.g.,main
ormaster
) depending on whether the metadata object is referencing the project or an information product. The version number should not include a leadingv
(i.e.,v1.0.0
) or other identifier. A Git branch (release candidate branch) must exist with the same name (e.g.,1.0.0
) during the review process. Upon publication, the version branch is converted to a tag. (We will discuss more about release tags in a future episode).
-
status
: Must be one of the enumerated values listed below. There are no official definitions for these terms in code.gov; however, Wikipedia provides some good definitions, which are paraphrased below.-
Ideation
: planning phase of a software project. -
Development
: work on software project prior to formal testing. -
Alpha
: initial testing phase, often done within the project team or organization. -
Beta
: feature complete testing phase that follows Alpha testing, often available to users outside project team or organization. -
Release Candidate
: a Beta version with the potential to be ready for production. In USGS, a release candidate would be going through formal review and approval. -
Production
: the product has passed all stages of testing. In USGS, a production release has been reviewed and approved. -
Archival
: a version of the software that is no longer supported.
-
-
permissions
-
usageType
: A list of enumerated values which describes the usage permissions for the release:- openSource: Open source
- governmentWideReuse: Government-wide reuse
- exemptByLaw: The sharing of the source code is restricted by law or regulation, including—but not limited to—patent or intellectual property law, the Export Asset Regulations, the International Traffic in Arms Regulation, and the Federal laws and regulations governing classified information
- exemptByNationalSecurity: The sharing of the source code would create an identifiable risk to the detriment of national security, confidentiality of Government information, or individual privacy
- exemptByAgencySystem: The sharing of the source code would create an identifiable risk to the stability, security, or integrity of the agency’s systems or personnel
- exemptByAgencyMission: The sharing of the source code would create an identifiable risk to agency mission, programs, or operations
- exemptByCIO: The CIO believes it is in the national interest to exempt sharing the source code
- exemptByPolicyDate: The release was created prior to the M-16-21 policy (August 8, 2016)
-
license
-
name
: The name of the license under which the product is released (e.g.,Public Domain, CC0-1.0
). In most cases, the appropriate license for USGS products isPublic Domain, CC0-1.0
, but sometimes (e.g., when some of the code is from outside sources or collaborators) different licenses are required. For more information on selecting an appropriate license see the Licensing episode in this Lesson. -
URL
: A link to theLICENSE.md
file stored in this project- Must reference the
main
ormaster
branch (this will differ for an official product, which should point to the immutable tagged version) - Must use the
raw
variant of the file, which provides access to the plain text of the file and not the GitLab-formatted text. To get theraw
variant of a file, click into the file, and click theOpen raw
button next to theDownload
button:
- Must reference the
-
-
JSON
"permissions": {
"usageType": "openSource",
"licenses": [
{
"name": "Public Domain, CC0-1.0",
"URL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/main/LICENSE.md"
}
]
}
-
homepageURL
*: A link to the project homepage- May point to the project on GitLab, but will not include the
.git
extension - May point to a project home page elsewhere as long as it is publicly accessible (or soon-to-be publicly accessible, once you have gone through the release process) and in an approved location (e.g., usgs.gov webpage as opposed to a personal website)
- May point to the project on GitLab, but will not include the
-
downloadURL
: A link to download a ZIP archive of the project source code- Must point to the
main
ormaster
branch (this will differ for an official product, which should point to the immutable tagged version) - In GitLab, you can get the download URL by selecting
Code
–> right clickzip
(underDownload source code
) –>Copy Link
:
- Must point to the
JSON
"downloadURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/archive/main/vampires-and-werewolves-main.zip"
-
disclaimerURL
: A link to theDISCLAIMER.md
file stored in this project- Must use the
raw
variant of the file, which provides access to the plain text of the file and not the GitLab-formatted text - Must point to the
main
ormaster
branch (this will differ for an official product, which should point to the immutable tagged version)
- Must use the
JSON
"disclaimerURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/main/DISCLAIMER.md"
-
repositoryURL
*: A link to this project on GitLab- Must include the
.git
extension
- Must include the
*Note: homepageURL
and repositoryURL
are
different. repositoryURL
should end with .git
whereas the homepageURL
should not.
-
vcs
: A lowercase string with the name of the version control system that is being used. For USGS, this will begit
. No updates are needed to the template.
-
laborHours
: An estimate of total labor hours spent by your organization across the current version and all previous versions, including labor performed by federal employees and contractors. Labor hours are cumulative across all versions. Your best guess is fine. If not known, the recommendation is to use-1
.
-
tags
: An array of topical/domain tags relevant to the project- Consider using the USGS Thesaurus or other controlled vocabularies to improve browse functionality in the code inventory.
- These tags can be used to help people narrow down searches for software, so consider terms that will help direct potential users to your project
- If the project supports AI/ML research and development, this array
must include the tag
usg-artificial-intelligence
. This tag is short forU.S. Government Artificial Intelligence
(i.e., do not use “usgs-artificial-intelligence”).
-
languages
: An array of the programming languages used within this project (e.g., “Python”, “R”, “C++”). There is not a controlled vocabulary, so use your best judgement on how to represent the programming languages in your project.
-
contact
: Point of contact information for the software information product.
-
date
-
metadataLastUpdated
: An ISO datestamp (YYYY-MM-DD) of when the metadata item within thecode.json
file was last modified. Be sure to update this value whenever you modify any of the other key/value pairs for this metadata item. Note that you must use two digits for month and day (e.g., 2024-8-9 is not correct).
-
Personal Space in GitLab
In the examples above, the URLs that we are generating reference Vlad Dracula’s or your own personal GitLab space. In reality, you cannot make a repository public that is located under a personal username. Instead, public repositories need to be located under a public group. The current recommendation is to have groups at the USGS Mission Area level (e.g., Ecosystems) and then subgroups at the USGS Science Center level. Project repositories will then be located within the Science Center subgroup. To avoid needing to rename all of your URLs, it is a best practice to start projects within these public groups and maintain more restrictive permissions at the project level.
This is what the full code.json
file should look like
after making the updates above:
JSON
[
{
"name": "vampires-and-werewolves",
"organization": "U.S. Geological Survey",
"description": "Code for modeling the co-occurrence of Vampires and Werewolves on Mars",
"version": "main",
"status": "Development",
"permissions": {
"usageType": "openSource",
"licenses": [
{
"name": "Public Domain, CC0-1.0",
"URL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/main/LICENSE.md"
}
]
},
"homepageURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves",
"downloadURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/archive/main/vampires-and-werewolves-main.zip",
"disclaimerURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/main/DISCLAIMER.md",
"repositoryURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves.git",
"vcs": "git",
"laborHours": 0,
"tags": [
"usg-artificial-intelligence",
"vampires",
"werewolves",
"mars"
],
"languages": [
"Python"
],
"contact": {
"name": "Vlad Dracula",
"email": "vdracula@usgs.gov"
},
"date": {
"metadataLastUpdated": "2024-05-29"
}
}
]
Challenge
Use JSON Formatter & Validator to format and check your JSON. What errors were present in your JSON? Note that this tool only validates against the JSON syntax and does not validate against the code.gov metadata schema.
Additional code.json
fields
Additional fields are also available. See the official
code.gov metadata schema for additional details. Note that you
should only add fields from the “releases” array within this schema. The
full code.gov metadata schema includes other fields that are necessary
for building the Enterprise Code Inventory, but those should not be
included in the individual project code.json
files. Fields
that are not documented in the official code.gov metadata schema
cannot be included in the code.json
files.
Updating Metadata for Initial Software Information Product
Remember that the top-level element in code.json
file is
an array. This means it may contain more than one object for
your project. The recommended practice is to order metadata objects with
the DEFAULT_BRANCH (e.g., main) appearing first, followed by the most
recently released version. For an initial software information product
release, it would look something like this:
Metadata evolve over time. There is some confusion where people think the metadata in the main branch should be for the main branch code only and not for any other branches. The reality is the metadata in the DEFAULT_BRANCH (e.g., main) should contain metadata for each version of the project (official or otherwise). The metadata in the tags associated with a specific version should contain metadata for the current version and all preceding versions; in this way, it will match the metadata in the main branch at the time the version is created.
Releasing an Initial Software Information Product
You are ready to release an initial version of your software
information product. In the code.json
file, copy the text
for the main branch’s release object and paste it directly below in the
code.json
array (you will use the main branch release
object as a type of template to make further changes). You will need to
add a comma between the two objects after the closing }
for
the first object. In the second object, update the status
field to Production
. Additionally, update the
URL
fields in the second object to use 1.0.0
(or whatever version number you are using; it is not required to use
1.0.0
) instead of main
in the
RELEASE_VERSION
section of the URL. You will also need to
update the laborHours
and the
metadataLastUpdated
fields.
JSON
[
{
"name": "vampires-and-werewolves",
"organization": "U.S. Geological Survey",
"description": "Code for modeling the co-occurrence of Vampires and Werewolves on Mars",
"version": "main",
"status": "Development",
"permissions": {
"usageType": "openSource",
"licenses": [
{
"name": "Public Domain, CC0-1.0",
"URL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/main/LICENSE.md"
}
]
},
"homepageURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves",
"downloadURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/archive/main/vampires-and-werewolves-main.zip",
"disclaimerURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/main/DISCLAIMER.md",
"repositoryURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves.git",
"vcs": "git",
"laborHours": 200,
"tags": [
"usg-artificial-intelligence",
"vampires",
"werewolves",
"mars"
],
"languages": [
"Python"
],
"contact": {
"name": "Vlad Dracula",
"email": "vdracula@usgs.gov"
},
"date": {
"metadataLastUpdated": "2024-06-15"
}
},
{
"name": "vampires-and-werewolves",
"organization": "U.S. Geological Survey",
"description": "Code for modeling the co-occurrence of Vampires and Werewolves on Mars",
"version": "1.0.0",
"status": "Production",
"permissions": {
"usageType": "openSource",
"licenses": [
{
"name": "Public Domain, CC0-1.0",
"URL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/1.0.0/LICENSE.md"
}
]
},
"homepageURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves",
"downloadURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/archive/1.0.0/vampires-and-werewolves-1.0.0.zip",
"disclaimerURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves/-/raw/1.0.0/DISCLAIMER.md",
"repositoryURL": "https://code.usgs.gov/vdracula/vampires-and-werewolves.git",
"vcs": "git",
"laborHours": 200,
"tags": [
"usg-artificial-intelligence",
"vampires",
"werewolves",
"mars"
],
"languages": [
"Python"
],
"contact": {
"name": "Vlad Dracula",
"email": "vdracula@usgs.gov"
},
"date": {
"metadataLastUpdated": "2024-06-15"
}
}
]
The version of the code.json
file that was created in
the exercise above will be included in the 1.0.0
branch,
once the branch is created, and ultimately the immutable tagged product,
as well as in the main
branch.
Note about Status Field
There are no set rules for what status needs to be assigned to a
given version or branch of a project. The goal is to do the best to
communicate to users how thoroughly particular code has been tested,
reviewed, and approved, and how you might anticipate them using the
project and products. For example, if you have testing, reviews, and
approvals built into your development process such that the
main
branch is always the latest and greatest and should be
the go-to code to use, then the main
branch might be
labeled with a status of ‘Production’. If instead the content in the
main
branch is not formally approved until a release branch
is created, then the main
branch might maintain a status of
‘Development’ to encourage users to use the most recent formal
version.
Likewise, if a previous version of a product is still relevant and
usable, it may continue to have a status
label of
‘Production’. If, however, the newer version corrects some bugs and
should be used instead of a previous version, then, the previous version
should have its status
updated to ‘Archival’.
Key Points
- A
code.json
file is a file formatted in JavaScript Object Notation (JSON) and contains project metadata. Thecode.json
file is saved at the top-level of the project. - USGS compiles all of the
code.json
files for public products in GitLab into an inventory that is required by Federal policy. - You can use the
code.json
file template above to begin creating your project and product metadata with the required fields.