Creating and Using Metadata in OSCAL
This tutorial provides a walkthrough for creating the metadata
section of an OSCAL document, which appears in all OSCAL documents.
Before reading this tutorial you should:
- Have some familiarity with the XML, JSON, or YAML formats.
- Have a basic understanding of the OSCAL models and their overall structure.
Document metadata in OSCAL appears in two locations:
- The document's universally unique identifier (UUID).
- The document's metadata section.
This tutorial explores both of these topics.
Document UUID
All OSCAL documents use a UUID RFC4122 to provide a stable and unique way to refer to a given instance of an OSCAL document. UUIDs are generated when the OSCAL document is created or revised.
While not strictly part of the metadata
section of an OSCAL document, this document identifier is part of the OSCAL document's core metadata.
OSCAL supports two types of UUIDs:
- Version 4: A randomly or pseudo-randomly generated UUID.
- Version 5: A name-based UUID based on SHA-1 hashing.
The OSCAL program recommends using a version 4 (random) UUID as the document identifier, which is highly resistant to collisions.
Many tools and programming APIs provide easy ways to generate version 4 and 5 UUIDs.
For our example we have generated the UUID c3da6d1d-c20c-4c7c-ae73-4010167a186b
using a trivial UUIDv4 generator.
<catalog uuid="c3da6d1d-c20c-4c7c-ae73-4010167a186b">
An OSCAL document's UUID is provided by the @uuid
attribute, based on the uuid datatype, on the document's root element. In this example, the root element is catalog
.
|
|
An OSCAL document's UUID is provided by the uuid
property, based on the uuid datatype, on the document's top-level object. In this example, the top-level object is identified by the property catalog
.
|
|
An OSCAL document's UUID is provided by the uuid
key, based on the uuid datatype, at the document's top-level. In this example, the top-level key is named catalog
.
What is the Metadata Section?
All OSCAL models share some common structure and elements, as discussed in Common High-Level Structure. The foremost of these is the metadata section, which includes important identifying and categorizing information for an OSCAL document. The metadata section contains several mandatory fields that are vital to the processing of OSCAL documents and help ensure interoperability, as well as optional fields that are designed to provide OSCAL content creators flexibility in expressing additional information.
As with all parts of OSCAL, the metadata section can be represented in XML, JSON, and YAML, which each support representing an equivalent set of information. Examples in this tutorial are provided for XML, JSON, and YAML to show the equivalent representations.
Metadata Fields
The OSCAL metaschema reference (XML | JSON/YAML) provides a comprehensive listing of the metadata section's data fields. Below is the high-level structure of the metadata section in XML, JSON, and YAML followed by a listing of each field's purpose.
|
|
Element definitions:
<title>
(required) - A human-readable title for the document instance, expressed as markup content.<published>
(optional) - The date and time that this document instance was originally published, expressed as a datetime-with-timezone.<last-modified>
(required) - The date and time that this document instance was last modified, expressed as a datetime-with-timezone. If any part of the document is changed, this value should be updated to the current date and time.<version>
(required) - A string that provides the version of the document instance. If any part of the document is changed, this version value should be incremented according to the versioning scheme used.<oscal-version>
(required) - The version of OSCAL that this document instance is considered valid against, expressed as a string.<revisions>
(zero to many) - A list of revisions providing a history of changes to the document.<document-id>
(zero to many) - A unique ID that identifies a document, and ties all separate instances of the document into a single document series. The@scheme
attribute provides a URI to identify the scheme used to generate the ID. See the later section for more information and usage guidance.<prop>
(zero to many) - Represents some arbitrary "property" of the document. This flexible element provides an extension point for adding additional metadata. See the later section for more information and usage guidance.<link>
(zero to many) - Provides a resolvable URI (the@href
attribute) for some resource. The purpose of the link is given with the@rel
attribute, and its media type through the@media-type
attribute. See the later section for more information and usage guidance.<role>
(zero to many) - Defines a function or duty assumed or expected to be assumed by a party (i.e., person or organization) in a specific situation.<location>
(zero to many) - A location, with associated metadata that can be referenced.<party>
(zero to many) - A responsible entity which is either a person or an organization that can be referenced.<responsible-party>
(zero to many) - A reference to a set of organizations or persons that have responsibility for performing a referenced role in the context of the containing object.<remarks>
(optional) - Markup formatted text consisting of notes for human readers of the content.
|
|
Note that in JSON, any objects that may appear multiple times are contained in a JSON Array.
Field definitions:
title
(required) - A human-readable title for the document, expressed as Markdown content.published
(optional) - The date and time that this document was originally published, expressed as a datetime-with-timezone.last-modified
(required) - The date and time that this document instance was last modified, expressed as a datetime-with-timezone. If any part of the document is changed, this value should be updated to the current date and time.version
(required) - A string that provides the version of the document instance. If any part of the document is changed, this version value should be incremented according to the versioning scheme used.oscal-version
(required) - The version of OSCAL that this document instance is considered valid against, expressed as a string.revisions
(optional) - A list of revisions providing a history of changes to the document.document-ids
(optional, zero to many) - A set of unique IDs that identifies a document, tying all separate instances of the document into a single document series. Thescheme
property defines the identification scheme used based on a URI associated with the scheme used to generate the ID. See the later section for more information and usage guidance.props
(optional, zero to many) - Represents some arbitrary "property" of the document. This flexible element provides an extension point for adding additional metadata. See the later section for more information and usage guidance.links
(optional, zero to many) - Provides a resolvable URI (thehref
property) to some resource. The purpose of the link is given with therel
attribute, and its media type through themedia-type
property. See the later section for more information and usage guidance.roles
(optional, zero to many) - Defines a function or duty assumed or expected to be assumed by a party (i.e., person or organization) in a specific situation.locations
(optional, zero to many) - A location, with associated metadata that can be referenced.parties
(optional, zero to many) - A responsible entity which is either a person or an organization that can be referenced.responsible-parties
(optional, zero to many) - A reference to a set of organizations or persons that have responsibility for performing a referenced role in the context of the containing object.remarks
(optional) - Markup formatted text consisting of notes for human readers of the content.
|
|
Note that in YAML, any objects that may appear multiple times are contained in a YAML array.
Field definitions:
title
(required) - A human-readable title for the document, expressed as Markdown content.published
(optional) - The date and time that this document was originally published, expressed as a datetime-with-timezone.last-modified
(required) - The date and time that this document instance was last modified, expressed as a datetime-with-timezone. If any part of the document is changed, this value should be updated to the current date and time.version
(required) - A string that provides the version of the document instance. If any part of the document is changed, this version value should be incremented according to the versioning scheme used.oscal-version
(required) - The version of OSCAL that this document instance is considered valid against, expressed as a string.revisions
(zero to many) - A list of revisions providing a history of changes to the document.document-ids
(zero to many) - A unique ID that identifies a document, and ties all separate instances of the document into one document series. The@scheme
attribute provides a URI to identify the scheme used to generate the ID. See the later section for more information and usage guidance.props
(zero to many) - Represents some arbitrary "property" of the document. This flexible element provides an extension point for adding additional metadata. See the later section for more information and usage guidance.links
(zero to many) - Provides a resolvable URI (thehref
property) to some resource. The purpose of the link is given with therel
attribute, and its media type through themedia-type
property. See the later section for more information and usage guidance.roles
(zero to many) - Defines a function or duty assumed or expected to be assumed by a party (i.e., person or organization) in a specific situation.locations
(zero to many) - A location, with associated metadata that can be referenced.parties
(zero to many) - A responsible entity which is either a person or an organization that can be referenced.responsible-parties
(zero to many) - A reference to a set of organizations or persons that have responsibility for performing a referenced role in the context of the containing object.remarks
(optional) - Markup formatted text consisting of notes for human readers of the content.
Creating a Metadata Section
Let's start with a nominal example of an OSCAL catalog document to demonstrate how to create the metadata section.
The metadata section contains fields which may be categorized as:
- required
- recommended
- extensions
- and other optional fields
The following sections demonstrate how the aforementioned field categories can be used to support specific use cases.
Using Required and Recommended Fields
First, we start with the document's title, which is intended to be brief, human-readable text that will help a reader understand the context of the document.
The following is an example of what a title might look like expressed in OSCAL:
|
|
|
|
|
|
As you can see, the title is expressed in a very similar way across the different formats.
Next, we have the published and last modified date/time fields that represent when the document was published for the first time and most recently changed respectively. These field values are expressed using the OSCAL dataTime-with-timezone data type, which requires that the time zone offset is included to provide a localized time. By providing a localized timezone, the local time in any timezone can be calculated when using this information.
Lets look at a scenario where an OSCAL document was:
- published on January 1st, 2021 at "midnight" or 12:00AM with a time offset of 5 hours from Coordinated Universal Time (UTC)
- updated on January 5th, 2021 at "midnight" or 12:00AM with a time offset of 5 hours from UTC
This information would be expressed in OSCAL as follows in the metadata section:
|
|
The <published>
element gives the date and time of when the document was published for the first time. This element is not required, but including it is strongly recommended to help the OSCAL document consumer understand when the document was originally published.
The <last-modified>
element provides the the date and time of the most recent change to this document. If a document was published and then never updated, the <last-modified>
value should be identical to the one given by <published>
.
|
|
The published
property gives the date and time of when the document was published for the first time. This field is not required, but including it is strongly recommended to help the OSCAL document consumer understand when the document was originally published.
The last-modified
property provides the the date and time of the most recent change to this document. If a document was published and then never updated, the last-modified
value should be identical to the one given by published
.
|
|
The published
key gives the date and time of when the document was published for the first time. This key is not required, but including it is strongly recommended to help the OSCAL document consumer understand when the document was originally published.
The last-modified
key provides the the date and time of the most recent change to this document. If a document was published and then never updated, the last-modified
value should be identical to the one given by published
.
Next, we must provide the version of the content. This refers to the version of the OSCAL document instance itself, not the version of any other content.
OSCAL does not place requirements on the version string itself, as versioning is a complicated process that differs from content creator to content creator. Where possible, it is recommended to use well formatted versions with a clear and well defined syntax. The NIST OSCAL team uses Semantic Versioning 2.0.0 as the versioning scheme for all schema, Metaschema, and OSCAL documents we maintain, which is a generally accepted best practice for versioning.
Since this is the first time we've created this document, we will use the semantic version "1.0.0".
This information would be expressed in OSCAL as follows in the metadata section:
<version>1.0.0</version>
|
|
|
|
This version will be incremented whenever the OSCAL document is updated.
Finally, it is required to specify the version of OSCAL that was used when creating the document. This value provides an indicator to tools processing the data, which version of OSCAL to consider the data valid with.
The OSCAL version is expressed in OSCAL as follows in the metadata section:
|
|
|
|
|
|
We have now covered all required and recommended fields of the metadata section, and could publish our document as a valid OSCAL document. Here is what it would look like all together:
|
|
|
|
|
|
However, the metadata section has several other optional, but important, fields which we will cover in the following sections.
Providing additional document identifiers
OSCAL documents, by their nature, may often require updates of varying impact. It is important to track these updates in a way that can be automated and managed by a wide array of systems and users. To that end, we must cover the concepts of "Document Series" and "Document Series Instances".
A "Document Series" consists of all formats, updates and versions of a given document. In more human terms, if a content author writes "Document 1 version 1", "Document 1 version 2", and "Document 1 version 3.4", we could say that the "Document Series" is simply "Document 1". Furthermore, if a document author produces PDF, OSCAL JSON, and other formats of the same information, it may be useful to identify the entire collection of formats and versions in a single "Document Series".
A "Document Series Instance" is a discrete document instance that is part of a "Document Series". Using the above example, "Document 1 version 2 represented in OSCAL JSON" is a "Document Series Instance".
In OSCAL we use a document identifier to identify that a document is a member of a "Document Series" and a document UUID to identify a "Document Series Instance".
In the following example, we illustrate how to include a document identifier using the Digital Object Identifier (DOI) System using the identifier 10.1000/182
. This DOI points to the DOI Handbook and is used here as an example only. In a typical OSCAL use case, the referenced DOI will point to the actual OSCAL document instead.
|
|
In the above example, OSCAL we add a <document-id>
to track the "Document Series" identified by the DOI 10.1000/182
.
The DOI System is indicated using the @scheme
attribute value http://www.doi.org/
, which is standardized in OSCAL (XML, JSON/YAML). A scheme is required to be a URI.
|
|
In the above example, OSCAL we add an object to the document-ids
array to track the "Document Series" identified by the DOI 10.1000/182
.
The DOI System is indicated using the scheme
property value http://www.doi.org/
, which is standardized in OSCAL (XML, JSON/YAML). A scheme is required to be a URI.
|
|
In the above example, OSCAL we add an object to the document-ids
list to track the "Document Series" identified by the DOI 10.1000/182
.
The DOI System is indicated using the scheme
property value http://www.doi.org/
, which is standardized in OSCAL (XML, JSON/YAML). A scheme is required to be a URI.
In the future, any updated versions of this document that are produced will have the same "Document Series" identifier, but a different document identifier.
In the case that a document does not explicitly provide a "Document Series" identifier as above, it is given one implicitly that is equal to it's document identifier. This means that if we publish a new document without a <document-id>
, then later need to publish an update to it, we can tie it back to the original document.
Multiple "Document Series" identifiers can be provided for a document instance to denote that a document is a part of multiple Document Series.
Using Optional Extensions
Like most OSCAL objects, the metadata section provides for the ability to define properties and link relationships.
Properties
In OSCAL properties are namespaced, key/value pairs that allow additional information to provided that annotate the containing object. In the metadata section, properties are used to provide document-level annotations. Common use cases for properties includes providing keywords, tags, and classifications.
For this example, we will create a property named marking
in the default OSCAL namespace http://csrc.nist.gov/ns/oscal
to mark an OSCAL document with a Traffic Light Protocol (TLP) classification of red
.
|
|
The <prop>
element above declares the desired property. The @name
attribute is the property's key, which is in the default OSCAL namespace http://csrc.nist.gov/ns/oscal
, since no @ns
is provided. The @value
attribute is the property's value.
Together they form a key-value pair to facilitate automated data lookup. Here we use the optional @class
attribute to provide an additional layer of information, noting that the marking we are using is the TLP protocol, as indicated using the http://www.first.org/tlp
scheme.
|
|
We add an object to the props
array to declare the desired property. The name
field is the property's key, which is in the default OSCAL namespace http://csrc.nist.gov/ns/oscal
, since no ns
field is provided. The value
field is the property's value.
Together they form a key-value pair to facilitate automated data lookup. Here we use the optional class
field to provide an additional layer of information, noting that the marking we are using is the TLP protocol, as indicated using the http://www.first.org/tlp
scheme.
|
|
We add an item to the props
list to declare the desired property. The name
field is the property's key, which is in the default OSCAL namespace http://csrc.nist.gov/ns/oscal
, since no ns
field is provided. The value
field is the property's value.
Together they form a key-value pair to facilitate automated data lookup. Here we use the optional class
field to provide an additional layer of information, noting that the marking we are using is the TLP protocol, as indicated using the http://www.first.org/tlp
scheme.
This completes defining the marking property.
Links
Alongside properties, links are another way of providing additional information in the metadata section that is not covered by the other fields. Links are a means to establish a relationship between an OSCAL object and another OSCAL objects or web resource.
<link>
.The following example illustrates how to establish a latest-version
link to the latest version of the OSCAL document. This link will reference a static URL whose contents can be updated to reflect the latest revision of the document, which makes it easy for consumers to quickly and easily update their content.
|
|
The @rel
attribute establishes latest-version
as the type of relationship. The @href
attribute provides a URI that can be used to access, also called "resolve", the associated resource.
|
|
By adding a new object to the links
array property, the rel
field establishes latest-version
as the type of relationship. The href
field provides a URI that can be used to access, also called "resolve", the associated resource.
|
|
By adding a new item to the links
list, rel
field establishes latest-version
as the type of relationship. The href
field provides a URI that can be used to access, also called "resolve", the associated resource.
The predecessor-version
and successor-version
link relationships can be used in tandem with the above to create a navigable web of document versions. By using these and other custom relationship values, any relevant or related resource can be described and linked to in the OSCAL document's metadata section.
Using Other Optional Fields
The remainder of this tutorial will briefly cover the other optional fields inside the metadata section. While not required by the specification, these fields provide invaluable information, particularly around the provenance and authorship of the data contained in an OSCAL document, or to define referencable roles, locations, or parties. This information is declared in the metadata section, and is often passed by-reference to other parts of the OSCAL document or other OSCAL documents importing that document. Each field can be expanded to get additional information.
<role>
- Roles define some function or purpose that is to be assigned to some entity later in the document. Role elements have anid
attribute with a Token that is used to reference the role elsewhere in the OSCAL document. A number of pre-defined roles exist in OSCAL, but differ depending on context.In the metadata section they are as follows:
- creator: Indicates the organization or person that created this content.
- prepared-by: Indicates the organization or person that created this content.
- prepared-for: Indicates the organization or person for which this content was created.
- content-approver: Indicates the organization or person responsible for all content represented in the "document".
- contact: Indicates the organization or person to contact for questions or support related to this content.
Other roles can be locally defined by the content creator.
<location>
- Geographic data associated with a street, mailing, or email address, or phone number. Locations have a@uuid
attribute that allow them to be referenced elsewhere in the OSCAL document. Includes elements to describe a variety of data describing the location in question.<party>
- Defines some entity, either a person or an organization. Has a@uuid
attribute that allows for references to this party elsewhere in the OSCAL document. Includes elements to describe a variety of data describing the party in question, including a location uuid.<responsible-party>
- Explicitly declares a party that is responsible for a given role relative to the document. The@role-id
attribute references the role that the party is fulfilling, and is either a custom role locally defined or one of the core-defined roles; see the<role>
section above for details. Uses a party's uuid to link the given role to the given party.<remarks>
- markup-multiline formatted text providing notes and comments regarding the document.
roles
- An array of Role objects. Roles define some function or purpose that is to be assigned to some entity later in the document. Role objects have anid
field with a Token that is used to reference the role elsewhere in the OSCAL document. A number of pre-defined roles exist in OSCAL, but differ depending on context.In the metadata section they are as follows:
- creator: Indicates the organization or person that created this content.
- prepared-by: Indicates the organization or person that created this content.
- prepared-for: Indicates the organization or person for which this content was created.
- content-approver: Indicates the organization or person responsible for all content represented in the "document".
- contact: Indicates the organization or person to contact for questions or support related to this content.
Other roles can be locally defined by the content creator.
locations
- An array of location objects. Geographic data associated with a street, mailing, or email address, or phone number. Locations have auuid
field that allow them to be referenced elsewhere in the OSCAL document. Includes fields to describe a variety of data describing the location in question.parties
- An array of party objects. Defines some entity, either a person or an organization. Has auuid
field that allows for references to this party elsewhere in the OSCAL document. Includes fields to describe a variety of data describing the party in question, including a location uuid.responsible-parties
- An array of responsible-party objects. Explicitly declares a party that is responsible for this a given role. Therole-id
field references the role that the party is fulfilling, and is either a custom role locally defined or one of the core-defined roles; see theroles
section above for details. Uses a party's uuid to link the given role to the given party.remarks
- markup-multiline formatted text providing notes and comments regarding the document.
roles
- Roles define some function or purpose that is to be assigned to some entity later in the document. Role objects have anid
field with a Token that is used to reference the role elsewhere in the OSCAL document. A number of pre-defined roles exist in OSCAL, but differ depending on context.In the metadata section they are as follows:
- creator: Indicates the organization or person that created this content.
- prepared-by: Indicates the organization or person that created this content.
- prepared-for: Indicates the organization or person for which this content was created.
- content-approver: Indicates the organization or person responsible for all content represented in the "document".
- contact: Indicates the organization or person to contact for questions or support related to this content.
Other roles can be locally defined by the content creator.
locations
- Geographic data associated with a street, mailing, or email address, or phone number. Locations have auuid
field that allow them to be referenced elsewhere in the OSCAL document. Includes fields to describe a variety of data describing the location in question.parties
- Defines some entity, either a person or an organization. Has auuid
field that allows for references to this party elsewhere in the OSCAL document. Includes fields to describe a variety of data describing the party in question, including a location uuid.responsible-parties
- Explicitly declares a party that is responsible for a given role. Therole-id
field references the role that the party is fulfilling, and is either a custom role locally defined or one of the core-defined roles; see therole
section above for details. Uses a party's uuid to link the given role to the given party.remarks
- markup-multiline formatted text providing notes and comments regarding the document.
Putting It All Together
Finally, we have a basic example of an OSCAL metadata section below:
|
|
|
|
|
|
Summary
This concludes the tutorial. At this point you should be familiar with:
- The basic structure of the metadata section.
- How to provide the basic metadata required to be included in an OSCAL document.
- How to use and understand UUIDs and document-ids to track document instances
- How to use optional fields to express additional metadata and extend the metadata section