The Structured Threat Information Expression (STIX) is a collaborative, community-driven effort to define and develop a framework for expressing cyber threat information to enable cyber threat information sharing and cyber threat analysis. The STIX framework comprises a collection of extensible component specifications along with an overarching core specification and supporting specifications. This document serves as an overview of those specifications and defines how they are used within the broader STIX framework.

Status:

This document was last revised or approved by the OASIS Cyber Threat Intelligence (CTI) TC on the above date. The level of approval is also listed above. Check the “Latest version” location noted above for possible later revisions of this document. Any other numbered Versions and other technical work produced by the Technical Committee (TC) are listed at https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=cti#technical.

TC members should send comments on this specification to the TC’s email list. Others should send comments to the TC’s public comment list, after subscribing to it by following the instructions at the “Send A Comment” button on the TC’s web page at https://www.oasis-open.org/committees/cti/.

For information on whether any patents have been disclosed that may be essential to implementing this specification, and any offers of patent licensing terms, please refer to the Intellectual Property Rights section of the TC’s web page (https://www.oasis-open.org/committees/cti/ipr.php).

Citation format:

When referencing this specification the following citation format should be used:

[STIX-v1.2.1-Overview]

STIX^[TM] Version 1.2.1. Part 1: Overview. Edited by Sean Barnum, Desiree Beck, Aharon Chernin, and Rich Piazza. 06 November 2015. OASIS Committee Specification Draft 01 / Public Review Draft 01. http://docs.oasis-open.org/cti/stix/v1.2.1/csprd01/part1-overview/stix-v1.2.1-csprd01-part1-overview.html. Latest version: http://docs.oasis-open.org/cti/stix/v1.2.1/stix-v1.2.1-part1-overview.html.

Notices

All capitalized terms in the following text have the meanings assigned to them in the OASIS Intellectual Property Rights Policy (the "OASIS IPR Policy"). The full Policy may be found at the OASIS website.

This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published, and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this section are included on all such copies and derivative works. However, this document itself may not be modified in any way, including by removing the copyright notice or references to OASIS, except as needed for the purpose of developing any document or deliverable produced by an OASIS Technical Committee (in which case the rules applicable to copyrights, as set forth in the OASIS IPR Policy, must be followed) or as required to translate it into languages other than English.

The limited permissions granted above are perpetual and will not be revoked by OASIS or its successors or assigns.

This document and the information contained herein is provided on an "AS IS" basis and OASIS DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY OWNERSHIP RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

OASIS requests that any OASIS Party or any other party that believes it has patent claims that would necessarily be infringed by implementations of this OASIS Committee Specification or OASIS Standard, to notify OASIS TC Administrator and provide an indication of its willingness to grant patent licenses to such patent claims in a manner consistent with the IPR Mode of the OASIS Technical Committee that produced this specification.

OASIS invites any party to contact the OASIS TC Administrator if it is aware of a claim of ownership of any patent claims that would necessarily be infringed by implementations of this specification by a patent holder that is not willing to provide a license to such patent claims in a manner consistent with the IPR Mode of the OASIS Technical Committee that produced this specification. OASIS may include such claims on its website, but disclaims any obligation to do so.

OASIS takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on OASIS' procedures with respect to rights in any document or deliverable produced by an OASIS Technical Committee can be found on the OASIS website. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this OASIS Committee Specification or OASIS Standard, can be obtained from the OASIS TC Administrator. OASIS makes no representation that any information or list of intellectual property rights will at any time be complete, or that any claims in such list are, in fact, Essential Claims.

The name "OASIS" is a trademark of OASIS, the owner and developer of this specification, and should be used only to refer to the organization and its official outputs. OASIS welcomes reference to, and implementation and use of, specifications, while reserving the right to enforce its marks against misleading uses. Please see https://www.oasis-open.org/policies-guidelines/trademark for above guidance.

STIX[™], TAXII[™], AND CybOX[™] (STANDARD OR STANDARDS) AND THEIR COMPONENT PARTS ARE PROVIDED “AS IS” WITHOUT ANY WARRANTY OF ANY KIND, EITHER EXPRESSED, IMPLIED, OR STATUTORY, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTY THAT THESE STANDARDS OR ANY OF THEIR COMPONENT PARTS WILL CONFORM TO SPECIFICATIONS, ANY IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR FREEDOM FROM INFRINGEMENT, ANY WARRANTY THAT THE STANDARDS OR THEIR COMPONENT PARTS WILL BE ERROR FREE, OR ANY WARRANTY THAT THE DOCUMENTATION, IF PROVIDED, WILL CONFORM TO THE STANDARDS OR THEIR COMPONENT PARTS. IN NO EVENT SHALL THE UNITED STATES GOVERNMENT OR ITS CONTRACTORS OR SUBCONTRACTORS BE LIABLE FOR ANY DAMAGES, INCLUDING, BUT NOT LIMITED TO, DIRECT, INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES, ARISING OUT OF, RESULTING FROM, OR IN ANY WAY CONNECTED WITH THESE STANDARDS OR THEIR COMPONENT PARTS OR ANY PROVIDED DOCUMENTATION, WHETHER OR NOT BASED UPON WARRANTY, CONTRACT, TORT, OR OTHERWISE, WHETHER OR NOT INJURY WAS SUSTAINED BY PERSONS OR PROPERTY OR OTHERWISE, AND WHETHER OR NOT LOSS WAS SUSTAINED FROM, OR AROSE OUT OF THE RESULTS OF, OR USE OF, THE STANDARDS, THEIR COMPONENT PARTS, AND ANY PROVIDED DOCUMENTATION. THE UNITED STATES GOVERNMENT DISCLAIMS ALL WARRANTIES AND LIABILITIES REGARDING THE STANDARDS OR THEIR COMPONENT PARTS ATTRIBUTABLE TO ANY THIRD PARTY, IF PRESENT IN THE STANDARDS OR THEIR COMPONENT PARTS AND DISTRIBUTES IT OR THEM “AS IS.”

Table of Contents

1 Introduction. 7

1.1 Document Conventions. 7

1.1.1 Fonts. 8

1.1.2 UML Package References. 8

1.1.3 UML Diagrams. 8

1.2 Terminology. 9

1.3 Normative References. 9

1.4 Non-Normative References. 10

2 Language Modularity. 11

2.1 Core Data Model 11

2.2 Common Data Model 11

2.3 Component Data Models. 11

2.3.1 Observable. 12

2.3.2 Indicator 12

2.3.3 Incident 12

2.3.4 Tactic, Techniques and Procedures (TTP) 14

2.3.5 Campaign. 14

2.3.6 Threat Actor 14

2.3.7 Exploit Target 14

2.3.8 Course of Action (COA) 14

2.3.9 Report 14

2.4 Data Marking Data Model 14

2.5 Default Extensions Data Model 14

2.6 Default Vocabularies. 15

2.7 Basic Data Types. 15

2.7.1 Common Basic Data Types. 15

2.7.2 Specializations of the BasicString Data Type. 16

3 Data Model Conventions. 18

3.1 UML Packages. 18

3.2 Naming Conventions. 20

3.3 Identifiers. 21

4 Relationships to Other Externally-defined Data Models. 22

4.1 Common Attack Pattern Enumeration and Classification (CAPEC) 22

4.2 Common Vulnerability Reporting Framework (CVRF) 22

4.3 Customer Information Quality (CIQ) 22

4.4 Cyber Observable Expression (CybOX^[TM]) 22

4.5 Malware Attribute Enumeration and Characterization (MAEC) 22

4.6 Open Indicators of Compromise (OpenIOC) 23

4.7 Open Vulnerability and Assessment Language (OVAL) 23

5 Conformance. 24

Appendix A. Acknowledgments. 25

Appendix B. Revision History. 27

1 Introduction

[All text is normative unless otherwise labeled]

The objective of the Structured Threat Information Expression (STIX^[TM]) effort is to specify, characterize, and capture cyber threat information. STIX addresses a full range of cyber threat use cases – including threat analysis, capture and specification of indicators, management of response activities, and information sharing – to improve consistency, efficiency, interoperability, and overall situational awareness.

The STIX specification consists of a formal UML model and a set of textual specification documents that explain the UML model. Specification documents have been written for each of the individual data models that compose the full STIX UML model, which in addition to the nine top-level component data models (Observable[1], Indicator, Incident, TTP, ExploitTarget, CourseOfAction, Campaign, ThreatActor, and Report), includes a core data model, a common data model, a default extension data model, a data marking data model, and a set of default controlled vocabularies.

As illustrated in Figure 1‑1, this STIX specification overview document (shown in yellow) serves as a unifying document for the full set of STIX specification documents. As such, it discusses the modularity of STIX (Section 2), outlines general STIX data model conventions that is necessary as background information to fully understand the the set of STIX specification documents (Section 3), and summarizes the relationship of STIX to other languages (Section 4). Conformance information is also provided (Section 5).

Figure 1‑1. STIX^[TM] Language v1.2.1 documents

Regarding Figure 1‑1, altered shading differentiates the overarching Core and Common data models from the supporting data models (vocabularies, data marking, and default extensions), and the color white indicates the component data models. The solid grey color denotes the overall STIX Language UML Model.

A collection of non-normative STIX information, including community information, suggested practices, and content examples, is available at [GitHub-IO].

For completeness in terms of describing the document overview, note that we provide document conventions in Section 1.1, terminology in Section 1.2, and references in Sections 1.3 and 1.4.

1.1 Document Conventions

The following conventions are used in this document.

1.1.1 Fonts

The following font and font style conventions are used in the document:

· Capitalization is used for STIX high level concepts, which are defined in Section 2.3.

Examples: Indicator, Course of Action, Threat Actor

· The Courier New font is used for writing UML objects.

Examples: RelatedIndicatorsType, stixCommon:StatementType

Note that all high level concepts have a corresponding UML object. For example, the Course of Action high level concept is associated with a UML class named, CourseOfActionType.

· The ‘italic’ font (with single quotes) is used for noting actual, explicit values for STIX Language properties. The italic font (without quotes) is used for noting example values.

Example: ‘PackageIntentVocab-1.0,’ high, medium, low

1.1.2 UML Package References

Each STIX data model is captured in a different UML package (e.g., Core package, Campaign package, etc.) where the packages together compose the full STIX UML model. To refer to a particular class of a specific package, we use the format package_prefix:class, where package_prefix corresponds to the appropriate UML package. Table 3‑1 contains a list of the packages used by the STIX data models, along with the associated prefix notations, descriptions, examples.

1.1.3 UML Diagrams

This overview document makes use of UML diagrams to visually depict relationships between STIX Language constructs. Note that the diagrams have been extracted directly from the full UML model for STIX; they have not been constructed purely for inclusion in this or the other specification documents. Typically, diagrams are included where the visualization of its relationships between classes is useful for illustration purposes. This implies that there will be very few diagrams for classes whose only properties are either a data type or a class from the STIX Common data model.

In UML diagrams, classes are often presented with their attributes elided, to avoid clutter. The fully described class can usually be found in a related diagram. A class presented with an empty section at the bottom of the icon indicates that there are no attributes other than those that are visualized using associations.

1.1.3.1 Class Properties

Generally, a class property can be shown in a UML diagram as either an attribute or an association (i.e., the distinction between attributes and associations is somewhat subjective). In order to make the size of UML diagrams in the specifications manageable, we have chosen to capture most properties as attributes and to capture only higher level properties as associations, especially in the main top-level component diagrams. In particular, we will always capture properties of UML data types as attributes. For example, properties of a class that are identifiers, titles, and timestamps will be represented as attributes.

1.1.3.2 Diagram Icons and Arrow Types

Diagram icons are used in a UML diagram to indicate whether a shape is a class, enumeration or data type, and decorative icons are used to indicate whether an element is an attribute of a class or an enumeration literal. In addition, two different arrow styles indicate either a directed association relationship (regular arrowhead) or a generalization relationship (triangle-shaped arrowhead). The icons and arrow styles we use are shown and described in Table 1‑1 on page 9.

Table 1‑1. UML diagram icons

Icon	Description
	This diagram icon indicates a class. If the name is in italics, it is an abstract class.
	This diagram icon indicates an enumeration.
	This diagram icon indicates a data type.
	This decorator icon indicates an attribute of a class. The green circle means its visibility is public. If the circle is red or yellow, it means its visibility is private or protected.
	This decorator icon indicates an enumeration literal.
	This arrow type indicates a directed association relationship.
	This arrow type indicates a generalization relationship.

1.1.3.3 Color Coding

The shapes of the UML diagrams are color coded to indicate the data model associated with a class. The colors used in the collection of specification documents via exemplars are illustrated in Figure 1‑2.

Figure 1‑2. Data model color coding

1.2 Terminology

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC2119.

1.3 Normative References

[CAPEC] Common Attack Pattern Enumeration and Classification (CAPEC). (2014, Nov. 7). The MITRE Corporation. [Online]. Available: http://capec.mitre.org.

[CEE] Common Event Expression (CEE). (2014, Nov. 28). The MITRE Corporation. [Online]. Available: http://cee.mitre.org.

[CIQ] Customer Information Quality (CIQ) Specifications Version 3.0. Edited by Ram Kumar. 8 April 2008. OASIS Public Review Draft 03. Available: http://docs.oasis-open.org/ciq/v3.0/specs/ciq-specs-v3.html.

[CPE] Common Platform Enumeration (CPE). (2014, Nov. 28). The MITRE Corporation. [Online]. Available: http://cpe.mitre.org.

[CVE] Common Vulnerabilities and Exposures (CVE). (2015, Jul. 28). The MITRE Corporation. [Online]. Available: http://cve.mitre.org.

[CVRF] Common Vulnerabilites Reporting Framework (CVRF). (n.d.). The Industry Consortium for Advancement of Security on the Internet (ICASI). [Online]. Available: http://www.icasi.org/cvrf/. Accessed Aug. 22, 2015.

[CWE] Common Weakness Enumeration (CWE). (2014, Jul. 31). The MITRE Corporation. [Online]. Available: http://cwe.mitre.org.

[ISO8601] Date and time format – ISO 8601 (n.d.). International Organization for Standardization (ISO). [Online]. Available: http://www.iso.org/iso/home/standards/iso8601.htm. Accessed Aug. 23, 2015.

[MAEC] Malware Attribute Enumeration and Characterization (MAEC). (2015, Apr. 14). The MITRE Corporation. [Online]. Available: http://maec.mitre.org.

[OpenIOC] The OpenIOC Framework. (n.d.). Mandiant Corporation. [Online]. Available: http://openioc.org/. Accessed Aug. 23, 2015.

[OVAL] Open Vulnerability and Assessment Language (OVAL). (2015, Jul. 9). The MITRE Corporation. [Onlne]. Available: http://oval.mitre.org.

[RFC2119] Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels”, BCP 14, RFC 2119, March 1997. [Online]. Available: http://www.ietf.org/rfc/rfc2119.txt.

[RFC3986] Berners-Lee, T., Fielding, R. and Masinter, L., “Uniform Resource Identifier (URI): Generic Syntax,” STD 66, RFC 3986, January 2005. Available: https://www.ietf.org/rfc/rfc3986.txt.

[RFC5646] Phillips, A. and Davis, M., “Tags for Identifying Languages,” BCP 47, RFC 5646, September 2009. Available: http://www.ietf.org/rfc/rfc5646.txt.

[W3Name] “Namespaces in XML 1.0 (Third Edition),” W3C Recommendation, 8 December 2009. Available: http://www.w3.org/TR/REC-xml-names.

[W3DT] “XML Schema Part 2: Datatypes Second Edition,” W3C Recommendation, 28 October 2004. Available: http://www.w3.org/TR/xmlschema-2.

1.4 Non-Normative References

[GitHub-IO] STIX – Structured Threat Information Expression | STIX Project Documentation. (n.d.). The MITRE Corporation. [Online]. Available: http://stixproject.github.io/. Accessed Aug. 23, 2015.

[STIX-MAEC] “Characterizing Malware with MAEC and STIX,” The MITRE Corporation, Bedford, MA, April 20, 2014. [Online]. Available: http://stixproject.github.io/about/Characterizing_Malware_MAEC_and_STIX_v1.0.pdf.

[STIX-W] Barnum, S., “Standardizing Cyber Threat Intelligence with the Structured Threat Information eXpression (STIX^[TM]),” The MITRE Corporation, Bedford MA, Feb. 20, 2014. [Online]. Available: http://stixproject.github.io/getting-started/whitepaper/.

[UML-2.4.1] Documents associated with Unified Modeling Language (UML), V2.4.1. (Aug. 2011). The Object Management Group (OMG). [Online]. Available: http://www.omg.org/spec/UML/2.4.1/.

2 Language Modularity

The data models of the STIX language were developed in a modular fashion to facilitate flexibility. As shown in Figure 2‑1, the STIX core and common data models (see Sections 2.1 and 2.2) provide the overarching framework and common characteristics to support nine component data models (see Section 2.3), a cross-cutting data marking data model (see Section 2.4), and a set of default controlled vocabularies (see Section 2.6). Furthermore, the extensibility of the STIX design enables the use of external data models as appropriate (see Section 2.5).

Figure 2‑1. The STIX^[TM] architecture

Each data model depicted in Figure 2‑1 is described in a subsection below.

2.1 Core Data Model

The STIX Core data model defines the STIX Package (not to be confused with a UML package), which corresponds to the primary structure for bundling of information characterized in STIX. A STIX Package is used to bundle the various objects of the other STIX data models and has two main parts: a set of instantial content conformant to any of the nine top-level components (STIX Package content) and a STIX header (provides context for the content). Please see STIX Version 1.2.1 Part 3: Core for complete information on the STIX Core data model.

2.2 Common Data Model

The STIX Common data model defines object classes that are shared across the various STIX data models. At a high level, the STIX Common data model provides base classes, relationship-oriented classes, content aggregation classes, and shared classes. Please see STIX Version 1.2.1 Part 2: Common for complete information on the STIX Common data model.

2.3 Component Data Models

Individual component data models define objects specific to each top-level STIX component construct: Observable (defined in the CybOX^[TM] Core data model); Indicator; Incident; Tactics, Techniques, and Procedures (TTPs); Campaign, Threat Actor, Exploit Target; Course of Action; and Report. These data models each provide the capability to fully express information about their targeted conceptual area. In the STIX framework, they are all optional and may be used separately or in concert, as appropriate, using whichever components and architectural relationships that are relevant for a given use case.

The architecture diagram shown in Figure 2‑2 on page 13 illustrates the interrelationships of the component data models that are found in a STIX Package. To overly simplify a relationship depicted between two components, we might say that a directed arrow from component “A” to component “B” indicates that either component “B” is an optional property of component “A” or that the relationship between components “A” and “B” is an optional property of component “A.” For example, the arrow going from Indicator to TTP denotes that an Indicator may specify a set of one or more relevant TTPs indicated by the Indicator (the TTPs are optional properties of the Indicator). To further illustrate, the arrow going from Indicator to Campaign denotes that the Indicator may specify a set of one or more relationships to a Campaign (the relationships are optional properties of the Indicator).

The bracketed asterisk on each of the arrow labels implies that such a property or relationship MAY exist zero to many times. Note that the interrelationships are not bidirectional. A further discussion of the STIX architecture as it relates to components is given in the STIX white paper [STIX-W].

In the subsections below, a brief description is given for each component data model as well as a reference to the data model’s individual specification document.

2.3.1 Observable

A STIX Observable (as defined with the CybOX Language) represents stateful properties or measurable events pertinent to the operation of computers and networks. Implicit in this is a practical need for descriptive capability of two forms of observables: “observable instances” and “observable patterns.” Observable instances represent actual specific observations that took place in the cyber domain. The property details of this observation are specific and unambiguous. Observable patterns represent conditions for a potential observation that may occur in the future or may have already occurred and exists in a body of observable instances. These conditions may be anything from very specific concrete patterns that would match very specific observable instances to more abstract generalized patterns that have the potential to match against a broad range of potential observable instances. Please see Section 4.4 for details on the relationship between CybOX and STIX.

2.3.2 Indicator

A STIX Indicator conveys specific Observable patterns combined with contextual information intended to represent artifacts and/or behaviors of interest within a cyber security context. Please see STIX Version 2.2.1 Part 4: Indicator for complete information on the STIX Indicator data model.

2.3.3 Incident

An STIX Incident corresponds to sets of related security events affecting an organization, along with information discovered or decided during an incident response investigation. Please see STIX Version 1.2.1 Part 6: Incident for complete information on the STIX Incident data model.

2.3.4 Tactic, Techniques and Procedures (TTP)

A STIX Tactics, Techniques, and Procedures (TTP) is used to represent the behavior or modus operandi of cyber adversaries. Please see STIX Version 1.2.1 Part 5: TTP for complete information on the STIX TTP data model.

2.3.5 Campaign

A STIX Campaign represents a set of TTPs, Incidents, or Threat Actors that together express a common intent or desired effect. Please see STIX Version 1.2.1 Part 8: Campaign for complete information on the STIX Campaign data model.

2.3.6 Threat Actor

A STIX Threat Actor is a characterization of a malicious actor (or adversary) representing a cyber attack threat including presumed intent and historically observed behavior. Please see STIX Version 1.2.1 Part 7: Threat Actor for complete information on the STIX Threat Actor data model.

2.3.7 Exploit Target

A STIX Exploit Target conveys information about a vulnerability, weakness, or misconfiguration in software, systems, networks, or configurations that may be targeted for exploitation by an adversary. Please see STIX Version 1.2.1 Part 10: Exploit Target for complete information on the STIX Exploit Target data model.

2.3.8 Course of Action (COA)

A STIX Course of Action (COA) is used to convey information about courses of action that may be taken either in response to an attack or as a preventative measure prior to an attack. Please see STIX Version 1.2.1 Part 9: Course of Action for complete information on the STIX Course of Action data model.

2.3.9 Report

A STIX Report defines a contextual wrapper for a grouping of STIX content, which could include content specified using any of the other eight top-level constructs, or even other related Reports. Please see STIX Version 1.2.1 Part 11: Report for complete information on the STIX Report data model.

2.4 Data Marking Data Model

The STIX data marking data model enables flexible specification of data markings (e.g., handling restrictions) on any STIX content using any number of default or custom-defined data marking models. Please see STIX Version 1.2.1 Part 13: Data Marking for complete information on the STIX Data Marking data model.

2.5 Default Extensions Data Model

A primary design principle of STIX is to avoid duplicating data models that already exist for capturing cyber threat information. Therefore, in addition to the direct import of classes defined in CybOX (see Section 4.4), STIX leverages a number of other structured languages and identifiers through the use of default extensions.

More precisely, the STIX Default Extensions data model provides loose-coupling mechanisms and default extensions for leveraging constituent data models – such as Malware Attribute Enumeration and Characterization [MAEC], Common Vulnerabilities and Exposures [CVE], Common Weakness Enumeration [CWE], and Common Platform Enumeration [CPE] – to capture such information as specific vulnerabilities, weaknesses, and platforms targeted for exploitation by a malware instance.

High level summary information is given in Section 4. Please see STIX Version 1.2.1 Part 12: Default Extensions for complete information on the STIX Default Extensions data model.

2.6 Default Vocabularies

For some properties captured in STIX, a content creator may choose to constrain the set of possible values by referencing an externally-defined vocabulary or by leveraging a default vocabulary class defined within STIX. Alternatively, the content creator may use an arbitrary value without specifying any vocabulary. Please see STIX Version 1.2.1 Part 14: Vocabularies for more information about the default vocabularies defined in STIX.

2.7 Basic Data Types

The Basic Data Types data model defines UML data types used in STIX and CybOX. As stated in the [UML 2.4.1] specification, UML data types are similar to UML classes, but also different:

“A data type is a special kind of classifier, similar to a class. It differs from a class in that instances of a data type are identified only by their value. All copies of an instance of a data type and any instances of that data type with the same value are considered to be equal instances. Instances of a data type that have attributes (i.e., is a structured data type) are considered to be equal if the structure is the same and the values of the corresponding attributes are equal. If a data type has attributes, then instances of that data type will contain attribute values matching the attributes.”

Although four of the requisite primitive data types (Boolean, Integer, String, UnlimitedNatural) are defined in UML, the need for a broader set in STIX drove the decision to define a complete set of basic data types in a separate, stand-alone UML package (the Basic Data Types data model). We explicitly define the data types in the Basic Data Types data model in Sections 2.7.1 and 2.7.2.

2.7.1 Common Basic Data Types

Common data types, such as string and integer, are defined in the Basic DataTypes data model and adhere to the following definitions shown in Table 2‑1. These definitions are based on the specification of the corresponding data types found in [W3DT].

Table 2‑1. Common basic data types

Data Type	Definition
BasicString	The BasicString data type is a sequence of characters. Currently, characters are defined using the UTF-8 character encoding. The number of characters allowed is finite, but unbounded.
Boolean	The Boolean data type is defined with two possible literals: ‘true’ and ‘false’.
Decimal	The Decimal data type is a sequence of decimal digits, with perhaps an interveaning decimal point, “.”. The number of digits on either side of the decimal point is finite, but unbounded. Often used to express currency amounts.
Integer	The Integer data type is a sequence of decimal digits, with perhaps a leading minus sign “-“. The number of decimal digits allowed is finite, but unbounded.
NonNegativeInteger	The NonNegativeInteger data type is a restriction on the Integer data type such that the leading minus sign is not allowed.
PositiveInteger	The PositiveInteger data type is a restriction on the NonNegativeInteger data type that disallows zero (0).

2.7.2 Specializations of the BasicString Data Type

The data types in Table 2‑2 correspond to strings that have sematics associated with them. Because of this, they usually are restricted to a certain pattern, defined via a regular expression, and/or more formally defined in a standardization document.

Table 2‑2. Specializations of the BasicString Data Type

Data Type	Definition
CAPEC_ID	The CAPEC_ID data type is a restriction on the BasicString data type, such that it adheres to the regular expression “CAPEC-\d+”. The CAPEC_ID values should correspond to those defined at [CAPEC].
CCE_ID	The CCE_ID data type is a restriction on the BasicString data type such that it adheres to the regular expression “CCE-\d+\d”. The CCE_ID values should correspond to those defined at [CEE].
CVE_ID	The CVE_ID data type is a restriction on the BasicString data type such that it adheres to the regular expression “CVE-\d\d\d\d+\d+”. The CVE_ID values should correspond to those defined at [CVE].
CWE_ID	The CWE_ID data type is a restriction on the BasicString data type such that it adheres to the regular expression “CWE-\d+”. The CWE_ID values should correspond to those defined at [CWE].
DateTime	The DateTime data type is a restriction on the BasicString data type such that it adheres to the standard defined in [ISO8601].
HexBinary	The HexBinary data type is a restriction on the BasicString data type such that it adheres to the regular expression [0-9A-Fa-f]*. The number of characters allowed is finite but unbounded. The number of digits must be even in length.
LanguageCode	The LanguageCode data type is a restriction on the BasicString data type, such that it adheres to the standard defined in [RFC5646].
QualifiedName	The QualifiedName data type is a restriction on the BasicString data type such that it adheres to the requirements specified in [W3Name].
NoEmbeddedQuoteString	The NoEmbeddedQuoteString data type is a restriction on the BasicString data type such that it does not include any double quote characters. This data type captures properties that were attributes in the XML model.
URI	The URI data type is a restriction on the BasicString data type such that it adheres to the standard defined at [RFC3986].

3 Data Model Conventions

The following general information and conventions are used to define the individual data models in STIX Version 1.2.1 Part 15: UML Model. It should be noted that the STIX data models actually evolved as XML schemas, and as a consequence, our UML model follows some conventions so as to be compatible with the preexisting XML implementation. However, we have abstracted away from the XML implementation as much as possible.

3.1 UML Packages

Each STIX data model is captured in a different UML package (e.g., Core package, Campaign package, etc.). To refer to a particular class of a specific package, we use the format package_prefix:class, where package_prefix corresponds to the appropriate UML package. Table 3‑1 lists the packages used throughout the STIX data model specification documents, along with the prefix notation and an example. Descriptions of the packages are provided in Section 2.

Table 3‑1. Package prefixes used by the STIX^[TM] Language

Package	STIX Core
Prefix	stix
Description	The STIX Core data model defines a STIX Package that encompasses all other objects of STIX.
Example	stix:TTPsType

Package	STIX Common
Prefix	stixCommon
Description	The STIX Common data model defines classes that are shared across the various STIX data models.
Example	stixCommon:ConfidenceType

Package	STIX Data Marking
Prefix	marking
Description	The STIX Data Marking data model enables data markings to be used.
Example	marking:MarkingType

Package	STIX Default Vocabularies
Prefix	stixVocabs
Description	The STIX default vocabularies define the classes for default controlled vocabularies used within STIX.
Example	stixVocabs:MalwareTypeVocab

Package	Packages used in STIX Default Extensions
Prefix	a (ciq address); capec; ciq; stix-ciqidentity; maec; tlpMarking; cvrf; ioc; oval-def; oval-var
Description	Various packages are used by STIX extensions. Details are given in STIX Version 1.2.1 Part 12: Default Extensions.
Example	capec:Attack_PatternType

Package	STIX Basic Data Types
Prefix	basicDataTypes
Description	The STIX Basic Data Types data model defines the types used within STIX.
Example	basicDataTypes:URI

Package	STIX Indicator
Prefix	indicator
Description	The STIX Indicator data model conveys specific Observable patterns combined with contextual information intended to represent artifacts and/or behaviors of interest within a cyber security context.
Example	indicator:TestMechanismType

Package	STIX Incident
Prefix	incident
Description	The STIX Incident data model captures discrete instances of a specific set of observed events or properties affecting an organization.
Example	incident:AffectedAssetType

Package	STIX TTP
Prefix	ttp
Description	The STIX TTP data model captures the behavior or modus operandi of cyber adversaries.
Example	ttp:AttackPatternType

Package	STIX Campaign
Prefix	campaign
Description	The STIX Campaign data model encompasses one or more Threat Actors pursuing an Intended Effect as observed through sets of Incidents and/or TTP, potentially across organizations.
Example	campaign:AttributionType

Package	STIX Threat Actor
Prefix	ta
Description	The STIX Threat Actor data model captures characterizations of malicious actors (or adversaries) representing a cyber attack threat including presumed intent and historically observed behavior.
Example	ta:ObservedTTPsType

Package	STIX Exploit Target
Prefix	et
Description	The STIX Exploit Target data model conveys a vulnerability or weakness in software, systems, networks or configurations that is targeted for exploitation by the TTP of a Threat Actor.
Example	et:ConfigurationType

Package	STIX Course of Action
Prefix	coa
Description	The STIX Course of Action data model conveys specific measures to be taken to address threats whether they are corrective or preventative to address Exploit Targets, or responsive to counter or mitigate the potential impacts of Incidents.
Example	coa:StructuredCOAType

Package	STIX Report
Prefix	report
Description	The STIX Report defines a contextual wrapper for a grouping of STIX content, which could include content specified using any of the other eight top-level constructs, or even other related Reports.
Example	report:RelatedReportsType

	CybOX Core
Prefix	cybox
Description	The CybOX core data model defines the core constructs used in CybOX.
Example	cybox:ObservablesType

3.2 Naming Conventions

The UML classes, enumerations, and properties defined in STIX follow the particular naming conventions outlined in Table 3‑2.

Table 3‑2. Naming formats of different object types

Object Type	Format	Example
Class	CamelCase ending with “Type”	IndicatorBaseType
Property (simple)	Lowercase with underscores between words	capec_id
Property (complex)	Capitalized with underscores between words	Associated_Actor
Enumeration	CamelCase ending with “Enum” or “Type	DateTimePrecisionEnum; IndicatorVersionType
Enumeration value	varies	Flash drive; Public Disclosure; Externally-Located
Data type	CamelCase or if the words are acroynms, all capitalized with underscores between words.	PositiveInteger; CVE_ID

3.3 Identifiers

Optional identifiers (IDs) can be assigned to several STIX constructs so that the constructs can be unambiguously referenced. Technically, the decision to specify an ID on a given construct is optional based on the specifics of the usage context. As a general rule, specifying IDs on particular instances of constructs enables clear referencing, relating, and pivoting.

Assigning IDs supports several very common STIX use cases such as:

· Enabling individual portions of content to be externally referenced unambiguously (e.g., a report talking about a specific Campaign or Threat Actor)

· Enabling the sharing/resharing of portions of STIX content (e.g., PartyB resharing two of a set of 100 Indicators received from PartyA)

· Enabling versioning of content

· Enabling the specification of potentially complex webs of interconnection and correlation between portions of STIX content (e.g., connecting particular TTPs and Indicators to specific Campaigns over time)

· Enabling analysis pivoting on content with multiple contexts (e.g., the same IP Address seen in multiple Incidents and with connections to multiple TTPs and Indicators)

In STIX v1.2.1, each STIX ID is a fully qualified name, which consists of a producer namespace and a unique identifier. The producer namespace is a short-hand prefix, which is separated from the unique identifier by a colon (“:”). For example:

[producer namespace]:[unique identifier]

This format provides high assurance that IDs will be both meaningful and unique. Meaning comes from producer namespace, which denotes who is producing it, and uniqueness comes from the unique identifier.

4 Relationships to Other Externally-defined Data Models

STIX Version 1.2.1 leverages several other externally-defined data models that are relevant to the cyber threat domain. However, the STIX specification documents do not define any classes that are part of a non-STIX data model (e.g., CybOX classes are not defined in STIX specification documents). An alphabetical listing of these other data models is given below.

Please see STIX Version 1.2.1 Part 12: Default Extensions for further information on all of the externally-defined data models STIX leverages by default (with the exception of CybOX, for which a different reference is given in Section 4.4).

4.1 Common Attack Pattern Enumeration and Classification (CAPEC)

Common Attack Pattern Enumeration and Classification (CAPEC) is a publicly available catalog of attack patterns along with a comprehensive schema and classification taxonomy [CAPEC]. By extending the STIX TTP AttackPatternType class, STIX Version 1.2.1 uses CAPEC to enable the structured description of attack patterns.

4.2 Common Vulnerability Reporting Framework (CVRF)

The ICASI Common Vulnerability Reporting Framework (CVRF) is an XML-based language that enables different organizations to share critical security-related information in a single format [CVRF]. In addition to capturing basic information and referencing vulnerability registries, the STIX Exploit Target VulnerabilityType class is intended to be extended as appropriate to enable the structured description of a vulnerability using CVRF 1.1.

4.3 Customer Information Quality (CIQ)

The OASIS Customer Information Quality (CIQ) Version 3.0 is a set of XML specifications for representing characteristic information about individuals and organizations [CIQ]. By extending the STIX Common AddressAbstractType and IdentityType classes, STIX Version 1.2.1 leverages CIQ Version 3.0 to capture geographic address information and identity information associated with Threat Actors, victims, and sources of information.

4.4 Cyber Observable Expression (CybOX^[TM])

STIX Version 1.2.1 uses the Cyber Observable Expression (CybOX) language Version 2.1.1 to describe cyber Observables. The CybOX data models are natively imported and used within STIX to characterize system and network events, characteristics, and behaviors observed within the operational domain. The reader is referred to the CybOX specification documents for the definitions of these classes, and in the cases where a STIX class (the subclass) is a specialization of a CybOX class (the superclass), we will explicitly define the class extensions (i.e., new names and types) that have been made in the STIX subclass.

4.5 Malware Attribute Enumeration and Characterization (MAEC)

Malware Attribute Enumeration and Characterization (MAEC™) is a standardized language for sharing structured information about malware based upon attributes such as behaviors, artifacts, and attack patterns [MAEC]. By extending the STIX TTP MalwareInstanceType class, STIX Version 1.2.1 uses MAEC Version 4.1 to capture a structured description of a malware instance.

The Characterizing Malware with STIX and MAEC white paper [STIX-MAEC] provides more details on the relationship between MAEC and STIX and when each should be used in the context of malware characterization.

4.6 Open Indicators of Compromise (OpenIOC)

Open Indicators of Compromise (OpenIOC) is an extensible XML schema for the description of technical characteristics that identify a known threat, an attacker’s methodology, or other evidence of compromise [OpenIOC]. By extending the STIX Indicator GenericTestMechanismType class, STIX Version 1.2.1 enables 2010 OpenIOC to be leveraged as a test mechanism of an Indicator.

4.7 Open Vulnerability and Assessment Language (OVAL)

The Open Vulnerability and Assessment Language (OVAL) is an information security community effort to standardize how to assess and report upon the machine state of computer systems [OVAL]. By extending the STIX Indicator GenericTestMechanismType class, STIX Version 1.2.1 enables OVAL 5.10 to be leveraged as a test mechanism of an Indicator.

5 Conformance

Implementations have discretion over which parts (components, properties, extensions, controlled vocabularies, etc.) of STIX they implement (e.g., Indicator/Suggested_COAs).

[1] Conformant implementations must conform to all normative structural specifications of the UML model or additional normative statements within this document that apply to the portions of STIX they implement (e.g., Implementers of the entire TTP component must conform to all normative structural specifications of the UML model or additional normative statements within this document regarding the TTP component).

[2] Conformant implementations are free to ignore normative structural specifications of the UML model or additional normative statements within this document that do not apply to the portions of STIX they implement (e.g., Non-implementers of any particular properties of the TTP component are free to ignore all normative structural specifications of the UML model or additional normative statements within this document regarding those properties of the TTP component).

The conformance section of this document is intentionally broad and attempts to reiterate what already exists in this document. The STIX 1.2 Specifications, which this specification is based on, did not have a conformance section. Instead, the STIX 1.2 Specifications relied on normative statements and the non-mandatory implementation of STIX profiles. STIX 1.2.1 represents a minimal change from STIX 1.2, and in that spirit no requirements have been added, modified, or removed by this section.

Appendix A. Acknowledgments

The following individuals have participated in the creation of this specification and are gratefully acknowledged:

Participants:

Dean Thompson, Australia and New Zealand Banking Group (ANZ Bank)

Bret Jordan, Blue Coat Systems, Inc.

Adnan Baykal, Center for Internet Security (CIS)

Jyoti Verma, Cisco Systems

Liron Schiff, Comilion (mobile) Ltd.

Jane Ginn, Cyber Threat Intelligence Network, Inc. (CTIN)

Richard Struse, DHS Office of Cybersecurity and Communications (CS&C)

Marlon Taylor, DHS Office of Cybersecurity and Communications (CS&C)

David Eilken, Financial Services Information Sharing and Analysis Center (FS-ISAC)

Sarah Brown, Fox-IT

Ryusuke Masuoka, Fujitsu Limited

Eric Burger, Georgetown University

Jason Keirstead, IBM

Paul Martini, iboss, Inc.

Jerome Athias, Individual

Terry MacDonald, Individual

Alex Pinto, Individual

Patrick Maroney, Integrated Networking Technologies, Inc.

Wouter Bolsterlee, Intelworks BV

Joep Gommers, Intelworks BV

Sergey Polzunov, Intelworks BV

Rutger Prins, Intelworks BV

Andrei Sîrghi, Intelworks BV

Raymon van der Velde, Intelworks BV

Jonathan Baker, MITRE Corporation

Sean Barnum, MITRE Corporation

Desiree Beck, MITRE Corporation

Mark Davidson, MITRE Corporation

Ivan Kirillov, MITRE Corporation

Jon Salwen, MITRE Corporation

John Wunder, MITRE Corporation

Mike Boyle, National Security Agency

Jessica Fitzgerald-McKay, National Security Agency

Takahiro Kakumaru, NEC Corporation

John-Mark Gurney, New Context Services, Inc.

Christian Hunt, New Context Services, Inc.

Daniel Riedel, New Context Services, Inc.

Andrew Storms, New Context Services, Inc.

John Tolbert, Queralt, Inc.

Igor Baikalov, Securonix

Bernd Grobauer, Siemens AG

Jonathan Bush, Soltra

Aharon Chernin, Soltra

Trey Darley, Soltra

Paul Dion, Soltra

Ali Khan, Soltra

Natalie Suarez, Soltra

Cedric LeRoux, Splunk Inc.

Brian Luger, Splunk Inc.

Crystal Hayes, The Boeing Company

Brad Butts, U.S. Bank

Mona Magathan, U.S. Bank

Adam Cooper, United Kingdom Cabinet Office

Mike McLellan, United Kingdom Cabinet Office

Chris O'Brien, United Kingdom Cabinet Office

Julian White, United Kingdom Cabinet Office

Anthony Rutkowski, Yaana Technologies, LLC

The authors would also like to thank the larger STIX Community for its input and help in reviewing this document.

Appendix B. Revision History

Revision	Date	Editors	Changes Made
wd01	21 August 2015	Sean Barnum Desiree Beck Aharon Chernin Rich Piazza	Initial transfer to OASIS template