Data extensibility

The <data> element represents properties ranging from simple values to complex structures. Processes can harvest the <data> element for automated manipulation or to format data associated with the body flow. The <data> element is primarily intended for use in creating specializations.

You can nest <data> elements for structures. You can use the name attribute to indicate the semantic of instances of the <data> element such as addresses, times, amounts, and so on. In many cases, however, you may prefer to specialize the <data> element for more precise semantics and for constraints on structures and values. For instance, a specialization can specify an enumeration for the value attribute.

In some cases, it isn't possible or convenient to maintain a property as part of the content of its subject. For instance, you might prefer to maintain extensive data in the <prolog> that applies to a note or example within the body. To handle such exceptions, you can use the <data-about> element to identify the subject of the property.

A process can harvest the data values for a machine-processable representation such as RDF. The default formatting ignores the <data> element within the <body> element. Understanding whether and how the properties should display, customized or specialized processing can extend formatting to include data values in some formatted outputs.

CAUTION:

It is an abuse of the DITA architecture to specialize <data> element for text that is part of the body flow, particularly to escape the constraints of the base content models. For example, a special kind of paragraph must specialize the base <p> element rather than the <data> element. When exchanging content with others or retiring a specialization, a paragraph specialized from the <data> element will be generalized and thus skipped by the base formatting, mangling the discourse flow and resulting in invalid content.

Applications

Uses of the <data> element include the following:

Complex metadata properties such as bibliographic records corresponding to citations.
Hybrid documents with data values as part of the content. (Word processor formats using form fields provide an example of such hybrid documents.)
Messages in which the payload includes human-readable content. Such applications can use the <data> element to define the addressing on the message envelope. For instance, a topic could model an email by representing the address with specialized <data> elements in the <prolog> element and the content with the <body> element.
Transactional documents in which the values are processed but also displayed with human-readable content. In particular, a library of building blocks for transaction documents can be implemented through a DITA domain as specialized <data> elements including those from the UN/CEFACT Core Components Technical Specification (http://www.unece.org/cefact/).

The following example specifies the delimited source code for a code fragment so an automated process can refresh the code fragment. The <fragmentSource>, <sourceFile>, <startDelimiter>, and <endDelimiter> elements are specialized from <data> but the <codeFragment> is specialized from <codeblock>. These properties wouldn't appear in the formatted output (except perhaps for debugging problems in the refresh):

<example>
    <title>An important coding technique</title>
    <codeFragment>
        <fragmentSource>
            <sourceFile     value="helloWorld.java"/>
            <startDelimiter value="FRAGMENT_START_1"/>
            <endDelimiter   value="FRAGMENT_END_1"/>
        </fragmentSource>
        ...
    </codeFragment>
</example>

The following example identifies a real estate property as part of a house description. The <realEstateProperty> element and everything it contains are specialized from <data>. The <houseDescription> element is specialized from <section>. A specialized process can format the values to identify the lot if appropriate for the brochure.

<houseDescription>
  <title>A great home for sale</title>
  <realEstateProperty>
    <realEstateBlock value="B7"/>
    <realEstateLot   value="4003"/>
    ...
  </realEstateProperty>
  <p>This elegant....</p>
  <object data="B7_4003_tour360Degrees.swf"/>
</houseDescription>