Chunking

Content may be chunked (divided or merged into new output documents) in different ways for the purposes of authoring, for delivering content, and for navigation. For example, something best authored as a set of separate topics may need to be delivered as a single Web page. A map author can use the chunk attribute to split up single documents into component topics or combine multiple topics into a single document as part of output processing.

Examples of use

Here are some examples of potential uses of the chunk attribute:

Reuse of a nested topic
A content provider creates a set of topics as a single document. A reuser wants to incorporate only one of the nested topics from the document. The reuse can reference the nested topic from a DITA map, using the chunk attribute to specify that the topic should be produced in its own document.
Identification of a set of topics as a unit
A curriculum developer wants to compose a lesson for a SCORM LMS (Learning Management System) from a set of topics without constraining reuse of those topics. The LMS can save and restore the learner’s progress through the lesson if the lesson is identified as a referenceable unit. The curriculum developer defines the collection of topics with a DITA map, using the chunk attribute to identify the learning module as a unit before generating the SCORM manifest.

Usage of the chunk attribute

When a set of topics is transformed for output using a map, the map author may use the chunk attribute to override whatever default chunking behavior applies. The chunk attribute allows the map author to request that multi-topic documents be broken into multiple documents, and that multiple individual topics be combined into a single document.

Chunking is necessarily output transformation specific with chunked output required for some and not supported for other types of output. Chunking is also implementation specific with some implementations supporting some, but not all, chunking methods, or adding new implementation specific chunking methods to the standard methods described in this specification.

The value of the chunk attribute consists of one or more space delimited tokens:
by–topic
When the chunk attribute value includes the “by–topic” token, a chunking policy is established for the current topicref element where a separate output chunk is produced for the target topic and each of its descendants. The policy only applies for a chunk action of the current element (for example, to-content), except when it is set on the map element, when the “by-topic” policy is established for the entire map.
by–document
When the chunk attribute value includes the “by–document” token, a chunking policy is established for the current topicref element where a single output chunk is produced for the referenced document. The policy only applies for a chunk action of the current element (for example, to-content), except when it is set on the map element, when the “by-document” policy is established for the entire map.
select–topic
When the chunk attribute value includes the “select–topic” token, an individual topic, without any other topics (ancestors, descendents, or peers) from within the same document, is selected.
select–document
When the chunk attribute value includes the “select–document” token, the content for the referenced topic, as well as any other topics (ancestors, descendents, or peers) contained within the same document are selected.
select–branch
When the chunk attribute value includes the “select–branch” token, an individual topic as well as any nested topics it contains are selected.
to–content
When the chunk attribute value includes the “to–content” token, processing generates a new chunk of content.
to–navigation
When the chunk attribute value includes the “to–navigation” token, processing generates new chunk of navigation (toc, related-links).
Note that the set of select­–xxxxx token values are only useful when addressing a topic in a document that contains multiple topics.

Some tokens or combinations of tokens may not be appropriate for all output types. When unsupported or conflicting tokens are encountered during output processing, warning or error messages should be produced. Recovery from such conflicts or other errors is implementation dependent.

There is no default value for the chunk attribute and the chunk attribute does not inherit values from container elements. A default for an entire map may be established by setting the chunk attribute on the map element.

When no chunk attribute values are given, chunking behavior is implementation dependent and may vary for different implementations. When variations of this sort are not desired, a default for the entire map may be established by including a chunk attribute value on the map element.

When creating new documents via chunk processing, the storage object name or identifier (if relevant) is taken from the copyto attribute if set, otherwise the root name is taken from the id attribute if the by-topic policy is in effect and from the name of the referenced document if the by-document policy is in effect.

Examples

Given several single topic documents, parent1.dita, parent2.dita, …, child1.dita, child2.dita, …, grandchild1.dita, grandchild2.dita containing topics with ids P1, P2, …, C1, C2, …, GC1, GC2, …., several nested topic documents, nested1.dita, nested2.dita, …, each containing two topics, parent topics with ids N1, N2, … and child topics with ids N1a, N2a, … nested within the parent, and ditabase.dita with the following contents:
<dita> 
   <topic id="X"/> 
   <topic id="Y"> 
      <topic id="Y1"> 
         <topic id="Y1a"/> 
      </topic> 
      <topic id="Y2"/> 
   </topic> 
   <topic id="Z"> 
      <topic id="Z1"/> 
   </topic> 
</dita> 
map1.ditamap:
<map chunk="by-document"> 
   <topicref href="parent1.dita" chunk="to-content"> 
      <topicref href="ditabase.dita#Y1" 
         chunk="select-topic"/> 
   </topicref> 
</map> 
Produces a single output document, parent1.xxxx containing topic P1 with topic Y1, but not topic Y1a nested in it.
map2.ditamap:
<map chunk="by-document"> 
   <topicref href="parent1.dita" chunk="to-content"> 
      <topicref href="ditabase.dita" 
         chunk="select-branch"/> 
   </topicref> 
</map> 
Produces a single output document, parent1.xxxx containing topic P1, topic Y1 nested within topic P1, and topic Y1a nested within Y1.
map3.ditamap:
<map chunk="by-topic"> 
   <topicref href="parent1.dita" chunk="to-content"> 
      <topicref href="ditabase.dita#Y1" 
         chunk="select-document"/> 
   </topicref> 
</map> 
Produces a single output document, P1.xxxx, containing topic P1 and topics X, Y, and Z together with their children nested within topic P1.
map4.ditamap:
<map chunk="by-document"> 
   <topicref href="parent1.dita" copyto="parentchunk"> 
      <topicref href="nested1.dita" chunk="select-branch"/> 
   </topicref> 
 </map> 
Produces a single output document named parentchunk.xxxx containing topic P1 with topic N1 nested within P1 and topic N1a nested within N1.
map5.ditamap:
<map chunk="by-document"> 
   <topicref href="parent1.dita" 
      chunk="to-content" copyto="parentchunk"> 
      <topicref href="child1.dita" chunk="select-branch"/> 
      <topicref href="child2.dita" 
         chunk="to-content select-branch" 
         copyto="child2chunk"> 
         <topicref href="grandchild2.dita"/> 
      </topicref> 
      <topicref href="child3.dita"> 
         <topicref href="grandchild3.dita" 
            chunk="select-branch"/> 
      </topicref> 
   </topicref> 
 </map> 
Produces two output documents: the P1, C1, C3, and GC3 topics in parentchunk.xxxx, and the C2 and GC2 topics in child2chunk.xxx.
map6.ditamap:
<map> 
   <topicref href="nested1.dita#N1" copyto="nestedchunk" 
     chunk="to-content select-topic"/> 
</map> 
Produces a single output document, nestedchunk.xxxx, which contains topic N1 with no topics nested within.
map7.ditamap:
<map> 
   <topichead navtitle="How to do lots of things" 
      chunk="to-navigation"> 
       <topicref href="parent1.dita" 
             navtitle="How to set up a web server"> 
          <topicref href="child1.dita" 
             chunk="select-branch"/> 
          ... 
          </topicref> 
       <topicref href="parent2.dita" 
             navtitle="How to ensure database security"> 
          <topicref href="child2.dita" 
             chunk="select-branch"/> 
              ... 
          </topicref> 
         ... 
   </topicref> 
</map> 
  
Produces two navigation chunks, one for P1, C1, … and a second for P2, C2, ….

The above example identifies a “how to” for setting up a product as a single unit. The “how to” might be provided both as navigable HTML pages and as a printable PDF attached to the root HTML page.

Implementation specific tokens and future considerations

Additional chunk tokens may be added to the DITA Standard in the future. Additional implementation specific tokens may be defined as well. To avoid name conflicts between implementations or with future additions to the standard, implementation specific tokens should consist of a prefix that gives the name or an abbreviation for the implementation followed by a colon followed by the chunking method name. For example: “acme:level2” could be a token for the Acme DITA Toolkit that requests the “level2” chunking method.

Return to main page.

OASIS DITA Architectural Specification v1.1 -- Committee Draft 13 February 2007
Copyright © OASIS Open 2005, 2007. All Rights Reserved.