Crane's RELAX-NG schemas for OASIS UBL

G. Ken Holman

Crane Softwrights Ltd.

$Date: 2013/05/14 12:40:11 $(UTC)


Table of Contents

1. Introduction
2. Installation
3. Integration and use
4. Augmentations
5. Extensions
5.1. Restriction on ID/IDREF validation
Bibliography

1. Introduction

This is a package of RELAX-NG compact syntax schemas [RELAX-NG] suitable for validating and directed editing of OASIS Universal Business Language (UBL) [UBLTC] XML instances.

This release supports instances of UBL 2.0 [UBL 2.0] and UBL 2.1 PRD4 [UBL 2.1] simultaneously by detecting an instance's use of the <UBLVersionID> element. Though extensions are included, please review Section 5.1, “Restriction on ID/IDREF validation”.

Support for the nXML major mode [nXML] for Emacs is included.

2. Installation

Unzipping the package creates the base directory with one RELAX-NG compact schema for each UBL document, plus the extensions.rnc common extension schema fragment used by every main schema.

Also included is the ns-UBL-schemas.xml association between namespaces and schemas suitable for directed editing tools such as Emacs with nXML. To engage this file in nXML, add the following to the schemas.xml file of locating rules:

  <include rules="ns-UBL-schemas.xml"/>

The versions/ subdirectory has all of the individually-versioned RELAX-NG schemas, each with one modification from those published by the UBL committee: the optional UBLVersionID has a hardwired element text value being the UBL version.

The aggregate document schema titled "UBL-AllModels.rnc" allows any UBL instance to be validated with the corresponding internally-referenced RELAX-NG document model. Note, however, the magnitude of this aggregate schema may prove to exceed many processing capacities.

3. Integration and use

To engage validating or directed editing of a UBL instance, point the application to the appropriate RNC file in the base directory. Alternatively, pointing an application aware of the namespace association file will automatically engage the required RNC file from the base directory.

Each RNC file in the base directory will engage one or more modified schemas accommodating the available versions of the document type. The unmodified versions are created from using the Sun MSV RELAX-NG converter [RNGCONV] and James Clark's Trang [Trang]. The modification ensures that when the UBLVersionID element is present it must be populated with the version number of one of the available versions. Simply populating the element will ensure the remainder of the document is constrained accordingly.

For example, consider this Emacs/nXML example of an instance fragment that properly uses proposed UBL 2.1 elements:

<Invoice xmlns="urn:...>
  <cbc:UBLVersionID>2.1</cbc:UBLVersionID>
  <cbc:CustomizationID>urn:X-Crane</cbc:CustomizationID>
  <cbc:ProfileID>urn:X-Crane:SimpleInvoice</cbc:ProfileID>
  <cbc:ID>A123</cbc:ID>
  <cbc:CopyIndicator>false</cbc:CopyIndicator>
  <cbc:IssueDate>2010-03-31</cbc:IssueDate>
  <cbc:InvoiceDueDate>2010-04-30</cbc:InvoiceDueDate>
  <cbc:Note>Example invoice; not bona fide.</cbc:Note>
  <cbc:DocumentCurrencyCode>CAD</cbc:DocumentCurrencyCode>
  ...

Simply editing the version identifier to 2.0, and doing nothing else, highlights the proposed InvoiceDueDate as an unknown element in error:

<Invoice xmlns="urn:...>
  <cbc:UBLVersionID>2.0</cbc:UBLVersionID>
  <cbc:CustomizationID>urn:X-Crane</cbc:CustomizationID>
  <cbc:ProfileID>urn:X-Crane:SimpleInvoice</cbc:ProfileID>
  <cbc:ID>A123</cbc:ID>
  <cbc:CopyIndicator>false</cbc:CopyIndicator>
  <cbc:IssueDate>2010-03-31</cbc:IssueDate>
  <cbc:InvoiceDueDate>2010-04-30</cbc:InvoiceDueDate>
  <cbc:Note>Example invoice; not bona fide.</cbc:Note>
  <cbc:DocumentCurrencyCode>CAD</cbc:DocumentCurrencyCode>
  ...

If it is necessary to hardwire the validation of any particular version of a UBL schema, each version's schema is found independently in the versions/ subdirectory.

4. Augmentations

Two augmentations to UBL 2.1 are introduced in these RELAX-NG schemas:

  • the W3C Schema implicit xsi:schemaLocation= attribute is added to all document elements as an optional attribute; and

  • the optional UBLVersionID element, when present, is constrained to be the version number of the minor release of UBL:

    • this prevents users who enter an earlier minor revision number in this element from accessing this revision's elements that are not available in the earlier revision

    • this revision's elements are still available to be used in the instance when the UBLVersionID element is absent

5. Extensions

The extensions.rnc file is delivered configured to validate an instance with the extensions found in the UBL distribution. All of the document models engage this single schema fragment in order to define the named pattern ExtenstionContentDataType.

Additional extensions are engaged in this schema fragment by following the extensions.rnc documentation of the included standardized signature extension.

5.1. Restriction on ID/IDREF validation

The wildcard nature of validating and allowing unknown UBL extensions imposes a necessary restriction on RELAX-NG validation of ID/IDREF integrity. A wildcard cannot exclude known attributes of type xsd:ID because other unknown extensions may have similarly-named attributes that are not of type ID/IDREF.

This results in being unable to validate the ID-ness of extension attributes of type xsd:ID. To avoid the conflict, the easiest thing to do is change the attributes of type xsd:ID in all known extensions to be xsd:NCName. This has been done for the included UBL-standardized signature extension.

If this is not possible, then run validation without ID/IDREF checking engaged. This, of course, produces the same result.

This constraint of an attribute of type ID not being allowed to be a descendant of a wildcard element is described at: http://www.oasis-open.org/committees/relax-ng/compatibility.html#id where it reads: "the first child of the element ancestor is a name element" (which is to say it cannot be a wildcard).

Note

The core UBL specification does not include any ID/IDREF constraints, thus this validation restriction does not impact on standard non-extension UBL content. As of this writing, only the standardized signature extension includes ID/IDREF validation. Such is found only in those included schema fragments that are not under the UBL committee purview. These signature-related fragments are standardized by the W3C.

Bibliography

[nXML] James Clark nXML mode home page

[RELAX-NG] James Clark, Makoto Murata ISO/IEC 19757-2 RELAX-NG (Regular Language for XML)

[UBL 2.1] Jon Bosak, Tim McGrath, G. Ken Holman Universal Business Language (UBL) Version 2.1 PRD4 OASIS UBL Technical Committee 2013

[UBLTC] Jon Bosak, Tim McGrath OASIS UBL Technical Committee 2001