Commentary on Guidelines for Filenames, URIs, Namespaces, and Metadata

Editor: Robin Cover
Date: 2008-10-09
Version: 08
Latest Version: http://docs.oasis-open.org/specGuidelines/namingGuidelines/resourceNamingCommentary.html
This Version: http://docs.oasis-open.org/specGuidelines/namingGuidelines/resourceNamingCommentaryV08.html
Status: The OASIS Board has approved the Guidelines for use as of 2006-08-02. This commentary document augments OASIS Naming Guidelines Part 1: Filenames, URIs, Namespaces. OASIS Staff is authorized to update the guidelines from time to time as needed. This commentary text is under revision (more examples needed).
Review Comments: send to robin@oasis-open.org
Document Supplement: OASIS Naming Guidelines: Metadata and Versioning


Note: For Technical Committee work governed by the OASIS Technical Committee Process effective October 15, 2010, please see the document OASIS Naming Directives, which updates the naming rules to match the revised TC Process. Information in this document may still be useful for background (e.g., rationale for vintage-2008 decisions) and other historic purposes. For most other purposes, the OASIS Naming Directives document supercedes this document — OASIS Naming Guidelines, In Two Parts and commentary.


Contents

Introduction

Between February 2003 and February 2006, at least twenty-one (21) numbered drafts relating to "OASIS Naming Guidelines" were produced under an informal process, for review by OASIS Members, Chairs, TAB, Board, and others. These drafts have several different titles, suggesting variable focus and scope.

This document seeks closure on a few key decisions that need to be made in order to proceed with design and development of new document management facilities to support resources in the OASIS Open Library. The document editor recognizes that consensus will never be reached on a much longer list of naming issues about which stakeholders have strong opinions, and disagreements.

The document provides a summary of key issues raised by members of the TAB in its recent production of draft documents on "artifact" guidelines, as well as some issues raised by reviwers of AIR/ASIS, including OASIS Staff. On August 02, 2006, the OASIS Board of Directors approved the Naming Guidelines for use by OASIS Staff and Members. Further work is underway on issues in the sections labeled "Issues Requiring Further Discussion."

Not all topics addressed by the TAB during the period 2003-02 through 2006-02 are mentioned in this summary document: additional background, including theoretical matters and formal notations are available in early drafts and supplemental materials, most recently in Artifact Standard Identification Scheme for Metadata 1.0.

This document uses terms familiar to users of the Web: "resource" and "URI", along with "file" and sometimes "document", "specification", and "directory" — in preference to the abstract term "artifact".

Scope and Applicability

Name Characters

"Name Characters" here refers to characters used in URIs — including filenames, directory names, colon- or slash-delimited components within namespace URIs, delimiters, and possibly other URI subcomponents as may be labeled.

Beginning with one of the earliest drafts (Proposed Rules for OASIS Document File Naming, Working Draft 02, 18-February-2003), contributors to the twenty-some versions of the OASIS naming guidelines have agreed that a restricted character inventory for published names would best serve the needs of the organization. Experience using the Kavi system has confirmed that users (unconstrained) are likely to publish documents using problematic characters and character patterns in filenames/URIs, creating risks to interoperability and data integrity. Some such characters require hex (escape) representation because they are "Reserved Characters" in URI syntax, while others present risks because they are meaningful to the shell. Some of these potentially problematic characters include the at-sign (@), ampersand (&), left and right parenthesis, tilde (~), hash/pound-sign (#), dollar-sign ($), left and right square-bracket, plus-sign (+), colon (:), semicolon (;), etc.

While technical solutions are available to minmize problems arising from potentially problematic characters and character sequences, common best practice guidance urges avoiding them altogether; this conclusion has been supported in all twenty-some AIR/ASIS drafts and in the two OASIS member reviews.

Approved for Use

Issues Requiring Further Discussion

Name Construction

"Name construction" here refers to the lexical and syntactic structure of names, given the restricted character inventory. Motivations for the constraints include concerns for fidelity of interchange across file systems, minimizing the risks of common text-processing errors, usability (visual clarity), and other data QA. In other cases, arbitrary restriction of unbounded variablity serves the goal of simplicity through uniformity.

Approved for Use

Issues Requiring Further Discussion

URI Design

Approved for Use

Issues Requiring Further Discussion

XML Namespace Design, Allocation, and Management

Approved for Use

Issues Requiring Further Discussion

Versioning Issues

Document Supplement 2006-09-30: See now OASIS Naming Guidelines: Metadata and Versioning.

Issues Requiring Further Discussion

Metadata

Document Supplement 2006-09-30: See now OASIS Naming Guidelines: Metadata and Versioning.

[Partly obsoleted as of 2006-09-30] ASIS sections on metadata have been removed in this document, as metadata design has been targeted for work as a separate design effort, to be revisited following the conclusion of OASIS Staff design on specification templates, search requirements, and other functional requirements that are part of the document management system design. Results from this design will be incorporated into the "Guidelines for Filenames, URIs, Namespaces, [and Metadata]" document at a later stage.

A significant conclusion emerged from the two public reviews of AIR (July 2005) and ASIS (February 2006): the OASIS membership does not welcome a policy mandating the use of structured filenames which use hyphen-delimited metadata components, IETF-style. In some cases it may be natural and desirable to use some "metadata" information in filenames and URIs, but we heard strong negative reaction against the early proposal to make a componentized schema required. The current plan is to coordinate investigation about metadata requirements around site-wide search functionality, then to align the metadata model(s) with usage in specification templates and (other) markup embedding guidelines. Further support for (optional use of) componentized flenames might be reconsidered later (e.g., when it would make sense to generate sugested filenames from a metadata record.

General Notes

Preliminary Notes

Notes for the Introduction

Notes for the Scope and Applicability Section

Notes for the Name Characters Section

Name Construction

Notes for the URI Design Section

Notes for XML Namespace Design

Notes for the Versioning Section

References