Editor: Robin Cover
Latest Version: http://docs.oasis-open.org/specGuidelines/namingGuidelines/resourceNamingCommentary.html
This Version: http://docs.oasis-open.org/specGuidelines/namingGuidelines/resourceNamingCommentaryV07.html
Status: The OASIS Board has approved the Guidelines for use as of 2006-08-02; OASIS Staff is authorized to update the guidelines from time to time as needed. This commentary text is under revision (more examples needed).
Review Comments: send to firstname.lastname@example.org
Current Activity: produce a versioned condensed format with just the rules; add examples and commentary
Document Supplement: OASIS Naming Guidelines: Metadata and Versioning
Between February 2003 and February 2006, at least twenty-one (21) numbered drafts relating to "OASIS Naming Guidelines" were produced under an informal process, for review by OASIS Members, Chairs, TAB, Board, and others. These drafts have several different titles, suggesting variable focus and scope.
This document seeks closure on a few key decisions that need to be made in order to proceed with design and development of new document management facilities to support resources in the OASIS Open Library. The document editor recognizes that consensus will never be reached on a much longer list of naming issues about which stakeholders have strong opinions, and disagreements.
The document provides a summary of key issues raised by members of the TAB in its recent production of draft documents on "artifact" guidelines, as well as some issues raised by reviwers of AIR/ASIS, including OASIS Staff. On August 02, 2006, the OASIS Board of Directors approved the Naming Guidelines for use by OASIS Staff and Members. Further work is underway on issues in the sections labeled "Issues Requiring Further Discussion."
Not all topics addressed by the TAB during the period 2003-02 through 2006-02 are mentioned in this summary document: additional background, including theoretical matters and formal notations are available in early drafts and supplemental materials, most recently in Artifact Standard Identification Scheme for Metadata 1.0.
This document uses terms familiar to users of the Web: "resource" and "URI", along with "file" and sometimes "document", "specification", and "directory" — in preference to the abstract term "artifact".
"Name Characters" here refers to characters used in URIs — including filenames, directory names, colon- or slash-delimited components within namespace URIs, delimiters, and possibly other URI subcomponents as may be labeled.
Beginning with one of the earliest drafts (Proposed Rules for OASIS Document File Naming, Working Draft 02, 18-February-2003), contributors to the twenty-some versions of the OASIS naming guidelines have agreed that a restricted character inventory for published names would best serve the needs of the organization. Experience using the Kavi system has confirmed that users (unconstrained) are likely to publish documents using problematic characters and character patterns in filenames/URIs, creating risks to interoperability and data integrity. Some such characters require hex (escape) representation because they are "Reserved Characters" in URI syntax, while others present risks because they are meaningful to the shell. Some of these potentially problematic characters include the at-sign (@), ampersand (&), left and right parenthesis, tilde (~), hash/pound-sign (#), dollar-sign ($), left and right square-bracket, plus-sign (+), colon (:), semicolon (;), etc.
While technical solutions are available to minmize problems arising from potentially problematic characters and character sequences, common best practice guidance urges avoiding them altogether; this conclusion has been supported in all twenty-some AIR/ASIS drafts and in the two OASIS member reviews.
"Name construction" here refers to the lexical and syntactic structure of names, given the restricted character inventory. Motivations for the constraints include concerns for fidelity of interchange across file systems, minimizing the risks of common text-processing errors, usability (visual clarity), and other data QA. In other cases, arbitrary restriction of unbounded variablity serves the goal of simplicity through uniformity.
Document Supplement 2006-09-30: See now OASIS Naming Guidelines: Metadata and Versioning.
Document Supplement 2006-09-30: See now OASIS Naming Guidelines: Metadata and Versioning.
[Partly obsoleted as of 2006-09-30] ASIS sections on metadata have been removed in this document, as metadata design has been targeted for work as a separate design effort, to be revisited following the conclusion of OASIS Staff design on specification templates, search requirements, and other functional requirements that are part of the document management system design. Results from this design will be incorporated into the "Guidelines for Filenames, URIs, Namespaces, [and Metadata]" document at a later stage.
A significant conclusion emerged from the two public reviews of AIR (July 2005) and ASIS (February 2006): the OASIS membership does not welcome a policy mandating the use of structured filenames which use hyphen-delimited metadata components, IETF-style. In some cases it may be natural and desirable to use some "metadata" information in filenames and URIs, but we heard strong negative reaction against the early proposal to make a componentized schema required. The current plan is to coordinate investigation about metadata requirements around site-wide search functionality, then to align the metadata model(s) with usage in specification templates and (other) markup embedding guidelines. Further support for (optional use of) componentized flenames might be reconsidered later (e.g., when it would make sense to generate sugested filenames from a metadata record.
<Robin Cover and Mary McRae joined the meeting for this discussion.>
Robin Cover presented the OASIS Naming Guidelines and requested approval for trial implementation.
RESOLUTION 2006-08-02.03: RESOLVED that the OASIS Naming Guidelines (Draft Guidelines for Filenames, URIs, Namespaces [and Metadata]) v06, dated 2006-07-20 and referenced below, are approved for use by OASIS Staff and Members, with the removal of each instance of the text "Issues Resolved, Near Resolution, or with Substantial Agreement" from the guidelines document, [and further RESOLVED that OASIS staff may update these guidelines from time to time as needed.]
- Draft Guidelines for Filenames, URIs, Namespaces [and Metadata] http://docs.oasis-open.org/specGuidelines/namingGuidelines/resourceNaming.html
- Commentary on Draft Guidelines for Filenames, URIs, and Namespaces http://docs.oasis-open.org/specGuidelines/namingGuidelines/resourceNamingCommentary.html
RESOLUTION 2006-08-02.03a (TO AMEND): RESOLVED, that the main motion is amended to add the following text: "and further RESOLVED that OASIS staff may update 118 these guidelines from time to time as needed." Passed unanimously.
MAIN MOTION: Passed unanimously
The relationship between implementation of these guidelines and the current Kavi system was discussed. It was clarified that the guidelines would be implemented on docs.oasis-open.org, and that some guidelines cannot be supported by the current system.
ACTION ITEM 2006-08-02.01: Robin Cover and staff to report back to the Board on naming convention implementation in 6 months.
Design document titles: An incomplete listing for various versions of "AIR", variously titled:
Note on nomenclature: this document does not feature the term "artifact" (nor "requirements" nor "deliverable"), as was used in several earlier draft proposals (Artifact Identification Guidelines, Artifact Identification Requirements, Artifact Standard Identification Scheme for Metadata). Feedback from reviewers of ASIS indicated that artifact probably does not represent a central concept, even if some of its defined characteristics are useful. This document uses the more familiar Web terms "resource" and "URI", along with "file" and sometimes "document", "specification", and "directory" (as a hierarchical element matching a URI path component). A taxonomy of resource types (previously described in ASIS as "artifact types") will be considered separately as part of the metadata design effort. The notion of an abstract identifier may need to be modeled to account for: (a) constituent parts of a compound document; (b) nearly "identical" representations of a document, differing only in format [XHTML, XML, ODF, PDF, Postscript, plain text]; (c) names of package files which store collections of resources making up [part/whole] a complete specification or otherwise aggregating related documents forming a single published relase; (d) etc.
See Upgrading OASIS document and file management services, posted by Peter Roden November 18, 2005: "We plan exclusively to use the [Internet] domain 'docs.oasis-open.org' for public access to approved work product of its technical committees. The 'docs' subdomain is in optional use today. By 'approved', we mean all work that has been approved under our TC Process rules as a Committee Draft, Public Review Draft, Committee Specification or OASIS Standard..."
While the collection of naming rules is intended to apply to all resources deposited into the OASIS Open Library, rules may apply variably to different document genres, file formats, and as a function of specification status. Thus, while rules for allowable characters in file and directory names would apply universally, rules governing namespace definition would be applied differently to contributed specifications vs. TC-approved specifications. Any such distinctions will be documented clearly.
Methods for uploading and installing resources in the OASIS Open Library include use of compressed archives or packages like ZIP and tar+gzip. It is expected that filenames and directory names created by a package extract operation will conform to the naming rules just as if the files were uploaded individually. In order to make all resources directly visible to human users (not requiring a download + extract-on-local-machine operation) and accessible to indexing for search purposes, all files in packages will be extracted and installed in the named directories. All package files uploaded to the OASIS Open Library will also be retained in package format at the canonical URI.
See Eve Maler in Proposed Rules for OASIS Document File Naming, February 2003: "Hyphens must be used as separators of the major portions of a file name. Spaces must not be used. Hyphens are recommended between words within the description and extended description portions, though underscores may be used. Hyphens are preferred because they are easier to see in displayed URIs and easier to type. Lowercase spelling is recommended..."
This document specifies characters for use in the URI Path and Fragment components, but does not address characters allowable in the Scheme and Authority components (e.g., http://docs.oasis-open.org/ ). The characters "?" (question-mark) and "=" (equals) may be recommended at some future time for use in the Query component of a URI, should OASIS provide implementations that use such query elements.
TC members involved in naming are encouraged to consider the context in which URIs are likely to be used; in some print media, the UNDERSCORE character is indistinguishable from other "blank" characters, and in the context of common Web practice, may be ambiguous. See the following note.
The most common strategies for creating names from a sequence of words or morphemes (closed compounds) include use of an explicit delimiter character (e.g., HYPHEN) or marking juncture by camel case. Both strategies are intended to enhance readability for the human user. In some programming languages (by no means all), the underscore character may be used to join word components. This usage is probably benign. In the context of the World Wide Web, where use of the [not-hex-escaped] SPACE character within filenames (thus URIs) is exceedingly popular, the use of the underscore character to mark juncture may be deleterious, since typically it will be rendered as an ambiguous BLANK character in certain print media.
The OASIS web server(s) used for resources in the OASIS Open Library will respect the authoritative, canonical, exact (mixed-case) spelling used in official OASIS URIs, viz., in the path and filename components of the URI. Protecting the quality of URIs and the identity of URI-addressable resources depends critically upon respecting case: Unicode, used in XML and almost all modern applications, is case-sensitive, so that 'foo' and 'Foo' as identifiers are different. Most XML processing depends upon respect for case (XML schema, XML DTDs, etc). Therefore, using the exact (normative, canonical, authoritative) correct character tokens, including correct case, with respect to subdirectories and files is critical. The Apache server as currently configured is doing the right thing: rejecting requests for approximate URIs. We may provide assistance for 404s but will not deliver mis-identified documents silently.
File names reserved for (future) administrative use include any files significant to the Apache server (e.g., .htaccess; *.cgi; *.conf or matching any Apache config files; mime.types) and files used by Staff for uniform browsing/navigation (e.g., index.html, index.htm, etc). A complete list must be provided.
The goal of the naming guidelines is to provide a set of loose constraints under which TCs can adopt naming practices suitable to their application. In boundary cases, where some naming construct is judged problematic for technical, political, or social reasons, the TC Administration will attempt to negotiate an acceptable solution that avoids the problem, but in some cases, may need to exertise authority, which may be appealed by a TC.
A tcShortName is the abbreviated name assigned to a TC by the TC Administration, designed for use in URIs. Examples:
|TC Full Name||tcShortName|
|OASIS eXtensible Access Control Markup Language (XACML) TC||xacml|
|OASIS Universal Business Language (UBL) TC||ubl|
|OASIS Security Services (SAML) TC||security|
|OASIS Entity Resolution TC||entity|
|OASIS HumanMarkup TC||humanmarkup|
Once assigned to a resource, an identifier (URI) should never be retired and re-assigned to some other resource: the relationship between identifier and resource should be considered fixed and unseverable. This applies to primary resources (e.g., a conceptual whole document) and to secondary resources associated with a fragment identifier component of a URI [post-pound # fragment portion]. Some CMS products [Moin Wiki] will rewrite the value of of an (X)HTML ID attribute when a document is saved — breaking URI references that link to internal document components. Corollary, because we do not break hyperlinks by any means: "fixed and unseverable" also means that once a URI is published for a resource, that resource must not be deleted.
The rule prohibiting the semantic overloading of a URI at the point of contention between a regular directory URI and NS URI attempts to support transparent visibility of all resources through directory index views, uniformity of user experience when dereferencing a directory URI, and uniform server behavior at the point of a NS URI. As of 2007-03, some OASIS Technical Committees have installed resources in directories which (when transparently exposed via directory indexes) violate this rule: they will be grandfathered and supported.
Comments received during the ASIS review and afterward addressed the related questions visible in the Commentary, introduced by the following sentence: "Given the possible dual use of an HTTP scheme URI as (a) a namespace URI and (b) an identifier for a directory node, it seems reasonable to clarify expectations about server behaviors with respect to dereferencing HTTP scheme namespace URIs and about possible conflicts arising from contention/overloading..." Suggestions from reviewers have converged on a consensus that resources (typically files) should not be stored in directory foo/ if the NS URI is foo or foo/.
Here is a short explanation; a longer design note is available from TC Administration for parties needing to understand more about this rule selection.
An important part of the OASIS Library Design is the determination to expose (transparently) the contents of all directories using auto-generated HTML index pages that match each directory URI. See, sub URI Design, "[directory] contents will be made publicly viewable via standard indexes and other navigation/browse facilities.
We face an obvious conflict if a TC elects to use a Type 1 (slash) namespace URI: HTTP scheme namespace URIs must resolve to some informative resource, ideally meeting the requirements for a (RDDL) namespace document. Since the server will dereference from the slash-type namespace URI to the (RDDL) namespace document, no directory listing matching that foo/ URI can be delivered to the user. If the TC stores physical content (files) in the foo/ directory, they will not be exposed transparently at the expected directory-index URI. Therefore, we forbid installation of content in the foo/ directory.
If a TC elects to create a Type 3 (simple/slashless) NS URI, it becomes theoretically possible to serve the RDDL namespace document in response to a request for foo, while serving up a directory listing in response to a request for foo/. Such server behavior seems undesirable for two reasons. First, users are accustomed to the widespread, common server configuration which delivers the same result for a directory URI, with or without final slash. Servers need to be configured with the "directory slash fixup redirect" to support proper construction of relative reference URIs. It will come as a surprise (and possibly annoyance) if users encounter an odd situation in which "slash matters" at the point of a directory: they will have to figure out which URI is which, and they don't understand why (typically), and they won't remember to use the right (particular) syntax in these odd cases where "slash matters." Second, implementing the theoretical server behavior will mean delivering a different result for Type 1 and Type 3 URIs where there is URI contention for interpretation as a directory URI. In view of the desired uniformity in user experience, this represents a poor choice: it's better to avoid collision and semantic overloading by adopting URI design that resists any point of possible contention and confusion. We achieve that goal by installing no files in foo/ in the case where the NS URI terminates in foo/ or foo.
Suggestions have been made by various parties to formally rescind IETF RFC 3121 (June 2001, "A URN Namespace for OASIS") because the principal author no longer supports its proposals or sees value in URNs. Furthermore, OASIS has no current commitment to provide what the RFC indicates "A more interactive, online resolution system will also be deployed in the near future." There is no current plan, as stated, to "distribute catalogs (OASIS TR9401 Catalogs) that map the assigned URNs to resource identifiers (e.g., URLs)." Based upon public feedback provided for the TAB's ANG/AIR/ASIS documents, Staff concluded that resolution support for HTTP scheme NS URIs (the ability to provide information about a namespace URI at the URI) makes the use of (RDDL) namespace documents in connection with HTTP scheme NS URIs a strong desideratum.
See for example DocBook : "Historically, DocBook was in no namespace. Starting with DocBook V5.0, DocBook is in a namespace: http://docbook.org/ns/docbook, the namespace name for DocBook. In time, other modules may also have their own namespace."
W3C Licenses New Namespace URI Pattern for Use in Technical Reports: On September 07, 2006, Ian Jacobs, team contact for the W3C Advisory Board and editor of the W3C Process Document, announced an update to the document "URIs for W3C Namespaces" which authorizes the optional use of a new namespace URI pattern. In the usage category "Namespace URIs in Recommendation Track Documents, Group Notes, and other Working Drafts," the document now licenses a URI with the form http://www.w3.org/ns/ssss for use, where "ssss" is a short string not causing confusion, alarm, or embarrassment. For instance, the short string should not cause confusion when used in both http://www.w3.org/TR/ssss and http://www.w3.org/ns/ssss URIs. The W3C document also specifies that in all Member and Team Submissions: (1) Namespace URIs MUST be dereferenceable, and (2) Namespace Documents MUST describe the relationship between the defining specification and the namespace URI. A Namespace Document describes the namespace, providing directly or by reference information for human and also, ideally, machine consumption. A Namespace Document is available for retrieval using a corresponding namespace URI. When a namespace URI appears in a Recommendation Track document, the responsible group MUST publish a corresponding Namespace Document.
Some TCs want to hard-link to XML schemas from namespace URIs rather than to separate "namespace documents". We can honor that option by saying in the rules that the namespace URI MUST resolve rather than return a harsh 404 status code. The WebArch document's section on Namespace documents notes that there are many methods of accomplishing the effect of a namespace document, including documents based upon XML Schema (XSDs), to follow "Good practice": Namespace documents — "The owner of an XML namespace name SHOULD make available material intended for people to read and material optimized for software agents in order to meet the needs of those who will use the namespace vocabulary." The W3C document says: "the following are examples of data formats for namespace documents: OWL Web Ontology Language Reference (OWM), Resource Directory Description Language (RDDL), XML Schema Part 1: Structures (XML Schema), and XHTML 1.1-Module-based XHTML. Each of these formats meets different requirements described above for satisfying the needs of an agent that wants more information about the namespace..." It seems quite reasonable that an HTTP scheme namespace URI (namespace name, URI reference) should resolve to something useful and informative, whether an XML schema or other representation which fulfills the general requirement of "useful information." TCs should be able to indicate the resource to be delivered when the URI is dereferenced; we expect that resource to be located under the TC's web site root. If the TC designates/provides no such resource, OASIS TC Administration would do so.
Several solutions have been offered for a required or recommended practice of identifying "versions" of specifications using words and enumerators. One scheme would apply the term "revision" to any intermediate non-approved documents between major status levels — where '#' is a digit:
SDOs/SSOs commonly publish a "Latest Version: " (version-agnostic) URI for specifications that are developed in stages over a long period of time, allowing collaborators and reviewers to create stable bookmarks and hyperlinks that can be assured to get the most recent/current release of an evolving document. See references below to this practice as implmented by W3C, WS-I, the Unicode Consortium, Dublin Core Metadata Initiative (DCMI), and the RDDL Spec Development Team. Similarly, a "Latest Version: " URI is useful for documents like a TC Issues List or Minutes, where team members need a stable URI reference for the most recent published instance. This feature has been requested by numerous TCs for other application scenarios in which changing interdependent URIs in each release is impractical.
The OASIS Open Library currently supports the "Latest Version: " URI in several contexts, though not uniformally; as of 2006-07-01, no common method for designing these URIs had yet been adopted. For issues lists, see these examples:
|TC||Latest Version URI||(Example) Version-Specific URI||History|
The label "Latest version" URI functioning as the constant/fixed/permanent URI can be confusing — depending upon whether one thinks of a "version URI" as an identifier or as a locator. However, it's clear what "Latest" means when one sees the customary usage in context, for example in any of the following sample documents from W3C, WS-I, Unicode Consortium, DCMI, etc.:
W3C: Web Services Choreography Description Language Version 1.0. This version: http://www.w3.org/TR/2004/WD-ws-cdl-10-20041217/; Latest version: http://www.w3.org/TR/ws-cdl-10/. Previous version: http://www.w3.org/TR/2004/WD-ws-cdl-10-20041012/. The 'Latest version' URI is the permanent/fixed URI alias that always points to the most recent 'This version' instance file as the W3C specification progresses through different maturity levels: Working Draft, Candidate Recommendation, Proposed Recommendation, Recommendation.
W3C: Synchronized Multimedia Integration Language (SMIL 2.1). Slightly more complex case; same idea, same principles, same implementation.
Unicode Consortium: Unicode Named Character Sequences
Dublin Core (DCMI): Dublin Core Metadata Element Set, Version 1.1: Reference Description
RDDL Spec Team: Resource Directory Description Language (RDDL)
OASIS: Draft OASIS Specification Templates. [needs discussion]
[ANG06] OASIS - Artifact Naming Guidelines. Working Draft 06. July 09, 2004. Document identifier: 'oasis-tab-artifact_naming_guidelines-wd-05'. Edited by Tim Moses. See the posting "Naming guidelines for comment." [oasis-tab-artifact_naming_guidelines-wd-06.doc; local source .DOC]
[ANG09] OASIS - Artifact Naming Guidelines. Working Draft 09. The diff from version -08, also unmarked v09. October 25, 2004. Artifact identifier: 'tab-artifact_naming_guidelines-1.0-spec-wd-09'. Edited by Tim Moses and William Cox. See the posting "Back to boring old Naming Guidelines, this time with URN section". Source: tab-artifact_naming_guidelines-1.0-spec-wd-09-diff.doc [local] and tab-artifact_naming_guidelines-1.0-spec-wd-09.doc [local].
[AIR] Artifact Identification Requirements 1.0. Produced by members of the OASIS Technical Advisory Board (TAB). Edited by William Cox and Tim Moses. Artifact Identifier: 'ArtifactIdentificationRequirements-v1.0-requirements-wd-r14'. Working Draft 15. 30-June-2005. See the announcement [also here] and "short preface and list of questions." [local copy]
[ASIS] Artifact Standard Identification Scheme for Metadata 1.0. Approved TAB Document. 30-January-2006. Edited by William Cox and Tim Moses. Artifact Identifier: 'ArtifactStandardIdentificationSchemeForMetadata-1.0.1-req-approved'. Sent out for public review on February 06 2006 ending March 01, 2006. With spreadsheet listing comments and responses from the July 2005 review. [local copy]
[Namespaces10] Namespaces in XML. W3C Recommendation. World Wide Web Consortium. Reference: REC-xml-names-19990114. 14-January-1999. Edited by Tim Bray, Dave Hollander, and Andrew Layman. With errata page.
[Namespaces11] Namespaces in XML 1.1. W3C Recommendation. 04-February-2004. Edited by Andrew Layman, Richard Tobin, Tim Bray, and Dave Hollander. See also Namespaces in XML 1.1 Requirements. With errata page.
[Namespaces11-2] Namespaces in XML 1.0 (Second Edition). W3C Recommendation. 16-August-2006. Edited by Tim Bray (Textuality), Dave Hollander (Contivo, Inc.), Andrew Layman (Microsoft), Richard Tobin (University of Edinburgh and Markup Technology Ltd). See also Namespaces in XML 1.1 Requirements. With errata page and public archives for 'email@example.com' list.
[Versioning] [Editorial Draft] Extending and Versioning Languages Part 1. Edited by David Orchard (BEA Systems, Inc) and Norman Walsh (Sun Microsystems, Inc). Draft TAG Finding. 17-July-2006 or later.