Schema Requirement Implementation Comparison: Structural Requirements
Version 0.3b
Henry S. Thompson

18 March 1999

This is the third draft, prepared for the SSDL editorial team. The Schema requirements it describes come from the draft of 7 January 1999:http://www.w3.org/XML/Group/1999/1/NOTE-schema-req-19990107.htm.

References
DCD: Document Content Description for XML (DCD)
SOX: Schema for Object-oriented XML (SOX)
XMLDR: XML Data Reduced (XDR)
DDML: Document Definition Markup Language (DDML) (formerly XSchema)
1. Introduction

This document attempts to discharge two obligations I took on at the XML Schema WG telcon of 1999/1/14:

  1. To produce a template for horizontal exploration of existing proposals vis a vis the emerging XML Schema requirements;
  2. To actually coordinating the carrying out of this task for the structural requirements list.

The XML original of this document attempts to satisfy the first requirement: it is accompanied by both a DTD which provides a minimal framework for exposition and comparison of existing proposals, and a (DSSSL) stylesheet which can be used to produce HTML for distribution using JADE.

2 Structural Requirement 1

Mechanisms for constraining document structure (namespaces, elements, attributes) and content (datatypes, entities, notations)

2.1 Sub-requirement

Element type declaration: name, content model

XML 1.0 syntax:
<!ELEMENT DL (DT,DD?)+>
2.1.a DCD Proposal
Proposed syntax:
<ElementDef Type="DL" Model="Elements" Content="Closed">
 <Group Occurs="OneOrMore" RDF:Order="Seq">
   <Element>DT</Element>
   <Group Occurs="Optional"><Element>DD</Element></Group>
 </Group>
</ElementDef>

All element definitions are top level and globally scoped.

2.1.b SOX Proposal
Proposed syntax:
<elementtype name="DL">
 <model>
  <sequence occurs="+">
   <element name="DT"/>
   <element name="DD" occurs="?"/>
  </sequence>
 </model>
</elementtype>

All element definitions are top level and globally scoped.

2.1.c XMLDR Proposal
Proposed syntax:
<ElementType name="DL" content="eltOnly" model="closed">
 <group occurs="1:">
  <element type="DT"/>
  <element type="DD" occurs="optional"/>
 </group>
</ElementType>

Element definitions can be either top level and globally scoped, or within other element definitions, in which case only reference from within that definition is allowed.

2.1.d DDML Proposal
Proposed syntax:
<ElementDecl Name="DL">
 <Model>
  <Seq Frequency="OneOrMore">
   <Ref Element="DT"/>
   <Ref Element="DD" Frequency="Optional"/>
  </Seq>
 </Model>
</ElementDecl>

The structure here is very similar to that of SOX, although the names are in many cases different. All element definitions are top level and globally scoped.

2.1 Comparison

Let's look at some common features side-by-side. Note that in describing DCD, I'll call things attributes or elements as dictated by common sense plus the preponderance of evidence from examples in the document:

Schema Feature: The element type for declaring element types
DCD realisation
ElementDef
SOX realisation
elementtype
XMLDR realisation
ElementType
DDML realisation
ElementDecl
Schema Feature: The attribute of an element type declaration which identifies the element type: Can it be a qualified name (i.e. refer to a definition in another schema via a namespace URL)?
DCD realisation
Type (or id if using a fragment identifier): Yes (No).
SOX realisation
name: Yes
XMLDR realisation
name: Yes
DDML realisation
Name: Yes

DCD and XMLDR assume without comment that qualified names can be used in attribute values with the desired effect, implying that URIs will be derived from prefixes in such cases via the published Namespace rules (i.e. using xmlns and xmlns:xxx declarations in the schema instance itself. DDML provides its own attribute, namely ElementNS, for declaring the prefix:URI association for qualified names which occur in referential attribute values. SOX asserts that the published Namespace rules are inadequate to the task, and provides a separate namespace attribute, which may appear alongside any referentail attribute to provide a prefix, whose associated URI is in turn defined by namespace elements at the top level.

Schema Feature: The means used to refer to element types in content models
DCD realisation
The text content of an Element element
SOX realisation
The name attribute of an element element
XMLDR realisation
The type attribute of an element element
DDML realisation
The Element attribute of a Ref element
Schema Feature: The grouping element(s)
DCD realisation
Group
SOX realisation
sequence, choice or mixed
XMLDR realisation
group
DDML realisation
Seq, Choice or Mixed
Schema Feature: The way the content model type is declared
DCD realisation
A Model attribute on ElementDef, values from Any, Empty, Data, Mixed, Elements. If this is Elements, the ElementDef must have a single Group daughter.
SOX realisation
A daughter of elementtype, one of any, empty or model. model in turn has a single daughter drawn from string, mixed or one of element, choice, sequence, all of which imply element-only content (see below under model group type)
XMLDR realisation
A content attribute on ElementType, values from empty, textOnly, mixed, eltOnly. The functionality of XML ANY is achieved by the combination of mixed with model='open' (see below under Open Content).
DDML realisation
A Model daughter of ElementDecl, with one daughter in turn drawn from Any, Empty, PCData, Mixed or one of Ref, Choice, Seq, all of which imply element-only content (see below under model group type)

Here we see a substantive difference between DCD and XMLDR on the one hand, which use an attribute and don't distinguish sub-types of element-only content at this level, and SOX and DDML on the other, which use a collection of element types and do make that distinction.

Schema Feature: How model group type is declared
DCD realisation
An RDF:Order attribute on a Group element, with allowed values Alt, Seq.
SOX realisation
As noted above, any of element, choice or sequence are allowed as the daughter of model (or choice or sequence).
XMLDR realisation
An order attribute on an ElementType or group element, with allowed values one, seq, all, many. all is as per SGML's & connective, many means a *ed disjunction.
DDML realisation
As noted above, any of Ref, Choice or Seq are allowed as the daughter of Model.

Note that while DCD and XMLDR need a group element for subordinate model groups, SOX doesn't. DDML allows Model recursively, thereby being slightly different from the nearly-equivalent SOX solution by not allowing Choice directly inside Choice or Seq directly inside Seq.

There is a difference of substance as well as appearance here: SOX and DDML with their inventory of element types which may appear inside models combine two dimensions of content model constraint which are separated out into separate attributes by DCD and XMLDR: content type (none, text-only, elements-only, mixed) and constraint type (sequence, set, bag or choice). Only XMLDR supports the set constraint, which reconstructs SGML's ampersand connective. The combined approach is more restrictive in practice, in that only some cells in the matrix defined by the two dimensions can be specified. Obviously the constraint types don't make sense for empty and text-only content, but XMLDR allows all four to apply to either elements-only or mixed content. DCD, although specifying things independently, is actual as restrictive as SOX and DDML, as it allows only a sequence/choice specification, and that only in the elements-only case. It follows from all of this that DCD, SOX and DDML all reconstruct XML 1.0 content models more or less exactly, while XMLDR extends them both in allowing the bag constraint for elements-only as well as mixed content, and by permitting restrictions, e.g. a choice constraint, on the elements allowed in mixed content.

Schema Feature: Constraining cardinality of occurrence
DCD realisation
An Occurs attribute on a Group element, values drawn from Required, Optional, OneOrMore, ZeroOrMore.
SOX realisation
An occurs attribute on any of sequence, choice or element elements, values drawn from [not there], ?, +, * or n,m where m may be *., otherwise 0<=n<m [sic, but surely should be 0<=n<=m].
XMLDR realisation
An occurs attribute on an element or group element, values drawn from required, optional, zeroOrMore, oneOrMore, [not there], n:m or n:, where 0<=n<=m.
DDML realisation
A Frequency attribute on any of Ref, Choice or Seq elements, values drawn from Required, Optional, OneOrMore, ZeroOrMore.

The Microsoft preview implementation of XMLDR split occurs into minOccurs and maxOccurs.

Both SOX and DDML respectively declare a fixed value of */ZeroOrMore on the mixed/Mixed element type.

Schema Feature: How instance root elements are identified
DCD realisation
If the Root property of an ElementDef element is True, the defined element type may be a root element, otherwise not, unless no ElementDef in the schema is so described, in which case any element type may be a root element.
SOX realisation
Not specified in the schema
XMLDR realisation
Not specified in the schema
DDML realisation
A Root attribute on an ElementDecl element gives hints as to the likeliness of the declared element type appearing as root: Recommended, Possible or Unlikely.

Not clear whether this belongs in schemas or not. XML 1.0 allows any declared element to be used as the root, leaving it up to each instance to specify this.

It is worth noting that all four proposals are silent on some tricky issues, particularly the semantics of the occurs attribute when inside mixed content, the matter of tags versus types, and the matter of explicit versus implicit collections.

2.2 Sub-requirement

Element declaration: allowed attributes

XML 1.0 syntax:
<!ATTLIST DL clause ID #REQUIRED
             type (bullet|number|naked) 'naked'>
2.2.a DCD Proposal
Proposed syntax:
<AttributeDef Name="clause" Datatype="id" ID-Role="ID" Occurs="required"/>
<ElementDef . . .>
 <Attribute>clause</Attribute>
 <AttributeDef Name="type" Datatype="enumeration" Values="bullet number naked" Default="naked"/>
<ElementDef/>

An AttributeDef element may occur either at the top level, in which case it has global scope by default, or within an ElementDef element, in which case by default it has the XML 1.0 (local scope) semantics. Attribute definitions with global scope can be referenced using the Attribute element type, as in the above example. Attribute definitions can be explicitly made local or global using the Global property. If a top-level AttributeDef has Global="false" then it must have an id so that a fragment identifier can be used to point to it using RDF's resource indirection facility. So what's going on here is a conflation of internal/external visibility (although if the definition has an id, an external schema could still point to it!) and scoping.

The spec is somewhat unclear on this point, but it appears that both the Datatype and the ID-Role properties are required to reconstruct XML 1.0's ID attribute type. What one would mean without the other is unclear.

2.2.b XMLDR Proposal
Proposed syntax:
<AttributeType name="clause" dt:type="id"/>
<ElementType . . .>
 <attribute type="clause" required="yes"/>
 <AttributeType name="type" default="naked">
  <datatype dt:type="enumeration" values="bullet number naked"/>
 </AttributeType>
 <attribute type="type"/>
</ElementType>

An AttributeType element may occur either at the top level, in which case it has global scope, or within an ElementType element, in which case it has the XML 1.0 (local scope) semantics. Attribute definitions with global scope can be referenced using the attribute element type, as in the above example.

It is somewhat infelicitous that even a locally scoped AttributeType has to be referred to with an attribute, as above. This should have been defaulted, as in DCD and SOX

2.2.c SOX Proposal
Proposed syntax:
<interface name="anchor">
 <attdef name="clause" datatype="ID"><required/></attdef>
</interface>
<elementtype . . .>
 . . .
 <implements name="anchor"/>
 <attdef name="type" datatype="NMTOKEN">
  <enumeration>
   <option>bullet</option>
   <option>number</option>
   <option>naked</option>
  </enumeration>
  <default>naked</default>
 </attdef>
</elementtype>

Sticking closely to XML 1.0, an element is required to indicate implied/required/default/etc.

The use of sub-elements for the components of enumerated types allows a distinction to be made between their components and their type.

2.2.d DDML Proposal
Proposed syntax:
<ElementDecl . . .>
 . . .
 <AttGroup>
  <AttDef Name="clause" Type="ID" Required="Yes"/>
  <AttDef Name="type" Type="Enumerated" AttValue="naked">
   <Enumeration>
    <EnumerationValue Value="bullet"/>
    <EnumerationValue Value="number"/>
    <EnumerationValue Value="naked"/>
   </Enumeration>
  </AttDef>
 </AttGroup>
</ElementDecl>

Although an AttGroup is allowed to occur at the top level with the stated intention of supporting globally scoped attribute definitions, no mechanism is provided for in this version of DDML to enable such global definitions to be invoked for particular element types.

Aside from some attribute/content differences, similar to SOX.

2.2 Comparison
Schema Feature: The element type for declaring attributes
DCD realisation
AttributeDef
SOX realisation
attdef
XMLDR realisation
AttributeType
DDML realisation
AttDef
Schema Feature: The attribute of an attribute declaration which identifies the attribute: Can it be a qualified name (i.e. refer to a definition in another schema via a namespace URL)?
DCD realisation
Type (or id if using a fragment identifier): Yes (No).
SOX realisation
name: Yes
XMLDR realisation
name: Yes
DDML realisation
Name: Yes

See discussion of namespaces under the parallel feature of references to element type declarations above.

Schema Feature: The means by which an attribute declaration constrains its content
DCD realisation
Datatype attribute
SOX realisation
datatype attribute
XMLDR realisation
dt:type attribute or datatype daughter
DDML realisation
Type attribute

There's an issue hidden in the alternation allowed by XMLDR regarding how much local flexibility there is to specialise named datatypes on the spot. The XMLDR datatype element type provides for range constraints, for example.

Schema Feature: Are globally scoped attribute definitions supported? If so, how are they referenced? Can they be grouped and referenced together?
DCD realisation
Yes: An Attribute element whose text content names the global definition: No.
SOX realisation
Yes: An implements element whose name attribute names an interface element containing the global definition: Yes.
XMLDR realisation
Yes: an attribute element whose type attribute names the global definition: No
DDML realisation
Partially: can't be yet: Yes, but they can't be referenced.

SOX's novel interface/implements facility is closer to a principled reconstruction of the common use of parameter entities for shared attributes than it is to the idea of globally scoped attribute definitions as supported in DCD and XMLDR. The former is perhaps a bit more like making it easy to have many element types with the same local attributes, whereas the latter is a bit more like constraining the element types on which a global attribute may appear. Neither is quite like what has been proposed for SGML. Grouping is clearly a good thing for this functionality, particularly if it's not available via some form of element type extension facility.

Schema Feature: Can references to global attributes constrain them? If so, how?
DCD realisation
No.
SOX realisation
Yes: specialisation from implied to inherited to defaulted to required to fixed is allowed. Datatype can be constrained.
XMLDR realisation
Yes: default can be supplied/changed (if not already required); requirement can be tightened [HT: note this was inadvertently not made clear in the XMLDR document].
DDML realisation
No.

SOX does not address the possibility of changing the default. XMLDR misses the possibility of constraining the datatype, although it does allow this during element type extension (as does SOX). Whether there's a notion of information set substitutability which would be blown by changing defaults or fixing on a value other than the default has not, to my knowledge, been thought about at all.

Schema Feature: Can enumerated types be of any type? How are the permitted values of enumerated types declared?
DCD realisation
No: in a Values attribute, separated by spaces.
SOX realisation
Yes: as the text content of option daughters of an enumeration daughter.
XMLDR realisation
No: in a values attribute, separated by spaces.
DDML realisation
No: in a Value attribute on EnumerationValue daughters of an Enumeration daughter.

DDML's structure here is surely overkill. SOX's provision for enumeration of any datatype is surely a good idea, but again raises issues of demarcation vis a vis the datatype section.

Schema Feature: Can attributes be declared to take any number of members of an enumerated type?
DCD realisation
No
SOX realisation
Yes, via multiple='yes' on the enumeration element. This introduces potential ambiguity when the enumerated type allows spaces in individual members.
XMLDR realisation
No
DDML realisation
No
Schema Feature: How are occurrence constraint and default value specified?
DCD realisation
Occurs property, allowing only two of the four values allowed when it occurs with ElementDef, and Default and Fixed properties.
SOX realisation
Empty daughter required, implied or inherited or text-content daughter fixed or default: exactly one is required.
XMLDR realisation
required and default attributes.
DDML realisation
Required and AttValue attributes.

Both XMLDR and DDML use the presence of a default value specification on a required attribute to reconstruct XML 1.0's #FIXED semantics.

2.3 Sub-requirement

Element declaration: namespaces

XML 1.0 syntax:
<!-- There is no way to declare namespaces in XML 1.0 DTDS.
     In instances, we have both prefix and empty namespace declarations: -->
     <xsl:stylesheet xmlns:xsl='http://www.w3.org/TR/WD-xsl'
                     xmlns='http://www.w3.org/TR/WD-html40/'>
2.3.a DCD Proposal
Proposed syntax:
<Namespace>http://www.w3.org/TR/REC-html40</Namespace>

Must be present, at the top level, defines the namespace which all the definitions in the schema are definitions for.

2.3.b SOX Proposal
Proposed syntax:
<schema name="memo"
namespace="http://www.veosystems.com/schemas/memo.xml">

I.e. the namespace is defined via the namespace attribute on the schema document element itself.

2.3.c DDML Proposal
Proposed syntax:
<DocumentDef ns="http://www.taxonomy">

The ns attribute is allowed on any definitional element in the document, although it is deprecated anywhere except on DocumentDef.

2.3.d XMLDR Proposal

XMLDR does not provide any means of declaring the namespace for which a schema provides definitions.

2.3 Comparison

Both DCD and DDML appear to assume that pointing out from a schema to the namespace URI for which it provides definitions is a coherent thing to do -- it seems pointless and/or backwards to me, but at least in the case of DDML the purpose appears to be to allow multiple schema documents to combine to define the semantics of a namespace. SOX requires something which might amount to circularity, i.e. that the URI associated with the schema point to the schema.

Stubs below here!!!!

2.4 Sub-requirement

Element declaration: 'strict' vs. 'loose' interpretation

XML 1.0 syntax:
No XML 1.0 equivalent
2.5 Sub-requirement

Element declaration: content defaulting and datatyping

XML 1.0 syntax:
No XML 1.0 equivalent
2.6 Sub-requirement

Attribute declaration

XML 1.0 syntax:
No XML 1.0 equivalent as such
2.7 Sub-requirement

Entity declaration

XML 1.0 syntax:
<!ENTITY w3c "World Wide Web Consortium">
2.8 Sub-requirement

Notation declaration

XML 1.0 syntax:
<!NOTATION GIF PUBLIC "-//Compuserve Information Service//NOTATION Graphics Interchange Format//EN">
3 Requirement

Mechanisms to enable inheritance for element, attribute, and datatype definitions

3.1 Sub-requirement

Element declaration: sub-classing and parameterisation

XML 1.0 syntax:
<!ENTITY % pattrs '
  name CDATA #IMPLIED
  address CDATA #IMPLIED'>
<!ATTLIST human %pattrs;
                gender (m|f) #REQUIRED>
<!ATTLIST corp %pattrs;
               charitable (y|n) 'n'>
3.2 Sub-requirement

Attribute declaration: sub-classing

3.3 Sub-requirement

Datatype declaration

4 Requirement

Mechanism for URI reference to standard semantic understanding of a construct

5 Requirement

mechanism for embedded documentation

6 Requirement

mechanism for application specific constraints and descriptions

7 Requirement

mechanism for addressing the evolution of schemata

2. Revision History