Clarification/Elaboration of XML Schema requirements
Henry S. Thompson
14 November 1998
1. A4: Define relationship of schemata to XML document instances
The group requested clarification
2. Anew: Identifier renaming
SGML/(XML) architectural forms provide a number of
facilities, including various forms of content model subsetting and element and
attribute renaming. All these are really application-side requirements, and are
perhaps not quite the same as A12, which are in my view author-side.
3. B5: [Enable applications to use schemata to filter input documents]
- Goal Application designers should be able to use schemata to
define what they require of documents. Documents which provide more than that
should still be processable.
- Decomposition Application designers need to be able to specify
how documents can 'go beyond' their requirements (cf. D3: Open Content Models,
non-required attributes). Application users nned to be able to identify what
parts of their (richer) schemata satisfy the application schema requirements
(cf. Identifier renaming, 'kind of' assertion)
4. B6: [Switch between expression as attribute and expression as sub-element]
In writing several DTDs for schema languages, I have observed that the
ElementType element type and the AttributeType
element type were very similar. Perhaps schemata would be easier to write and
maintain if this similarity was exploited. There are three ways one could
imagine doing this:
- Point this out, and exploit it by defining parameter entities in the
Schema DTD for the common parts;
- Actually abandon the two different element types in favour of, say,
ComponentType, and in the content model make clear which
sub-components were to be expressed in instances as attributes and which as sub-elements;
- As (2), but actually change the data model so that instead of Element
and Attribute nodes, in what a schema processor presents to applications we
actually have Component nodes, with sub-components which are only incidently
differentiated between attribute-expressed and sub-element-expressed. This
has at least three sub-cases:
- You still have to make explicit in a schema for each sub-compenent
whether it is to be expressed as attribute or sub-element;
- You can leave this specification out if the datatype precludes
attribute expression;
- Unless the datatype precludes this, you can specify that instances
can choose on a case-by-case basis which expression to use (RDF allows this).
5. B7: Validation of documents across links
Stipulate that XML Link provides a way of expressing one or more
varieties of transclusion (find my content over there; replace me with what's
over there; . . .).
- Goal Allow such links to be transparent to schema-validation,
i.e. schema-validity should be assessed on the basis of the result of
transclusion as well as on its invocation
- Decomposition For find my content over there, c.f. Dnew,
CONREF. For replace me with what's over there, need something new.
6. C2 (and D2?): Element Subclassing and Inheritance
The group requested subdivision, against a background of the observation
that the balance between requirements language and implementation language was
skewed too far towards the implementation style.
Accordingly, there follow hereafter a number of new candidate requirements.
6.1. Goal C2a:
- Goal Provide explicit support for 'kind-of' relations between element types.
- Reason Support good software engineering in Schema design by
allowing declaration reuse within and across schemata. Replace common use of
parameter entities with a principled mechanism.
- Features If sub is a kind of super, then
- sub is valid in instances whereever super
is valid;
- attributes which are valid on instances of super are
valid on instances of sub.
- Issues Shadowing vs. intersection vs. union wrt associated attribute
declarations, content models, content datatype. Can an element type be declared
as a kind of more than one other type? Does this mechanism preclude the
necessity for separate declaration of attribute sets (cf. SOX)?
6.2. Goal C2b:
- Goal Provide explicit support for 'kind-of' relations between attribute types.
- Reason As C2a
- Features If sub is a kind of super,
then sub is valid in instances whereever super
is valid;
6.3. Goal C2c:
I thought there was something else here, but I can't reconstruct it.
Arguably something which attempts to reconstruct what SOX is doing with
parameterised declarations belongs here.
7. D3: Support incomplete constraints on element content models
- Goal Allow instances to be valid despite including more than
what is declared explicitly
- Reason See B5. Also necessary to support content model
subsumption (see C2a).
- Issues Specify for content model and attributes separately or
together? Allow additional material anywhere, at end, at specified loci?
Allow for level elision (i.e. if parent requires a
child daughter, but makes no mention of any wrapper
daughter, should there be a way of saying that child
within a wrapper daughter is good enough)?
8. D6: Support for alternate encodings of numeric values
- Goal Provide specialised support for constraining the radix in
which numeric datatypes are expressed.
- Reason If we don't support lexical constraints in general,
this is a case which some applications may well require.