Formalising the Proximate Semantics of XML Languages with UML, OWL and GRDDL
Henry S. Thompson
ICCS/HCRC
School of Informatics
University of Edinburgh
Markup Technology Ltd.
17 May 2007
See
the full paper from the proceedings
for more detail, illustrations and links to all the stylesheets mentioned below.
1. Acknowledgements
Dan Connolly wrote the first Violet-to-OWL stylesheet, which really inspired this entire project.
2. Defining XML languages
Domain analysis
then
markup design
Three layer story
Not a new distinction
Examples: W3C XML Schema, Relax NG, SVG, RDF, CML, . . .
Since 1990s, domain analysis/data models use OO analysis/design
3. Drawing abstract data models
Whiteboards!
Dan Connolly leading a group design session at a whiteboard, with circles and arrows
Next best thing to a whiteboard: drawing program
Use just a bit of UML
Extract clean XML from the messy java object serialisation
Use pipelines!
First pipeline, two steps, produces
property summary
4. Using OWL to represent data models
Not just the obvious choice for W3C-related work
About the right level of formalisation
5. Converting from micro-UML to OWL
Second pipeline, shares first step with previous one, actually produce OWL
cardinality
Browse with e.g. SWOOP
6. Checking data models
Change from two years ago -- tools are usable (just barely)
Check consistency of model (limited)
GUI and/or command-line
7. GRDDL as abstract data binding
Turning to instances
GRDDL (CR)
is about connecting documents to representations of their semantics
Pipelines are an obvious mechanism for implementing this.
Successive approximation -- five steps
8. XSLT design pattern for abstract data binding
Each level of a class hierarchy is home for different bits
Named templates mirroring the class hierarchy
9. Conclusions: Towards executable specifications?
A long-standing tension: who is the audience of a specification, users or implementers?
Even for implementers, formal approaches have at best a mixed track record
Formalisation in related (i.e. XML) technologies may make formalisations more accessible to
both
audiences
10. Conclusions: Document type design issues
Continuum from more-or-less isomorphic expressions (c.f. normal forms) to vernacular/idiomatic expressions
Normal form work talked about schema-annotation to guide data-binding
Not clear if carefully-crafted XML languages (e.g. XML Schema, XProc) will support
None-the-less maybe the two approaches can be combined.