W3C, SGML, XML, ODA, HTML, DSSSL and CSS:
A Guide to the Alphabet Soup of the Online Document
Henry S. Thompson
Language Technology Group,
Human Communication Research Centre
University of Edinburgh
Outline draft, 10/12/96
Introduction
The ordinary online document and the web document are on a collision
course, and two working groups of the World Wide Web Consortium
(W3C) are in the thick of it, trying to balance the needs of document
producers and consumers while trying to keep Netscape and Microsoft
at the same table. Two standards, one family of semi-standards,
one standard manqué and two proto-standards are involved,
and sorting out their inter-relations, both technical and political,
is a tricky business. The outcome of this struggle will determine
the nature of online publishing for some time to come. In this,
the first of a series of articles on this complex topic, we introduce
the acronyms, the players and the issues.
The Acronyms
- HTML
- HyperText Markup Language
- SGML
- Standard Generalized Markup Language (ISO 8879)
- XML
- Extended Markup Language
- CSS
- Cascading Style Sheets
- DSSSL
- Document Style Semantics and Specification Language (ISO 10179)
- ODA
- Open Document Architecture (ISO 8613)
The Players
- W3C
- World Wide Web Consortium
- ISO
- International Standards Organisation
- W3C SGML Working Group
- A W3C working group responsible for XML.
- W3C HTML Working Group
- The W3C working group responsible for CSS.
- WG8
- ISO Working Group 8 [ISO/IEC JTC1/SC18/WG8]: Document Description
and Processing Languages. The ISO committee directly responsible
for SGML, DSSSL and related standards.
- Microsoft and Netscape
- Locked in (potentially mortal (for Netscape at least)) combat
for the Web browser market
- The Document Industry
- Producers and consumers of prodigious volumes of structured
text, desparately searching for a technology which will protect
their data and their investments. Increasingly looking to intranets
(private corporate networks), if not the Internet, for document
distribution.
The Issues
- Interoperability
- The key issue for the software providers. A big ego thinks
it can define its own standards, and everyone else will fall into
line. Less plausible when more than one key player swings a lot
of commercial weight. Recognition of this fact is all that W3C
has to keep everyone paying W3C standards at least lip-service
- Structure
- Are documents flat, or (tree) structured? Are tags state-change
signals to a formatter, or boundary markers in the tree structure?
Is format separable from structure, and if so how?
- Ownership
- Who owns the data? The advantage of international standards
to the users is that they know they're safe from the perils of
single-sourcing (just ask anyone who has had to convert large
amounts of corporate documentation from an obsolete word-processor
format).
Conclusion