Putting XML to Work: Style, Metadata and API
Henry S. Thompson

1 Day Seminar: 24 November 1999

1. Scope and Purpose

XML offers tremendous promise as the packaging and transfer mechanism of choice for the information age. Realising that promise means not just using XML to mark up documents and to expose data bases over the WWW, but retrieving, displaying, printing and processing XML documents and data.

This seminar will introduce you to the standards and the technology which are making this possible. In every case a concrete example-based introduction will include hands-on experience.

1.1. Standards and Tools for Style: CSS2 and XSL

Raw XML is not pretty to look at, and unlike HTML, XML tags are not standardised and have no standardised appearance on a display or the printed page. The second edition of CSS, the official W3C style sheet language for HTML, is designed to work with XML documents as well, making it possible for 5th generation browsers to display XML as well as HTML. XSL is the style language designed specifically for XML, and the scheduled-to-be approved W3C Recommendation for XSL will form the basis of this part of the seminar. It gives much more flexibility in how your XML documents and data appear, and has more support for high-quality printing as well.

1.2. Standards and Tools for Schemas and Metadata: XML Schema and RDF

The structure and meaning of XML documents are themselves natural candidates for description using XML. XML Schema is a W3C draft recommendation for describing XML document structure using XML instance syntax, giving much more power than XML's built-in DTD language, allowing for example a class hierarchy of element types. RDF is a W3C recommendation for uniform annotation of information about information. Its goal is to provide uniform access for search, retrieval and organisation of web-based information.

1.3. Standards and Tools for Application Development: DOM

If you have to build your own application, do you have to start from the bottom, parsing a character stream? Fortunately, not only are there already a number of XML parsers available, there's a W3C recommendation on how applications can access a parsed XML document: the Document Object Model.

2. Who Should Attend

If you need to deliver structured data over the web, then present it in one or more output media, using vendor-independent standards, this seminar is for you. It will help you to

Managers and analysts concerned with documents and information within IT, computing, office systems and corporate publishing departments in all sectors will benefit.

3. Outline Contents
3.1. Electronic Style
3.1.1. Background

Three aspects of a document: content, structure, appearance. A brief history in three stages:

3.1.2. Electronic style specificiation

Structured documents, formal style specifications and medium-dependent appearance: How they fit together. Managing style as opposed to managing documents.

3.1.3. The right tool for the job: three style standards

Strengths and weaknesses of each style language. Which is right for what situation?

3.1.4. CSS tools and examples

An introduction to controlling HTML document appearance with CSS. Microsoft and Netscape CSS support.

3.1.5. XSL tools and examples

An introduction to the XSL recommendation: how does it provide for associating simple styles with tags you add to your HTML documents, or for defining a completely new style for XML documents? Major-vendor and freeware tools for XSL: Internet Explorer 5 and XT.

3.2. Constraining document structure

An introduction to XML Schema: how does it allow you to say just as much about the structure of your documents as you wish, and no more? How inheritance allows specialisation of document structure while still allowing general-purpose tools to operate.

3.3. Describing document content

An introduction to RDF: how does it provide a standard way of telling the WWW what your document is about, who wrote it and who should read it. Relation to other standards, including PICS and Digital Signatures.

3.4. Application access to documents

An introduction to the DOM: how does it provide a parser (and programming language) independent API for standardised access to (HTML/XML) document content? What DOM-supporting tools are available already?

3.5. Summary conclusions

What should your company be doing about deploying XML?

4. The Tutorial Presenter

Henry S. Thompson is Reader in Artificial Intelligence and Cognitive Science at the University of Edinburgh, where he is chiefly engaged in research and research management in the Language Technology Group of the Human Communication Research Centre. He has published several language research corpora on CD-ROM, and has developed software systems for SGML and DSSSL. He was a member of the original W3C SGML Working Group, responsible for the first drafts of the XML standard. He was a co-author of the XML-Data proposal and the original XSL proposal and is now a member of the XSL working group and the XML Schema working group. He is the author of XED, the first freely available XML editor.

Marc Moens is Senior Research Fellow and Manager of the Edinburgh Language Technology Group, has over 15 years experience in language processing and document handling work. He has played a key part in projects involving the creation of text routing and document classification tools; the development of authoring tools for controlled languages; and information extraction from documents.