Research Dissemination Workshop:

Markup Technologies for Computational Linguistics

Hosted by the HCRC Language Technology Group
University of Edinburgh
25, 26 February 1999

1. Overview

The Language Technology Group, with support from EPSRC, ESRC and other sources, has invested substantial effort over the last four years in building up an inventory of tools and technologies for the markup of language data. This in turn has led to the articulation of a markup-based architecture for NLP systems, which we have used for applications as diverse as discourse relation annotation, named entity recognition and tokenisation. The goal of this workshop is to introduce our work to a wider audience, with

Anyone interested in the role of markup technologies and standards in computational linguistics is invited to attend.

2. Programme

Thursday 25 February
1330--1730Hands-on intensive introduction to SGML/XML, Henry S. Thompson, Chris Brew (HCRC LTG)
1930--2030Invited Keynote: Michael Sperberg-McQueen (Univerisity of Illinois, Chicago; co-Editor of the Text Encoding Initiative)
Friday 26 February
0900--1030 Introduction to pipelining XML files through XML-aware tools; Survey of available tools; Survey of XML/SGML-encoded CL resources, members of the LTG, Adam Kilgarriff, Brighton
1100--1230Architectures in use:
LT TTT/LT MUC:
A pipelined, FSM-based, XML-aware customisable tokeniser, configured to produce a world-class Entity recognition system, Claire Grover and Marc Moens, LTG
GATE:
Alternative architecture for NLP application development, MUC system based thereon, Rob Gaizauskas, Sheffield
1330--1500Approaches to annotation:
HCRC Map Task Corpus:
XML-annotated dialogue corpus, with general-purpose tools for intersecting annotations from different perspectives (Edinburgh, Glasgow), Jean Carletta, LTG
SGML and Lexical Resources:
Approaches to marking up dictionaries and related resources, Adam Kilgarriff, Brighton
SSML:
Marking up text for Text-to-Speech, Paul Taylor, Edinburgh
1530--1700Annotation Support:
MATE:
XML-based dialogue annotation support, (Edinburgh, Pisa, Aarhus), David McKelvie, LTG
LT Annotator:
An automatic UI generator for annotating resources, XML-based but not requiring XML knowledge, Henry S. Thompson, LTG
The TEI:
Present state, future plans, Michael Sperberg-McQueen, University of Illinois
1700--1730Plans, Prognostications and Public Discussion

3. Participation

The workshop is open to all interested parties. You are welcome to attend for either or both days, but please register in advance, particularly if you intend to come for the tutorial on Thursday, as places are limited by the facilities available to us, and priority will go to people in registration order.

Please note that although tea and coffee will be provided on both days, and a buffet lunch will be provided on Friday, evening meals and accommodation are not part of the workshop.

For information about travel and accommodation, see the University's local information page.

4. Acknowledgements

This workshop is made possible by support from the UK EPSRC (grant GR/L29125, NSCOPE), and by the UK ESRC's baseline funding of HCRC.