The new empiricism and the Semantic Web: where are we headed?

Henry S. Thompson
University of Edinburgh
Markup Systems
1 March 2011
Creative CommonsAttributionShare Alike

1. Acknowledgements

2. Overview

First, an introduction to the new empiricism

Then, a consideration of the Semantic Web

3. A short history of computational linguistics

First closely parallel to, latterly increasingly separated from, the history of linguistic theory since 1960.

Situated in relation to the complex interactions between linguistics, psychology and computer science:

[no description, sorry]

Originally all the computational strands except the 'in service to' ones were completely invested in the Chomskian rationalist inheritance.

A corresponding commitment to formal systems, representationalist theories of mind/so-called 'strong AI'

4. The empir[icist] strikes back

Starting in the late 1970s, in the research community centred around the (D)ARPA-funded Speech Understanding Research effort, with its emphasis on evaluation and measurable progress, things began to change.

(D)ARPA funding significantly expanded the amount of digitised and transcribed speech data available to the research community

Instead of systems whose architecture and vocabulary were based on linguistic theory (in this case acoustic phonetics), new approaches based on statistical modelling and Bayesian probability emerged and quickly spread

"Every time I fire a linguist my system's performance improves" (Fred Jellinek, head of speech recognition at IBM, c. 1980)

5. Case study: Automatic Speech Recognition

Speech recognition, that is, at least the transcription, if not the understanding, of ordinary spoken language, is one of the major challenges facing Applied Computational Linguistics.

One of the reasons for this is masked by the fact that our perception of speech is hugely misleading: we hear distinct words, as if there were breaks between each one, but this is not actually the case at the level of the actual sound. For example here's a display of the sound corresponding to an eight-word phrase:

[no description, sorry]

Despite this evident difficulty, that fact is that people can easily wreak a nice beach. Sorry, . . . recognise speech.

6. Case study: ASR cont'd

This is not just a matter of getting the word boundaries right or wrong. The next problem facing a speech recogniser, whether human or mechanical, is that the signal underdetermines the percept:

[no description, sorry]

You heard:

[no description, sorry]

But I said:

[no description, sorry]

And there are more possibilities:

[no description, sorry]

What's going on here? How do we do this?

7. Instructive versus selective interaction in complex systems

Biology has gone down this road first and furthest

The immune system was an early and revolutionary example

A simpler example (oversimplified here) is bone growth

How do we get the required array of parallel lines of rectangular cells?

The naive instructional view is that there's somehow some kind of blueprint, which some agent (enter Hume's paradox) appeals to in laying down the cells:

[no description, sorry]

8. Bone growth, cont'd

The truth appears to be selective: cells appear in new bone with all possible orientations, and the ones that are not aligned with the main stress lines die away:

[no description, sorry][no description, sorry][no description, sorry]

9. What kind of selection for ASR?

So how do we select the right path through the word lattice?

Is it on the basis of a small number of powerful things, like grammar rules and mappings from syntax trees to semantics?

[no description, sorry]

Or a large number of very simple things, like word and bigram frequencies?

[no description, sorry]

The probability-based approach performed much better than the rule-based approach

10. Up the speech chain

The publication of 6 years of digital originals of the Wall Street Journal in 1991 provided the basis for moving the Bayesian approach up the speech chain to morphology and syntax

Many other corpora have followed, not just for American English

And the Web itself now provides another huge jump in the scale of resources available

To the point where even semantics is at least to some extent on the probabilistic empiricist agenda

11. The new intellectual landscape

Whereas in the 1970s and 1980s there was real energy and optimism at the interface between computational and theoretical linguistics, the overwhelming success of the empiricist programme in the applied domain have separated them once again

While still using some of the terminology of linguistic theory, computational linguistics practioners are increasingly detached from theory itself, which has suffered a, perhaps connected, loss of energy and sense of progress.

Within cognitive psychology, there is significant energy going in to erecting a theoretical stance consistent with at least some of the new empiricist perspective.

But the criticism voiced 25 years ago by Herb Clark, who described cognitive psychology as "a methodology in search of a theory", remains pretty accurate.

And within computer science in general, and Artificial Intelligence in particular, the interest in "probably nearly correct" solutions, as opposed to contructively true ones, is dominant.

12. Part 2: The Semantic Web

The knowledge representation landscape has changed

13. The Semantic Web

14. Semantic Web technologies

15. What's not special about the Semantic Web

16. What is special about the Semantic Web

17. The impact of URIs

18. Merging ontologies

19. Practical problems

20. One clear benefit of URIs

21. 'Follow your nose' example

22. The key prediction

23. From Semantic Web to Linked Open Data

Increasing divergence between two communities

Linked Open Data is where the energy is

24. Conclusions, part 1

25. Conclusions, part 2

One thing is clearly new and important

Sends a clear message to the KR community

Any knowledge resource creation effort

should look very carefully at the ideology and the technology of the Semantic Web

26. Conclusions, part 3

27. Envoi: Pulling it together

The Semantic Web project is, intriguingly, stuck in a time warp.

It's busy reconstructing the Knowledge Representation systems and methodologies of the 80s and 90s, on a Web-scale.

But it has not taken the recent history of AI seriously, at least not yet.

The stated goal, and existing practice, of the SemWeb effort, is the rules, facts and inference story, not the Bayesian/machine learning/probably nearly correct story.

Not surprisingly, even if the Statistical Semantic Web doesn't seem to be there