XML Languages, entailment and versioning

Work in Progress, subject to change w/o notice: Please do not quote or redistribute

Henry S. Thompson

3 Oct 2006

1. The nature of XML languages

In the world of the languages we're interested in, there are three levels:

The concrete syntax, defined by BNF or DTD/Schema (plus BNF) or . . .;
The data model, either concrete, defined by e.g. Java classes or database schemata, or abstract (e.g. W3C XML Schema components, RDF graph);
The model, out there in the real world, or at least half-way there, in the case of e.g. SVG or HTML. . .

On this account we have strings, (abstract) data models, and reality.

I think it's fair to say that the most straightforward mapping from the traditional model-theoretical view to the three-level story is to say that the sentential forms and models of model theory correspond to levels (2) and (3) of the three-level view. The relationship between (1) and (2) is typically held to be trivial, or at least uninteresting.

But I think that it's actually levels (1) and (2) that (most of) the TAG have had in their minds, at least some of the time. And I include myself in that generalisation, at least in part.

There are certainly plenty of cases where (1) and (2) are uninterestingly different (consider e.g. LISP -- the mapping from the string "(+ 3 4)" to the corresponding s-expression composed of three cons pairs, one atom, two numbers and nil is just not the locus of what's interesting about LISP).

For the XML languages, the situation is further complicated. Call the mapping from surface string to data model Θ, and the relationship between data model and the appropriate model/world/domain of discourse Φ. For an XML language such as SVG we then have

XML:  document --Θ-->  infoset 
SVG:                   infoset --Θ--> data structure --Φ--> bitmaps

or for RDF:

XML:  document --Θ-->  infoset 
RDF:                   infoset --Θ--> relational graph --Φ--> world

or for some XML-based business language

XML:  document --Θ-->  infoset 
POL:                   infoset --Θ--> java class instances --Φ--> ???

The two Θs compose in each case, of course, but they are quite different in character, or rather, where they get you is quite different. It's hard to see what concrete domain arbitrary XML infosets can be interpreted as making claims about, whereas SVG data structures certainly can be interpreted as making claims on bitmaps, etc.

[For a very interesting attempt to talk about the general matter of markup semantics, see Sperberg-McQueen, Huitfeld and Renear, Meaning and interpretation of markup, _Markup Languages: Theory and Practice_ v2 n3, pp 215--234, MIT Press, 2001 and Sperberg-McQueen, Dubin, Huitfeld and Renear, Drawing inferences on the basis of markup, in T. Usdin, ed., Proceedings of Extreme Markup Languages, 2002, IDE Alliance. The latter is probably the better introduction.]

So, the net-net of all this? I don't think we can ignore any of the three levels, but I wonder if we can get most of what we need from the two which are fully under our control, as it were, namely (1) and (2). The main reason for this is that in too many cases which we need to cover, we can't ignore (2) because it's where the semantics of the language are normatively stated (see e.g. SVG, RDF, XML Schema), and we can't ignore (1) because it's not even close to being 1-to-1 with (2), and in any case it's what people are used to actually seeing and manipulating.

2. Direction of fit

Consider XML Schema, SVG and POX (purchase-order XML). These have a very different feel to their semantics, if you look carefully. The kind of contrast is that referred to with the words 'declarative' and 'performative' in the philosophy of language -- contrast "Snow is white" with "I pronounce you man and wife". Declarative sentences have what John Searle calls the "word-to-world" direction of fit -- on hearing such sentences, we ask, of the words, do they fit the world. Performative sentences change the world to fit the words, as long as they are uttered felicitously, that is, the necessary pre-conditions satisfied. Note that not all performatives are so socially bound -- something as simple as "I promise to pay you five pounds" is the same sort of thing.

In between 'declarative' and 'performative' we have 'imperative', for such sentences as "Open the door" and "Please pass me the salt" and even "Can you slow down a bit". These have a sort of conditional world-to-word direction of fit---either the world, courtesty of the addressee, changes as commanded/requested, or it doesn't.

W3C XML Schema: Declarative/word-to-world, for sure. Given a schema (component) and an infoset, the schema is true (satisfied) or false (not satisfied) of/by the infoset.
SVG: Could go either way. We could think of it like a schema---given an SVG dataset and a bitmap, the dataset is true (satisfied) or false (not satisfied) of/by the bitmap. Or more imperative/conditionally world-to-word---an SVG dataset is a request to change the world by producing a particular kind of bitmap.
POX: In at least some contexts, definitely world-to-word. On receipt of a POX instance, for example, a shipping department should package and ship the goods; a accounting department should update Accounts Receivable, etc.

Not only declarative, but also performative and imperative can be understood in terms of claims on the world: performatives are understood as a pair of sets of claims, pre-conditions and post-conditions. If the pre-conditions are all satisfied, then the post-conditions become satisfied as a result of the utterance. Similarly for imperatives, but the post-conditions are contingent on cooperation from the addressee.

3. The RGY language

Let's try working with a very simple XML language to see if we can illustrate some of this -- the RGY language. Here's its syntax (a DTD):

<!ELEMENT rgy (l*)>
<!ELEMENT l EMPTY>
<!ATTLIST l x NMTOKEN #REQUIRED
            y NMTOKEN #REQUIRED
            c (r|g|y) #REQUIRED>

And here's its data model (a UML diagram):

The mapping Θ[RGY] from syntax to data model is obvious.

What's perhaps more interesting is that we can easily give both declarative and performative semantics to this data model.

3.1. RGY: The declarative interpretation

The domain of the model is 8-bit-per-colour RGB bitmaps. We say that

An instance of RGY is satisfied by a bitmap iff all its lights are satisified by that bitmap.
An instance of Light (call it l) is satisfied by a bitmap iff one of the following is true
1. The pixel at position l.x, l.y in the bitmap is #ff0000 and l.colour is red.
2. That pixel is #00ff00 and l.colour is green.
3. That pixel is #ffff00 and l.colour is yellow.

3.2. RGY: The performative interpretation

The domain of interpretation is a set of traffic lights in a city with a rectangular, NS/EW orientated set of streets and avenues, both numbered, each intersection governed by a traffic light, all of which are controlled by a computer which implements a simple web service, which performs actions under the control of messages as follows:

To process a message whose body is corresponds to an instance of RGY, determine the subset of its lights whose preconditions are satisfied (see below), then process the members of that subset.
The preconditions of an instance of Light (call it l) are determined as follows:

if l.colour is 'red'
The traffic light at the intersection of l.x Avenue and l.y Street must be green in the NS direction.

if l.colour is 'green'
The traffic light at the intersection of l.x Avenue and l.y Street must be red in the NS direction.

if l.colour is 'yellow'
[No precondition]
To process an instance of Light (call it l)

if l.colour is 'red'
Set the traffic light at the intersection of l.x Avenue and l.y Street to yellow in the NS direction, wait 4 seconds, then set it to red in the NS direction and green in the EW direction.

if l.colour is 'green'
Set the traffic light at the intersection of l.x Avenue and l.y Street to yellow in the EW direction, wait 4 seconds, then set it to red in the EW direction and green in the NW direction.

if l.colour is 'yellow'
Set the traffic light at the intersection of l.x Avenue and l.y Street to blinking yellow in both directions.

4. RGY and entailment

We can define entailment for either approach to the semantics of RGY. An instance of RGY (call it A) entails another instance of RGY (call it B) with respect to the declarative semantics of RGY iff all bitmaps which satisfy A also satisfy B, or, to put it another way, the set of bitmaps satisfying A is a subset of the set satisfying B. I think it's obvious that this will be true just in case B.light is a subset of A.light, where equality for Lights is property equality.

With respect to the performative semantics, things turn out differently. We say that message A entails message B iff for all possible initial states of the traffic lights, the response to A performs at least all the actions involved in the response to B.

Somewhat surprisingly the performative semantics also guarantees that A entails B whenever B.light is a subset of A.light. Suppose B corresponds to

<RBG>
 <l x='1' y='1' c='r'/>
</RGY>

and A corresponds to

<RBG>
 <l x='1' y='1' c='r'/>
 <l x='1' y='1' c='g'/>
</RGY>

Wrt the declarative interpretation, A (defectively) entails B, because no bitmaps satisfy A.

And wrt the performative interpretation, A entails B, because either in the initial state, the light at the corner of 1st and 1st is green NS, in which case both A and B turn it yellow then red, or that light is red NS, which case A turns it green, and B does nothing, which does satisfy the definition of entailment above.

The consistency here depends on defining entailment in terms of changes, not final states. . .

5. Notes

What changes for our language diagram?

Can we appeal to entailment w/o appealing to the model except insofar as entailment presupposes it?

Is the functional assumption OK for Θ?

Can we enumerate the kinds of relations between markup changes and data model changes, starting within versions and then looking across versions?

The above notion of data model is too simple -- any language with keys in may be mapped directly into updates at the model level, violating the implicit appeal to some kind of context-free abstract syntax kind of story. For example an Address might be interpreted immediately/directly as an update to a database row keyed by name.

Consider the impact of adding <xs:attribute name="foo" use="prohibited"/> to a type defined by restriction, with the effect of removing something from the data model.

All this gives us the opportunity to state two levels of relationship:

Two texts from a language are equivalent if they correspond to the same data model.
A text A from a language is compatible with another text B if A corresponds to a data model which entails the data model corresponding to B.
A language L1 is compatible with a language L2 with respect to a text A if the data model corresponding to A in L1 entails the data model corresponding to A in L2.