Whitespace control: fixed or user-controlled?

Henry S. Thompson
18 September 2000

1.   Background

My proposal for normalizing attribute values and element content includes a whitespace facet with which users can control the degree of normalization. David Beech asked whether this flexibility necessary, and whether removing it would be problematic. The obvious way to do this would be to take the defaulting rules from my proposal and just make them the facts. Here is the relevant text from my proposal:

When the {variety} is atomic or list, in the absence of an explicit or inherited whitespace facet, one is added with a value based on the {variety} as follows (union types have no whitespace facet, the whitespace facet of its members are what matters):

  1. If the {variety} is list, then collapse

  2. otherwise (the {variety} is atomic), then depending on the {primitive type definition}, as follows:

    1. if it is the simple ur type definition, then preserve;

    2. if it is the built-in primitive string datatype, and there are no explicit or inherited enumeration facets, then replace;

    3. otherwise, collapse.

The strongest argument I see for providing user control is with respect to string: I may well want string-valued key/keyrefs which are collapsed, so that I'm not punished by accidental initial/final EOLs in my document. On the other hand I may want length-limited but not replaced content, for verbatim fields.