What is XML? Well, I know what it isn’t.
I just read Jim Waldo’s blog entry “What is XML?“.
For the most part, I like what he has to say:
To begin with, I’m never quite sure what people mean when they talk about XML. XML itself is a specification of a syntax for documents, and as an extension of the Standard Generalized Markup Language (SGML) it is pretty unobjectionable and pretty unremarkable. But getting worked up about the syntax is, as Rob Gingell is wont to say, about as sensible as getting worked up about ascii.
But I don’t think most people, when talking about XML, are just talking about the syntax. They talk about XML being self-documenting or human-readable. They talk about XML allowing communication between distributed objects. And none of these properties are syntactic; they all require some kind of semantics. So when people start ascribing semantic properties to a syntax, I start wondering what they are talking about. Clearly, the term XML has become shorthand for something more, something richer, something more, well, meaningful.
So perhaps what people mean when they talk about XML is actually XML and some DTD, or schema, or other interpretation that will give some semantics to go along with syntax. This combination would give some of the properties of XML that people talk about. It would allow inter-operation of programs that exchanged information using XML (and the common interpretation). But then people would need to say what schema, DTD, or interpretation they were pairing with XML, because lord knows there are a hell of a lot of different standards, pseudo-standards, and proposals that use the XML syntax but don’t inter-operate.
I think Jim is right. Most people, when talking about XML, are talking about DTD or schema, because it seems they are always talking about how interoperable it is, and how great and powerful it is… please pass the kool-aid.
Now, in my experience, there is barely such a thing as interoperable XML, even industry standard DTDs and schema are rarely used properly. The only thing I would say truly seems to be used well is RSS, and even that has a couple variations.
I worked with a DTD in the payments industry that required digital signatures, so “borrowed” from the DSig spec a simplified set of tags for what it needed. It still worked as DSig (as long as it kept up to date with DSig). It also allowed, for political reasons, the client side of this DTD to change the order of the elements, to be allowed by the server side (Guess which side I worked on :)). Not cool. You always hear “Loose on what you accept, Strict on what you send” (or something similar), which is a good thing, and I agree with it in principal, but there were other aspects, such as mandatory compliance testing that should have weeded out any of the bad input. Nope.
Anyway, enough of that rant… now back to XML. When I first read about XML, I thought “Hmmm, another way of defining and passing a data structure… Ok”. That was it. It has all the same conceptual issues that come with, say, trying to allow a COBOL program to call a C function, passing it a datastructure. That is to say, it is not any more interoperable than anything else.
To make it interoperable, it has to be strictly defined, and that ends up with problems, as I mentioned above. So, one thing it is NOT is a panacea, or was it NOT interoperable? meh, nevermind. Its not perfect. There.