XML is a poor format for exchanging data for a few reasons.
XML is strictly hierarchical. There are a few basic ways to organize data: sequentially, as in a list, array or tuple; by key, as in a dict or hash; hierarchically, as in a tree; or relationally, which means any item in the data can be related to any other item, in a free-form web. XML is not supposed to be sensitive to the order of nodes, which means it cannot really represent sequential data, as the nodes cannot be guaranteed to return in a particular order. In practice, most XML tools enforce a strict ordering that programmers rely upon, but this is not standard and can be very disturbing when it breaks. XML can represent a simple hash, where the node names are used as keys. And XML can, of course, represent strictly hierarchical data. But it also cannot represent free-form relational data. The other three basic ways of organizing data can be easily implemented as subsets of free-form relational structures. But a pure relational object, where, say A points to B and C, and B and C both point to D, or a loop, where A points to B, B points to C, and C points to A, cannot be encoded in XML without creating your own, ad-hoc unique identifier system, or, worse, copying nodes.
XML is un-typed. Every modern programming language uses typed data. A data interchange format should include type information as a primitive, and the implementation of that format in each language would handle converting interchange types into types native to the language, and back again. Sometimes the data type is included as metadata on the tag, and sometimes the name of the tag also defines its type, but this is totally ad-hoc and many uses of XML don’t contain any type information at all.
XML also has a data/metadata distinction that seems to be ignored, misunderstood or just flat out implemented wrong in many cases. Metadata should be the responsibility of the data-interchange library, not the programmer.
XML is extremely verbose. Node names are repeated twice for each node with content. And node and attribute names are repeated for every instance of the same type of node, rather than defining the structure once and then populating several instances of it.
There are data-interchange formats that solve parts of these problems, like struct.pack, bencode, JSON; and even SQL databases and BerkeleyDB are generally better that XML. XML currently has the upper hand because it’s human readable (if only barely) and because it works in so many languages. But it would be nice to see a real competitor.
Update 2008-01-26: YAML ain’t markup language, but it is a real competitor.

I wish XML were becoming obsolete faster - glyphobet • глыфобет • γλυφοβετ says:
August 18th, 2008 at 08:40XML sucks at (almost) everything it’s used for.