The Danger of Breaking Changes

Xerces-C is without a doubt one of the most popular DOM implementations in C++ (and its Java sibling undoubtedly the most popular implementation for Java). As with any project that lives under the banner of the Apache Foundation the project is managed using a meritocracy-style project management scheme and has been, quite successfully, for the last decade.

Due to its well-deserved popularity which can be attributed to the run-time stability of its code and its adherence to open standards as well as to good documentation and support by an active community, it has been integrated as the reference DOM implementation for free and non-free XML processing programs alike. XML processing tools such as Altova XML Spy, a very good smart XML editor with which I have no affiliation, but which I do recommend, support Xerces - as well as MSXML - to generate XML-parsing code. That means that, in order to compile code generated by such a tool, you need a version of Xerces-C that is API-compatible with the version the code was generated for.

Xerces-C recently underwent a rather intrusive change: it now adhers more closely to the XML-1.0 recommendation than ever before, meaning some of its public API was deprecated and/or removed. The Xerces-C team encodes such changes in its public API in its version number, which is structured a bit differently than the way we structure version numbers at Vlinder Software (but then, at Vlinder Software, we have the benefit of having two independent version numbers for each library). I.e. in X.Y.Z, increasing Z means a bugfix has occured but, the version remains compatible at a binary level with X.Y.; X..* is compatible at an API-level, but not necessarilly binary-compatible. Changing X means the API compatibility is broken.

The dangers of making breaking API changes of such widely-popular (wildly-popular?) software should be obvious: code generated from such tools will no longer compile with the latest-and-greatest version of the software. As binary compatibility is also broken, using more than one version of the software in any “client” software - i.e. any software that uses the software in question - can be very dangerous. I have had to deal with a problem like this before: I maintained a local version of Xerces-C at one of the companies I worked for. My first course of action was to move everyone from Xerces-C1 to Xerces-C2, which was an API-breaking change, By then, Xerces-C2 was at version 2.3 and Xerces-C1 was deprecated for all intents and purposes. Also, the users for Xerces-C1 were running into a few bugs that had already been repaired in Xerces-C2 but were making life more difficult for the software team. The move was a painful one, as some parts of the code had to be re-written from scratch while others needed to be modified. It paid off, however, as we saw our performance increase and at least some of our bugs go away.

As the software had only a few releases to create every year and we knew in advance when those were going to be, keeping up with Xerces-C development was relatively easy: whenever the upstream base was updated and we were close enough to a release to bother but far enough to be bothered, I recompiled the entire base with the new Xerces-C version, ran the extensive tests on it and, if all the tests passed, packaged it up, put it on our internal package manager and sent out an E-mail. From Xerces-C2.3 to Xerces-C 2.4 that went fine. With Xerces-C 2.5, a new memory management scheme was introduced that broke our code. After some fruitless discussions with the Xerces-C team and some internal discussions, we decided to fork, ripping out the new memory management until subsequent versions of “vanilla” Xerces-C survived the tests. I had some issues with the way the memory management code was designed, so I was not willing to put any time in it in order to fix it, and forking was less effort than repairing. I have since left the company I maintained the fork for and have been told the latest version of Xerces-C2 no longer pose the problem.

The change from version 1.x to version 2.x was painful, but worth it: the team had been using 1.x for quite a while, but 2.x was ready and stable and, though there was some code changing involved, the result was better, faster and more stable code. The change from 2.4 to 2.5 was not intended to break anything except binary compatibility and was intended to allow for radical optimization by client code. In our experience, however, the change was painful and barely worth it as it rocked our confidence in the Xerces team and finally meant a fork, which in itself can be a costly, painful process.

Now, version 3 is upon us. The lesson I have learned from going from version 1.x to version 2.x and from version 2.4 to version 2.5 is to take a wait-and-see approach, which is regrettably the wrong approach: the current version of Xerces-C2 is version 2.8.0, which Boris Kolpackov volunteered to maintain for exactly the right reasons:

While there are preparations to release 3.0 soon, many existing applications won't be able to use 3.0 immediately because of the extensive API changes but would greatly benefit from a large number of bug fixes that have been committed to repository. I therefore have volunteered to be a release manager for 2.8.0.

The effect of this is likely to be that many, many users will continue to use Xerces-C2 waiting for Xerces-C3 to mature which, because of the fact that Xerces-C2 implements a deprecated, non-standard DOM API, is a Bad Thing. It is one of the dangers of making breaking changes to the API of a very popular software package, though: deprecated versions live longer like that…