April 03, 2009

Programming by coincidence

A little while ago I was wrestling mightily with SOAP and posted this on twitter:

every time I do anything nontrivial in SOAP I resort to programming by coincidence, and I really hate it

One of my coworkers requested a clarification on what I meant and I summed up as best I could given the space:

by coincidence: A might work, try it! no? B might work, try it! no? C might work...

The Pragmatic Programmer has a good deal to say on this, but the related tip will do for now:

Don't Program by Coincidence
Rely only on reliable things. Beware of accidental complexity, and don't confuse a happy coincidence with a purposeful plan.

When we talk about elegant systems we usually extol the virtues of componentization: the idea of writing applications by putting lego-like pieces together has been around for probably as long as software itself. And a lot of times that works swimmingly (see CPAN).

However, one area where this breaks down is when you're using a fairly complex component that itself wraps up a whole bunch of other components. The most common case is with frameworks.

In this case, it was Apache CXF. Don't get me wrong, I think CXF is a great piece of software. It has decent documentation, excellent integration, a very responsive developer community, and the benefit of being a rethink of earlier projects.

But there is a lot of software assumed when you say "CXF". Its lib/ directory has 65 separate JAR files. They're almost certainly not all necessary for your task -- in fact, there's a very helpful WHICH_JARS file in that directory to tell you which ones you need for which use cases.

My use of CXF is pretty plain vanilla. And because it's not central to our application I don't know it that well -- in fact, I want to know as little as possible to get it working. [1] And this is where we run into some tension.

Understanding SOAP top-to-bottom involves a huge number of fairly complex technologies, WSDL, XML Schema, databinding and code generation tools chief among them. You can spend many solid weeks really understanding each of these just by themselves, and even longer on how the different software packages implement and internalize them. And when you stack them on top of one another, and you're not sure which layer is generating error messages, or if you're wiring them together properly? Ouch.

I'm not arguing against anything except complexity, which is kind of like arguing against river floods or crushed puppies. But such complexity should be an indication when we build these software stacks that something is deeply wrong. Because software is supposed to help us manage the complexity of our lives, not add to it. Most domains have enough essential complexity that using such stacks just make things worse. This is one of the reasons so many people reinvent such frameworks (and even the components used), because they're trying to impose a consistent worldview from top-to-bottom, so people won't have to resort to coincidences to get things done.

[1] This isn't true for all such projects -- with some (like Jersey) I'm actually interested in understanding how the project views the world and solves problems.

Next: Things to learn from legacy code

Previous: A Jersey Freemarker Provider