Wednesday, October 10, 2012

The Paradox of Software Complexity

A common principle in user interface design is the conservation of complexity. Every task has some fundamental, irreducible complexity. Designers can either hide this complexity behind a simple interface or expose it to the user through a richer, more complex interface. What if a designer adds useless widgets to the interface? Because complexity is added beyond what the user needs, the interface has been complicated. This same principle applies when designing abstractions in software; hiding complexity behind abstractions is one way our limited minds can build complex software! After all, it is much easier to reason about structures and functions than raw memory addresses and stack frames. Herein lies the paradox - partitioning software into abstractions is fundamentally a complication.

Complexity can be looked at as the "inter-connectedness" of the parts of a system. The more parts and connections between them, the more complex the system becomes. An abstraction takes some of these connected parts and hides them behind a new part. We have hidden the details, and the top-level is now (hopefully) simpler. The complexity from the details does not just go away; it is only hidden behind the abstraction. On top of this, we now have a new part and its new connections. The system is now complicated! If an abstraction is not worth the cost of this complication, then it is a bad abstraction.

For this reason, good abstractions are highly coherent and minimally coupled. A perfect example is a C compiler. A good compiler can abstract away the details of the machine code and free our minds to build higher-level abstractions. The programmer still has to worry about managing resources, but can have the convenience of functions and loops rather than the moves and jumps that are implemented by the CPU. In specific cases, a programmer will need to know these hidden details anyway. A C compiler is so decoupled that it does not need to be shipped with the binaries it produces. The C standard library, however, is supported by a platform-specific runtime library. This allows a lot of higher-level, platform-dependent parts to be abstracted away too. We can continue to hide complexity with virtual machines, network stacks and GUI toolkits. Every part hidden is another part we can afford to build.

In a perfect world, this article would have ended after the last sentence. As we covered earlier, there is a complication - all of these abstractions add unnecessary complexity. Their fundamental complexity does not just go away; somebody has to manage that complexity. To make matters worse, it is sometimes necessary to understand some of the abstracted details anyway. These leaky, complicated abstractions often result in performance, maintenance and security issues. Sometimes the complication pays off: the software components can be reused or divided more efficiently among a team. Other times, the cost of this complication will scale much faster than the software.