Posted in Architecture
When a new IT solution is needed in an enterprise, maybe because the business is changing or maybe because an existing manual process should be automated, the people who are in charge of implementing the solution usually quickly get to the question: should we build the solution or should we buy a package? For a long time the accepted wisdom has been to always buy when possible and only build when no suitable packaged solution exists in the market.
In this article series, which will eventually be published in this book, I want to explore the reasons behind the existing preference for buying and, more importantly, I want to challenge the accepted wisdom and explain why I believe that the changes that have occurred in software development over the last decade have shifted the answer to the buy-or-build question away from buy and more towards build. I am not going as far as suggesting that build should be the new default but I think building software should be considered more often and it should be considered even when a suitable package exists.
In my experience the case for buying software is often not made on the back of arguments that show why buying is a good idea. The usual route seems to be to analyse why building is a bad idea. This in itself says something. It is as if building is the natural answer and only problems with it make it necessary to take a different approach. So, what are these problems that have led people to preferring to buy a package?
The most common argument against building software is the understanding that building software is risky. There are stories abound of failed software projects and of software projects running late, exceeding their budgets not by a few percentage points but by many times the originally anticipated cost. One way for an organisation to address this risk is to buy the software for a pre-determined price from a vendor. This shifts the risk of budget overruns to the vendor, and if the software has been completed before the purchase is made, as is generally the case with packed software, then there is not even a risk that the software will not be ready on time.
The obvious question is whether there really is something inherently risky in software development that has caused so many projects to be unsuccessful. On the one hand software development is a comparatively young discipline, at least when compared to architecture, in the building houses sense, or to industrialised production of goods. Consequently, there is less experience, practices are not as established, and many ideas and concepts have not been turned into repeatable processes. On the other hand, though, software is flexible and infinitely malleable. With the right techniques it should be possible not only to produce exactly what is needed, but it should also be possible to use parts of the software before the whole is finished, and it should be possible to adapt the software to a changing business environment, all of which are would reduce the risk of a large capital expenditure.
From an organisational perspective the real question is even slightly different: whose risk is actually reduced by buying software? In many organisations the department responsible for building or buying software, the IT department, is separate from the department that uses the software, the business. The IT department is often measured more on delivering on time and on budget than on the overall cost or potential realised. So, for the IT department there is little to no incentive to build because buying shifts all the risk in areas that matter to the IT department onto the external software vendor.
For the business, however, the situation is quite different. Not only is it possible that custom software may have been cheaper, therefore providing an earlier return on the investment, but it is also possible, and not uncommon, that the software package does not do exactly what the business needs, leading to decreased productivity and lost opportunities. Of course, these concerns should be addressed in the decision making process. Unfortunately, the process is normally run by the IT department, which is set up to care more about different risks. As a result it is possible that the selection and decision process minimises the risk for the IT department but does not optimise the solution for the organisation as a whole.
A good example of this conflict is the introduction of a CRM solution at Australia's largest telco. A system was bought from one of the market leaders, clearly reducing the risk for the IT department, but as a newspaper reported "[...] dealers at the coalface of the telco's retail chain are still saying the clumsy customer and billing platform at the heart of the company's multi-billion IT transformation is costing them business." Minimal risk for the IT department clearly did not translate into an optimal solution for the business.
If the business may really be better served by building, is it possible to take the risk out of building for the IT department? To start with, major breakthroughs in development methods over the past decade have resulted in much more predicable software delivery. Processes built around the agile value system deliver working software in small iterations, not only releasing functionality incrementally, but also giving the business a chance to provide feedback and, if necessary, allowing the developers to adapt the software. This greatly reduces the risk that the software does not meet the needs of the business. Software development teams have gained the ability to react to changes and develop software iteratively through techniques and practices as those described by the Extreme Programming and Continuous Delivery methodologies.
More intriguingly, it is possible to reduce the risk by shifting the goal. What if the main goal was no longer to deliver a large number of function points in a pre-determined time and budget? What if the core goal was to deliver the most needed functionality at any given point in time as efficiently and effectively as possible? Again, agile practices would clearly help, but how crazy is this idea? In 1982 Tom DeMarco opened his hugely influential book on Software Engineering with the line "you can't control what you can't measure". Interestingly, in an IEEE paper in 2009 he writes: "For the past 40 years [...] we've tortured ourselves over our inability to finish a software project on time and on budget. But as I hinted earlier, this never should have been the supreme goal. The more important goal is transformation, creating software that changes the world or that transforms a company or how it does business." I could not have said this any better.
Another common argument against building software is that it takes a lot of time and effort to implement. A business function that can be described on a couple of pages requires several hundred lines of code, which take a developer weeks to write and test, and yet more time to be deployed into a production environment. Complex systems that can only be described on several hundred pages require years to be implemented. One obvious answer would be to live with the inefficiency but at least to reduce the time by putting a larger team to the task. Unfortunately, it has become more than obvious that the speed of software development does not scale with the number of people on the team. At some point the communication and coordination overhead cancels out any further brainpower.
In the same way that buying a package shifts risk to the vendor, it shifts the problem of efficient delivery to the vendor, too. The buyer simply has to run a tender and procurement process and oversee installation of the software package. (I will return to installation in part 2.) Moreover, when multiple parties buy the same software the vendor can spread the cost of development across all buyers, thus reducing the impact of the inefficiencies for each buyer.
Is software development really so tedious and inefficient as proponents of buying suggest? And what could be done to make software development more efficient? In his 1987 paper Fred Brooks explains why "there is no single development, in either technology or in management technique, that by itself promises even one order-of-magnitude improvement in productivity". He adds, though, that "many encouraging innovations are under way" and that "a disciplined, consistent effort to develop, propagate, and exploit these innovations should indeed yield an order-of-magnitude improvement." I believe that he was right and in the 25 years since he wrote this paper huge gains in developer productivity have been achieved.
Brooks makes an important distinction between essential complexity and accidental complexity. Essential complexity is the complexity of the business problem the system is implementing. It might seem that there is nothing an IT team can do to reduce essential complexity. They cannot and should not change the business. Interestingly, though, in projects with long cycle times (years) there is a tendency for the business to be somewhat speculative and request all functionality that they can think of because of the worry that if they leave something out of the requirements it will not be delivered for a very long time. With prioritised iterative delivery the business can halt a project when all features that are actually needed have been completed. Of course, this does not really reduce essential complexity but it does reduce the amount of features that are implemented, and based on my experience, quite substantially so.
Returning to Brook's distinction, accidental complexity is the complexity caused by the tools and techniques used to implement the system. This is complexity that would not exist without the implementation. As Brooks predicted no order-of-magnitude reductions in accidental complexity have occurred. For example, there is no widespread use of expert systems or graphical programming environments, and it is not for a lack of trying that such tools do not exist. End-user programming, or automatic programming as Brooks called it, is not in widespread use either, maybe with the exception of Excel spreadsheets. However, domain specific languages are a promising current development that has the potential to achieve some breakthroughs in this area.
One area that Brooks underestimated are integrated development environments (IDEs). The capabilities a modern IDE provides in areas such as code navigation and automated refactoring have resulted in impressive productivity gains. Automated refactoring in combination with test-driven development have made it possible for developers to achieve high internal software quality with less effort, thus not only increasing the efficiency with which code is written initially but also making the pace of development more sustainable as the codebase grows.
Developer workstations have become substantially faster, allowing for more complex IDEs and tools. Improved compilers, for example, create more optimised code without developers having to worry about details such as register allocations. The major productivity boost in this area comes with the fast internet connection, though. Rather than experimenting or discussing problems on mailing lists and newsgroups, developers can now access online communities that provide answers to many common development issues within minutes, either because previous discussions of the problem are easily searchable and accessible or because peers answer in real-time using internet relay chat (IRC).
The improvements in hardware obviously do not only affect developer workstations but they also impact production environments. In many cases it is now possible for developers to trade some wastefulness of resources at runtime against increased productivity at development time. For example, instead of spending effort on writing code to manage memory, developers can reduce the accidental complexity associated with manual memory management by employing an automatic garbage collector. There is a line of argument that on modern hardware automatic memory management even increases efficiency due to higher locality of reference and better resulting cache use, but even if automatic memory management is slower, the trade-off is worthwhile in all but a few specialised areas.
Lastly, Brooks and many others in the early nineties envisioned a thriving commercial market for components, creating a middle ground between buying complete packages and writing the entire solution. By and large such markets have not been overly successful. There are exceptions, but in general these markets have been superseded but something even better: open source frameworks. Many developers and development organisations have found that sharing the source code for basic building blocks did not harm their organisations, by helping the competition, but instead helped the organisation by gaining free development capacity from other collaborators. A further benefit of frameworks that are created directly by the practitioners is that their feature set is determined by actual needs and not driven by feature checklists. Innovation extends to the public sites on which the frameworks are maintained, enabling ever more flexible and effective ways to share code.
Over the last 30-40 years software solutions have reached more and more areas of businesses but it is important to remember that they started as tools providing operational efficiency in certain areas. It is maybe that background that has led many organisations to believing that software development is not part of their core competency. Sometimes people refer back to examples from the early twentieth century when factories had their own electricity generation facilities. The analogy is clear: factories switched to electricity off the grid as soon as that was available because generating electricity was not at the core of their business. Their core competency lay elsewhere, in the production of cars for example. So, why should an enterprise that, say, specialises in managing insurance have competency in software development? In fact, with cloud computing the question that is being asked is whether many companies should have any competence in IT, or whether they should move it all into the cloud, much like the factories moved their energy problem to the grid.
This issue is different from the two issues discussed previously. The assessment of risk and efficiency is changing because of advances in the way software is developed and delivered. The question of competency is mostly affected by how the businesses are changing. Software solutions are no longer just tools that make it possible to run a certain business process more efficiently. Software solutions have become enablers of completely new ways of doing business.
About 25 years ago a newspaper benefitted from electronic desktop publishing, but, still, this only made the process of laying out the pages more efficient. There was little reason to write bespoke publishing software. Today, as much of the business has shifted onto the web, the software behind the newspapers' websites has become strategically important to their business. For example, the audience is no longer just the loyal reader of the paper. It is equally the person who arrives at an article via a search engine. If the website can provide easy access to related content, the visitor will stay longer, see more adverts, and therefore increase revenue for the paper. Providing compelling content is still a task for the editors and writers, but making it available at the right time in the right context is a function of the software.
The newspaper example is just one of many. The trend for software to become a strategic enabler for the business can be found in many domains and verticals. Now, if a piece of software is strategic to a business, the result of choosing from the same set of packages as the competition should be obvious: little or no competitive advantage. As discussed before there may be less risk for the IT department by buying a package, but it certainly will make it much harder for the business to get an edge over the competition who can chose from the same packages. Even worse, though, reacting to change or implementing a brilliant new idea will take much longer if a product and an external vendor are involved. Cycle-time, that the time between having an idea and seeing it running in production, can be reduced by having a well-oiled IT function that uses practices and techniques from Agile methods and Continuous Delivery.
It seems that even if software development was not a core competency for most companies it is rapidly becoming strategically important for many to make software development a core competency. This may require some painful changes and a shift in corporate culture, but ignoring new realities has rarely helped either.
In part 2 I discuss issues with buying, traditional issues and new issues that have emerged over the past 5-10 years.
 Mitchell Bingemann, "Dealers still fuming at 'clumsy' Telstra system". The Australian, 2 June 2009.
 Tom DeMarco, "Software Engineering: An Idea Whose Time Has Come and Gone?" IEEE Software, July/August 2009.