Your software development team likely has a process for dealing with requirements, estimating, allocating tasks and completing coding work. After all, developing software is just about providing application features isn't it?
Each and every software project has an internal work stream, which if neglected can start undermining your ability to produce quality software. I've heard this type of work referred to by a number of different names: core, infrastructure, supporting code or framework. I propose that it can be summed up as:
Work that needs to be done to support getting the primary work done.
I'll refer to this type of work as infrastructure throughout the rest of this post.
Let's look at a typical enterprise software project, what are some examples of infrastructure work?
- Day one experiences
- Workstation setup
- Installed software
- E-mail setup
- Network permissions
- Team introductions
- Code & business walkthroughs
- Development servers
- Software licenses
- Source control
- Logging & monitoring
- Security & Permissions
- Continuous Integration Setup
- Centralised compilation
- Automated tests
- Code inspection and reporting
- Scripted deployment
- Code + configuration
- Dependencies such as IIS, MSMQ
- Automated backups and disaster recovery
- Living documentation - e.g. wiki.
I'd even go so far as classifying refactoring and unit testing as a form of on-going infrastructure work.
Physical infrastructure work usually consists of one-off items which can sometimes be solved with a credit card. If you can solve the problem with money then take advantage! It's the problems which can only be solved with time which grow into ugly monsters.
As a software developer I experience most problems (unsurprisingly) around development infrastructure - in particular around refactoring and continuous integration systems. What happens when we start neglecting such items of work? Firstly, we should recognise why infrastructure tasks often take a back seat:
- Lack of time. Ever heard "we don't have enough time to setup a repeatable build and deployment system", only for the project to spend months in the integration and roll-out phase? Or my personal favourite "we don't have time for backups", only for progress to grind to a halt when the shared development server overheats and cooks the hard drive? I've experienced both; the latter resulting in the calamity of attempting to rebuild the server from copied files on each developer's machine.
- Lack of money. The customer is only interested in paying for software features. Management therefore decides to concentrate resources on developing features exclusively. Everything else is labelled as non-billable and therefore not justifiable as legitimate work items.
- Lack of experience. Sometimes a development team, in particular management, just doesn’t have the experience required to recognise that infrastructure work exists. If you don't know that work exists how can you possibly plan and allocate resources to it?
- Lack of skills. You may have a crack team of developers writing code but development infrastructure tasks require a holistic overview of the project. Infrastructure work often touches all areas of a system, meaning that people skills are just as important as technical ability.
- Lack of discipline. Given the choice most developers would like to spend their time writing code and not spending time on those other "boring tasks".
- Lack of tools. I've participated in teams where the purchasing of software licenses is handled by a far removed finance department. This often results in the development team waiting weeks and weeks for a piece of software that would otherwise dramatically improve their productivity. In one case an order was rejected because the finance department didn't understand what the software was for! This is a slight variation on the cost-cutting behaviours in the Short Pencil project pattern.
So, we start our project and spend our time writing features and neglecting infrastructure work:
Look at all of those features we are getting done! Unfortunately this graph doesn’t show the consequences of such an approach, for that we need to look at the overall team velocity:
It's possible for the team to achieve a pretty impressive velocity during the early stages (or iterations) of the project. The problem is that this approach is unsustainable. At some point the technical debt from neglecting infrastructure tasks will begin to overwhelm the team. Examples of such technical debt are the accumulation of spaghetti code from a lack of refactoring, brittle code from a lack of testing, a painful manual integration or deployment step or attempting to retrospectively apply logging and permissions across the entire application.
The end result is that progress can slow to where it's possible for your project to remain 90% complete for 90% of its lifecycle - killing team morale in the process. Your project has derailed into a Death March. Any attempts to predict a completion date go out of the window and you'll join project Chandler and many other software projects at the bottom of Fred Brooke's tar pit.
By ignoring infrastructural tasks you've created a massive amount of undone work. To make matters worse you are faced with the monumental task of having to complete the undone work at the end of project where it is much more difficult.
A better method is to prioritise both feature and infrastructure work based on the projects lifecycle:
What we see is that during the initial phases of the project a lot more effort is spent on infrastructure tasks than on developing features. The development team will firstly spend their time creating an environment which enables them to ship features quickly, but spend only a small amount of time actually developing such features. Such work items could involve setting up a CruiseControl server, creating virtualised development environments, implementing and agreeing on automated code inspections. As the project progresses we see items such as unit tests, security, logging and code inspection taking a more prominent role.
The idea is to even out the team velocity to a sustainable and predictable pace.
The failure mode for this infrastructure-first approach is when the project manager measures progress by the number of completed features. The problem with infrastructure tasks is that they are much less visible than software features and therefore more difficult to manage. Since less time is being spent on features it can appear (from the outside) as if the project is in trouble from the beginning. The temptation is then to add more resources to the project which is the easiest way to make a bad situation worse.
Technical Debt has a less known cousin, Technical Credit. By prioritising sound software engineering principles and paying attention to infrastructural work it's possible to reap massive benefits in later iterations.