Alberto Savoia on crappy code and the beauty of testing

Alberto Savoia is on a mission: He wants to banish crappy software, or at the very least, remove any excuses for its existence. His company, Agitar Software, is making people pay attention to testing and how it should be done, winning a couple Jolt awards and a Wall Street Journal Innovation Award.

Savoia will present “Better Code: Recognizing, Avoiding and Sanitizing Crappy Software” at the Business of Software conference ( Bob Cramblitt spoke to him about the cause and effects of crappy software, a Google-like approach to software quality, and how testing can be a beautiful thing.

Cramblitt: Your company is named Agitar Software and you are a self-proclaimed agitator.  Where are you doing your most serious agitation these days?

Savoia: My mission and passion is to help developers take responsibility for, and succeed at, unit testing their own code. There’s an obvious and beautiful symmetry between code and unit tests: every piece of code that does something non-trivial should have a set of unit tests to make sure that it does the right thing – and continues to do the right thing as the code evolves. Unit tests benefit everybody: developers working on the code today, the developers who inherit the code, QA, and – of course – end users.

Today, unfortunately, the organizations that are serious about unit testing are a minority. I plan to keep stirring things up – agitating if you will – until developer testing becomes the norm rather than the exception. I expect that this will keep me busy for a few more years.

Cramblitt: Why don’t software developers test or test better than they do?

Savoia: I used to say that there are no excuses for developers not providing thorough automated unit tests for their own code, but I have softened my stance a little bit. Developers actually do have some excuses for not testing as much as they should.

Test development is hard; it’s a combinatorial problem and it can consume a lot of time – especially if you don’t use some test automation technology. We have found that, on average, for every line of Java code, you need 3-5 lines of JUnit to achieve adequate code coverage. When you look at the kind of code most developers have to deal with, and the amount of time they are given to do their work, you realize that the schedule they are given doesn’t have much room for testing. And – amazing as it sounds – there are development managers who don’t want their developers to “waste time” writing unit tests: “Just get the features in and let QA decide if they work or not.”

With this short-term perspective and schedule restrictions, it’s not surprising that even some well-meaning developers don’t unit test as well or as much as they should.

Cramblitt: You once said that crappy code is often defined as code you didn’t write yourself. Can you elaborate?

Savoia: A few months ago, I saw a guy wearing a T-shirt that said: “Hell is other people’s code,” an interesting take on Sartre’s famous quote: “Hell is other people.” Like almost all developers, I got into programming because I loved the idea of using code to create something new and useful from basic components – coding is like playing with Lego for the mind. When I first started, I couldn’t believe that people were willing to pay me so well for doing something that was so much fun. Inheriting and having to maintain a steaming pile of legacy code that someone else wrote, on the other hand, is real work – hard work – and, for many people, crappy work.

Cramblitt: Is crappy code in the eye of the beholder or are there objective ways to define it?

Savoia: There is no fool-proof, 100% accurate and objective, way to determine if a particular piece of code is crappy or not. However, our intuition – backed by research and lots of empirical evidence – is that unnecessarily complex and convoluted code is the most likely to elicit the “crap response” by the poor developer who has to inherit it. Furthermore, since complex code is particularly hard to test, crappy code usually comes without any automated tests.

Since the combination of complexity and lack of tests are key factors in making code crappy – and a maintenance challenge – my Agitar Labs colleague Bob Evans and I have been experimenting with a metric based on those two measurements. The Change Risk Analysis and Prediction (C.R.A.P) index uses cyclomatic complexity and code coverage from automated tests to estimate the effort and risk associated with maintaining legacy code. We have implemented a free, open-source, experimental tool called “crap4j” that calculates the C.R.A.P. index for Java code. We need more experience and time to fine-tune it, but the initial results are extremely encouraging and we have started to use it in-house. Crap4J is implemented as an Eclipse plug-in and people can download and experiment along with us by going to SourceForge and searching for “crap4j”.

Cramblitt: Good code can become crap when it changes hands.  How does a company ensure that code can be maintained and improved after the original writers are no longer involved?

Savoia: Since the actual work will have to be done by individual developers, the company should take time to train each developer on the proper way to deal with legacy code. The book Working Effectively with Legacy Code by Michael Feathers should be mandatory reading for managers and developers alike. Here is a quick overview of the main directions I would give to a developer maintaining someone else’s code:

1) Make sure you have tests for a particular piece of code before you change it. A test suite is a beautiful thing to have and it will give you a safety net as you make those changes. If you are working with legacy code and have no tests, you should become an expert at writing tests that capture the original behavior of the code before making any changes. Michael Feathers calls these tests “characterization tests.”

2) Once you have tests, refactor the code to reduce complexity and make your maintenance work cleaner. Method extraction is my favorite refactoring; it’s fast, safe (especially if you have tests), and easy to revert if needed.

3) Rename cryptic or ambiguous identifiers. This is a pet peeve of mine. In this age of cheap computing resources there’s no excuse for using a variable name like “ac” instead of “areaCode.”

4) Delete. If you can identify dead, useless, redundant or obsolete code, get rid of it.

In other words, leave each piece of code you touch in better shape than it was before you worked on it. You can use “crap4j” or some other metrics tools that measure complexity and code coverage to make sure you are on the right track.

Cramblitt: You’ve said that an application with a crappy code base can still be extremely useful or wildly successful.  What are some examples of this and why are these apps successful?

Savoia: With some rare exceptions, the most successful and long-lived applications tend to have the crappiest code bases. To understand why this is so, we need to understand the difference between a software application and the code for that application. Software is the finished product, what the user uses and experiences. Code is what makes up the software and what the developers use and experience. Those two experiences can be as different as the experience of a diner enjoying a sausage and that of the meat processing workers making that sausage. The user of the software is shielded from the nastiness of its inner working.

How successful software usually leads to crappy code is easy to understand: application success means long life, and long life means maintenance and adding features, support for new platforms, etc., while maintaining backward compatibility – something that’s very hard to do cleanly. The more used and widespread an application is the worse the problem – think of the compatibility and configuration challenges that Microsoft developers have to face with Windows.

Cramblitt: Even if an app is successful, is it only a matter of time when the crappy underlying code will make the app topple like a house of cards?

Savoia: In most cases successful crappy software is not allowed to topple because it’s too important to the business. But keeping it upright and functioning requires continuous heroic efforts and consumes a tremendous amount of money and resources. The opportunity cost due to crappy code is huge.

Cramblitt: What’s the most important thing developers can do to avoid crappy code?

Savoia: The answer is pretty simple and, by now, it should not be surprising. I can phrase it as follows: If you write code, write tests for it.

Write tests whether you are writing new code, or maintaining existing code. The simple act of writing tests has many beneficial side effects beyond the obvious ones. It forces you to keep the code simple and testable. It forces you to think about your code from the user perspective – which often results in better design and more natural interfaces. It helps you discover corner cases and exceptions before they byte you. And so on. You know how doctors say that if they could put the health benefits of exercise in a pill, it would be the most prescribed pill in the world. I feel the same about testing. Think of tests as exercises for your code – the best thing you can do to keep your code healthy and in shape.

It’s hard to have a thorough set of tests if the code is hard to test, and to make code easy to test it has to be relatively simple to use and understand, it has to have minimal dependencies, etc. So, if the code is accompanied by a thorough set of tests chances are that it will not be crappy.

Cramblitt: A couple of years ago, InfoWorld quoted you as saying that Agitar wants "to do for software quality what Google has done for search quality." How do you intend to achieve that and how are you doing so far?

Savoia: I was director of software engineering at Google before starting Agitar and saw first-hand how a bunch of very smart people, armed with huge amounts of CPU power and a very sharp focus on solving a very difficult, but clearly specified, problem can achieve amazing results. Agitar is focused exclusively on the very difficult problem of test automation and, in addition to hiring the brightest people we can find to do research and develop solutions, we have also made some significant investments in computing resources. The latest version of our flagship product, AgitarOne, can generate over 250,000 lines of JUnit per hour – that’s roughly equivalent to 1,000 very productive programmers. On top of that, our generated tests now achieve an average of 80 percent code coverage.

Like Google, we are pretty generous with our technology and resources. For example, we offer a free hosted version of our test generator at I’d say that we are well on our way to do for software quality what Google has done for search quality.

Cramblitt: You contributed a piece called "Beautiful Tests" to the new O’Reilly book called Beautiful Code.  What are the principal attributes of beautiful testing?

Savoia: The main purpose of tests is to instill, reinforce or reconfirm our confidence that the code works correctly and efficiently. Therefore, to me, the most beautiful tests are those that help me maximize my confidence that the code does, and will continue to do, what it’s supposed to. Because different types of tests are needed to verify different properties of the code, the basic criteria for beauty vary.

There are tests that are beautiful for their simplicity. With a few lines of test code I can document and verify the target code’s basic behavior. By automatically running those tests with every build, I can ensure that the intended behavior is preserved as the code evolves. I love tests that take minutes to write and keep paying dividends for the life of the project.

Then there are tests that are beautiful because they reveal ways to make code more elegant, maintainable and testable. In other words, tests that help make code more beautiful. The process of writing tests often helps you realize not only logical problems, but also structural and design issues with your implementation. Writing tests forces you to put yourself in the shoes of a user of your code and that’s an invaluable perspective.

Finally, there are tests that are beautiful for their breadth and depth. Very thorough and exhaustive tests boost the developer’s confidence that the code functions as expected not only on some basic, or handpicked, cases, but in all cases.

[Enjoyed this article? Subscribe to this blog’s RSS feed]

One response to “Alberto Savoia on crappy code and the beauty of testing”

  1. Shad Aumann says:

    Alberto Savoia recently posted a really interesting article about how automated software quality metrics could potentially be mis-used.