Whew! I've been really busy. Seems impossible that it's over a month since I posted, but there it is.
The week before last, I attented SQE's Agile Development Practices conference in Orlando. The highlight of the conference for me was J. B. Rainsberger's tutorial on Test-Driven Enterprise Code. I'll post my thoughts on that later, but for now I'll just say that I usually consider a conference to have been worthwhile if it changes my thinking, even in a small way, and that's exactly what his talk did.
Right now, however, I'd like to rant. I keep hearing, and heard several times at the conference, that TDD "isn't about testing". It's a "design technique". Well, that's just plain wrong.
Test-driven development is about testing. It's about testing your code as early as possible, to minimize the feedback loop. It's about forcing yourself to design your code to be testable, so that you can write good tests for it, and maintain those tests easily. TDD is about making sure you have testable, tested code. There, I said it.
Now, does TDD influence design? Sure. Testable designs also tend to be designs that separate responsibilities, producing cohesive, loosely coupled classes and functions. Furthermore a good test suite enables refactoring, which is a critical activity in emergent design. But to say that TDD influences and supports design is a different thing from saying that it is a design technique.
Thursday, December 20, 2007
Wednesday, November 14, 2007
Position soon to be open:
Borland's marketing communications editor.
I just got the following email from Borland:
Join the other 29%... of what? Why failed software projects, of course! Sure, 71% of failed software projects fail because of poor requirements, but after reading this paper, you can join the 29% that fail for some other reason! Yes!
I just got the following email from Borland:
It’s amazing that poor requirements account for 71% of failed software projects*. And, unfortunately, no matter how efficient of a requirements management system you have, it might not be enough to save your projects from joining that statistic. That’s why defining complete and accurate requirements is a critical first step to ensuring you manage the right requirements throughout a project's lifecycle. In this white paper you’ll discover how to:
- Capture and analyze requirements faster and more accurately—without relying on sticky notes and long-lost emails
- Reduce validation cycles—use storyboarding to enhance communication between teams and speed sign-off on requirements
- Synchronize business flows and test cases—build testing scenarios that accurately map to the needs of end users
Defining requirements correctly up front can be pivotal in preventing the chain reaction of poor communication and missed deadlines that lead to rework and project failure. Don’t let this happen to you. Join the other 29%.
Join the other 29%... of what? Why failed software projects, of course! Sure, 71% of failed software projects fail because of poor requirements, but after reading this paper, you can join the 29% that fail for some other reason! Yes!
Tuesday, November 13, 2007
Classifying information (or "what we know about what we know")
Information about any project can be divided into four categories:
1. Things we know (and know we know)
2. Things we know we don't know
3. Things we think we know, but don't (i.e. things we're wrong about)
4. Things we don't know we don't know
Obviously, if you were to try to actually figure out where everything falls, you would put everything into 1 or 2. Everything that should be in 3, you would put in 1 (you're not going to have known mistakes in your information), and everything that should be in 4 would simply be missing.
However, without dealing with specific items, I do think that it's possible to guess at how much "stuff" goes in each category. You can take into account your history ("I tend to often be mistaken about X"), or a general feeling of ignorance ("I've never used framework Y before") to guess how much goes in each category. For example, consider two projects:
Project 1
1. Uses a language with which the developers are familiar
2. Doesn't have many or unfamiliar external dependencies
3. Is likely to have stable requirements
Project 2
1. Uses a language with which the developers have little experience
2. Has many external dependencies, including a database of undetermined type.
3. Is likely to have changing requirements
In project 2, because of the unfamiliar language, there are likely to be a lot of things I'm mistaken about (category 3), and a lot of things I don't know I don't know (category 4). Because of the unknown database and the known instability of the requirements, there are important things I know I don't know (category 2). Now, it's unlikely I can assign a specific number to each category--I won't know that 25% of what I think I know is wrong, but I can say "I think there's a lot I don't know I don't know" or "I know this pretty well, I'm probably not mistaken about a lot".
By taking time to estimate how much of the information about the project falls into each of these categories, we can actually gain valuable information from our lack of information. The more we estimate we don't know (anything in 2, 3, or 4), the more care we need to take to accelerate feedback. Whereas project 1 might be able to do 1 month iterations, project 2 might need mostly 1 week iterations. Project 2 might require daily standups, while developers on project 1 might be fine meeting weekly or less.
Take some time to estimate what you don't know, as well as what you know. What you don't know is often one of the most powerful factors influencing a project's schedule and outcome. Agility requires you to use all the information you have at your disposal, including information about your ignorance.
1. Things we know (and know we know)
2. Things we know we don't know
3. Things we think we know, but don't (i.e. things we're wrong about)
4. Things we don't know we don't know
Obviously, if you were to try to actually figure out where everything falls, you would put everything into 1 or 2. Everything that should be in 3, you would put in 1 (you're not going to have known mistakes in your information), and everything that should be in 4 would simply be missing.
However, without dealing with specific items, I do think that it's possible to guess at how much "stuff" goes in each category. You can take into account your history ("I tend to often be mistaken about X"), or a general feeling of ignorance ("I've never used framework Y before") to guess how much goes in each category. For example, consider two projects:
Project 1
1. Uses a language with which the developers are familiar
2. Doesn't have many or unfamiliar external dependencies
3. Is likely to have stable requirements
Project 2
1. Uses a language with which the developers have little experience
2. Has many external dependencies, including a database of undetermined type.
3. Is likely to have changing requirements
In project 2, because of the unfamiliar language, there are likely to be a lot of things I'm mistaken about (category 3), and a lot of things I don't know I don't know (category 4). Because of the unknown database and the known instability of the requirements, there are important things I know I don't know (category 2). Now, it's unlikely I can assign a specific number to each category--I won't know that 25% of what I think I know is wrong, but I can say "I think there's a lot I don't know I don't know" or "I know this pretty well, I'm probably not mistaken about a lot".
By taking time to estimate how much of the information about the project falls into each of these categories, we can actually gain valuable information from our lack of information. The more we estimate we don't know (anything in 2, 3, or 4), the more care we need to take to accelerate feedback. Whereas project 1 might be able to do 1 month iterations, project 2 might need mostly 1 week iterations. Project 2 might require daily standups, while developers on project 1 might be fine meeting weekly or less.
Take some time to estimate what you don't know, as well as what you know. What you don't know is often one of the most powerful factors influencing a project's schedule and outcome. Agility requires you to use all the information you have at your disposal, including information about your ignorance.
Sunday, November 11, 2007
What Code Reviews Are Good For
I don't have any hard evidence to back up this claim, but my guess is that most people think of defect detection when they think of code reviews. And well they should--code reviews are one of the most effective defect-detection methods around, falling just behind high-volume beta testing and prototyping, according to Steve McConnell's Code Complete. If we consider code and design reviews together, reviews may come in at the top of the heap. There's no doubt that code reviews provide value as a defect-detection technique.
However, I think if the value of code reviews is assessed solely on this, a great deal of the real value of code reviews will be overlooked. The other benefits of code reviews could easily equal, and perhaps surpass their value as defect detectors.
Code reviews are frequently the birthplace of standards. If a person sits down to write a "company coding standards" document, unless they have a great deal of experience with the company (much of it probably reading/reviewing code), that document will include many things that no one is doing, and miss real problems that pervade the code. Code reviews bring forward patterns, both good and bad, that actually exist. The good patterns solve things in a way that everyone ought to be solving them, and the bad patterns become part of the list of things not to do.
Code reviews also reduce duplication of code and effort. If two programmers have written a function to do the same thing, and one of them sees the other's function in a code review, they have the opportunity to select the best implementation, or the best parts of each implementation, and use the common function in both places. This effect is contagious, since now, if anyone involved in that review participates in another review where the same problem is being solved again, they can call attention to the duplication and resolve it. Duplication of effort can be reduced as programmers who are planning to implement X, or have started implementing X, may see X implemented in a code review, and can then use that implementation.
Code reviews build solidarity. This idea might seem humorous to someone who has seen code reviews descend into backbiting and defensiveness. However, I would suggest that code reviews do not cause these phenomena, they merely expose them. If your team was really built on mutual, professional respect, you wouldn't see ugly behavior in or "because of" code reviews. Eliminating code reviews because "they don't work due to personal issues" is like killing the messenger. The personal problems will still destroy you, but I guess at least you won't have to hear about them for a while. On a team where respect is the rule, however, code reviews provide a sense of collective ownership, and collective pride in the final quality of the code.
Finally, code reviews provide a forum for conflict. Conflict, where there is mutual respect, is a good thing. Conflict brings to light the advantages and disadvantages of different approaches to a problem, many of which would have been unknown or unconsidered by any particular individual. The final solution chosen will be one that takes the most important factors into account, rather than just the factors known to one person (the author).
This is a short list, but I think it includes some of the most important benefits of code reviews. What would you add? What, in your experience, are other benefits of code reviews?
However, I think if the value of code reviews is assessed solely on this, a great deal of the real value of code reviews will be overlooked. The other benefits of code reviews could easily equal, and perhaps surpass their value as defect detectors.
Code reviews are frequently the birthplace of standards. If a person sits down to write a "company coding standards" document, unless they have a great deal of experience with the company (much of it probably reading/reviewing code), that document will include many things that no one is doing, and miss real problems that pervade the code. Code reviews bring forward patterns, both good and bad, that actually exist. The good patterns solve things in a way that everyone ought to be solving them, and the bad patterns become part of the list of things not to do.
Code reviews also reduce duplication of code and effort. If two programmers have written a function to do the same thing, and one of them sees the other's function in a code review, they have the opportunity to select the best implementation, or the best parts of each implementation, and use the common function in both places. This effect is contagious, since now, if anyone involved in that review participates in another review where the same problem is being solved again, they can call attention to the duplication and resolve it. Duplication of effort can be reduced as programmers who are planning to implement X, or have started implementing X, may see X implemented in a code review, and can then use that implementation.
Code reviews build solidarity. This idea might seem humorous to someone who has seen code reviews descend into backbiting and defensiveness. However, I would suggest that code reviews do not cause these phenomena, they merely expose them. If your team was really built on mutual, professional respect, you wouldn't see ugly behavior in or "because of" code reviews. Eliminating code reviews because "they don't work due to personal issues" is like killing the messenger. The personal problems will still destroy you, but I guess at least you won't have to hear about them for a while. On a team where respect is the rule, however, code reviews provide a sense of collective ownership, and collective pride in the final quality of the code.
Finally, code reviews provide a forum for conflict. Conflict, where there is mutual respect, is a good thing. Conflict brings to light the advantages and disadvantages of different approaches to a problem, many of which would have been unknown or unconsidered by any particular individual. The final solution chosen will be one that takes the most important factors into account, rather than just the factors known to one person (the author).
This is a short list, but I think it includes some of the most important benefits of code reviews. What would you add? What, in your experience, are other benefits of code reviews?
Monday, November 5, 2007
Honesty fuels Agility
Another way to say that is, the more honest you are, the more agile you will tend to be in your software development. The more personal and interpersonal honesty you practice, the more able you will be to respond to change.
If you are honest with yourself, you will admit what you don't know. You'll admit that you don't know the future, and that you're not that good at predicting it. You'll admit that you don't fully understand every detail of the components your software will have to interact with. You'll admit that you aren't able to conceive of an implementable design, beginning to end, entirely in your head or on paper, without ever validating it in actual code. You will also admit these things to others, which, if they are being honest with themselves and with you, will help everyone to form better expectations.
The more honest you are with yourself, the clearer picture you'll have of your own limitations. This is what agility is really about. Agility is not avoiding all design up front. Agility is maximizing what you know, and not pretending to know what you don't. That means you can do some design up front--you just don't do more than you know is valid. Once you reach that point, you validate. Obviously, this point will be different for different people, and different for different kinds of tasks.
O.K., so you might think I'm saying a bunch of nothing--I've defined "agility" in such a way that no one would think it was a bad thing, and I've said things that boil down to "you should do just enough design up front". Enlightening, right?
Well, two things. Although I think that most of the things I've said would be hard to disagree with, I believe that agreeing with an idea is one thing, and focusing on it and practicing it is another. Focusing on personal honesty will lead to practicing agility.
Secondly, although every agile practice may not be applicable to every environment and task, I do believe that many of the codified agile practices grew out of environments where people were practicing this kind of honesty. I know that the best design ideas often come out of discussions with a co-worker. I also know that others often see problems or potential improvements to my code that I don't. Knowing this, I want to get the benefit of this kind of feedback as quickly as possible. This drives me toward early code reviews and pair programming. I know that, once a system gets to a non-trivial size, I can't even fully understand all the interactions of my own code, much less someone else's. This drives me toward automated tests. And, once again, I want this feedback as early as possible, so I am driven towards automated tests developed concurrently with the code. Finally, a really good way to make sure I don't write more untested code than I can understand is to write the tests before I write the code. Then the quantity of untested code is always zero.
Take some time to focus on honesty, especially personal honesty. It will make you a better, and more agile, developer.
If you are honest with yourself, you will admit what you don't know. You'll admit that you don't know the future, and that you're not that good at predicting it. You'll admit that you don't fully understand every detail of the components your software will have to interact with. You'll admit that you aren't able to conceive of an implementable design, beginning to end, entirely in your head or on paper, without ever validating it in actual code. You will also admit these things to others, which, if they are being honest with themselves and with you, will help everyone to form better expectations.
The more honest you are with yourself, the clearer picture you'll have of your own limitations. This is what agility is really about. Agility is not avoiding all design up front. Agility is maximizing what you know, and not pretending to know what you don't. That means you can do some design up front--you just don't do more than you know is valid. Once you reach that point, you validate. Obviously, this point will be different for different people, and different for different kinds of tasks.
O.K., so you might think I'm saying a bunch of nothing--I've defined "agility" in such a way that no one would think it was a bad thing, and I've said things that boil down to "you should do just enough design up front". Enlightening, right?
Well, two things. Although I think that most of the things I've said would be hard to disagree with, I believe that agreeing with an idea is one thing, and focusing on it and practicing it is another. Focusing on personal honesty will lead to practicing agility.
Secondly, although every agile practice may not be applicable to every environment and task, I do believe that many of the codified agile practices grew out of environments where people were practicing this kind of honesty. I know that the best design ideas often come out of discussions with a co-worker. I also know that others often see problems or potential improvements to my code that I don't. Knowing this, I want to get the benefit of this kind of feedback as quickly as possible. This drives me toward early code reviews and pair programming. I know that, once a system gets to a non-trivial size, I can't even fully understand all the interactions of my own code, much less someone else's. This drives me toward automated tests. And, once again, I want this feedback as early as possible, so I am driven towards automated tests developed concurrently with the code. Finally, a really good way to make sure I don't write more untested code than I can understand is to write the tests before I write the code. Then the quantity of untested code is always zero.
Take some time to focus on honesty, especially personal honesty. It will make you a better, and more agile, developer.
Wednesday, October 31, 2007
When Metric Gathering Goes Wrong
Metrics, properly used, are a valuable tool for monitoring and improving your software development process. There's a lot of information available about what metrics to gather, what to publish, what not to publish, and how to interpret your findings.
Today, however, I want to deal with the problem of intrusive metrics gathering. How much intrusion on a developer's workflow should we allow in the name of metrics-gathering? One obvious answer is "when the benefit provided by knowing the metric is less than the detriment caused by gathering it". How can we determine when that is the case? By measuring the benefit and....uh, right.
O.K., So that sounds like a good principle, but there's no practical way to implement it directly. As a rule of thumb, it's all right--you might try to feel out whether your developers feel unduly burdened, but that won't give much clear direction. How about this:
- Reducing feedback loops wherever possible (using TDD, pair programming, code reviews...)
- Eliminating duplication wherever I find it
- Cleaning my own code before submitting it, and cleaning any code I touch
That's a short list. Whatever your list looks like, eliminate any method that would prevent anyone from doing the right thing, where prevent means that a conscientious developer would choose, at least some of the time, to skip doing the right thing, in order to avoid the overhead.
An Example
Suppose we decide that we want to gather information about the kind of changes being made to our code, and we decide we want to use our version control system to do it. Suppose we're interested in three kinds of changes, defect fixes, refactorings, and new functionality. Now, there's no automated way I know of to identify the purpose of a change, so we ask the developers to simply indicate what kind of change they are submitting by including a line "changetype:letter" where letter is f, r, or n, as appropriate, in the changelist description. Each changelist is only allowed to belong to a single category--if a developer wishes to make changes to a file that fall under multiple categories, he must create a separate changelist for each. Additionally, each fix, refactoring or new task should have its own changelist. Sounds great, right?
Well, consider that our version control system only allows one local version of a file. That means a developer can only make changes pertaining to a single changelist at a time. That means that if, while implementing a new feature, I notice a coding standards violation in existing code, I have to make a terrible choice. I must either:
A. Fix the violation now, and then, when I want to submit, change the code back, make a note of the defect somewhere, submit the changelist as new functionality, check the file out again and make the fix (and repeat for each fix).
or
B. Stop coding, make a note of the defect somewhere, try to get back in the zone, submit my new functionality, check the file out again, and make the fix (and, again, repeat for each fix).
This is terrible. This is also not hypothetical. This is an actual suggestion that I am currently fighting. I believe that, if implemented, this rule would, at best, never be followed. At worst, it would completely destroy all efforts to clean code. All existing yucky code would become written in stone, and our code would continue to degrade at something close to the rate at which defects get past code review. Oh, and can you imagine how effective code reviews will be if every defect found requires its own changelist to fix? I predict the number of defects found would drop by 90%.
Never, never, never prevent your developers from doing the right thing. No metric is worth it.
Today, however, I want to deal with the problem of intrusive metrics gathering. How much intrusion on a developer's workflow should we allow in the name of metrics-gathering? One obvious answer is "when the benefit provided by knowing the metric is less than the detriment caused by gathering it". How can we determine when that is the case? By measuring the benefit and....uh, right.
O.K., So that sounds like a good principle, but there's no practical way to implement it directly. As a rule of thumb, it's all right--you might try to feel out whether your developers feel unduly burdened, but that won't give much clear direction. How about this:
Use no method that will prevent a developer from doing the right thing. Period.Now, I think we all have some pretty clear ideas about what the right thing is. For me, the right thing includes:
- Reducing feedback loops wherever possible (using TDD, pair programming, code reviews...)
- Eliminating duplication wherever I find it
- Cleaning my own code before submitting it, and cleaning any code I touch
That's a short list. Whatever your list looks like, eliminate any method that would prevent anyone from doing the right thing, where prevent means that a conscientious developer would choose, at least some of the time, to skip doing the right thing, in order to avoid the overhead.
An Example
Suppose we decide that we want to gather information about the kind of changes being made to our code, and we decide we want to use our version control system to do it. Suppose we're interested in three kinds of changes, defect fixes, refactorings, and new functionality. Now, there's no automated way I know of to identify the purpose of a change, so we ask the developers to simply indicate what kind of change they are submitting by including a line "changetype:letter" where letter is f, r, or n, as appropriate, in the changelist description. Each changelist is only allowed to belong to a single category--if a developer wishes to make changes to a file that fall under multiple categories, he must create a separate changelist for each. Additionally, each fix, refactoring or new task should have its own changelist. Sounds great, right?
Well, consider that our version control system only allows one local version of a file. That means a developer can only make changes pertaining to a single changelist at a time. That means that if, while implementing a new feature, I notice a coding standards violation in existing code, I have to make a terrible choice. I must either:
A. Fix the violation now, and then, when I want to submit, change the code back, make a note of the defect somewhere, submit the changelist as new functionality, check the file out again and make the fix (and repeat for each fix).
or
B. Stop coding, make a note of the defect somewhere, try to get back in the zone, submit my new functionality, check the file out again, and make the fix (and, again, repeat for each fix).
This is terrible. This is also not hypothetical. This is an actual suggestion that I am currently fighting. I believe that, if implemented, this rule would, at best, never be followed. At worst, it would completely destroy all efforts to clean code. All existing yucky code would become written in stone, and our code would continue to degrade at something close to the rate at which defects get past code review. Oh, and can you imagine how effective code reviews will be if every defect found requires its own changelist to fix? I predict the number of defects found would drop by 90%.
Never, never, never prevent your developers from doing the right thing. No metric is worth it.
Monday, October 29, 2007
On Defect Indebtedness
Kristan Vingrys writes about defect debt on his blog:
Secondly, I believe that, as a rule of thumb, allowing defects to accumulate in an agile project is completely counterproductive. Agile projects rely on continuous care to maintain agility. Because we don't know the future, we have to make the most of what we do know. Allowing new code to grow up around known defects is just making more work down the road, as that means more code that will have to be changed when the defect is finally fixed. If fixing the defect today would require significant changes to the code, waiting isn't going to make the situation any better. Waiting will make it worse.
"On an Agile project defects are fixed as they are found, so no story should be considered complete if there are outstanding defects with it. But this does not mean that defects do not persist."He describes those defects which persist beyond the story to which they pertain (or persist beyond the iteration in which they were created) as "defect debt". This seems a reasonable description. He further describes the sort of defects that lead to defect debt:
"they generally exist because of the way code is working, or the lack of code to handle a situation. In some cases defects are not fixed as they are extremely difficult to fix, so the customer would prefer to focus the effort elsewhere. Often these defects would require significant changes to the code."This makes sense. Defects that are easily fixed ought to be fixed when they are found. So far so good. Finally, he describes an approach to dealing with defect debt:
"Defects that will not be addressed immediately should be captured as separate cards and fixed whenever possible without impacting velocity."I have a problem with this statement. Two problems, actually, one with the statement itself, and one with the idea behind it. First, I don't understand how it's possible that a defect that's not one we can address immediately, one that is "extremely difficult to fix" or "would require significant changes to the code" can also be fixed without impacting velocity. Isn't that pulling work out of thin air? The reality is, if the defect is going to be fixed, it's going to impact velocity. Even simple defects will impact velocity, although the individual impact may be imperceptible.
Secondly, I believe that, as a rule of thumb, allowing defects to accumulate in an agile project is completely counterproductive. Agile projects rely on continuous care to maintain agility. Because we don't know the future, we have to make the most of what we do know. Allowing new code to grow up around known defects is just making more work down the road, as that means more code that will have to be changed when the defect is finally fixed. If fixing the defect today would require significant changes to the code, waiting isn't going to make the situation any better. Waiting will make it worse.
Subscribe to:
Posts (Atom)