Beyond Estimation

Note: An edited version of this essay appears in the book, The People’s Scrum, published by Dymaxicon, May 2013. 
———-

Most Scrum teams these days are taught the estimation techniques described in Mike Cohn’s Agile Estimation book , namely the estimating of user stories (aka requests) in relative size using techniques such as planning poker, and the estimating of tasks in hours. I eschewed task estimates a few years ago, see Unearthing Impediments by Doing Less. More recently, perhaps over the past 2-3 years I have found less and less need for story estimation. I find the practice is time-consuming, creates unnecessary overhead, and adds very little to a team’s ability to make commitments, and meet them. On occasion it has caused irritation and distress in both team members and customers who simply do not see any value in it.

I am not against having a team experience distress, or breakdowns as a method of learning, but in the case of estimation I don’t find it worthwhile. There are better things to focus on, such as writing well-formed user stories, requests that have clear value and conditions of acceptance. The dialog around defining the conditions of acceptance tends to cause requests to become smaller. It is very hard to create and agree on a set of acceptance criteria for large or vague requests.

If velocity is important (and I’m not convinced it is) then it can be done through a count of completed requests. When requests are kept small they tend to be much of the same size. If they are not, it may indicate a lack of clarity both about what we are asking for (the PO) and what we are committing to (the team), so rather than slapping on a big 21, 34 or 55, it is better to spend a little more time, think a little differently and refractor the request down into a set of smaller requests. I always recommend to teams that they take on between 5-10 stories per sprint. Somehow this seems to work out no matter the domain or the length of sprint. Having smaller requests also aids the prioritization process as we learn what need not be done.

I have come to believe that estimation using any number system, whether it be ideal hours or relative points (which ultimately map to time) is compliance to an outmoded system, a way of appeasing old-school management who are unable, or perhaps more accurately unwilling to think beyond data points and (empty) promises.  There is still a mindset in the Agile world, as much as there ever was in the PMI world, that teams can somehow commit to both time and scope. Estimation, as it is commonly done perpetuates that myth, or at best does not challenge it.

A team should commit on gut feel more than on data points. We are feeling people as much as we are thinking people, ironically the former may be more important when working in the knowledge industry. At the end of each planning meeting the team members should ask the question, do we honestly believe we can meet the commitment we just made? If the answer is not a resounding yes, then reduce the commitment until it is. And no, this does not result in teams under-delivering. I have never once experienced that phenomenon.

Moving away from the obsession with numbers allows us to begin to trust our instincts, to draw on our individual real-world experiences and our “team intelligence” to give truthful responses to the “when will it all be done” question.
If you find your team is bound by Estimation Think, I challenge you to spend a few sprints not estimating, but spending the time instead to make the stories in your backlog smaller, each having a clear value statement, and reasonable conditions of acceptance. You may be surprised by what happens.

42 responses to “Beyond Estimation

  1. “I am not against having a team experience distress, or breakdowns as a method of learning” – I know what you’re saying, but I’m having a hard time accepting.

    However the ‘estimates using a number system’ brought back a favourite memory of one team update, where on determining ‘Task A’ was “70% complete” my manager asked how much work was left; I knew what he was really asking for – so I paused a moment before replying “…30%?”

  2. I like this approach, though can only recall using it once. It certainly seems that more team maturity or discipline would be needed to give up story estimates and velocity in favor of better defined stories of uniform size. I’d be interested in a set of tips or kata or etudes that could help a team move to this approach. Maybe the technique to switch is to continue estimating and do 5-whys or fish-bone or something on any story that’s not a 1. Once you’ve gone 4 iterations with only 1’s, you’ve arrived. ?

    I’m likewise not a fan of estimating tasks and wrote about my struggle with it.

  3. I have also had the feeling in some estimation sessions that we were loosing too much valuable time in not so valuable discussions. At the same time, I wonder if not doing estimation at all would go against one of the tenets of Scrum, which is timeboxing the work.

    • I don’t think Tobias was advocating doing away with timeboxing (iterations) with this post. Are you thinking of an estimate of being a timebox on a story? I don’t.

      • In a way. We decide how much work we should be able to complete in a timeboxed iteration based on these estimations and this becomes one of the major reasons that push the team to stay focused in finishing what was estimated in the first place (i.e. what’s the point of a timeboxed iteration if you don’t have a clue of what you are trying to accomplish in that period?). I understand that Tobias says to try to put the focus in developing small user stories which should take more or less the same time and I agree. I just wonder until what point we can get away without estimating and how the focus of the team is impacted.

  4. Hi Tobias,

    I largely disagree on experierience with my own projects,but perhaps my experience is different. We did find that estimating in hours to lack the precision we hoped for and experimented with not estimating at all. The problem we encountered was a transparency among ourselves during the sprint. Next we tried estimating in days, wch matched our needs and solved that. The concern I have with your suggestion that detailed acceptance tests can replace the need for estimation is that this approach can create the illusion of certainty and blind the team from opportunities that arise during development.

    Also, I’ve found that the act of estimation drives cross-discipline discussion over the underlying needs and reinforces a sense of commitment from the team. It’s understood that they will change.

    Same for story points, but again YMMV. Story points, t-shirt sizes, etc are not for estimation so much as they are for driving conversation and maintaining a forecast.

    Again, this is my experience on highly creative, exploratory, iterative and cross-discipline products.

    • Hi Clinton, I agree that “the act of estimation drives cross-discipline discussion” and this has been my experience too. It is just that I have also found the same kind of dialog can be engaged in without the end goal of having a number. As to the idea of clear acceptance criteria creating “illusion of certainty” that would depend very much on the team and the coach, right? It need not be so. Having a goal in mind never prevents us reflecting mid stream and changing direction. That is always a choice.

      • Hi Tobias,

        The “illusion of certainty” I’ve seen comes from detailed acceptance criteria that is not truly certain, but given because the customer feels driven to deliver it. Variance is good and is the main thing that can be a problem applying Kanban everywhere (you and I have agreed on this). Scrum teams are best to swarm on a complex problem with all its uncertainty.

        I take the point that estimating sprint backlog hours might be replaced by good facilitated discussion and would like to see it, but I’m not convinced that story points can be dismissed as well. I think they wound up in the pile because there is a “no love for numbers” thing going on here 😉

        Thanks for making us think harder!

        Clint

  5. Usually when smart people discuss doing away with estimates there is a list of conditions and among them I usually find a comment about making stories small and about the same size. In this article it showed up in this quote:

    When requests are kept small they tend to be much of the same size.

    I’d argue that we’re still estimating, in effect, since we’re working to get all of the stories to be of similar size — say, 1 point — even if we don’t explicitly estimate. Such implicit estimation is sufficient in the face of good conversations around story clarity and acceptance criteria.

  6. I am always struggling with stories break down problem. It is clear that you can break down some features to small stories, but it it also clear that there are exceptions. For example, we are reworking plugins architecture. We have 10 plugins already and definitely you can rewrite them one by one. However, new architecture is a complex task that already took several months of development. Is it possible to break it into smaller stories? Yes. But these stories have no business value. I feel they are artificial. Should we still break such redesign into small stories? If yes, how? I don’t know.

    • Perhaps don’t think in “business value” but simply in value. For each item, there must be a “Why”, i.e. a clear reason for doing it that is well understood by the team, and aligns with the greater purpose. I think it is misleading and artificial to try to make everything have “business value”, which by definition must be measured in monetary units. But value is something more than cash, right?

  7. Btw, we are not estimating. We dropped estimates a year ago. But now i am thinking we “maybe” will bring them back. I don’t want to spend time on estimation. But as a PO I want better predictability in some cases. I there is another way to have better predictability, I will be happy to know it in depth 🙂

    • Why do you assume that estimates will give greater predictability? I’d recommend that it isn’t predictability you should focus on but prioritization. What can you choose NOT to do? By creating smaller units of work you get to see very clearly when something can be de-prioritized, and thus focus on what is essential… that mythical 20% that will satisfy 80% of users.

      • I fully agree on prioritization. But. Say, you define that new major refactoring is a top priority now. The next thing is to split it into small stories. In this particular case that we have we did not manage to do that, since new architecture is emerging after various researches and spikes, it constantly changing (which is good, since it is becoming better), but it is very fluid and unpredictable. Initial estimate was 2-3 months. We spent 5 months already and new architecture is close to finish, but 1 more month will be required for sure.

        The team split it to tasks after 3 months of development and research, but I can’t say that these tasks are user stories, since only full delivery of all these tasks make sense to end user.

        So the key is stories breakdown. And it is hard.

  8. Hey guys… all good points.

    To me, this is kind of like a Shu-Ha-Ri thing. Most of the teams I coach, learn the typical Mike Cohn approach to estimating. After a sprint or two, the task decomposition becomes tedious, but their stories are too big to deliver value early in the sprint and reduce risk.

    If they want to ditch tasks, I challenge them to decompose stories further. If they can get everything between 1 and 3, they can give up the tasks and task estimates. Most teams respond positively and pull it off. If they want to drop estimating all together, get everything down to 1-2.

    Teams that have overridden my guidance (which I sometimes encourage 😉 and ditched task breakdown without more fine grained decomposition usually decide to go back until they can break their stories down more. IMO, if the team can get everything down to a 1-2, ditch estimating all together… and maybe even the timebox, but I’d still want an average time to complete metric or something (maybe I’m old-school).

    But to my main point… these are advanced techniques for mature teams… not for teams starting their 1st sprint. The trick is to meet the team where they are today, and help them move toward greater levels of maturity. My only concern with your post is that folks use the approach before they are good enough to pull it off effectively… and don’t have a fallback plan to be successful.

    Thanks!

    • Hi Mike, Good points. And. I have often done the opposite to this, i.e. started teams out by just working on stuff in iterations, and then after 2-3 iterations estimating retroactively (much easier to be accurate once it’s done, right?). Once they have done their retroactive estimating they then apply the technique to the stories in the backlog. This way they feel more comfortable with the technique. But eventually I started realizing that they didn’t need to estimate at all, they needed to get stuff done.

      We estimate with the intention of calculating velocity in order to make predictions about the future. But all this implies that the items on the backlog are broken down enough for the estimations to be meaningful. And this is upfront work that goes against the just-in-time principle of Agile. Therein lies the conflict.

      As I said to Michael Dubakov (above) we put too much emphasis on using data to make predictions. If we were trusted from the get-go to instead use instinct to make predictions we will get very, very good at it. But as the status quo stands, we never tune this skill. I think that’s a great pity, and a sad loss to our profession.

  9. Pingback: Tweets that mention Beyond Estimation | Agile Anarchy -- Topsy.com

  10. In reading the interesting post and the comments, I was reminder of:

    In a past professional life, I tried to get Agile going in a small company with an in-house development team. The company culture was very anti-formal methodologies to say the least. Thus, I was able to implement “agile” principles but not a formal Scum methodology. When it came to estimating stories, we used a small, medium, large and gigantic label for every story. We used a whiteboard to put the story “cards” (few word summary of the request) up and we drew a line across the middle of the board. Those stories above the line were needed in the next release (1 week sprint as part of a 2 week release cycle) thus below the line were “nice to have” and may go above the line for the next release. We, the development team, made the call on how many L’s, M’s and H’s could be done in a sprint for the next release.

    It was extremely effective and it did what you described: it forced the product owners (three in this case, hence an additional challenge) to really slim down their requested features to only the really critical stuff.

    Thus, I can support your claim that agile (principle based) estimation can be successful using a non-numbers, more gut based approach.

    • Nice example John. Thanks. Did anyone ever try to convert the L, M, S sizes into numbers for the sake of prediction? If not, how did you satisfy the request for “accurate prediction”, or did that not come up? Michael Dubakov (and others) may be interested in your experience in this area. Cheers.

  11. “We estimate with the intention of calculating velocity in order to make predictions about the future”

    Yes and no… I tend to estimate to rough in requirements and set constraints. That said, I almost never work out of a flat backlog anymore. I use Epics, Releases, and Stories in a map. We estimate to rough in the size of the Epic and then aggressively work to deliver the simplest feature set to ‘converge’ the story estimates to the higher level estimate.

    If we are fixing team size and time… scope has to float. But for the companies I work with, having no idea of the high level deliverable, or any idea of what we are going to get for our money, is a non-starter. Estimates are just a tool… if they are misused or abused, that is the real problem. I’ve never been a fan of throwing out the tool, just because some folks don’t use them well. I’d rather coach folks to use data for good, not evil.

    I respect that this is very contextually sensitive, so if what you say works for you, that is awesome. I could see your approach being a desired end state for some of the organizations I work with, but not a starting place. (I did like the idea of not estimating at all the first few sprints and then backing in… I have not tried that, but might give it a shot)

    Thanks!

  12. Mike, we do the same, Epics -> Stories -> Tasks in a map to deliver the simplest model we can get to fit. We started 3 years ago and did poker but we found that it was more time than we needed and we work on gut feel of work since we understand the work very well now. Great article Tobias.

  13. Tobias,
    well thought through, and well written.
    You focus on the important things: keep stories small and gain a true team commitment. Both can be made more difficult by the numbers.
    Still, I’ve only seen this working if the team as well as their environment are experienced enough to let go of that kind of apparent control.
    In many situations, it’s a habit to grow into rather than one to start with.
    Thank you!
    Olaf

  14. “I’ve only seen this working if the team as well as their environment are experienced enough to let go of that kind of apparent control”

    Agreed. Companies approach Scrum from many angles… not all of which are to improve the human condition. The goal would be for everyone to be in the top 10%, motivated, problem solvers… people that just know what to do and when to do it and will always do the best job they possibly can as a team. Sometimes I just need early feedback, early risk reduction, and the ability to inspect and adapt.

    To me Tobias, it is kinda like your post about the Prime Directive… not everyone shows up and does their best. Sometimes people suck. Sometimes people are petty. Sometimes measuring things and posting numbers helps people get better… or at least pay attention to where they can improve. Personally, I think numbers and metrics are fine as long as they are owned by the team for the sake of the team getting better. When they are used by management to attempt to control outcomes, it get’s dicey.

    Great conversation!

  15. “Personally, I think numbers and metrics are fine as long as they are owned by the team for the sake of the team getting better. When they are used by management to attempt to control outcomes, it get’s dicey.”

    Agreed, for numbers within the sprint. I think that story points are abused when used for commitment, but that’s not their intended purpose. On the other hand, “attempting to control outcomes”, via backlog manipulation and release duration is the job of stakeholders (though the PO) given story point velocity.

  16. Great discussion! The real issue is managing stories, in particular, story sizes. When stories are small enough, estimates don’t matter.

    This still leaves us with the fundamental problem of defining small stories. Many teams have trouble thinking small. They try to do too much, too soon. Sprints don’t meet expectations and the team gets frustrated.

    This is one of the most difficult aspects of agile development and an area that needs a lot more thought and discussion. Story sizes in the teens, twenties and beyond are way too big. We need to stop telling people that it is okay to use such big numbers. Think small!

  17. Clinton, “I think that story points are abused when used for commitment, but that’s not their intended purpose.” What is their intended purpose?

    • Tobias, “What is their intended purpose?”

      To estimate size, and derive duration. To facilitate objective discussions about the prioritization and forecast of scope over a period of time that is longer than that which the team can reliably commit to.

      • Teams are often taught that once they know their velocity they can use that to make commitments. Mike Cohn talks about it as “filling the sprint bucket”. I take it you do not prescribe to that. Is that right?

  18. I don’t believe that story point velocity is the basis for commitment, only forecast.

    Actually, when Mike talks about filling the sprint bucket (at least he did when we co-taught), he is referring to the typical hour load that past sprints have experienced as a basis to estimate their upcoming sprint commitment (yesterday’s weather). I teach that this is a good starting place as well and encourage teams to go with what works best for them to refine their ability to commit. I completely agree that teams should explore different ways to do this. I even tell them that if they want to explore “sacrificing chickens in the moonlight”, feel free 😉

    • My memory was flawed on this. I found Mike’s article why-i-dont-use-story-points-for-sprint-planning which describes his technique exactly as you do. Others though do practice the “story points bucket” technique, e.g. here and here.

      I don’t subscribe to either technique, but the hours method seems especially wasteful to me. As mentioned in previous blogs, I find any effort to estimate tasks in hours to not only be wasteful but to actually hide organizational dysfunction. Using it as Mike Cohn describes to commit in a sprint when developers actually have an innate ability to make an intuitive guess and probably arrive at the same conclusion (actually, I’d argue a more accurate one) is surely unnecessary overhead.

      I don’t buy the argument that many have expressed here that new teams need these tools to make realistic commitments. I assumed that too once, but when put to the test doesn’t play out. Maybe it is us coaches who need it, not the teams. Just a thought 🙂

      • “I don’t buy the argument that many have expressed here that new teams need these tools to make realistic commitments. I assumed that too once, but when put to the test doesn’t play out. Maybe it is us coaches who need it, not the teams. Just a thought :)”

        Perhaps. I find value in it as a starting place that bridges how they worked before to how they will work in the future. The REAL problem IMO is not using numbers, but emboldening teams to take ownership of sprint planning/execution practices and have the C&C management relinquish it.

        You saying it’s not worthwhile is a one view, others that say “use it always” is another. Mine, that “it’s a useful starting place” one more, but, as we agree, we ultimately we let the team, the ones closest to the work, decide.

  19. Yes and no. Yes: an experienced team will need no estimates and may rely on knowledge. No: an unexperienced team should use any tool available that allows them to get a “feeling” of what it can do and what is requested (usually).

    I do confirm the experience explained in the article: We too ended up splitting all tasks to units of the same size and just counted up the task cards. Other developers argued that this would be inaccurate. But in the end our sprint planing was always just as good as those of teams using more complicated estimation techniques and planing poker – we just had a lot less effort to spend for our planing sessions.

    I do agree: commitment is not about numbers. I have been trying hard to make management understand what this whole planing is all about. It doesn’t mean “we are finishing _AT_LEAST_ this many tasks even if we have to work overtime”. It means: the business-risk we take is acceptable. That’s why we do “planing”, “estimation” and gathering “requirements”: we are trying to reduce business risks to an acceptable size. You would expect managers to be first to understand what that means – but unfortunately most do not.

    I also tried to provide management with regular status updates and early warnings if the commitment was endangered. I tried – hard – to make them understand that this does not mean people are not working long or hard enough. It does mean: the business risk is no longer acceptable. It means: if we continue to burden the team with extra workload on top of the commitment the same way we did before, then the commitment is likely to be breached.

    That’s what the burn-down chart is for: draw a top line and a bottom line around your chart. That’s your forecast corridor. The part of the corridor that is outside the expected time-line is your risk of failing. If this risk is acceptable, you carry on. Otherwise you adapt your planing.
    It’s just a tool to visualize common sense … unless you have none. In which case no tool whatsoever will help you anyway.

    Some managers thus misunderstand “commitment” and try to force teams to over-commit themselves. Plus asking them to work overtime when the commitment (as expected) is breached. Or they try to misuse story points and velocities to “measure” efficiency and try to compare teams against each other (team A is 20% better than team B).

    In the end, the problem is not in Scrum – the problem is in a certain type of people.

    • “In the end, the problem is not in Scrum – the problem is in a certain type of people.”

      Amen to that! There is much work to be done in the corporate world, most of it is beyond the limits of Scrum and Agile, although can be informed by some of the principles and values many of us work to. Thanks for your comments.

  20. @Clinton,
    “we ultimately we let the team, the ones closest to the work, decide.”
    Yep. And… for teams to do this effectively they need to know the different practices other teams have found success with, as well as innovating their own. Sometimes we have blind spots and can’t see past the way we currently do things. A coach can help guide us past that.

    Thanks for sharing your perspective on this blog. It has been a good discussion.

  21. Hello All,

    This was a very interesting and thought provoking post. Thanks Tobi,

    I am a coach and a developer myself. I don’t think any techniques story point or days makes it any easier for teams to estimate.

    For new Agile teams, I find story points giving a structure to their thinking. They are able to say quickly small , medium etc. It does encourage conversations. I have seen teams generally start with planning poker , in a couple of sprints as the trust builds they are quick to say this together without poker.

    That said majority of of the managers I have worked with look for some numerical thing. At a large program level with many teams, velocity helps with predictalibty. For a one or two team level this may not matter much.

    I always ask teams to try out story points and keep it if they like them. In teams where I have been part of of after 12 -13 sprints of writing fantastic code , things like story point etc do not have any meaning. We used to just focus on finishing whatever we commit .

    I think it comes down to what sort of organization you are working and does the system trust the teams or not. In places where there is a high trust, story points etc does not matter. Is the team delivering a quality release often is all that matters.

    But in a classic organizations that is so dependent on metrics , story points and velocity seem to fill the unwanted need to know “Are we going to make it”

    Vibhu

  22. Ravi Rajamiyer

    Great article!!
    As a practicing Scrum Master, I have seen myself slowly getting over the obsession of progress tracking with estimated hours, burnt & unburnt hours. The primary reason for this is that, no matter what technique we use for estimating, the estimates never seem to be accurate. At end of most of our sprints, we either overestimated, or underestimated, often leaving the stake holders perplexed. Pounding the team to come up with estimates, and hounding them to record at the end of each work day, only seems to bring back the memory of ‘command-and-control’ days, often leaving the team dispirited. In the recent sprints, we completed stopped using hours-based estimates, and switched to task-based estimates. With the help of the ‘gut-feel’ that Tobias is talking about, and some sticky notes on the white board, this is working out fine.

  23. Greg Reynolds

    First point, and one made above already by another poster, is that if you are decomposing stories down to the same size, say 2 days or less, then you are still effectively estimating. I think Alistair Coburn has advocated this approach as well in the past.
    Second point, if you don’t care about your sprint velocity or release velocity, then I suppose there is not as much reason for estimates. Your teams that don’t estimate tasks may not necessarily have the same sense of urgency and commitment to complete.
    Third point, if you are not estimating then you may also not be recording hours spent on your tasks, which means you won’t have a baseline of how much effort it took to produce features. In the future if you are called upon to provide high level estimates of what it might take to produce a new product, your team won’t be able to help you estimate because they either don’t know how, or only know how to do so if they decompose into 1-2 day sized features.
    In most of the enviroments I have encountered, knowing the sprint velocity was important for teams to know if they would finish within the timebox, if features would need to be trimmed, or if features could be added. More importantly, story point velocity was needed for release planning so we could let stakeholders know what they could reasonably expect in upcoming releases. Again not all domains require this; if you’re building commercial games taking 3+ years to release your product, maybe no one cares as long as it’s a hit game that makes lots of money?

  24. Greg Reynolds

    I left out one final and important point. If you do not estimate your backlog in story points then you cannot make decisions about feature priority tied to cost. If cost is no issue for you (see my earlier game analogy) that’s great, but if your priority is tied up in costs then you need those estimates to make better decisions. With many of my past customers, priorities rose or fell based on the size of the feature and the basic appreciation for what it was going to cost to develop it. I accept that some customers are not cost centric, and are focused only on value which is great; I just have not been fortunate to work for those kinds of deep pocketed customers.

    • Hi Greg. I feel all the needs for knowledge and control expressed here are rooted in a single problem: inadequate conversation between teams and customers. I have used all the number systems and charts you mention here in one form or another. Some are useful, others less so. All are temporary tools to learn, not permanent fixtures, and none of them replace good conversation between teams and customers. Metrics to prove things are a poor (but sometimes necessary) replacement for trust

      Just remember, no other creative industry worries about formal hours estimates and story points, etc. They talk with customers, make verbal commitments and deliver work incrementally getting feedback as they go. In the end that’s all we need to do too.

  25. Greg Reynolds

    @Tobias We do not or should not use Story Point estimation as a replacement for trust. Estimates are not needed because we don’t trust development teams, rather, they are used to allow teams to better understand what we can expect to deliver and when. There are many creative industries I am aware of that use estimation practices. They may not always use “Story Point” units or “Ideal Hours”. One game company I’m familiar with estimates using gummy bear units and the team actually brings gummy bears to the planning meeting. This is not a team where trust or communication with stakeholders is lacking and certainly estimation is no surrogate for communication!

    Imagine you have three major architectural options available to your team. How do you choose between the three given each is technically viable? Which delivers the best business value? Which will get your product to market sooner? How do I have a discussion with my engineering team about how long each option might take to realize? Can I afford to just try one and see how long it takes? What about build vs. buy decisions? Should I go with open source, build my own, use a commercial alternative, or even acquire another company’s IP? Is there some way to make a more informed decision rather than just coding it out and see what happens?

    In some cases as you say getting feedback and course correcting along the way is all you need to do. In other cases we need to be a bit more predictive and make decisions based on how much it is going to cost and how long it’s going to take to realize business value. In the latter case estimation is of use.

Leave a reply to tobiasmayer Cancel reply