Don't Do Software Estimation. Do Incremental Small(ish) Non-Trivial Chunking.
Hofstadter's Law: A task of substantial complexity always takes longer than you expect, even when you take into account Hofstadter's Law.
While enjoying delicious ramen at JINYA and discussing the characteristics of well-executing software development teams/efforts with friend and management consulting exec Mike Czupryn, our conversation wandered into software estimation and whether bad estimations are a sign of poorly performing, un-Agile teams.
TL;DR: All teams stink at level-of-effort (LOE) estimation. So, this is not necessarily indication of a poorly performing or inexperienced agile software development team. But it is a sign that their work "chunks" are too large or irregular. Software development is rife with variability and randomness that are difficult to discern and unfold unexpectedly to ruin estimation efforts. Incremental small(ish) non-trivial chunking (ISNC) helps reveal and account for hidden complexity. And the small(ish) chunking provides estimations "for free", since tasks are approximately the same size. ISNC-based progress tracking provide stakeholders a better basis for planning and decision making without the frustrating, ineffective, inaccurate estimation activities.
All teams, even experienced, high-performing ones employing Agile/Scrum "best practice" techniques like blameless retrospectives, Planning Poker, Three-Point Method, and Uncertain/Large/Small Uncertain, are inaccurate LOE estimators. No Agile/Scrum team doing significant work achieves the ideals of consistent "velocities" and "ideal line" burndown charts.
While writing this post, I surveyed experienced software development colleagues and received extensive feedback. All of which is accurately summarized by Bryan Hunter's pithy response to my inquiry:
"All estimates are lies."
How does estimation turn good teams into pinocchios?
Teams don't intend to be lying lairs. But, stakeholders frequently don't understand what estimates are, how they are made, or which assumptions they are based on. Stakeholders interpret these estimates as commitments or promises, and are frustrated when the team doesn't meet them.
And, it doesn't take much to ruin even the most experienced, talented teams' estimation. Estimation-wrecking factors include:
Chunking deliverables of “supersubstantial complexity” merely into "tasks of substantial complexity” is sufficient to reveal and derisk the "knowable unknowns”.
Hidden complexity in seemingly “simple” requirements emerge unexpectedly.
Segregating cross-functional work by functional domain -- which occurs even in cross-functional teams -- introduces further complexities related to coordinating and integrating these siloed efforts into a functional whole.
Hard-to-debug failures emerging when functional software subsystems are combined together into larger systems.
Subjectivity of software design means even minor philosophical design differences result in emergent failures in the codebase.
One factor alone is enough to invalidate an estimate... and most teams experience several factors everywhere simultaneously. These elements of randomness and variability renders software development as much a creative process of discovery as an engineering one, a stochastic effort masquerading as purely deterministic.
So, if all estimates are lies, why do teams do them?
Because, we force them to.
Unfortunately, despite being perpetually, unfixably inaccurate, estimations remain a key component of modern "agile" software development. LOE or duration estimations are collected, cooked, and consumed by project leaders, business operators and executives, and the team itself as "proxies for progress" to attempt to project when the requested software systems will be available, and to reactively make adjustments when the estimations inevitably prove wrong. Absent better data, these expected forward projections are also (ab)used to determine budgets and resourcing, and to accelerate, delay or cancel projects.
Estimations also provide a framework — often the primary framework — for the team and stakeholders to discuss knowns (requirements, risks, possible approaches), reveal knowable unknowns, and plan how to proceed.
However, in no other area of business would people make critical decisions using consistently badly-sourced data and expect successful outcomes. And yet, this is an accepted practice in software development? This is nonsensical.
So, don’t waste the teams' time, disappoint stakeholders, and squander everyone's good-will by asking for estimations. Instead, for more accurate development projections, do incremental small(ish) non-trivial chunking.
How is incremental small(ish) non-trivial chunking (ISNC) better than estimation?
Remember Hofstadter's Law from the top of this post?
A task of substantial complexity always takes longer than you expect, even when you take into account Hofstadter's Law.
If "tasks of substantial complexity" lead to underestimations of effort, then reduce each task’s complexity to approximately the same "smallish" (but non-trivial) level of effort. Chunking to similarly-sized smallish increments reveals the knowable unknowns hidden in larger work increments, and reduces other causes of randomness and variability. Each revealed complexity can itself be chunked small(ish), and chunks of complexity for which there are not known solutions and which require the team conduct additional experimentation and learning can be appropriately accounted for in planning.
Using smaller, similarly sized pieces of work has two additional benefits beyond more accurate forecast and discovery of unexpected complexities. Per scheduling theory, homogeneous task graphs (representations of tasks and their dependencies where tasks have similar characteristics, like execution time) achieve much higher throughput via much simpler schedulers than heterogeneous task graphs.
Still not convinced that outcomes are achieved faster, easier, cheaper via many small tasks versus an equivalent few large tasks? Consider that scheduling idealized task graphs for parallel execution against an idealized fleet of homogenous processors to minimize completion time is already an NP-complete problem. Then, observe that typical, non-idealizable, real-world software development efforts produce wildly heterogenous, ill-definable task graphs mapped to be scheduled against fleets of wildly heterogenous processors (i.e., programmers). The impossibility of consistently estimating correct completion times for individual software development tasks, let alone the entire software development task graph, become blindingly apparent. And, yet, we persist in demanding estimations. The only rational approach to improve scheduling and throughput is to either homogenize (as much as possible) the task graphs (which is most immediately achievable), or the programmers (more challenging, but perhaps AI can assist in leveling capabilities), or (ideally) both.
Secondly, unexpected variability or complexity in a low LOE (e.g, one or two day) task will likely require low additional LOE (add one or two days), producing a small delivery delay. The impacts of these small delays are much easier to manage, even in aggregate, than the types of impacts experienced when variability and complexity emerge in longer, larger LOE tasks.
Conclusion for Mike's original inquiry.
Inaccurate estimates indicate that the team's work breakdown consists of tasks that are too large or highly variable in size/complexity, which is leading to poor results.
By adopting the “incremental small(ish) non-trivial chunking” approach, teams can more accurately lay out incremental deliverable plans without the lies that are estimations. This contributes to more reliable development cycles, and more collaborative and rational planning, prioritization, budgeting, and resourcing decision making with stakeholders.
#agile #scrum #software #estimation #simplexity #project
Thanks, Mike Czupryn, for sparking this bit of writing, and to Eric Merritt, Tony Michaelson, Bryan Hunter, and Patrick Carver and other colleagues in NashFP for providing their valuable, hard-earned, often humorous perspectives.
AI is hellacool, useful tech, and was used to create the 3-D puzzle image at the top of the page. However, the rest of this post was synthesized, crafted, and refactored by this human, from ideations I operationalized over the years, for my own use and to share with you. Because writing sharpens critical thinking, and thinking critically is a critical human skill. Contributions of your critical thoughts are welcome!