202402171701 Tidy First?
A lovely short book on how, when, and why to refactor code.
Tidy First? describes:
- When to tidy messy code before changing what it computes
- How to tidy messy code safely and efficiently
- Why tidying works
As always, delete only a little code in each tidying diff. That way, if it turns out you were wrong it will be relatively easy to revert the change (see Chapter 28). “A little” is a cognitive measure, not a lines-of-code measure. It could be one clause in a conditional (e.g., you see the condition reduces to true), one routine, one file, one directory.
Pick a way. Convert one of the variants into that way. Tidy one form of unnecessary variation at a time—lazy initialization, for example, first.
So you need to call a routine, and the interface makes it difficult/complicated/ confusing/tedious. Implement the interface you wish you could call and call it. Implement the new interface by simply calling the old one (you can inline the implementation later, after migrating all other callers).
Reorder the code in the file in the order in which a reader (remember, there are many readers for each writer) would prefer to encounter it. You’re a reader. You just read it. So you know. Resist the temptation to apply any other tidyings at the same time. Likely in reading you will have noticed other details that make comprehension and change harder than they should be. There will be time for those details later. Alternatively, tidy those details now and shuffle the reading order in a later tidying. Don’t mix.
You aren’t stuck with Swiss cheese changes. Tidying can increase cohesion enough to make behavior changes easier. Sometimes the increased clarity from slightly better cohesion unlocks whatever is blocking you from decoupling. Sometimes better cohesion helps you live with the coupling.
I want to mention a couple of special cases of extracting a helper. One is when you have to change a couple of lines within a larger routine. Extract those lines as a helper, change just the lines in the helper, then, if it makes sense, inline the helper back into the calling routine. (Usually you’ll find yourself growing fond of the helper and keeping it around.) So, this:
The biggest cost of code is the cost of reading and understanding it, not the cost of writing it. Tidy first has a bias toward lots of little pieces, both theoretically, to increase cohesion as a path to reducing coupling, and practically, to reduce the amount of detail that needs to be held in your head at any one time. The goal of this bias toward small pieces is to enable the code to be understood a little at a time. Sometimes, though, this process goes wrong. Because of how the small pieces interact, the code is harder to understand. To regain clarity, the code must first be mooshed together so new, easier-to-understand parts can then be extracted.
And so we split our changes into separate PRs. Sequences of tidyings (or even just one tidying) go in one PR. Behavior changes go in a separate PR. Each time we switch between tidying and changing behavior, we open a new PR.
You and your team are going to need to figure out how exactly to reduce the cost of review. In teams with trust and a strong culture, tidyings don’t require review. The risk of interactions has been reduced so far that unreviewed tidying doesn’t destabilize the software.
Even if at first you tidy a lot, soon you will find yourself wanting to make a behavior change in code that’s already tidy. Continue for a bit, and most changes will happen in already-tidied areas of the code. Eventually, encountering untidy code will be the exception, even though most of the code in the system hasn’t been touched. That’s why I’m confident in saying that tidying is a minutes-to-an-hour kind of activity. Yes, sometimes it goes on longer, but not for long.
So sure, tidy after, if:
- You’re going to change the same area again. Soon.
- It’s cheaper to tidy now.
- The cost of tidying is roughly in proportion to the cost of behavior changes.
I love my job sometimes. So okay, yeah, of course it depends, but what does it depend on? I need to change the behavior of this code. This code is messy. Do I tidy first? Ask yourself these questions:
- How much harder is the messy change? If tidying doesn’t make it any easier, don’t tidy first.
- How immediate is the benefit of tidying? Let’s say you’re not ready to change the behavior yet. You’re just reading code for comprehension. Tidying helps you comprehend faster. Sure, tidy first.
- How will this tidying amortize? If you’ll only ever change this code once, then consider limiting your tidying. If this tidying will pay off weekly for years, then go for it.
- How sure are you of your tidying? Bias away from speculation. “I can see the messiness here, right here. If it’s gone, then this change will be easy.” But also, “Tidying this will make it easier to understand. I know because I’m confused right now.”
Summary
Tidy never when:
- You’re never changing this code again.
- There’s nothing to learn by improving the design.
Tidy later when:
- You have a big batch of tidying to do without immediate payoff.
- There’s eventual payoff for completing the tidying.
- You can tidy in little batches.
Tidy after when:
- Waiting until next time to tidy first will be more expensive.
- You won’t feel a sense of completion if you don’t tidy after.
Tidy first when:
- It will pay off immediately, either in improved comprehension or in cheaper behavior changes.
- You know what to tidy and how.
Software creates value in two ways:
- What it does today
- The possibility of new things we can make it do tomorrow
One reading of the phrase “beneficially relating elements” starts with “the design is….” What is the design? It’s the elements, their relationships, and the benefits derived from those relationships. Another reading starts with “designers are….” What do designers do? They beneficially relate elements. From this perspective, software designers can only:
- Create and delete elements.
- Create and delete relationships.
- Increase the benefit of a relationship
You know what’s better than a machine that spits out $10 for every $1 you put in? A machine that spits out $100 for every $10 you put in. Or $20 for every $1. How are we going to get to that better machine? In a word, optionality. The mere presence of a system behaving a certain way changes the desire for how the system should behave (Heisenberg’s uncertainty principle). However much you’d pay for the $10/$1 machine, you’d pay more for one that could turn into either a $100/$10 machine or a $20/$1 machine—even if you didn’t know which it would turn into.
Even though we know that we have to invest in structure to maintain and expand optionality, we can’t really tell if we have. The code is easier to change? Really? We can’t really tell if we’ve done enough. If we invested more in the structure, the code would be even easier to change? Really? We can’t really tell if we’ve made the right investments in structure. The structure changes we made were the best way to make the code easier to change? Really? And so people get muddled about structure changes in ways they don’t about behavior changes. This book is not here to answer those questions for you; it’s here to help you answer those questions for yourself. Start by understanding that structure changes and behavior changes are both ways to create value, but that they are fundamentally different. How? In a word, reversibility.
In the scope of this book, the time value of money encourages tidy after over tidy first. If we can implement a behavior change that makes us money now and tidy after, we make money sooner and spend money later. (As noted earlier, sometimes tidying first means the total cost of tidying first + behavior change is less than the cost of the behavior change without tidying. Always tidy first in such a case.)
What does this mean for software design? Software design is preparation for change; change of behavior. The behavior changes we could make next are the potatoes from the story. Design we do today is the premium we pay for the “option” of “buying” the behavior change tomorrow. Thinking of software design in terms of optionality turned my thinking upside-down. When I focused on balancing creating options and changing behavior, what used to scare me now excited me: • The more volatile the value of a potential behavior change, the better.
• The longer I could develop, the better.
• Yes, the cheaper I could develop in future, the better, but that was a small proportion of the value.
• The less design work I could do to create an option, the better.
Now, there are times to tidy first for sure. When:
cost(tidying) + cost(behavior change after tidying) < cost(behavior change without tidying) then absolutely tidy first. It’s still easy to get carried away and tidy too much, but set and maintain boundaries for how far you’ll go and you’ll be fine.
At the scale of tidying—minutes to hours—we can’t (and shouldn’t try to) precisely calculate the economics of our tidying. We are exercising two important forms of judgment, practicing for bigger things later: • Getting used to being aware of the incentives affecting the timing and scope of software design (“I want to spend more time designing and I’m getting pushback. What’s going on?”)
• Practicing on ourselves the relationship skills that we will later be using with our direct colleagues, and then our more distant colleagues
What about design changes that aren’t reversible? For example, “extract as a service” tends to be a big deal and hard to undo. Think about it some more, for example by actually implementing a prototype first. And by “implementing,” I mean putting it into production. Does this require a feature flag? Okay. Does it require checking the feature flag in a whole bunch of places? Okay, tidy first so it only requires a few feature flag checks. Do you see what we’re doing? We are making “extract as a service” reversible, at least for a while. If we get halfway into it and realize this is one of those services that really could have been a SQL query (thanks, Josh Wills), then we can change it without too much fuss.
Coupling drives the cost of software. Because coupling is so fundamental, I express and visualize it in as many ways as I can. As a math-ish definition: coupled(E1, E2, Δ) ≡ ΔE1 ⇒ ΔE2
Crux of the book
And with that you are prepared to answer the question “tidy first?” Over and over. Each time slightly differently, but each time affected by the same forces: • Cost—Will tidying make costs smaller, later, or less likely?
• Revenue—Will tidying make revenue larger, sooner, or more likely?
• Coupling—Will tidying make it so I need to change fewer elements?
• Cohesion—Will tidying make it so the elements I need to change are in a smaller, more concentrated scope?
Coupling conducts one tidying to the next to the next. Tidyings are the Pringles of software design. When you’re tidying first, resist the urge to eat the next one. Tidy to enable the next behavior change. Save the tidying binge for later, when you can go nuts without delaying the change someone else is waiting for.