Green Is Home Base

A developer’s typical working procedure is:

  1. Create a feature branch off of master.

  2. Implement the feature, which will destabilize the code…

  3. Then, spend time getting everything back into a working state.

It’s as if you needed to replace a part in your car that is buried deep inside of it — so first, you have to tear the car apart, then replace that part, and then put the car back together. The difficult step isn’t replacing the part. It’s putting all the other parts back together that you took out just to get to the part you’re replacing.

A good deal of regression is introduced when developers “tear apart” the code to make the change they need to make, and then hastily put it all back together before opening a pull request. If the code has good test coverage, the tests will spot the regression before the pull request can be merged; but still, the most challenging and time-consuming part of the process is to get everything working again.

I have experienced this countless times. The more disruptive the change that I make to the code, and the longer I go before stabilizing it, the more trouble I have stabilizing it again at the end.

An example of where this can be really overwhelming is when some design change to a cross-cutting concern is made. Say the whole app is using a singleton, and it’s causing all the usual headaches that singletons can cause. We’ve decided we want to replace it with dependency injection. This requires changing every class that uses this singleton.

It is tempting to try to do this all at once. Just get it over with, right? But I can remember so many times where I actually had to give up on a change I was making, because the change became so big that I was simply unable to put everything back together at the end! Things would break, I wouldn’t be sure why they were broken, and after enough stabbing in the dark, I would say: “This is a bad idea, I need to start over.”

In the most extreme cases, I spent significant time tearing the code apart without even running the compiler. It then can take hours just to get the code to compile again. Of course, I haven’t made any commits or stashes during that whole time. This means if I decide I need to take a step back, I can only take the huge step back to the very beginning. There are no “intermediate” states of my work to return to.

Stay Close to Home Base

If we want to make a significant change to the code, how can we do it without going through this very painful step of cleaning up all the collateral damage we created in the process? When put this way, the “obvious” answer is to make the change more “carefully” and cause little to no collateral damage. But what does this look like, exactly?

At the very least, committing changes frequently allows you to back up to an intermediate state. But what good is that, if this intermediate state is severely broken, and it would take a ton of work just to get it to compile? You can’t do anything with broken code except fix it. You can’t release it, or set the broken parts aside and work on something else. If you decide the damage is too much to handle, you still have to revert all the way back to the last working commit, no matter how many broken commits you made in the meantime.

What we really want is not only merely to commit more frequently, but to commit a known working state of the code more frequently. This, in turn, means that we need to get back to a working state more frequently. Instead of the workflow being to get all the way to the end goal of code changes, and then handle all the “cleanup” at the end, we need to split the work into a large number of “make a small change, then clean up and test” cycles.

The analogy I like to use is pirates. Pirates plunder ships, then take the goods back to some home base, like a tropical island. The island is, relatively at least, safe and secure. Being out on the open waters is risky, especially after you’ve plundered a ship. You’re vulnerable, and all the stuff you collected might get lost if someone attacks you, or if the ship simply breaks and everything falls off the side.

The longer you spend out in the open waters, the more likely you’ll get into serious trouble. So instead of plundering one ship after another and accumulating more and more valuable goods on the ship, it’s a better idea to plunder one ship, then take the stuff back to Home Base, and repeat. You want to get back to Home Base often.

This might seem more wasteful. All those extra trips back and forth to Home Base cost time, fuel, and other resources. But if a percentage of plunder is lost while on a ship, then these extra trips can increase the total rate of plunder that is returned to Home Base, because the losses will be minimized. So overall, it is less wasteful.

For developers, Home Base is “green”: the code is in a proven working state. “Proven” means some kind of tests were run. They can be automated, manual, end-to-end, unit, acceptance, whatever (as we’ll see, even compiling can be considered a test). When all the tests are passing, we are at Home Base and can relax.

We need to venture out into open waters to do our job. Open waters is “red”: at least one test is failing, the code is not compiling, or untested changes have been made. Instead of piling untested change after untested change, etc. on top of each other, we want to make one small, untested change, then test it and get back to a working state. We want to get back to Home Base often.

Example: A Disruptive Design Change

Let’s take the example of replacing a singleton of a class called HelperClass with dependency injection. Currently, HelperClass can only be used as a singleton, because its constructor is private:

class HelperClass {
  static let sharedInstance = HelperClass()
  private init() {
    // ...
  }
  // ...
}

A maximally disruptive change would be to delete the sharedInstance method and make the constructor public. This breaks every class that uses HelperClass! The code won’t even compile until we go through and change every reference to HelperClass.

And we can’t just do a simple find-and-replace of HelperClass.sharedInstance with HelperClass(). Presumably, the reason HelperClass is a singleton is that it manages shared state; doing a simple find-and-replace would make a copy of this state for every call to the singleton. Rather, we’d have to fully deal with creating one instance somewhere and passing that one instance to every object that uses it. That’s a huge, disruptive change with tons of opportunity for error — and we can’t even compile the code until we’ve touched dozens (hundreds?) of files.

So instead of all that, how might we make one small change toward achieving this end goal with all the code still working?

Let’s just look at one class, ClassA, that uses the singleton. Let’s try to change only this class to what we want it to be. What we want is for ClassA to accept an instance of HelperClass at construction, and leave it to other classes that use ClassA to provide the right instance. So let’s modify the constructor for ClassA to take a HelperClass instance, store it as an instance variable, and use that instead of the singleton. We’ll change this:

class ClassA {
  init() {
    // ...
  }
  func someMethod() {
    // ...
    HelperClass.sharedInstance.doSomething()
    // ...
  }
  // ...
}

to this:

class ClassA {
  private let helperClass: HelperClass
  init(helperClass: HelperClass) {
    self.helperClass = helperClass
    // ...
  }
  func someMethod() {
    // ...
    self.helperClass.doSomething()
    // ...
  }
  // ...
}

Once we do this, we’ll get compiler errors at every place where a ClassA is constructed. This is a lot fewer compiler errors than we’d be looking at if we deleted the sharedInstance method of HelperClass. And how do we fix them? Well, in order for the code to still work, the instance of HelperClass that ClassA receives needs to be the same one that all the other classes use. All the other classes use sharedInstance. So let’s pass sharedInstance! This:

let someA = ClassA()

becomes this:

let someA = ClassA(helperClass: HelperClass.sharedInstance)

Once we do that, the code compiles again. Even better, if you think about it, we haven’t really changed anything. This is what is known as a refactor (if you get good at using an IDE’s refactor tools, you’ll learn to make a change like this without ever breaking the code along the way). ClassA is still using the same instance of HelperClass, in the same way it did before. All we’ve done is to move the specification that ClassA will use sharedInstance from inside the class to outside of it. We don’t even really need to run and test this code. Compiling sufficiently tests the change. We can commit this code and treat it as a known working state in case we need to return to it later.

Now we can repeat this with another class that uses HelperClass. Each time we do this, we push the “singleton-ness” of HelperClass outside the class using it, and into wherever those classes are created. Each time, we are in a known working state and can commit. The end result is that all the sharedInstance calls are in constructor calls, not dispersed throughout classes.

The next step is to deal with these constructor calls. Let’s say ClassB creates a ClassA and is now doing so by passing in HelperClass.sharedInstance:

class ClassB {
  func someMethod() {
    // ...
    let someA = ClassA(helperClass: HelperClass.sharedInstance)
    // ...
  }
  // ...
}

We’ll encounter one of two situations. One situation is that ClassB is one of the classes that was already using the HelperClass singleton. This means that at some point, we added a private instance variable of HelperClass and made it a constructor parameter:

class ClassB {
  let helperClass: HelperClass
  init(helperClass: HelperClass) {
    self.helperClass = helperClass
    // ...
  }
  func someMethod() {
    // ...
    let someA = ClassA(helperClass: HelperClass.sharedInstance)
    // ...
  }
  // ...
}

If that is the case, we can just pass self.helperClass instead of HelperClass.sharedInstance. After all, self.helperClass is HelperClass.sharedInstance, so we aren’t even changing anything by doing this:

class ClassB {
  let helperClass: HelperClass
  init(helperClass: HelperClass) {
    self.helperClass = helperClass
    // ...
  }
  func someMethod() {
    // ...
    let someA = ClassA(helperClass: self.helperClass)
    // ...
  }
  // ...
}

That is another small change to a working state, and we can commit.

The other situation is that ClassB doesn’t use HelperClass, so we didn’t add the instance variable previously. The solution is to add it now and pass it to the constructor of ClassA. Then we’d just need to pass in HelperClass.sharedInstance to all the places constructing a ClassB. Once we do that, we’re in a working state again, and we can commit.

As we complete these steps, we keep pushing the call to HelperClass.sharedInstance further to the “outside” of the app, meaning toward classes that are created early and create a lot of other objects. If we’re lucky, we’ll eventually end up with just one place where HelperClass.sharedInstance is passed into a constructor. Then, we can finally delete sharedInstance, make the constructor of HelperClass public, and replace this one remaining call to HelperClass.sharedInstance with HelperClass().

Such luck, however, is not very probable. More likely, we’ll still have a few remaining calls to HelperClass.sharedInstance in classes that are not constructed by us. These are framework classes that the SDK creates for us, and we aren’t able to add constructor parameters.

Unfortunately, we can’t maintain the same level of assurance that nothing has changed as we handle these cases (by the way, this is one reason why it’s good to keep as much of your code as possible out of framework classes that you don’t control). We’ll have to pass in the instance of HelperClass at some time after construction, which means there is a possible state where it hasn’t been passed in yet. It can’t be known at compile time that the right instance has been supplied yet (in a language like Swift, we’d express this by making the instance optional, and then having to deal with what, if anything, to do if we want to use the instance when it is nil). Now we have to start testing. But otherwise, the process is the same: deal with one class at a time, make sure the code is back to a working state, and commit. We’ll continue this until we get down to a single sharedInstance call, and then we can delete sharedInstance.

There are other difficulties we might run into. If there are multiple singletons in the app and one of them references HelperClass (even worse, maybe HelperClass also references this other singleton!), then we’ll get stuck there because singletons can’t take constructor parameters.

But this is a problem that would have come up even if we made this change all at once. If we do make the change all at once, and if we can’t find a way to work around the problem, at best we’ll have a branch with broken code sitting around collecting cobwebs until someone thinks of a way to move forward. By that point, master has changed significantly, and you have to merge in those changes and resolve all the conflicts without breaking things again. Good luck with that!

But on the other hand, if we hit this brick wall while in, or very close to, a working state, we can put the work aside for now and get the changes we’ve made into master. Then, if we want to pick it up again later, we can pick up where we left it off.

Small Changes Are Comprehensible

We can see that forcing all these intermediate checkpoints of working code actually helps reveal all the work that needs to be done. The “obvious” part that we would probably do at the beginning if we tried to do this all at once is that any class that currently uses HelperClass will need an instance variable and a constructor parameter. What’s less obvious is that some classes that currently don’t reference HelperClass at all will need to act as intermediaries, passing the instance along to its own dependencies (it’s certainly not obvious which classes will need to do this). Also less obvious is that some classes, including these intermediaries, can’t have their construction modified, so some other strategy is needed, and that is where the compiler is no longer a sufficient guardrail.

This happens because our minds are better at accounting for the effects of small code changes. We can more reliably “see” what a small change is going to do, and will more likely catch things we didn’t think of before starting. Very large changes will go beyond what our minds can handle, and we end up poking around in the dark. What we’re really doing is digesting the change we want to make one bite at a time, instead of choking on it after trying to swallow it whole.

The “cost”, so to speak, of the enhanced clarity of following this process, is the existence of intermediate states of code that are provably working, but from a design standpoint, are arguably worse than what we started with. In the singleton example, we’ll have a mixture of dependency injection and singletons, with the dependency injection really just masking the fact that it is still a singleton. In other cases, the intermediate stage may look and feel significantly uglier, design-wise, than the initial stage did. This could mean longer methods, methods with too many parameters, less cohesion, or other things we would normally consider “design smells”. The tradeoff is that we accept a temporary move “backward” to a worse design for the code, but that ultimately allows a smoother transition to the better design we’ll eventually have once all the steps are completed.

Use TDD to Formalize the Process

This process of “staying close to green” also meshes quite well with test-driven development (TDD).

TDD already instructs you to add or modify a test first, and only modify production code when a test is failing. Making small changes then translates directly into the process of making one test fail, then getting that test to pass, and repeat. Don’t make dozens of tests red and then try modifying production code to make them all green again. Make one test fail, then make it pass. After each cycle, you’re clear to commit.

In fact, if your test coverage is high enough, you can even consider executing this work directly on master (with no branching)! Suppose you have a trustworthy suite of rapidly running unit tests. In that case, you can run them every time you compile a change and prove you’re still green, even as you venture beyond what “pure” refactoring (meaning changes the compiler can verify) will allow. Whenever you encounter a situation where runtime checks are necessary, like the classes that can’t have constructor parameters added, you can first add tests to cover those areas (if they aren’t there already) and make sure they stay green as you make small changes.

An essential feature of unit tests that is important here is that they run quickly. You don’t want to have to run integration tests that take, say, 5-10 minutes to complete, after every small change. Ideally, your unit test suite runs in a matter of seconds. Additionally, if you aren’t already test-driven and don’t already have good unit test coverage, this mode of working will create pressure to start moving toward that goal.

I want to emphasize that following this process does not make you slower at achieving the same goal. It may, however, dramatically improve quality, and it may turn out that high-quality code changes do take longer than low-quality code changes. (By “quality”, as usual, I mean both the code quality and the app quality, in terms of stability, bugs, and so on.) The whole point is to decrease, not increase, waste. You may feel like the extra time spent testing is wasteful, but you have to compare it to the time you’d spend getting totally lost in a labyrinth of changes you made without testing more often. What the process will do is to uncover just how much it costs you to not have good automation.

As you practice this and get better, you will eventually genuinely surprise yourself when you go through small steps to make a very large and significant change, and you get through the whole thing without causing any regressions at any point! You won’t believe it, and you’ll test over and over, looking for some bug you must have introduced somewhere. After much searching, you will have no choice but to accept the shocking truth: you made significant changes to the code and didn’t break anything.

You Might Also Like…

Effective Test Automation in iOS Development

Test automation is probably the single most important factor within an agile software project. Providing a customer with small, working increments of a product, delivered frequently, requires the pipeline from feature design to product delivery to operate very fast. The biggest hurdle in delivering a small change quickly is regression testing. The trouble stems from …

Effective Test Automation in iOS Development Read More »

    Sign Up

    Subscribe to our newsletter for tech tips, analysis, and more.