A Field Guide to Unit Testing: Maintainability

Having maintainable tests is very important, just like 204 is very important.

THE UNIT TESTING SERIES

  1. A Field Guide to Unit Testing: Overview
  2. A good test is trustworthy
    • Test isolation (and what is dependency)
    • Definition of a unit
    • Testing scenarios
  3. A good test is maintainable
    • Testing one thing at a time
    • Use setup method, but use it wisely
  4. A good test is readable
    • Test description
    • Test pattern

Welcome to the part 3 of our unit testing series! Previously, we got ourselves comfortable with the idea of unit tests (the basic concepts, its reason of existence...etc), and introduced three pillars of a good unit test: trustworthiness, maintainability and readability. In the last article, we dived into how a good test can/should be trustworthy. Now, let's talk about how a test can be excitingly maintainable.

Test only one thing at a time

is an important mindset when we think about test maintainability.

Often, we expect multiple outcomes for the unit that we are testing. It may be several attribute changes to one operation, or different states according to different inputs, so it feels natural enough if we have several assertions in one test case.

Let's see an example first.

In the restaurant that only serves French toast, Server has this take_order function:

# Inside class Restaurant::Server

def take_order(dish)
  if dish == 'French toast'
    Cook.new.make_french_toast
  else
    "We only make French toast. Take it or leave it."
  end
end

When the dish name is "French toast," server passes the order to a Cook instance. If it isn't, server simply replies with a reject message.

Now, we can write a test like this

RSpec.describe Restaurant::Server do
  describe '#take_order' do
    it 'works' do
      server = Restaurant::Server.new
      # ----- First scenario begins-----
      # First assertion
      expect_any_instance_of(Restaurant::Cook).to receive(:make_french_toast)

      server.take_order('French toast')

      # ----- Second scenario begins-----
      result = server.take_order('Not French toast')

      # Second assertion
      expect(result).to be_kind_of String

      # Third assertion
      expect_any_instance_of(Restaurant::Server).not_to receive(:serve)

      server.take_order('Not French toast')
    end
  end
end

It is a perfectly valid test: it covers both scenarios, and it tests every expected behavior. But we can make it better. The problems with it are several:

  1. Ambiguous failures: If this test fails, it might fail at any of the three assertions. How would we know which one is it? And, if it fails in the middle, the code would be stopped. How do we know if the items after that failure will succeed?
  2. Complicated state management: Codes get complicated once you want to test multiple things at once. And managing objects' states is–trust me, you really don't want to get into it.
  3. Decreased readability: expected outcome cannot be immediately clear. We need to look through the code to know what the expected behavior is. The test description "it works" doesn't help us much here. "Write a clearer description then," you'd say. But it is naturally hard if we need to encapsulate all the expected outcomes in one sentence.

Let's see an improved version:

RSpec.describe Restaurant::Server do
  describe '#take_order' do
    it 'when dish is French toast passes the order to Cook' do
      server = Restaurant::Server.new

      expect_any_instance_of(Restaurant::Cook).to receive(:make_french_toast)

      server.take_order('French toast')
    end

    it 'when dish is others returns an error message' do
      server = Restaurant::Server.new

      result = server.take_order('Not French toast')

      expect(result).to be_kind_of String
    end

    it 'when dish is others does not pass order to Cook' do
      server = Restaurant::Server.new

      expect_any_instance_of(Restaurant::Server).not_to receive(:serve)

      server.take_order('Not French toast')
    end
  end
end

Do you see? When we split the scenarios into three tests, magic happens:

  1. The test description is clear and to the point, so instead of grilling the test code, we can get a good grip of what this SUT does at a glance. It can almost server as a documentation, and life is so much more easier.
  2. We'd know exactly which part is not functioning if any of the tests fail because each test only have one assertion against a specific behavior. It makes debugging way more efficient.

Use setup method, but use it wisely

The second important thing I want to point out in this article is the setup method. It is a powerful tool. If you're working with an existing test suite, you may have already seen it. The idea is simple: Everything you put in the setup will be executed at the beginning of each test.

We'll just take the tests above for example. Let's see how we can apply a setup method to them.

Now, scroll up, look at the tests code again, and come back. See the

server = Restaurant::Server.new

at the first line of all three tests? Let's dry it up:

RSpec.describe Restaurant::Server do
  # ----- Attention here -----
  before :each do
    @server = Restaurant::Server.new
  end
  # --------------------------

  describe '#take_order' do
    it 'with French toast passes the order to Cook' do
      expect_any_instance_of(Restaurant::Cook).to receive(:make_french_toast)

      @server.take_order('French toast')
    end

    it 'with other orders returns an error message' do
      result = @server.take_order('Not French toast')

      expect(result).to be_kind_of String
    end

    it 'with other orders does not pass order to Cook' do
      expect_any_instance_of(Restaurant::Server).not_to receive(:serve)

      @server.take_order('Not French toast')
    end
  end
end

What changed is that instead of initializing the local variable server at the beginning of each test, we do the assignment to the @server instance variable at the before :each block. And it is readily usable inside each test.

It doesn't seem like a big change, but when we move the same line of code into the before :each setup helper, we get the following benefits:

  • If we are going to change the initialization of our Restaurant::Server, we will now only have to rewrite in one place (in the before :each block), instead of three.
  • By removing the boilerplate part, we can focus on what we really want to test.

Almost every testing framework supports this functionality, including RSpec. Apart from before :each method, RSpec provides three other helper methods at our disposal:

  • before :all is executed once at the beginning of all tests.
  • after :each is executed at the end of every test in the scope.
  • after :all is executed once when all the tests are finished.

The double-edged sword

Yet, precisely because it is such a convenient instrument, setup methods can be easily abused. We may be tempted to swing too many stuff into our setup methods, so much that we will eventually find ourselves looking at a mess inside them as they have become a maintenance nightmare.

A real-world example, shall we?

Our restaurant example is really simple, so there isn't much room to mess around in the setup method. Let's walk out of that quaint little restaurant and into the real world for a while.

At PicCollage, we have a good tradition of writing tests. At the time of writing, we have 4000-ish test cases in total that guards our server that serves millions of requests every day. They cover all the important logics in our codebase, and quite trust-worthy already. Yet, we do have some issues with our test suite, and one of them is, guess what, the setup method.

We have this global setup method that runs at the beginning of every of those 4000 tests, and here is a snippet of it:

def setup    
  Rails.cache.clear

  Redis.current.flushdb
  Redis.current_2.flushdb

  Sidekiq::Worker.clear_all
  PC::Helpers::ElasticsearchHelpers.elasticsearch_clear!

  OmniAuth.config.test_mode = true

  Draper::ViewContext.clear!

  setup_aws_stubs
  setup_logger_stubs
  setup_schema_patches

  # And it goes on...
end

Each line corresponds to an external component that our production code interacts with. Why is it bad? Because

  1. Some of the code only applies to a subset of the tests, yet we need to run them regardless, creating extra burden on the server.
  2. This makes it much slower to run the tests.
  3. When error happens, it's not readily visible where the error may be, because it's far away from where the actual tests are.

So what should be in the setup method?

To avoid misusing the setup methods, we can ask ourselves the following questions:

  • Does every test need this setup? If only a certain subset of tests need this setup, then they probably shouldn't be in the setup method. Also, if this setup is related to an external component, maybe you shouldn't need it at the first place if you are writing unit tests. Instead, you should have stubbed it to make your unit tests isolated.

  • Is it related to a certain logic that you want to test? If yes, then it shouldn't be in the setup method, because if we remove it from the test block, we may have to go back and forth between setup method and the code block to understand/remember what this test does. I'll let you in on a little secret: it's really easy to forget that each test has a setup method ahead of it.

We're not done yet

That's it! So in this article we talked about how a unit test can be maintainable. We looked at one important mindset and one powerful testing tool. Next article in this unit testing series will be on readability. Bye for now!


  • Find me at