The case for splitting tests into multiple it blocks
Cypress' documentation is very clear about how they suggest you structure your tests.
In their best practises documentation they say:
Tests should always be able to be run independently from one another and still pass.
You only need to do one thing to know whether you've coupled your tests incorrectly, or if one test is relying on the state of a previous one.
Change
it
toit.only
on the test and refresh the browser.If this test can run by itself and pass - congratulations you have written a good test.
... How to solve this:
Combine multiple tests into one larger test.
and
Anti-Pattern: Acting like you're writing unit tests.
Best Practice: Add multiple assertions and don't worry about it
This GitHub issue asks 'Is the execution order of it
blocks within a describe
block deterministic?'.
To which the answer is 'yes, but we don't recommend doing that', and a pointer to the above best practises documentation.
I respect that Cypress are clear about what their opinion is.
And, it's not that I disagree - I agree that a test should stand by it self, but I want to make the case that individual assertions are a helpful thing.
The Scenario - A basic CRUD application
This is a common scenario that I'm frequently testing. In plain english a test plan might look like this:
- I navigate to the widgets page
- I click 'add widget', a modal containing a form opens, I fill the form, I hit submit
- I see the new widget on the list of widgets.
- I click the corresponding 'edit' button
- A modal pops up, containing a form with the fields already populated
- I change widget name, I click submit.
- I see the widget with its name changed
- I click the corresponding delete button
- A confirmation modal pops up, I click confirmation
- The widget no longer exists on the page.
Now Cypress's opinion on this kind of tests is that either you write this all as one big test in side one it block, or this is split multiple tests that can run independently.
The problem with one big test
The problem with one big test, is that in the event of failures it's less clear what has gone wrong.
In our example above, the name of our test would probably be something like "We can create, edit and delete widgets".
Now let's say that the edit function is not working properly - when we see the test failure we don't immediately see what exactly went wrong. Was it that no 'add widget' button existed? When we created a widget, no new widget appeared? No 'edit' button existed? No modal popped up when we pressed the edit button?
In order to find out what's gone wrong, we need to drill in and be reading line numbers and stack traces.
If we were able to break the test into three it blocks like:
- "It can create a widget"
- "It can edit a widget"
- "It can delete a widget"
Then it narrows the code surface that we need to investigate. Granted we still need to be looking at stack traces and line numbers, but this will be easier.
The problem with multiple independent tests
The multiple independent tests scenario would like three tests that could be run at the same time:
- "Given that I am on the widget page, I click the 'add widget' button, fill the form, hit submit, and see the new widget"
- "Given that I am on the widget page, and there is a widget there, I hit the edit button, update the form, hit save, and see the edited widget"
- "Given that I am on the widget page, and there is a widget there, I hit the delete button, hit confirm, and see that the widget no longer exists"
Yes - this sounds good - but how do we get the application into the state of that given?
We could do API mocking - but I don't like this approach:
- It's more work. Now, in order to write some tests, we need to understand what API calls are made, provide mock data, and also possibly be defining some implementation behaviour (for example, "when I call DELETE /widgets/123, the widget with id 123 will no longer be returned in the GET /widgets" call). We're essentially reimplementing our API in order to test.
- We're not actually testing the API.
- One of the big values of using Cypress IMO is that you are implicitly testing your API. The problem with API mocking is how your API actually behaves may differ from what what your define your mock as doing. In that scenario you tests may be passing and you wouldn't know about the error.
Alternatively, we could be running our application against a real backend, that is populated with all the data we need. So for example, widgets with the name "Widget to update" and "Widget to delete" would already exist. Between runs of the cypress test suites, we reset our API to this initial state.
I do like this approach, and it would save a lot of time in tests. However, this would be a lot of work to set up. If you are someone just trying to set up some test coverage on an existing project, this is the kind of friction that might cause you to not write tests at all.
How multiple it blocks could work.
As it is, multiple it blocks do run consecutively, and so can get the advantage of increased granularity already. However, the problem you would have is that an it.only
on one of the later steps will fail.
So, a couple of ideas:
Have a special itSerial
block, that depends on previous it blocks
That is, say we have a test like:
describe ("Widgets page", () => {
itSerial("Can create a widget", () => {
});
itSerial("Can update a widget", () => {
});
itSerial("Can delete a widget", () => {
});
})
And we itSerial.only
that delete widget test, then all preceding itSerials in the describe will also run. Performance-wise this is no different to the 'one big test' approach.
Have itSerial
blocks use page URLs to retain state.
This idea is less fully formed, and possibly not practical, but humour me.
The problem with the previous idea is that in order to itSerial.only
the delete test, you still need to run all previous steps, and that can take a while.
It would be nice Cypress could just pick up from where it left off and run only the delete test.
The idea could be, that at the end of each itSerial
block, Cypress makes note of what the page URL is. When running an itSerial
as a single, it then navigates to that URL directly and runs the test.
Problems with this:
- Requires the application to have well defined URLs configured.
- It's still possible for the application to get into a non-repeatable broken state. For example, what if the delete step deletes the widget, but just doesn't update the screen? Subsequent DELETE calls will not work.
All in all, this approach might not be feasible.
Questions? Comments? Criticisms? Get in the comments! 👇
Spotted an error? Edit this page with Github