Killing Flaky Tests Without Killing Coverage

Flaky tests erode trust faster than missing coverage. A suite that cries wolf gets muted, and a muted suite catches nothing. Here is the loop I use to bring flakiness under control without throwing away the tests that matter.

Isolate before you blame the suite

The first question is always: does it fail on its own? Re-run the failing spec in isolation, many times:

npx playwright test flaky.spec.ts --repeat-each=20

Fails in isolation → the flake is in the test or the app, not shared state. Good — that is a real bug you can fix.
Passes in isolation, fails in the suite → order dependence or shared state (a leaked cookie, a seeded row, a global clock). Now you are hunting a leak, not a race.

Quarantine, don’t delete

A flaky test still encodes intent. Deleting it deletes the signal along with the noise. Instead:

Tag it (@quarantine) and move it off the blocking path.
Open a ticket with the isolation evidence attached.
Keep running it in a non-gating job so you can tell when it stabilizes.

Fix the cause, not the symptom

Retries are anesthetic, not a cure. Before adding one, ask what the retry is hiding:

A test that only passes on the second try is telling you the first try’s conditions were not guaranteed. Guarantee them.

Most e2e flakiness traces back to three causes: waiting on time instead of state, unisolated test data, and animations racing assertions. Fix those and the retry count drops on its own.