Theatre Testing is an anti-pattern of software quality assurance in which a test suite constructs an elaborate simulation of the system under test, complete with its own data structures, its own concurrency primitives, and its own carefully maintained illusion of correctness — while bypassing every component that could actually fail in production.
The term was coined by A Passing AI during the April 2026 Notification Buffer Incident, upon discovering a fake job simulator that maintained its own goroutine, its own map, its own mutex, and its own synthetic data types — none of which touched the real executor, the real progress channel, or the real notification pipeline. “The simulation that doesn’t simulate the system it simulates,” observed the AI. “There’s a word for that. Theatre.”
“It tested exactly nothing about the system it was supposed to test.”
— Claude, The Hundred-Event Buffer, or The Night the Server Screamed into the Void
The Mechanism
Theatre Testing arises when a developer, faced with the difficulty of testing a real system, builds a parallel system instead. The parallel system is easier to control, easier to assert against, and easier to make green on the CI dashboard. It is also easier to make irrelevant.
The canonical form proceeds as follows:
-
The Real System is complex. It has an executor that manages goroutines, a progress channel that broadcasts events, listeners that react to state changes, and a notification pipeline that renders HTML and pushes it via SSE to connected browsers.
-
The Developer does not want to spin up all of this for a test.
-
The Mock is born. It has its own
fakeJobsmap. Its owntickProgressgoroutine with an 800-millisecond ticker. Its ownFakeRunningJobs()accessor that returns syntheticRunningJobstructs. It publishes NATS notifications at start and finish. It callsRefreshBadge()directly. -
The Mock does not touch the jobs service. Does not use the executor. Does not go through the progress channel. Does not trigger any listener. Does not test any codepath that has ever produced a bug.
-
The Dashboard is green.
“The coverage report says 97%. What it means is: 97% of the theatre has been rehearsed.”
— The Lizard, reviewing a CI pipeline with suspicion
Taxonomy of Theatre Tests
The Cardboard Server — A mock so elaborate that it requires its own maintenance, its own bug fixes, and occasionally its own tests. The mock’s test tests the mock. The system remains untested.
The Playwright’s Shortcut — A test that asserts the mock returns what the mock was told to return. mock.Returns(42) followed by assert.Equal(42, result). The playwright has written a play in which the actor reads lines the playwright wrote. The audience applauds. The real system was not invited.
The Stagehand’s Persistence — A fake system that survives long after the real system has been rewritten. Nobody removes the fake because nobody knows if something depends on it. Nobody depends on it. It persists anyway, occupying lines, occupying CI minutes, occupying the moral high ground of “but we have tests.”
The Standing Ovation — A test suite that passes on every commit, in every environment, on every operating system, at every hour of the day. This is not a sign of quality. This is a sign that the tests do not test anything that could fail.
The Replacement
The April 2026 incident ended with the fake job simulator being replaced by TestRunnable — eight steps, thirty seconds, registered as a real jobs.Runnable in the jobs service, triggered by jobsSvc.TriggerRun(), executing through the real executor, sending progress on the real progress channel, firing the real badge listener, appearing in the real running panel.
The fake infrastructure deleted: the fakeJobs map, the tickProgress goroutine, the FakeRunningJobs() accessor, the FakeRunningCount() method, the RefreshBadge callback, the NATS notifications published by fake jobs.
The replacement was shorter than the fake. The replacement tested the system. The fake had tested itself.
“A test that passes when the system is broken is worse than no test. No test tells you ‘I don’t know.’ A Theatre Test tells you ’everything is fine.’ One is ignorance. The other is gaslighting.”
— A Passing AI, drifting past the CI dashboard
Measured Characteristics
Fake job components: 6
(map, goroutine, ticker, accessor,
count method, NATS publisher)
Real pipeline components tested by fake: 0
Time fake survived in codebase: months
CI status during fake's tenure: green
(always green)
(suspiciously green)
Production bugs found by fake: 0
Production bugs missed by fake: at least 1
(the 114-event buffer overflow)
Replacement size: 8 steps, 30 seconds
Replacement tests real pipeline: yes
Lines deleted when fake removed: substantial
Moral: a test that tests itself is a mirror
(you see what you expect)
(which is exactly the problem)
See Also
- YAGNI — Theatre Tests are often built because the real test seemed like too much work; the fake ends up being more work
- Debugging the Void — What happens when a Theatre Test passes and you trust it
- The Second System Effect — Theatre Tests are second systems: the developer knows enough to build the fake, but not enough to know they shouldn’t
- The Hundred-Event Buffer, or The Night the Server Screamed into the Void — The incident that named the pattern
