Back to writing
Research CommunicationInteractive Media

The Case for Explorable Explanations

Static figures are a compromise forced by print. We can do better — and we should.

August 20, 2024·10 min read

The standard academic paper was designed for print. Its conventions — a fixed sequence of sections, static figures, numbered equations — make perfect sense when the medium is paper and the bottleneck is page count. Those conventions have migrated, essentially unchanged, to the digital era. Most ML papers published today are PDFs: static, linear, read-only.

This is a missed opportunity. When you read about attention patterns in a transformer, you should be able to change the input and watch the patterns shift. When you encounter a claim about how SAE features cluster, you should be able to drag a threshold slider and test the claim yourself. When someone argues that PCA loses less structure than t-SNE for a given dataset, you should be able to try a few datasets and see.

The static figure imposes an authoritarian relationship between author and reader. The author chooses what to show you; you can look but not touch. Explorable explanations invert this. They give readers the ability to form their own intuitions, test edge cases the author didn't consider, and arrive at conclusions through direct experience rather than passive consumption.

The phrase comes from Bret Victor

In 2011, Bret Victor wrote a short essay called "Explorable Explanations" describing a vision for active reading: documents that respond to the reader's input in real time. The examples he built — a tax calculator where you could drag numbers and watch downstream effects update live, a simulation of predator-prey dynamics you could perturb — were immediately compelling. The idea felt obviously right.

The ML community got its own version a few years later through Distill.pub, a journal that required interactive figures as a standard component of each article. The best Distill articles — Olah et al.'s work on feature visualisation, Elhage et al. on circuits — shaped how a generation of researchers thinks about neural networks. Not just because the arguments were rigorous, but because you could hold the ideas in your hands, rotate them, and find where they broke.

The difference between reading about superposition and playing with a toy model of superposition is the difference between knowing a fact and having an intuition.

Distill wound down active publication in 2021, citing the difficulty of reviewing and maintaining interactive content at scale. This is a real problem, and I don't want to minimise it. But I think the difficulty reflects tooling that hasn't caught up, not an inherent limitation of the format.

Why it matters more in interpretability than anywhere else

A field that studies how models represent information has a particular obligation to be thoughtful about how it represents information to readers. Interpretability claims are often subtle. "This feature activates on French text" sounds simple until you start asking: what counts as French? How much activation counts as activation? Does it still activate on French words in English sentences?

These questions can't be answered by a single static figure. They require exploration. Readers who can only consume pre-selected examples have to trust the author's curation choices — which is exactly the kind of passive deference that interpretability research is trying to move away from in models. Why should we accept it in papers about those models?

When I built the explorable on SAE failures, the most important design decision was giving readers direct control over the model's activation threshold. The phenomenon I was describing — that features fragment differently depending on where you set the sparsity constraint — is impossible to communicate with a fixed image. You have to move the slider yourself. Otherwise you're just reading my description of what I saw, which is two steps removed from the thing itself.

The practical objections

It's harder to build. Yes. Writing a well-designed interactive figure takes longer than dropping a matplotlib plot. But the marginal cost has dropped substantially. WebGL is fast and accessible; React + canvas covers most use cases; libraries like Observable Plot and D3 are mature. The activation energy is lower than it was in 2011.

It's harder to peer review. Also yes. Reviewers can't run your interactive figures on paper printouts. But conferences and journals already handle supplementary code, videos, and demos. Interactive figures are a natural extension of existing supplementary material practices, not a separate category.

It's harder to preserve. This is the most serious objection. JavaScript rots. Libraries break. Browser APIs change. A paper published today with interactive figures might be unreadable in 20 years. This is a genuine archival problem that the community hasn't solved. But the answer is probably not "therefore static figures only." It's "therefore we need better archival infrastructure."

What the bar should be

I'm not arguing that every paper needs a full Distill-style treatment. Long-form interactives are expensive and appropriate for a small fraction of work. But there's a lot of space between "no interactivity" and "fully explorable article" that's currently almost entirely unoccupied.

Specifically: any empirical claim that depends on a threshold, hyperparameter, or choice of example should be accompanied by something the reader can manipulate. Any visualisation method that is known to distort in specific ways (dimensionality reduction, attention maps, saliency methods) should ship with a distortion indicator. Any toy model used to build intuition should be playable, not just viewable.

These aren't high bars. They're defaults we should have already adopted. The static figure will always have its place — for formal proofs, for concise summaries, for print fallbacks. But as a primary medium for communicating empirical results in a field that runs in browsers, on GPUs, and through APIs, it's an anachronism we've grown too comfortable with.

The reader deserves more than a picture. Give them the controls.

Related work: Bret Victor, Explorable Explanations (2011) · Distill.pub · Why SAEs Fail (explorable)