Reversibility and blast radius: the two questions before you delegate anything to AI

Last month I had to decide what should happen when a customer disputes a charge.

The webhook fires. Stripe says: this customer disputed. Now what?

The product could do one of two things. Suspend the customer's access immediately, since they might be a fraudster who'll dispute every charge from here on out. Or just log it for me to see, and let me decide.

I went with log it. Nothing automatic. The dispute creates a high-severity log entry I can see; the customer's access is unchanged. I make the call.

That's the decision in one paragraph. The reasoning under it is what this post is about, and it's basically the most reusable thing I can give you about working with AI.

The two questions

Before I let any code path act on its own, two questions.

How big is the blast radius? If this action happens, how far does the impact propagate? It might be one row in a table, or the whole table. One user's account, or every user's. An internal log nobody reads, or a customer-facing PDF that goes out by email.

Is it reversible? If the action turns out to be wrong, what does it cost to undo? A one-command revert? An emailed apology? A regulator inquiry?

The pair gives me a quick map. High reversibility plus small blast radius: safe to delegate. Low reversibility plus large blast radius: absolute no. The interesting territory is the middle, where there's no obvious answer.

Back to the dispute

Apply the questions to the dispute decision.

Auto-flip the subscription to suspended on dispute. Blast radius: a real customer loses access to the product, mid-day, with no warning. Reversibility: I can un-flip them, but only after I notice. Which usually means after they email me, frustrated, asking what happened.

Just log it. Blast radius: a log entry, visible only to me. Reversibility: irrelevant. Nothing happened.

The cost of being wrong with auto-flip is a real customer's bad day, and probably a churn risk. The cost of being wrong with log-only is that I might miss a fraudster for a few hours. At pilot scale, both errors are recoverable, but the first one is more expensive to the relationship.

Log-only wins.

This is the kind of decision the AI cannot make for you. It depends on what you know about your customer base, your support capacity, and how much you can afford to be wrong. The AI doesn't know any of that.

Scale changes the answer

At a thousand customers, log-only stops working. Dispute volume drowns the signal. Auto-flip becomes safer because the false-positive rate is bounded by the dispute rate, which is small, and you have a support team that can un-flip in minutes when a real customer surfaces.

The two questions don't give you a static answer. They give you a method for working out the answer in your specific context. The same code can be the right call at one scale and wrong at another. The framework doesn't change. The inputs do.

The same method, applied elsewhere

I don't let the AI run database migrations. I let it write the SQL. I read it. I run it.

Why: a migration's blast radius is the entire dataset of the product. Reversibility is contingent on backups, which exist but are slow to restore and lossy. The AI is genuinely good at writing migrations. The act of running one is where the bad outcomes live. So the AI stops at the SQL boundary, and I cross it.

I let the AI execute multi-step work, but in phases. Read, then propose, then act, with my OK between each. Same calculus at the task level: the smaller the unit of action, the smaller the blast radius of an error, and the easier the recovery.

The boundaries the AI itself draws

There's a version of this that's nice to watch.

A good AI agent will say things like "before I commit to a fix, let me check exactly how X works." Or "stopping rules I'll apply: if two iterations don't improve, I'll stop and flag." Or "let me add an emergency stop so the task is actually complete before I commit it."

These are the agent applying the two questions to its own work. It's measuring its own blast radius before swinging. The framework isn't something you only impose from above. The best AI work uses it internally too.

What people get wrong

Reversibility is treated like a binary. It isn't. Most actions are reversible at some cost. The real question is what undoing costs you, not whether undoing is possible. That cost varies enormously.

Blast radius gets read as a fact about the code. In practice it's an output of the code AND the scale you're running at. The same logic that's harmless at three customers can be catastrophic at three thousand. Re-evaluate as you scale; don't pick a rule and freeze it.

And both questions are weighted by what you can afford to be wrong about right now. A pre-revenue startup has more slack than a public company on a Friday before a holiday. The framework is the same. Your appetite for irreversibility is not.

What's left for the human

As AI gets faster and better at most code, the human's remaining job is increasingly this: decide which actions are too consequential to delegate, and which aren't.

That decision is not about the model. It's about your specific context, your specific blast radius, your specific reversibility cost. The AI can't see those things, and probably never will.

When someone shows you an agent that "decides autonomously," ask them how they handled this question. If they don't have an answer, the system isn't being driven. It's being run.