2026 is the year enterprises stopped asking whether to use AI agents and started asking how to put them in front of real users without getting hurt. The gap between the two is wider than most teams expect. An agent that books a meeting flawlessly in a demo is a different thing from an agent allowed to touch a customer’s account, move money, or send a message in your name. This is the nine-part checklist we run before any agent sees production traffic.
The unifying principle: an agent is software that takes actions you did not explicitly write. Everything below exists to bound what those actions can be, prove what they were, and stop them when something goes wrong.
1. Least-privilege permissions
Give the agent the narrowest possible access to do its job, and nothing more. If it summarises tickets, it does not need write access to the billing system. Scope every tool and credential to the specific task. The test: list everything the agent could do if fully compromised, and make sure that list is survivable.
2. A human-approval gate for irreversible actions
Draw a hard line between reversible and irreversible actions. Reversible actions (drafting, summarising, retrieving) can run autonomously. Irreversible ones — payments, deletions, external emails, account changes — require a human to approve before execution. The agent proposes; a person commits. This single control prevents the majority of catastrophic agent failures.
3. A tested evaluation suite over real tasks
Evaluate agents on tasks, not prompts. Build a suite that mirrors real user goals end to end and measure task-completion rate, number of steps taken, tool-call accuracy, and cost per task. Gate releases on a pass bar the same way you gate code on tests. An agent with no task-level eval is an agent you are testing in production on your customers.
4. Cost and step caps
Agents loop. A reasoning bug, an ambiguous goal, or a misbehaving tool can send an agent into a cycle that burns thousands of model calls before anyone notices. Set hard caps: a maximum number of steps per task, a maximum spend per session, and a maximum wall-clock time. When a cap is hit, the agent stops and escalates rather than continuing.
5. Full action logging and traceability
Log every decision the agent makes: the goal, each retrieval, each tool call with its arguments, each model response, and the final action. Tie it together with a correlation ID so you can reconstruct exactly what happened in any session after the fact. When an agent does something surprising — and it will — this log is the difference between a five-minute diagnosis and a multi-day investigation.
6. Guardrails against injection and exfiltration
Agents that read external content — emails, documents, web pages — are exposed to prompt injection, where malicious instructions hidden in the content try to hijack the agent. Add input and output guardrails: sanitise and bound untrusted content, detect attempts to override instructions, and scan outputs for data exfiltration and PII leakage. Treat every piece of content the agent ingests as potentially adversarial.
7. A kill switch
You need a single control that halts all agent activity immediately — across every session, not one at a time. When something goes wrong at 2am, the on-call engineer should be able to stop the fleet with one action and ask questions later. Build it, test it, and make sure the people who might need it know it exists and how to use it.
8. Graceful failure and fallback
Decide what the agent does when a tool times out, an API returns an error, or it cannot complete a task. The wrong answer is “keep trying” or “guess.” The right answer is a defined fallback: retry with backoff a bounded number of times, then hand off to a human or return a clear “I could not complete this” rather than fabricating a result. Failing safely is a feature.
9. Clear ownership after launch
An agent in production is a living system that needs an owner: someone who watches the metrics, reviews the logs, responds when quality drifts, and decides when to retrain or adjust. “The AI team” is not an owner. Name a person. Agents degrade silently as the world changes around them, and silent degradation with no owner is how a useful agent becomes a liability.
Run it as a gate, not a wish list
The point of this checklist is that it is binary. Each item is either done or it is not, and an agent does not go to production until all nine are done. The discipline feels heavy for the first agent and becomes routine by the third — at which point you have a repeatable path from demo to production that the rest of the organisation can trust.
The teams that move fastest with agents are not the ones who skip these steps. They are the ones who built the controls once, as reusable infrastructure, so every new agent inherits permissions, logging, caps, and a kill switch by default. Safety, done right, is what lets you go fast.