The agentic AI risks are real. Agentic AI tools — Copilot, AutoGPT, and the custom agents businesses are building...
But here's what the sales demos tend to gloss over: the same capability that makes agentic AI powerful is precisely what makes it dangerous when something goes wrong. Every permission you grant an AI agent is a permission that can be exploited. And unlike a human employee, the agent won't pause, feel uneasy, or ask whether it's really supposed to do this.
When an AI agent is compromised — through a prompt injection attack, a misconfigured tool, or simply an instruction it misinterprets — it doesn't make one bad decision. It executes a chain of actions automatically, often before any human has any idea something has gone wrong.
What "agentic overreach" actually means
Most businesses give their AI agents access they don't strictly need because it's convenient. Connecting the agent to email, calendar, your CRM, and the file server all at once means it can be more helpful. But it also means a single compromised or injected instruction can cascade into a sequence of irreversible actions across every one of those systems simultaneously.
This is what security professionals call agentic overreach: the gap between the access an AI agent actually requires to do its job and the access it's actually been given. That gap is where the damage happens.
Consider the difference between a human employee and an AI agent when both receive an instruction that turns out to be malicious. The human might feel something is off. They might ask a colleague. They might simply fail to follow through because they got distracted. The AI agent executes immediately, completely, and logs it as a completed task.
How a compromised agent attack plays out
The attack doesn't have to be sophisticated. It might be a malicious instruction hidden inside a document the agent is asked to summarise. It might be a compromised API endpoint that returns instructions alongside data. It might even be an ambiguously worded internal prompt that the agent interprets in a way nobody anticipated.
What all these scenarios have in common: by the time a human notices, the agent has already finished. The emails have been sent. The records have been updated. The data has left the building.
The principle of least privilege — and why almost nobody applies it
Least privilege is one of the oldest principles in information security: every system, process, or user should have access to exactly what it needs to do its job, and nothing more. Applied to agentic AI, it's simple in theory. An agent that summarises meeting notes doesn't need email send access. An agent that researches suppliers doesn't need write access to your CRM.
In practice, businesses almost never scope permissions this tightly. The reasons are understandable: it's faster to connect everything at once, the agent is more useful when it can cross-reference multiple systems, and nobody has really thought through the attack surface they're creating.
The discipline required here isn't just technical. It's organisational. Someone has to decide, for each agent deployment, what the minimum viable permission set actually is. That decision needs to be documented, reviewed periodically, and updated whenever the agent's scope changes.
Which actions should never happen without a human in the loop
Not every agentic action carries the same risk. Reading a file is very different from deleting one. Drafting an email is very different from sending it. Retrieving a database record is very different from updating one. The principle that emerges from this is straightforward: irreversible actions should require human confirmation before execution.
In practice, this means building approval gates into your agentic workflows at every point where the agent could do something that can't be undone. Sending communications. Modifying records. Deleting files. Making financial transactions. Calling external APIs that trigger downstream consequences. These are the actions where the cost of getting it wrong is highest — and therefore where the human checkpoint earns its keep.
The more tools you give an AI agent, the larger the blast radius of any successful attack or injected instruction.
This is a genuinely hard design problem. If you put a human checkpoint in front of every agentic action, you've eliminated the efficiency gains that made the agent worth deploying. The answer isn't checkpoints everywhere — it's checkpoints in the right places, informed by a clear analysis of which actions are reversible and which aren't.
Detect, Assess, Defend
Most deployments weren't designed with any of this in mind
The honest truth is that most agentic AI deployments happened fast. A developer connected a few APIs because the agent needed them. Nobody mapped the permission boundaries. Nobody defined which actions required human sign-off. Nobody built the monitoring infrastructure to know what the agent was actually doing day-to-day.
That's not a criticism of the businesses involved — it reflects the speed at which this technology has moved. But it does mean that a significant number of UK businesses now have AI agents operating with permissions they've never fully reviewed, taking actions nobody has fully audited, in workflows that have no human checkpoint anywhere in them.
The gap between where most businesses are and where they need to be is closeable. But it requires someone to actually map it first.
How BBS helps with this
- AI Security Gap Assessment — We review your agent permissions, tool access scope and privilege boundaries, producing a prioritised map of where your blast radius is largest and what needs to change first.
- AI Architecture Review — We design least-privilege permission models for agentic deployments, so your agents have exactly what they need and nothing more — from day one.
- AI Acceptable Use Policy — We define which actions AI agents may take autonomously and which require human confirmation, giving your team a clear, enforceable framework.
- Human Oversight Design — We build approval gates into agentic workflows at the right points, so irreversible actions always have a human checkpoint without killing the efficiency gains you deployed the agent to achieve.