It's the answer everyone gives to "how do you keep AI safe?" But what happens when the safety net becomes the single point of failure?
The Comfortable Answer
Ask anyone how they safeguard against rogue AI, how they ensure critical decisions have a human audit trail, how they meet regulatory requirements for accountability. You'll get the same answer.
Human in the loop.
I say it myself. Probably too much. It's become the default response, the reassuring phrase that makes boards comfortable and regulators nod. The EU AI Act mandates it for high-risk systems. The NIST AI Risk Management Framework lists it as a core principle. Every major consultancy (McKinsey, Deloitte, Gartner) references it as foundational to responsible AI deployment.
The logic is clean and intuitive. A human reviews AI decisions. Catches errors. Maintains accountability. Signs off before anything critical happens. The AI does the heavy lifting; the human provides the judgement.
It's a compelling model. It's also starting to crack.
And I know this because I'm cracking it right now.
The Loop I'm Sitting In
As I write this, I'm in a coffee shop. I have three independent projects running simultaneously. A newsletter build, an asset content production planner, and a blog publishing tool. Each one is being built and iterated by AI agents. My role across all three is the same: human in the loop. I review changes, sign off decisions, approve outputs.
Three projects. One human. A flat white.
And here's the honest truth: I'm good at this. I know these projects intimately. I can context-switch between them and give meaningful feedback. But I can already feel the edges of it. The newsletter change looks right, approve. The content planner structure makes sense, approve. The blog tool just needs a small tweak, approve. Each decision takes seconds. The review feels genuine. But how much am I really scrutinising versus pattern-matching against what I expect to see?
Now multiply this by ten projects. Or fifty. Or an entire organisation where every department runs on agent swarms and every decision flows through a human approval step.
The maths doesn't work. And I say that as someone who is, right now, the human in the loop.
The Speed Problem
Here's the first fracture.
Modern agentic AI systems don't make decisions at human speed. They make thousands, sometimes millions, of decisions per second. As organisations spin up autonomous teams, entire departments comprised of agent swarms working collaboratively, the human reviewer faces a throughput problem that no amount of training or dedication can solve.
Emre Kazim, co-founder of Holistic AI, put it bluntly in January 2026: "Human-in-the-loop has hit the wall. Humans cannot meaningfully track or supervise AI at machine speed and scale."
He's right. Consider what "review" actually means in practice. An agent handling customer service interactions processes hundreds of conversations simultaneously. A compliance agent scans thousands of transactions per hour. A marketing agent generates, tests, and optimises campaigns across multiple channels in real time.
Where exactly does the human reviewer sit in this? What are they reviewing? The first ten decisions? A random sample? The ones the AI flags as uncertain? Each approach has blind spots. And every point of human intervention adds latency and cost, creating a bottleneck in systems designed for speed.
In practice, in many implementations, human-in-the-loop has already become human-in-the-way.
The Complacency Trap
The second fracture runs deeper.
Parasuraman and Manzey demonstrated in their seminal 2010 paper that automation complacency (the tendency for humans to over-trust automated systems) occurs especially "under conditions of multiple-task load, when manual tasks compete with the automated task for the operator's attention."
The critical finding? Complacency "cannot be overcome with simple practice." It's not a training problem. It's a structural feature of how humans interact with automation.
Think about what happens in practice. A human reviewer sees a hundred AI decisions. Ninety-nine are correct. They start to trust the system. By the thousandth decision, they're not reviewing anymore. They're rubber-stamping. The approval becomes performative. The safety net is still there structurally, but functionally it's been hollowed out.
A 2024 study on Security Operations Centres confirmed this: increased automation in SOCs "amplifies the risk of automation bias and complacency whereby security analysts become over-reliant on automation, failing to seek confirmatory or contradictory information."
Research from Oxford on national security decision-making found that "initial reliance on direct AI answers impaired performance on subsequent tasks when the AI erred." In other words, trusting AI output in one context made humans worse at catching AI errors in the next.
A clinical study published in AI and Ethics found that in medical diagnostics, "clinicians of all expertise levels were vulnerable to automation bias, even when AI improved their overall diagnostic accuracy, with nearly half of errors associated with this bias."
The pattern is consistent across domains. The more you rely on AI, the worse you get at checking AI. The safety net degrades through use.
The Deskilling Spiral
The third fracture is the most alarming, and it's now backed by clinical evidence.
In August 2025, The Lancet Gastroenterology & Hepatology published a multicentre study across four endoscopy centres in Poland. Nineteen experienced endoscopists, each with over 2,000 colonoscopies, were measured before and after routine AI assistance was introduced.
The results were stark. Detection rates without AI dropped from 28.4% to 22.4% after routine AI exposure. That's a 20% relative reduction in diagnostic capability. Experienced clinicians got measurably worse at their core skill after relying on AI to help with it.
This wasn't a theoretical concern anymore. It was real-world evidence with patient-outcome implications.
Microsoft Research and Carnegie Mellon University found the same pattern in knowledge work. In a study of 319 workers and 936 real-world examples, higher confidence in generative AI was associated with less critical thinking. Deep problem-solving "often migrated from human-led to AI-driven." Workers reported that AI made tasks cognitively easier, but researchers found they were ceding the expertise itself to the system.
The ACM now describes this as "The AI Deskilling Paradox". A structural problem, not an individual failing, emerging across law, education, journalism, and software development.
Here's why this matters for human-in-the-loop: if humans lose the ability to perform the tasks AI is doing, they simultaneously lose the ability to meaningfully review AI's work.
The vicious cycle looks like this:
- AI takes over tasks from humans
- Humans lose the skills needed to perform those tasks
- Humans can no longer evaluate whether AI is performing correctly
- Human review becomes performative, approval without understanding
- Errors propagate undetected
- The organisation has no fallback if AI fails
We wrote about this dynamic in our piece on skill atrophy. The research has only strengthened since.
The Knowledge Gap Nobody Is Planning For
There's a dimension to this that rarely makes it into governance frameworks.
Research on institutional knowledge estimates that approximately 90% of total organisational knowledge is held as unwritten know-how. Skills, instincts, and contextual understanding that's difficult to articulate or document. This knowledge has traditionally been acquired by doing the work. Learning by doing. Absorbing context through experience.
When AI automates tasks, that pathway breaks. The ACM warns that when "novice knowledge workers simply retrieve the information they seek from GenAI, the result could be the risk of a loss, over time, of organisational knowledge."
Now project this forward. An agent swarm runs your compliance department. It handles regulatory filings, monitors changes in legislation, flags risks, and produces audit reports. It does this faster and more consistently than the human team it replaced. The humans who remain are reviewers. They approve outputs but no longer produce them.
What happens when the AI encounters a novel situation it can't handle? What happens when the model needs to be retrained or replaced? What happens when a regulator asks a human to explain, in detail, why a specific decision was made?
If the last person who understood how to do the work retired three years ago, and the current team has only ever reviewed AI outputs, who answers those questions?
This isn't hypothetical. Deloitte's State of AI 2026 report found that 85% of companies plan to deploy autonomous agents, but only 21% have mature governance models. That's a 64-percentage-point gap between ambition and readiness. Gartner predicts that over 40% of agentic AI projects will be cancelled by end of 2027 due to escalating costs, unclear business value, or inadequate risk controls.
McKinsey warns explicitly about "agent sprawl", the uncontrolled proliferation of redundant, fragmented, and ungoverned agents across teams and functions. As agent creation becomes accessible to anyone through low-code platforms, organisations face a new kind of shadow IT: agents that multiply across teams, duplicate efforts, or operate without oversight.
Self-Documenting Agents: The Way Forward
I've become convinced that the answer isn't removing humans from the loop. It's fundamentally changing what "the loop" looks like, and making AI agents responsible for their own accountability.
When an AI agent is running critical business tasks, it should be self-documenting by design. Not as an afterthought. Not as a compliance checkbox. As a core operational requirement.
This means:
Maintaining its own SOPs. The agent should hold and continuously update standard operating procedures for every process it manages. Not documentation that a human wrote and hopes the AI follows. Documentation the AI writes and maintains as part of doing the work. AWS has already released an open-source framework for this (Strands Agent SOPs), using standardised markdown formats with RFC 2119 constraints (MUST, SHOULD, MAY) to define agent workflows in natural language.
Reporting against those SOPs silently. The agent should continuously verify its own behaviour against its documented procedures and flag deviations automatically. Not in a dashboard that a human has to remember to check. In a proactive alert system that escalates by exception. ISACA identifies "the growing challenge of auditing agentic AI" precisely because current systems lack this built-in traceability.
Preserving institutional knowledge. Every decision the agent makes should generate an explainable record. Not just what it decided, but why, what alternatives it considered, and what data informed the choice. When a human needs to understand or override a decision, the context should be immediately available, not buried in model weights.
Planning for its own replacement. The documentation an agent maintains should be sufficient for a human, or a different AI system, to take over the process. If an agent were switched off tomorrow, could someone reconstruct what it does and why? If the answer is no, you have a business continuity problem dressed up as efficiency.
Building for Resilience, Not Just Compliance
The organisations that will scale AI safely are the ones moving beyond the binary of "AI does it" versus "human checks it." The emerging model has three tiers:
Human-in-the-loop. Retained for genuinely high-stakes, novel, or ethically sensitive decisions. Not everything, not routine approvals, but the moments where human judgement genuinely adds value.
Human-on-the-loop. Humans monitor dashboards and exception reports, intervening when anomalies are flagged. The system operates autonomously for routine decisions. The human's role shifts from approver to governor.
Agent-in-the-loop. AI "guardian agents" monitor other AI agents, providing machine-speed oversight that humans physically cannot. Humans focus on system design, policy setting, and strategic direction. This concept is expected to shift from experimental to mandatory in 2026.
The practical steps for any business scaling AI agent usage:
- Define what actually needs human review. Not everything. Be specific about which decisions carry consequences that justify human latency.
- Build in active deskilling countermeasures. Rotate staff through manual task performance. Maintain the pathway for acquiring hands-on knowledge. Don't let AI automate away the skills you need to govern it.
- Require self-documentation from every agent. Make SOP maintenance a core function, not an add-on. If an agent can't explain what it does and why, it shouldn't be running unsupervised.
- Design for agent failure. Business continuity planning should treat AI dependency as a first-class operational risk. What happens if your agent infrastructure goes down for a week? Can your team still function?
- Implement bounded autonomy. Agents operate within defined guardrails, with confidence thresholds that trigger escalation. Not a blanket approval process. A dynamic one that adjusts based on risk and certainty.
The Harsh Reality
Human-in-the-loop, as currently implemented, is simultaneously the most widely relied-upon AI safety mechanism and one of the most fragile.
It doesn't scale. It induces the complacency it's meant to prevent. It degrades the skills it depends upon. And it creates single points of failure in systems designed for resilience.
The path forward isn't removing humans. It's being honest about what humans are actually good at, and what they're not. Humans are excellent at setting policy, defining values, making judgement calls in novel situations, and asking "should we?" Humans are terrible at reviewing thousands of routine decisions at machine speed without losing focus.
Build systems that play to those strengths. Make AI responsible for its own documentation, its own compliance checking, its own audit trail. Free humans to do what humans do best: think about whether the system itself is pointed in the right direction.
Because the alternative, pretending that a human rubber-stamp at the end of an automated pipeline constitutes meaningful oversight, isn't a safety net.
It's a comfort blanket.
Thinking about how to scale AI agent usage without creating governance blind spots? Let's talk about building systems that are resilient by design.
Sources
Research papers and academic studies
- Complacency and bias in human use of automation, Parasuraman & Manzey, Human Factors (2010)
- Endoscopist deskilling risk with routine AI assistance, The Lancet Gastroenterology & Hepatology (2025)
- Impact of generative AI on critical thinking, Microsoft Research & Carnegie Mellon University (2025)
- Automation bias in Security Operations Centres, MDPI Computers (2024)
- Bending the automation bias curve, Oxford International Studies Quarterly (2024)
- Automation complacency: risks of abdicating medical decision making, AI and Ethics, Springer (2025)
- The AI Deskilling Paradox, Communications of the ACM
- Knowledge management in a world of generative AI, ACM (2025)
Industry reports and analysis
- State of AI in the Enterprise 2026, Deloitte
- Over 40% of agentic AI projects will be cancelled by 2027, Gartner
- Seizing the agentic AI advantage, McKinsey
- Human-in-the-loop has hit the wall, Emre Kazim, SiliconANGLE (2026)
- Human-in-the-loop in AI risk management: not a cure-all approach, IAPP
Frameworks and governance