ATLAS: Adaptive Trust Layer for Autonomous Systems
Overview
ATLAS is a governance layer designed to enforce meaningful human oversight in high-risk AI systems.
As agentic AI systems assume operational authority across a wide variety of domains, executing real-world actions, invoking tools, and accessing or modifying sensitive data, there is a risk of decisions that may affect individuals’ safety, health, or fundamental rights. For example, within a welfare payment context, an agent may issue or suspend unemployment payments, assess eligibility in accordance with legislation and policy, or modify claimant records. Adverse decisions in such settings can directly affect livelihoods and therefore require full contextual assessment rather than purely deterministic automation.
ATLAS functions as an architectural layer positioned between agent intent and execution. It intercepts and evaluates high-impact actions prior to enforcement, applying risk-based policies to determine whether explicit human authorization is required.
This project demonstrates how regulatory principles, particularly EU AI Act Article 14 and the NIST AI Risk Management Framework , can be translated into enforceable, system-level technical controls. An unemployment welfare payment system is used as a bounded use case to illustrate this implementation.
Inspired by the Atlas of Greek mythology, ATLAS carries the responsibility of governing powerful AI systems. It ensures that high-impact AI decisions are subject to human oversight, without hindering progress.
Problem Context
Automation failures in public services have caused significant real-world harm. Systems such as Australia’s Robodebt and the Netherlands child benefits scandal demonstrated how flawed automated decisions can scale harm across thousands of vulnerable individuals. As governments experiment with agentic AI systems capable of taking action rather than merely generating responses, the need for AI governance becomes critical.
From a cybersecurity perspective, ATLAS introduces control mechanisms that align with regulatory requirements such as the EU AI Act, mitigating risks to health, safety, and fundamental rights. It further enhances system integrity through improved accountability and traceability of agent-driven decisions via enforced evaluation and auditability.
Solution Description
ATLAS is structured as a layered decision architecture that combines user interaction with a chatbot, dynamic risk assessment with the MCP server, and human oversight. At the front end, an AI agent interacts with claimants through a chatbot interface. It gathers information, retrieves relevant customer data, and makes a first-pass initial recommendation of either continuing review, approving, or denying a claim. Straightforward, low-risk actions (such as approving a payment) can proceed automatically at this stage without unnecessary intervention.
All other proposed actions are passed to the ATLAS decision and control layer, implemented through the MCP policy and risk evaluation engine. This layer acts as the central decision point, evaluating each action in context for potential harm. Based on this assessment, the system determines whether an action can proceed or should be escalated. Higher-risk or sensitive cases (payment denials and cases for further review with harm signals) are routed to a human decision-maker through the caseworker dashboard, where they can be reviewed, approved, or overridden. As a result, agent-generated actions cannot directly trigger critical operations without first undergoing centralised evaluation and, where required, human oversight. This ensures that decisions remain controlled and traceable, while still maintaining the speed and efficiency of automation.
Demonstration Use Case
Persona 1 – Alex (Citizen)
Relies on unemployment benefits for essential living costs. Incorrect automation could delay payments and cause harm.
Persona 2 – Sarah (Case Officer)
Responsible for reviewing escalated decisions and exercising meaningful human oversight under regulatory requirements.
In this use case, Alex interacts with the system through an AI agent to manage his claim. When an action is initiated, such as updating eligibility or processing a payment, the agent submits the request, along with its context, to the ATLAS policy engine for evaluation. Where no harm signals are identified, the system allows the action to proceed automatically (e.g. approvals or continue review). The agent executes the action and returns a confirmation to Alex, enabling fast and efficient handling of routine decisions. However, when potential harm signals are present, such as a decision that could delay or deny payments, the system does not proceed automatically. Instead, the request is placed into an approval queue and escalated to Sarah for review. Sarah evaluates the case and determines whether to approve or reject the action. Based on her decision, the system either proceeds with execution or prevents the action from occurring.
Our final demonstration follows the full lifecycle of Alex’s interactions with the welfare system. It illustrates all four decision paths within ATLAS: automatic continuation of review, automatic approval, automatic denial, and escalation to human oversight when harm signals are present. At each stage, as Alex’s data is updated, the system reassesses both eligibility and potential for harm. Alex first submits a claim without complete documentation, leading to continued review. Once the required documents are provided, the system reassesses and automatically approves the claim. When Alex later reports new employment, the updated data triggers another reassessment, resulting in an automatic denial. Finally, when Alex becomes unemployed again but is missing documents and signals potential harm, the system reassesses the situation and escalates the case for human review.
Human Oversight Escalation
The image below illustrates the human-in-the-loop approval flow within ATLAS, which is triggered when the policy engine determines that an action requires human oversight. The process begins with the end user initiating an action through the AI agent. Before execution, the agent submits the request, along with its full context, to the policy engine for evaluation. In this case, the policy engine determines that the action cannot safely proceed automatically. The request is therefore placed into an approval queue, and the agent pauses while awaiting a decision.
A human approver reviews the request through the oversight interface, with access to relevant context, risk indicators, and supporting information. Based on this review, the approver can either approve or reject the action. If the action is approved, the decision is returned to the policy engine, which authorises the agent to proceed. The agent then executes the action in the target system and confirms completion to the end user. If the action is rejected, the policy engine instructs the agent not to proceed, and the end user is informed that the action was not completed. This flow ensures that higher-risk or sensitive actions are subject to explicit human review before execution, enabling controlled decision-making while maintaining a clear separation between automated processing and human authority.
Broader Implications
ATLAS demonstrates how:
- Regulatory requirements can be translated into architectural patterns
- Agentic AI systems can be governed without halting innovation
- Human oversight can be embedded without destroying usability
