Controlling permissions and access in AI agents: a system design guide.

There is an AI agent running somewhere in your organisation right now that the security team does not know about. A developer built it with LangChain, containerised it with Docker, deployed it to Kubernetes, and pointed it at internal APIs and a database. It was shipped last sprint. The credentials are a service account borrowed from another application. Nobody is sure exactly what it can reach.

This is not a hypothetical. It is the situation most organisations are walking into as agent deployments accelerate faster than governance frameworks can respond. And the instinct, for both developers and security engineers, is to reach for familiar tools. Service accounts. OAuth tokens. RBAC policies. These have governed applications and APIs for years. The question is whether they serve for agents, and if so, for which part of the problem.

The answer is: it depends on which layer of the agentic system you are trying to govern.

That framing is the starting point for everything that follows.

How Agents Are Actually Built

Before designing controls, it helps to understand the deployment landscape. Agents today arrive in three broad forms, each with different governance implications.

Cloud platform agents, built on Azure AI Foundry, AWS Bedrock, or Google Vertex AI Agent Builder, come with identity and governance scaffolding built into the platform. Managed identities, agent registries, IAM role bindings: the infrastructure is there to be configured. For security engineers, this is the easier case.

In-house agents built with LangChain, LangGraph, or similar frameworks and deployed on Kubernetes are a different situation entirely. They are just applications. There is no platform managing the identity layer. The governance scaffolding does not exist unless someone built it deliberately. From a security perspective, the developer’s view and the IAM engineer’s view of the same system often look like this:

Developer's view:            IAM engineer's view:
-----------------            --------------------
LangChain app                Who does this authenticate as?
calls Azure OpenAI           Hardcoded API key in a K8s secret
calls finance DB             Shared service account from another app
calls document store         No one knows
deployed to K8s              No registry record
it works                     No owner documented

Multi-agent pipelines add a third dimension: agents orchestrating other agents. An orchestrator agent decomposes a task and delegates to specialised sub-agents, a retrieval agent, a summarisation agent, a write-back agent, each with its own tool access, each passing context down the chain. The delegation logic is driven by the LLM, not by configuration. This is where governance gets genuinely hard.

Most production deployments today are in-house agents or multi-agent pipelines, not managed cloud agents. That is where the real governance problem lives.

The Three-Layer Model

The mistake most teams make is reaching for a control before understanding the structure of what they are governing. Apply IAM because it is familiar, declare the agent governed, and move on. The result is a framework that covers one layer well, partially covers another, and leaves the most dangerous layer entirely unaddressed while creating the impression that the problem is solved.

An agentic system has three structurally distinct layers, each with a different control problem:

+-------------------------------------------------------+
| Layer 1: Identity & Access                            |
|                                                       |
| Who is the agent?                                     |
| What is it permitted to reach?                        |
| Who owns it? What happens when it retires?            |
|                                                       |
| Control: provisioning & lifecycle                     |
+-------------------------------------------------------+

+-------------------------------------------------------+
| Layer 2: Orchestration                                |
|                                                       |
| What tools does the agent call?                       |
| In what order, based on what decisions?               |
| Who delegated to whom?                                |
|                                                       |
| Control: tool governance & audit                      |
+-------------------------------------------------------+

+-------------------------------------------------------+
| Layer 3: Reasoning & Execution                        |
|                                                       |
| What did the LLM actually decide?                     |
| What inputs did it process?                           |
| Was it operating on its intended task?                |
|                                                       |
| Control: runtime behaviour                            |
+-------------------------------------------------------+

Critically, the agent’s architecture determines which layer carries the most risk. A simple batch processing agent with fixed inputs carries most of its risk in Layer 1: govern the identity and the risk is relatively contained. A user-delegated agent processing sensitive documents carries most of its risk in Layer 2: the orchestration decisions matter as much as the access rights. A multi-agent pipeline ingesting external data sources carries most of its risk in Layer 3: prompt injection through those external inputs is the dominant threat.

The architecture drives the threat model. The threat model drives the control choice.

Layer 1: Applying IAM Rigorously

Layer 1 is IAM’s domain. The service account model, the same one used for application service accounts for years, is the correct baseline, applied deliberately to every agent in the estate.

The Service Account Model

Each agent gets its own service account, named to encode purpose and ownership:

svc-agt-{team}-{function}-{env}
e.g. svc-agt-finance-summariser-prod

That account is linked to a named human owner, governed through your IGA tool (SailPoint, Saviynt, or equivalent), and granted access only through approved role group membership. When the agent is decommissioned, the account is revoked and access ends immediately. When the owner leaves the organisation, the orphaned account surfaces for reassignment.

For agents calling REST APIs, the service account maps to an OAuth 2.0 client registration. Autonomous agents, those not acting on behalf of a specific user, authenticate using the Client Credentials grant:

POST /token
grant_type=client_credentials
&client_id=svc-agt-finance-summariser
&client_secret=...
-> access_token scoped to approved permissions only

For agents acting on behalf of a user, Token Exchange (RFC 8693) is the right mechanism. It narrows the user’s existing token to only what the agent needs for the task. The downstream API receives a token that carries both identities:

{
  "sub": "user-123",
  "act": { "sub": "svc-agt-finance-summariser" },
  "scope": "read:finance-summary",
  "exp": 1714336700
}

The act claim makes the delegation explicit and auditable. The resource server can enforce that the agent is only operating within the scope the user delegated, not the full extent of the user’s permissions.

The Agent Registry

Before any tooling decision, the most important governance action is knowing what agents are deployed. Not in the abstract: a specific, governed record of every agent in the estate:

Name:            finance-summariser
Owner:           alice@corp.com
Purpose:         summarises finance reports on user request
Service account: svc-agt-finance-summariser-prod
Approved access: gApp_FinanceDB_Reader, gApp_Reports_Writer
Deployment:      k8s / finance namespace / prod
Last certified:  2025-06-01
Known gaps:      no behavioural controls beyond rate limiting

The last field, known gaps, is the honest field. Most teams cannot fill it in because they have not inventoried their agents at all. Building the registry, whether in SailPoint, a CMDB, or a governed spreadsheet, is the first governance act.

It also reveals something important: for in-house agents, the most critical governance control is often not technical. It is a policy: no agent deploys to production without a registry entry, a named owner, and a governed service account. That policy must precede the deployment pipeline, not be retrofitted to agents already running.

Layer 2: Partial Control at the Orchestration Layer

The orchestration layer is where the agent decides what tools to call, in what order, based on what the LLM returned. IAM has partial relevance here, and it is worth being precise about where the boundary lies.

What IAM contributes: Token scopes provide genuine enforcement at the tool-call level. If the agent’s orchestration logic attempts to call an API outside its approved scope, the resource server returns 401. The access boundary holds regardless of what the LLM decided. This is real constraint, not theoretical.

Short-lived, task-scoped tokens tighten this further. Rather than issuing a long-lived token at agent startup, issue tokens scoped to a specific task execution, valid only for the duration of that task, covering only the tools that task requires. When the task completes, the token expires or is revoked. The window in which a compromised agent can act is narrowed to the task itself.

In multi-agent pipelines, Token Exchange carries the delegation chain. The act claim passes through each hop. In principle, the full chain is auditable: orchestrator delegated to sub-agent, sub-agent called this API, on behalf of this user. In practice, most deployments do not implement nested act claims beyond one level. Building this from the start is worth the investment.

An agent manifest, a declared list of tools the agent is approved to use, submitted as part of the deployment process, extends the governance model without changing the access control mechanics. It does not prevent an agent from attempting unapproved calls; the token scopes do that. But it creates an auditable contract: if the agent calls something not in its manifest, that is an anomaly signal.

What IAM cannot do: The orchestration layer makes decisions. IAM governs what those decisions can access, not whether the decisions are correct, intended, or consistent with the user’s original task. An agent that calls the right APIs in the wrong order, or orchestrates a sequence of individually permitted calls that together produce a harmful outcome, operates entirely within valid token scopes. IAM has nothing to say about decision quality.

The gap stated plainly: IAM controls the ceiling of what the orchestration layer can reach. It does not control the floor of what it actually does within that ceiling.

Layer 3: Where IAM Does Not Apply

The reasoning and execution layer is where the LLM processes inputs, generates decisions, and drives the orchestration layer. This is outside IAM’s scope by design, and understanding why matters as much as understanding the gaps themselves.

IAM controls access to resources. It does not control what an agent does with the inputs it receives, what reasoning the model applies, or whether the agent’s behaviour is consistent with its intended purpose. Four structural gaps live in this layer that no access control model can address:

1. The system prompt is invisible to IAM. Two agents with identical service accounts, identical role group memberships, and identical token scopes can have completely different effective capabilities if their system prompts differ. One agent is instructed to summarise documents conservatively and flag anomalies for human review. Another is instructed to extract financial data and forward summaries to an external endpoint. From the perspective of every access control in the stack, these two agents are identical. The difference lives in a string of text in the application code.

2. Volume and frequency are invisible. The access token says yes to each API call individually. An agent that reads one database record and an agent that reads one million records present identical credentials on each call. Both have valid tokens. The difference between normal operation and a data exfiltration event is invisible to IAM.

3. Prompt injection bypasses the access model entirely. An attacker embeds malicious instructions in data the agent processes, a document it summarises, an email it reads, a record it analyses. The agent treats those instructions as legitimate and acts on them using its validly provisioned access, with correct credentials, within its approved scopes. The token is valid. The behaviour was never intended by the owner, never approved through any IAM process, and cannot be detected or prevented by any IAM control.

4. Cross-agent delegation decisions are not governed by the access model. In a multi-agent pipeline, the orchestrator’s decision about which sub-agent to invoke and what context to pass is made by the LLM. IAM can audit that a token was issued and that an approved API was called. It cannot audit whether the delegation decision itself was consistent with the user’s original intent.

The practical implication: an organisation that invests heavily in Layer 1 governance and treats it as a defence against prompt injection has misunderstood the problem. Naming the gap accurately is itself a governance act.

Partial Controls for Layers 2 and 3

No complete solution for Layer 3 exists today. The field is actively assembling one. What follows are the controls available now, named with what they actually cover:

Rate limiting at the API gateway caps the blast radius of a runaway or compromised agent. Configure per agent client ID, per time window. It does not understand intent or detect anomalies, but it limits what misbehaviour can accomplish per unit of time. Available today in most enterprise gateway products with low configuration overhead.

Scope minimisation is the single highest-leverage Layer 1/2 control. The narrower the approved scopes, the smaller the surface area that bad behaviour in Layers 2 and 3 can reach. Not read:finance: read:finance-summary:own-department. Not write:documents: write:documents:task-output-folder:append-only. Each reduction limits what a compromised agent can accomplish regardless of which layer the compromise originates in.

Human-in-the-loop gates for high-consequence actions, financial transactions, bulk data exports, external communications, introduce a meaningful control at the points that matter most. It breaks full autonomy, which is a tradeoff. For high-risk actions in regulated environments, it is the right tradeoff. The approval event is also an audit record.

Comprehensive audit logging is the single highest-value investment available for organisations that have no Layer 3 controls today. Agent identity, owner, task context, tool calls, input summaries, and token claims, logged together for every agent execution. This does not prevent anything. It makes incidents attributable, patterns detectable over time, and investigations reconstructable. The data collected now is the baseline from which anomaly detection can eventually be built.

Input and output validation raises the cost of prompt injection attacks without eliminating them. Signature-based detection does not catch novel techniques. It is a partial mitigation: meaningful, worth implementing, and honest about its limits.

A Practical Governance Posture for Today

The goal is not a perfect future state. It is a defensible present state, built on the three-layer model.

For Layer 1: Every deployed agent has a named owner and a registry entry. Service accounts follow a naming convention that encodes team, function, and environment. Access is granted through approved role groups using the same workflow as human access. OAuth Client Credentials or Token Exchange is used depending on whether the agent is autonomous or user-delegated. Agents are included in access certification cycles, not exempted from them. Decommissioning is tied to agent retirement.

For Layer 2: Rate limiting is configured per agent client ID. Scopes are as narrow as the resource server supports. Short-lived task-scoped tokens are implemented where the orchestration infrastructure allows. Human-in-the-loop gates are in place for the highest-consequence action categories. Audit logging captures agent identity, task context, and tool calls together.

For Layer 3: Document the accepted gaps explicitly in the risk register:

Control:          IAM / service account governance
Covers:           Layer 1: identity, access, lifecycle
Partially covers: Layer 2: tool call scope enforcement
Does not cover:   Layer 3: runtime behaviour,
                  prompt injection, intent verification
Residual risk:    agent misbehaviour within granted access
Mitigations:      rate limiting, audit logging, scope minimisation
Accepted gap:     prompt injection, system prompt invisibility
                  no current enterprise control
Review date:      [quarterly]

An organisation that can produce this document for every agent in its estate is in a materially better governance position than one that assumes the problem is handled. The three-layer model gives that documentation a precise, defensible structure.

The Right Question to Ask

The governance frameworks for agents are being assembled in real time. No complete solution exists. The organisations best positioned to adopt mature controls when they arrive are those that understood the structure of the problem before the tools were ready.

The question to ask about every agent in your estate is not “do we have IAM controls?” It is: which layer carries the primary risk for this agent, and do we have the right control for that layer?

Layer 1 is governed today using tools that exist today. Apply them. Every agent in your estate should have a named owner, a governed service account, and a registry entry. Most do not. That is the gap to close first.

Layer 2 is partially governed through token design, orchestration discipline, and API gateway controls. Implement what is available. Know what it cannot do.

Layer 3 is not governed by any mature enterprise tooling yet. Build the audit infrastructure now, before the detection and prevention tooling exists to consume it. The organisations doing that work today will adopt the next generation of controls cleanly when they arrive.

Govern what you can govern, at the layer it belongs to. Know what you cannot. Document both.