AI InsightsJul 01, 20265 min read

Least Privilege for AI Agents at 144:1 NHI Scale

Cory Waddingham

Article Index

The scale problem is already here
Where RBAC breaks for agents
What we do instead: agent-as-principal, scoped per endpoint
How to set up per-endpoint least privilege for AI agents
The audit trail is the point
What teams get wrong the first time
Lessons and trade-offs

Plain RBAC was designed for humans with stable roles, and it stops being a useful security model the moment you have thousands of agents instead of thousands of employees. Guild enforces least privilege for agents that call internal tools at the control plane: every agent is its own principal, and its permissions are scoped per endpoint rather than per role. At agent scale, the question that matters is which endpoints a given agent is allowed to call, and a role label can't answer it.

The short version: traditional RBAC assigns permissions by role. Agents don't have stable roles. They have tasks, and those tasks require specific endpoints on specific integrations. Per-endpoint scoping makes each agent a first-class principal with its own identity, its own brokered credentials, and its own audit trail. The blast radius of a compromised agent drops from "everything the role can do" to "the two endpoints it was granted." That's the difference between a manageable incident and a breach.

This post is for the platform and security leaders who have to defend that distinction in a compliance review. We'll walk through where RBAC breaks for agents, how per-endpoint scoping works inside Guild's control plane, and what the resulting audit trail gives you when an auditor or a CISO comes asking.

The scale problem is already here

The enterprise identity baseline has changed faster than most access models have. Entro Security's NHI & Secrets Risk Report H1 2025 found the non-human-identity-to-human ratio jumped from 92:1 to 144:1 in a single year, a 56% increase, on a population that itself grew 44% year over year. Inside that population, 1 in 20 AWS machine identities carries full-admin privileges, what Entro calls "Super NHIs." And 43% of exposed secrets now surface outside source code, in CI/CD logs, Slack, Teams, and Jira tickets, where shift-left scanning never sees them.

Now layer LLM-driven agents on top. Gartner projects that 40% of enterprise applications will embed task-specific AI agents by the end of 2026, up from less than 5% in 2025. That's an eight-fold jump in a single year.

Okta's agent-sprawl explainer, citing Salesforce's 2026 Connectivity Benchmark Report, puts the average enterprise at 12 AI agents already in use, with half of them operating in isolated silos. CSA's 2026 agentic identity work reports that 51% of organizations have no clear ownership of AI identities at all.

Those three datapoints are the stakes. We're wiring a new class of principal into the same RBAC model that was designed for an org chart, and that principal is far more numerous than the humans, far less owned, and frequently inheriting full user-level access by default.

Where RBAC breaks for agents

RBAC works well at human scale because human roles are relatively stable. You're a Backend Engineer, you're a Support Lead, and the operations you need to perform inside any given system are well-described by that role for months at a time. Both Okta and Cequence make this point explicitly in their least-privilege-for-agents guidance: RBAC assigns permissions by stable organizational role, which doesn't map to agents whose access requirements vary by task, resource, and condition within a single session.

In our experience the model breaks in three concrete ways once you put agents inside it.

Granularity

RBAC was built around roles, and roles work well at human scale. The model starts to break when you need permissions scoped to specific operations rather than broad categories. At tens or hundreds of thousands of agents, mapping fine-grained access to role hierarchies produces exactly the kind of sprawl RBAC was supposed to prevent.

Composition

This is the failure mode Veza Chief Security & Trust Officer Mike Towers describes in his NHI Summit 2025 session on agentic AI and non-human identity risks: as AI agents become composite users that interact across multiple systems, the identity model has to evolve beyond per-permission grants to address the combined access pattern. Standard RBAC evaluates permissions individually. It can't see what happens when a multi-tool chain combines read_email and access_calendar to act on the contents of one through the other. Each permission looks harmless. The combination is not.

Blast radius under prompt injection

OWASP's Top 10 for LLM Applications ranks prompt injection as LLM01 in both the 2023-24 and 2025 editions, and lists "excessive agency" as LLM06 in the 2025 update. Teleport's analysis frames the operational consequence cleanly: prompt injection doesn't bypass infrastructure controls, it steers an authorized agent into taking actions it shouldn't.

The 2025 EchoLeak vulnerability in Microsoft 365 Copilot (CVE-2025-32711), the first reported zero-click prompt-injection exploit against a production AI assistant, is the named recent example. The damage prompt injection can do is bounded by the breadth of the agent's scope. A role-scoped agent gives an attacker the entire role. An endpoint-scoped agent gives them one endpoint.

NIST and NCCoE have started naming this in their identity work as well. CSA's research note on the emerging federal framework summarizes the NCCoE position: agents that inherit full user-level permissions, or that run under generic service accounts, are anti-patterns enterprises must move away from. NIST SP 800-53 Rev. 5 already codifies least privilege as control AC-6. The direction of travel is set. The question is whether the access model you have today can actually enforce it.

At agent scale, the question that matters is which endpoints a given agent is allowed to call. A role label can't answer it.

What we do instead: agent-as-principal, scoped per endpoint

The shortest way we describe Guild's identity model to a CISO is "IAM for AI." Every agent is a first-class principal. It has its own identity, its own scoped credentials, and its own audit trail. There are no shared service accounts in the model, and there is no path by which an agent ends up running under a human user's token.

From there, three properties matter.

Guild brokers credentials; agents never see raw keys

The control plane sits in the path of every call. Auth is configured once at the organization level, and agents receive brokered access to the underlying services. Agents never handle credentials directly: when an agent invokes a service tool, the platform automatically provides the organization's credentials without the agent touching the raw key. This is the part that addresses the "43% of secrets leak outside source code" problem at the architectural level: if the agent never holds the secret, it can't leak the secret into a CI log, a Slack thread, or a Jira ticket.

Permissions are declared at the endpoint, not the role

For any given integration, whether that's GitHub, Slack, Jira, or an internal REST API registered through Bring Your Own Integration, Guild exposes the underlying operations as a typed surface (repos_list, pulls_create, issues_list, and so on) and lets you allow or deny each one per agent. The unit of permission is (agent, endpoint, conditions), not (role, service).

Because permission is scoped to the individual endpoint, you can allow only certain workflows to call certain internal tools: an agent that runs a release workflow can be granted pulls_create while being denied every endpoint on the billing or customer-data integrations it never needs. This composes cleanly with existing role-based governance rather than replacing it. Human roles still run your org chart; per-endpoint permissions run your agents inside that. Sensitive workflows can be gated to specific teams by binding the agents that run them to the team that owns them, so a high-risk integration is reachable only by agents under that team's control.

Every model call and every tool call is logged

LLM access goes through the control plane, where daily token-budget caps are enforced. Tool calls go through the same path. Guild captures an event log for every agent session that includes LLM calls, tool invocations, sub-task spawns, errors, and lifecycle transitions. The audit log provides a tamper-evident, read-only record of administrative actions across the organization. Nothing the agent does happens outside the supervised boundary.

How to set up per-endpoint least privilege for AI agents

The setup follows the same order every time:

Configure auth once at the organization level, so credentials are brokered centrally and no agent ever holds a raw key. (Credentials docs)
Register each integration the agent needs, including internal tools added through Bring Your Own Integration, so their operations are exposed as a typed endpoint surface. (Integrations docs)
Declare the allowed endpoints per agent as (agent, endpoint, conditions), allowing only the specific operations that agent's workflow requires and denying the rest by default.
Gate sensitive workflows to a team by binding the agents that run them to that team, so a high-risk integration is reachable only by agents the team owns.
Set daily token budgets and the audit-log retention and export rules for the agent. (LLM settings docs)
Commit the whole declaration to version control alongside the agent definition, so every permission change is reviewed in a PR.

The practical shape of the configuration is governance-as-code. We keep that configuration in version control alongside the agent definition, which means the permission surface is reviewed in PRs the same way the agent's behavior is. We've found that to be the single highest-leverage habit on a rollout. If a permission change doesn't go through code review, it won't stay scoped for long.

Concretely, a single agent's declaration names the agent identity, the one integration it's bound to, the explicit list of endpoints it may call on that integration, the conditions attached to each call, the token budgets, and the retention rules for its audit log. Every endpoint not listed is denied. That declaration is the reviewable unit: a platform lead reading the PR can see exactly which internal tools the agent can reach and which it cannot, without running it.

The audit trail is the point

Per-endpoint scoping earns its keep in a compliance review through the audit record it produces. Kiteworks and Panther both make this point: infrastructure logs, orchestration logs, and inference logs don't satisfy SOC 2, ISO 27001, HIPAA, PCI DSS, or GDPR audit-trail requirements on their own. None of those logs record what regulated data was accessed, by which specific agent, under what authorization, at what time, with what policy outcome.

Guild's audit logs are tamper-evident and read-only: entries cannot be modified or deleted. Each entry attributes a specific action to a specific actor, names the target resource, and records the timestamp. Session event logs provide the execution-level detail: every LLM call, every tool invocation, every error, linked to the agent that produced them.

When something goes wrong, that's the record the auditor wants. It's also the record your own incident reviewer wants. When nothing goes wrong, the same record is what lets you prove the agent only did what it was allowed to do.

This is the closed loop. Per-endpoint RBAC by itself is a policy. Per-endpoint RBAC enforced in the runtime, with an attributable audit trail tied to a single agent principal, is a control.

What teams get wrong the first time

We've watched enough first least-privilege rollouts to know the most common mistake. Teams grant permissions iteratively until the agent runs without errors, then ship it. "It works" is a functional test. It tells you nothing about the security boundary. The agent ends up with whatever it needs to stop failing, which is almost never the minimum it should have.

The fix is to invert the order. Start with the smallest plausible set of endpoints the agent needs to complete its task. Run it. Let it fail. Read the denials in the audit log, and add a specific endpoint only if the action the agent was attempting is one you would consciously authorize. Two things follow. The final permission set is defensible, because every entry on it exists because a human deliberately added it. And the denial log becomes a useful artifact in its own right: it tells you what your agents were asked to do but weren't allowed to do, which is one of the few signals that catches prompt injection attempts before they become incidents.

The other early mistake is using one credential across many agents because it's easier to provision. That collapses the identity model back into the shared-service-account anti-pattern NIST and NCCoE name explicitly. The whole point of agent-as-principal is one agent, one identity, one scoped credential, one row in the audit log. If two agents share a key, you no longer know which one made the call.

Lessons and trade-offs

This pattern works when you have more than a handful of agents, when those agents touch services that matter, and when you have or will have a compliance obligation to explain what they did. It's over-engineered for a single developer with one weekend agent calling one API.

There are real trade-offs. Per-endpoint declarations are more verbose than role assignments: that's the cost of granularity. Token-budget caps will occasionally cut off an agent mid-task, and that surfaces as a deny in the audit log. Teams have to be willing to treat that as a feature rather than a bug.

The consistent thing we tell platform and security leaders: the control plane is where enforcement lives. RBAC at the role level still runs your org chart. Per-endpoint scoping, brokered credentials, and an attributable audit trail are what run your agents. The two coexist. They don't replace each other.

Your agents are infrastructure now.

Guild is the control plane that enforces least privilege, brokers credentials, and produces the audit trail your compliance team actually needs. See it in action.

Book a demo

Assign each agent its own identity as a first-class principal, scope its permissions to the specific endpoints it needs on each integration, broker credentials so the agent never holds raw keys, and log every call in a tamper-evident audit trail. Start with the smallest plausible set of endpoints the agent needs, let it fail, and add permissions only when you would consciously authorize the action.

RBAC assigns permissions by stable organizational role, which works for humans who hold the same role for months. Agents don't have stable roles. They have tasks that vary by resource and condition within a single session. At 144:1 NHI-to-human ratios, mapping fine-grained agent access to role hierarchies produces the kind of sprawl RBAC was supposed to prevent, and standard RBAC can't evaluate the combined risk when an agent chains multiple permissions across systems.

Per-endpoint scoping means the unit of permission is (agent, endpoint, conditions) rather than (role, service). Instead of granting an agent the "Backend Engineer" role with access to everything GitHub offers, you grant it repos_list and pulls_create on one repository and deny everything else. Every endpoint not explicitly listed is denied by default.

Entro Security's NHI & Secrets Risk Report H1 2025 found the non-human-identity-to-human ratio jumped from 92:1 to 144:1 in a single year. Within that population, 1 in 20 AWS machine identities carries full-admin privileges ("Super NHIs"), and 43% of exposed secrets now surface outside source code.

OWASP ranks prompt injection as LLM01 and "excessive agency" as LLM06 in the 2025 Top 10 for LLM Applications. Prompt injection steers an authorized agent into taking actions it shouldn't. The damage is bounded by the agent's scope: a role-scoped agent gives an attacker the entire role, while an endpoint-scoped agent limits exposure to the specific endpoints granted.

SOC 2, ISO 27001, HIPAA, PCI DSS, and GDPR all require audit trails that record what data was accessed, by which identity, under what authorization, at what time. Infrastructure and orchestration logs alone don't satisfy this. You need per-agent attribution, tamper-evident records, and the ability to link each action back to the permission configuration that was in force.

Guild treats every agent as its own principal rather than assigning agents to shared roles. Credentials are brokered at the organization level so agents never hold raw keys. Permissions are declared per endpoint, not per role. And every call is logged in a tamper-evident audit trail that attributes actions to specific agents, not shared service accounts.

Yes. Human roles still run your org chart. Per-endpoint scoping runs your agents inside that structure. The two complement each other: RBAC governs who can create and configure agents, while per-endpoint scoping governs what those agents can do at runtime.