AI Agent Identity and Permission Challenges: How Uber and Auth0 Are Rethinking Access Control

Uber recently described an internal architecture for propagating agent identity across multi-agent AI workflows. The design aims to preserve originating user context, agent provenance, and scoped access as agents delegate work and call internal tools. Uber's case study aligns with Auth0's argument that AI agents need permission models based on delegated authority, scoped credentials, and explicit human approval boundaries rather than conventional service accounts or broad OAuth scopes.

The problem is that AI agents do not fit neatly into access-control models built for either human users or backend services. In Auth0's article, Cameron Pavey argues that users are usually bounded by sessions and user interfaces. Backend services, by contrast, are generally deterministic and auditable through static code paths. Agents may perform multi-step tasks, call tools, delegate to other agents, and act on behalf of a user without every action being directly selected by that user. As Pavey puts it, "AI agents fit neither category."

Uber's implementation extends its Zero Trust architecture for agentic systems. Uber engineers describe an architecture that includes an Agent Registry, AI Agent Mesh, Security Token Service, Model Context Protocol (MCP) Gateway, downstream systems, and an AI Gateway/AI Guard. The Agent Registry stores the relationship between an agent and the workload it is allowed to host. The Security Token Service verifies that relationship and issues short-lived JSON Web Tokens (JWTs) for each hop in an agentic workflow. The MCP Gateway then mediates access from the agent mesh to internal systems, performs tool access checks, and redacts sensitive data if needed.

Uber’s agent identity architecture connects agent registration, token exchange, gateway enforcement, and downstream system access (source)

A key design choice is that Uber does not rely on a single user credential or a long-lived service account moving through the workflow. Each agent uses local metadata, inbound context, destination audience, and a SPIRE-issued workload identity to request a new token from the Security Token Service. Uber says the approach is conceptually based on OAuth 2.0 Token Exchange, but customised to carry agent identity and provenance through internal auditing and performance requirements. The issued tokens are single-hop and short-lived, with a specific Audience claim and a TTL in the order of minutes.

That provenance is captured in what Uber calls an actor chain. In Uber's multi-hop investigation example, an on-call engineer may ask an Oncall Agent to investigate an issue. That agent may then delegate to an Investigation Agent before calling an internal tool through the MCP Gateway. The token presented to the gateway contains the chain of participants, not only the immediate caller. This allows downstream systems to evaluate both the originating human identity and the acting agent identity when making authorisation decisions.

Uber’s multi-hop investigation example propagates actor-chain claims across agent-to-agent and agent-to-tool calls (source)

Auth0's framing complements Uber's implementation by arguing for three patterns in production agent architectures: capability-scoped permissions, task-scoped credentials, and layered enforcement. The goal is to limit the blast radius of agent mistakes without removing the autonomy that makes agents useful. In Uber's architecture, similar controls include per-hop token exchange, audience scoping, registry-backed agent verification, gateway policy checks, and redaction of sensitive data when required.

Auth0 describes an agent permission model where short-lived scoped tokens flow from the identity provider to the agent runtime and tool layer (source)

Uber also highlights the developer-experience side of the design. The company initially considered an external proxy for agent-to-agent calls, but found that preserving execution context end to end required support in the application layer. Uber therefore built a standardised A2A client to automate token exchange and actor-chain propagation. Uber frames this as a secure-by-default developer experience, moving identity propagation into the standard client path rather than leaving each agent team to implement it independently.

Uber says the system has been adopted by thousands of internal agents. The company says its production metrics show P99 latency for the Security Token Service token exchange API remains below 40 milliseconds. That addresses a potential concern that per-hop token exchange could add too much overhead in workflows involving many tool calls and delegations. Uber also says it is tracking the IETF WIMSE working group and related drafts, including AI Agent Authentication and Authorization, as standards activity around workload and agent identity evolves.

The pattern aligns with concerns covered in InfoQ's article on building a least-privilege AI agent gateway with MCP, OPA, and ephemeral runners, where the gateway boundary serves as the control point for tool access, policy evaluation, and auditability. Uber's case study adds a production example of how that boundary can be combined with agent identity propagation and per-hop delegation. For architects, the broader lesson is that agentic systems require access models that preserve the originating user's context, agent identity, and tool-level authorisation across the full workflow, rather than treating agents as ordinary clients or services.

About the Author

Eran Stiller

Show moreShow less

InfoQ Software Architects' Newsletter

Follow us on

About the Author

Eran Stiller

Rate this Article

This content is in the Agents topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter