AI agent security requires identity-first controls, fine-grained authorization (FGA), and continuous monitoring across the data lifecycle. Organizations can help mitigate risks like prompt injection, data leakage, and overprivileged access by combining Zero Trust principles, auditability, and governance frameworks, such as NIST AI RMF and the OWASP LLM Top 10.
Why AI agents create a new security and identity attack surface
According to Bain’s 2025 executive AI survey, “AI Moves from Pilots to Production,” concerns around data security and privacy have risen as organizations move from pilots to production. Security and compliance reviews reveal a common gap: agents automating high-value workflows sometimes operate without the governance infrastructure needed to ensure their trustworthiness at scale. They access sensitive data, interact with external systems, and often function without consistent identity controls comparable to those applied to other enterprise resources.
AI agents introduce a new attack surface because they act autonomously across systems, data sources, and APIs. Securing them requires treating each agent as a non-human identity (NHI) with defined ownership, scoped permissions, and lifecycle governance. Key risks include prompt injection, data leakage, insecure tool use, and supply chain vulnerabilities. Effective security often combines identity-first access control, continuous monitoring, and compliance-aligned frameworks such as NIST AI RMF and OWASP LLM Top 10 (2025 categories).
Securing AI agents is not a single-layer problem. Security solutions need to protect the data, the model, the infrastructure, and the user interface simultaneously. A failure at any layer can compromise the integrity of the entire system.
What is agentic AI? Security risks and non-deterministic behavior
Agentic AI systems are different from LLM wrappers or chatbots. A large language model responds to a prompt. An agentic AI system plans, selects tools, executes multi-step tasks, and adapts its behavior based on intermediate results without continuous human input.
That autonomy is the source of their value and their risk. Traditional applications follow deterministic logic. Agentic AI systems are probabilistic and context-dependent. They reason across dynamic context, which means conventional access models cannot fully predict or constrain their behavior.
Agentic systems also introduce risks of Misinformation (OWASP LLM Top 10, 2025: LLM09) and Unbounded Consumption (OWASP LLM Top 10, 2025: LLM10), the latter involving recursive loops that drive “Denial of Wallet” financial liabilities.
The attack surface operates at the application layer. Prompt injection (OWASP LLM Top 10, 2025: LLM01), embedding malicious instructions in data an agent retrieves and processes, can redirect agent behavior mid-task, including unauthorized tool invocation or data exfiltration. A research agent summarizing customer records could be manipulated through a poisoned document to generate outputs that expose sensitive data or trigger downstream actions in connected systems. This aligns with insecure output handling risks (OWASP LLM Top 10, 2025: LLM05 – Improper Output Handling), where model outputs can directly influence downstream systems without proper validation. Because agents process natural language, the threat is semantic rather than syntactic. Unlike code, malicious content cannot be fully mitigated through simple filtering alone, as the payload can resemble legitimate data the agent is designed to process. Traditional perimeter defenses are often insufficient on their own to detect these threats.
Eliminating shadow AI: The business impact of uncontrolled agents
When agents are built and deployed without centralized governance, the result is shadow AI. Shadow AI is as much an operational and governance problem as it is a technical one. Teams across the enterprise independently deploy agents (a pattern known as agent sprawl), often connecting them to production data sources through long-lived API keys or over-privileged service accounts, with no inventory, ownership model, or decommissioning process.
Gartner has projected that 40 percent of enterprise applications will include task-specific AI agents by the end of 2026, up from less than 5 percent in 2025. Without governance infrastructure to absorb that growth, organizations may face fragmented identity solutions, duplicate agent functionality, and an expanding attack surface that no single team fully controls. The cost of urgent remediation may exceed the cost of governance built in from the start.
Shadow AI can stall innovation. Security and legal teams might block production deployments because there is no verifiable way to demonstrate that agents operate within defined boundaries.
Why identity is the foundation of AI agent security
The path from pilot to production requires treating AI agents as first-class identities. Every agent should have a verifiable identity, defined permissions, a human owner, and a lifecycle aligned to organizational governance policies, policy enforcement mechanisms, and risk tolerance.
This identity foundation serves as the core trust layer. It does not eliminate agent autonomy. It makes autonomy accountable by linking every agent action to a verified human authority, enforcing scoped access through policy-based authorization, and generating an auditable record of every authorized action.
Establishing the AI trust layer: Zero Trust principles for machine identities
Identity-based breach prevention and risk assessment
According to the Verizon Data Breach Investigations Report (2025), compromised credentials were the initial access vector in 22% of confirmed breaches, a risk that may be compounded when credentials are used by autonomous agents operating without human oversight. For AI agents, that risk grows with scale. In many enterprise environments, NHIs may outnumber human users.
Effective data breach prevention includes risk-tiering agents at provisioning based on their access, capabilities, and potential impact. Key controls include:
- Model access control: Restrict which users and systems can query or fine-tune models. Log and monitor model interactions to detect anomalous patterns, abuse cases, or extraction attempts, and use those findings to inform ongoing risk assessment. Controls should also address Unbounded Consumption (OWASP LLM Top 10, 2025: LLM10). For agentic workflows, this includes resource exhaustion risks (often referred to as “Denial of Wallet” in industry discussions) where recursive agent loops drive unintended API cost spikes and service degradation. Identity-based rate limiting and budget-capped service principals are crucial technical controls sometimes used to mitigate these financial liabilities.
- Rate limiting and behavioral monitoring: Detect and interrupt automated attempts to retrieve sensitive information through unusually high query volumes or atypical access sequences.
- Third-party risk management: Each external API or model dependency an agent calls can extend the organization’s attack surface and introduce supply chain risk (OWASP LLM Top 10, 2025: LLM03 – Supply Chain). Vet all AI-related vendors for their security practices and enforce contractual data protection obligations before any agent integrates with an external system.
The agent rule of two: Verifying machine and human ownership
Governance of AI agents requires two separate, verifiable identities for every deployment.
- Agent identity: The technical credential of the machine or service principal, distinct from any human user account.
- Human owner identity: The verifiable natural person or authorized employee who authorized the agent’s deployment. This identity provides the legal basis for attribution, ensuring that autonomous actions are mapped to a specific internal authority for fiduciary accountability and supporting accountability expectations and human oversight requirements under Articles 14 and 16 of the EU AI Act. (These requirements of the EU AI Act apply only to AI systems classified as "high-risk" under the Act, not to all AI systems.)
A credential without an owner cannot be traced to a business decision. An owner without a formal agent identity leaves no record of what the agent actually did. A complete audit trail and compliance in regulated environments typically require both.
Secure by design: Giving agents the keys, not the kingdom
Zero Trust implementation for AI agents centers on scoped, time-bound access. Agents should minimize or avoid standing privileges where possible, enforcing least privilege and favoring just-in-time (JIT) access.
Specific requirements:
- Avoid embedding API keys, passwords, or tokens directly in source code, configuration files, or container images. Use dedicated secrets management systems (e.g., secure vaults or workload identity mechanisms) with automated rotation.
- Require token-based or certificate-based authentication for all third-party APIs the agent calls.
- Protect data in transit with TLS 1.3; legacy TLS 1.2 should be permitted only for documented backward compatibility. Encrypt data at rest using AES-256 implemented via FIPS 140-3 validated cryptographic modules (or NIST-approved equivalents compliant with SP 800-175B Rev. 1).
- Replace long-lived static credentials with JIT access tokens scoped to the task and configured to expire automatically after a defined, short-lived window.
Governing the full agent lifecycle
Continuous discovery and centralized visibility of all agents
Shadow AI cannot be governed if it cannot be seen. Effective data lifecycle management for AI agents begins with a real-time inventory of every agent in the environment, its owner, purpose, data sources it can access, and creation date.
Centralized discovery is not a one-time audit. Agents are created continuously, often by teams without a formal security review process. A unified control plane should automatically surface new agents and flag any that operate outside the approved inventory.
Data isolation and compartmentalization for secure execution
Multi-step agentic processes may pose a risk of data contamination. An agent that retrieves sensitive information in step one may carry that context into step three, where it interacts with an external system or downstream agent that should not have access to it.
Technical countermeasures may include sandboxing, which isolates agent execution from production systems; secure enclaves, a form of confidential computing, which may help protect sensitive computation from surrounding infrastructure; and separate memory spaces for different task contexts, which prevent data retrieved in one reasoning cycle from persisting into the next.
Context contamination is a specific risk in multi-agent architectures. When one agent’s output becomes another agent’s input, a compromised context may propagate silently across the system. Data isolation should be engineered at the architecture level.
Best practices for securing data at each lifecycle phase
- Ingestion: Enforce data minimization. Agents should collect only what their function requires. Accumulating unnecessary personal information increases exposure without increasing capability. Training and retrieval data sources should be validated to mitigate data- and model-poisoning risks (OWASP LLM Top 10, 2025: LLM04), which occur when malicious data influences model or agent behavior.
- Processing: Apply input validation, context isolation, structured interaction patterns, and strict tool authorization controls to limit what actions an agent can execute. Agents that process unverified content without semantic validation are vulnerable to injection attacks and unintended data leakage (OWASP LLM Top 10, 2025: LLM01 – Prompt Injection).
- Storage: Encrypt at rest using AES-256 via CAVP-validated algorithms. Implement policies for regular deletion or archiving of data that no longer serves an active agent function to meet data minimization requirements.
- Deletion: Define and enforce retention limits. Agents operating in silos accumulate data indefinitely. That accumulation can be a liability that grows over time.
Identity controls should be enforced through all four phases, not only at the perimeter.
Auditability and compliance mandates
Generating a traceable governance trail
In regulated environments, the question is not only whether an agent produced the right output. The question is whether the organization can prove what data the agent accessed, which human authorized the action, and what decision logic the agent applied.
Identity, when combined with tamper-resistant logging and cryptographic controls, may help provide the link that enables this. Without an identity layer, it is difficult to establish a traceable governance trail. Explainable AI monitoring (which attempts to provide visibility into how an agent reaches its decisions) supports auditability, though full reconstruction of model reasoning may be limited by system design.
Meeting industry standards: HIPAA, GDPR, CCPA, and fiduciary liability
An AI agent without a verifiable identity may create direct exposure to compliance risks. In healthcare, an agent that accesses protected health information without a traceable identity chain may create compliance risk under HIPAA. In financial services, an agent that executes transactions without auditable human authorization can introduce fiduciary and regulatory risk.
Regulation | AI Agent Requirement | Identity Control That Satisfies It |
HIPAA | Traceable access to protected health information | Access controls and audit logging traceable to an authorized individual or system, satisfying Security Rule requirements under 45 CFR §164.312 |
GDPR / CCPA | Lawful basis for data processing; data subject rights | Consent tracking; data minimization enforced at ingestion; documented retention limits and erasure processes aligned to GDPR Art. 17 and CCPA §1798.105 |
EU AI Act (Art. 12 / 14) | Automated logging of high-risk system events (Art. 12) and mandatory human oversight mechanisms (Art. 14) to mitigate automation bias | Tamper-resistant action logs; Human-in-the-Loop authorization controls |
SOC 2 | Continuous monitoring and access control evidence | FGA, behavioral monitoring, and periodic access reviews supporting SOC 2 Trust Services Criteria for logical access |
Fiduciary (Financial Services) | Auditable authorization for consequential transactions | Delegated authority chain; Human-on-the-Loop (HOTL) monitoring for high-velocity workflows |
Meeting current and emerging compliance regulations requires:
- Hybrid oversight: HITL authorization for high-stakes actions and HOTL monitoring for high-velocity workflows, each supported by a traceable governance trail that documents authorized actions and intervention points.
- Consent and transparency: Data subjects must be clearly informed of what data autonomous systems collect and how they use it, which is a baseline requirement under regulations such as the GDPR and the CCPA.
- Third-party vendor compliance: Contractual and audit obligations must extend to every vendor in the agent's data pipeline. A compliant internal agent that passes data to a non-compliant vendor may undermine the organization's overall compliance posture.
Organizational readiness: Training, audits, and incident response
Technical controls require organizational support:
- Security training: Developers and operators of agentic systems need training specific to agentic risks, including prompt injection, privilege creep in non-human identities, and data minimization requirements.
- Regular audits: Security and privacy audits of AI systems and data pipelines should run on a defined schedule. Reactive auditing surfaces problems after they have caused harm.
- Incident response: Maintain a documented response plan for agent misuse or data breaches, including notification procedures and escalation paths. This should include a validated “kill-switch” mechanism for the rapid revocation of agent access and emergency cessation of autonomous behavior across all connected systems when misuse or anomalous logic is detected.
Preparing for the future: Evolving security roadmaps and mandates
The regulatory environment for AI agents is evolving. Identity-first governance is a primary technical mechanism organizations often use to demonstrate that high-risk agentic actions were logged, traceable, and subject to human intervention. The NIST AI Risk Management Framework establishes governance expectations that increasingly inform procurement standards and contractual requirements.
Emerging technical standards include verifiable credentials for AI agents: cryptographically signed assertions that attest to an agent’s identity, permissions, and provenance. Organizations building identity infrastructure for agents now are better positioned to adopt these standards as they mature.
How to protect AI agent data in motion and at rest
Implementing FGA and dynamic scoping
Encryption protects data at rest and in transit, but authorization determines what data an agent can reach in the first place. FGA can help enforce precise access controls for AI agents, operating at a level of granularity that role-based access control alone cannot provide.
FGA can help enable surgical access. An agent can reach only the specific API endpoints, medical records, or document segments required for its current task. An agent’s effective permissions reflect the current task context rather than a static role that accrues privileges over time. All model queries should be logged and monitored within this access control layer.
Securing cross-app access for autonomous workflows
Agents operating across SaaS applications and legacy systems may create distributed credential risk. Each application boundary is a potential point where credentials are exposed, reused, or persisted beyond their authorized scope. Modern standards (e.g., OAuth 2.0 incremental authorization and workload identity federation) can help support secure delegation across application boundaries, ensuring that agents request only the specific scopes required for the current sub-task, rather than accumulating broad access at initial authorization. This approach helps prevent privilege seepage, in which an agent retains access to an application beyond its intended task.
Implementation requirements include strict authentication for all third-party APIs in the agent’s workflow, and secrets management enforced through vault-based access for all API keys and tokens. Credentials should not appear in source code, configuration files, or environment variables that persist across sessions.
Defense against data seepage: Advanced anonymization and privacy-preserving techniques
Anonymization methods may be used as a secondary defense against agents that inadvertently expose PII to unmanaged contexts. The approaches range from foundational to computationally intensive:
- Foundational: Anonymization and pseudonymization strip or replace identifiers (names, email addresses, IP addresses) before data enters the agent’s processing environment.
- Model-level privacy: Differential privacy (e.g., DP-SGD) introduces mathematical noise during training or fine-tuning. When applied during fine-tuning on proprietary data, this can help reduce the risk of that data being extracted through adversarial inference. For agents, PII scrubbing and tokenization before data reaches the model may help mitigate the risk of sensitive context data being retained or inadvertently surfaced in future outputs.
- Advanced computation: Homomorphic encryption enables computation on encrypted data without exposing raw values. Secure multi-party computation (MPC) allows collaborative computation across multiple parties without any party revealing its underlying data.
- Monitoring: Continuous, real-time monitoring, including explainable AI techniques, tracks how an agent reaches decisions and helps identify unintended exposure of sensitive data.
Anonymization does not replace the need for strong access controls for agents. Identity-based guardrails are foundational controls that can help limit what the agent can reach. Anonymization addresses residual exposure after that boundary is defined.
Frequently asked questions
What is AI agent security?
AI agent security is the practice of protecting autonomous AI systems using identity-based controls, data governance, and continuous monitoring. It focuses on risks such as prompt injection, data leakage, and over-privileged access.
Why are AI agents a security risk?
AI agents operate autonomously across systems and data sources, making them vulnerable to prompt injection, insecure tool usage, and data exposure. Their non-deterministic behavior makes it more difficult to enforce traditional security controls.
How do you secure AI agents?
Organizations can help secure AI agents by implementing identity-first access controls, fine-grained authorization, secure secrets management, continuous monitoring, and governance frameworks such as NIST AI RMF and OWASP LLM Top 10.
What is prompt injection in AI agents?
Prompt injection is an attack in which malicious instructions are embedded in data processed by an AI system, causing it to override its intended behavior or expose sensitive information.
What is a non-human identity in AI?
An NHI, including workload identities, is a machine identity (such as an AI agent, service account, or API key) that requires authentication, authorization, and lifecycle management similar to that of human users.
Unified identity is a foundation for accountable AI agent security
To move beyond fragmented security solutions and deploy AI agents at scale, organizations need a unified identity control plane that manages both human and machine identities. A single control plane can support verifiable governance, FGA, and auditable logging to help ensure that autonomous agentic actions are authorized, scoped, and aligned with organizational compliance requirements.
Discover how the Okta Platform helps secure every AI agent with identity-first controls to manage, govern, and protect AI agent identities at scale.