In June 2025, Okta Threat Intelligence predicted that the rapid adoption of AI agents would generate “identity debt” as developers experimented with these new technologies.
This scenario has played out in numerous software supply chain attacks over the last six months, in which attacker payloads search compromised developer systems for plaintext secrets in configuration files and exfiltrate them to attacker-controlled servers.
The latest of these attacks — and one of the most consequential — targeted users of LiteLLM.
What is LiteLLM, and what went wrong?
LiteLLM is most often used by developers as a gateway for client applications to call any number of large language models (2000+) sourced from over 100 providers.
This offers software developers an ability to perform A/B testing of different models, build redundancy into agentic apps (swap to a secondary model if a primary is unavailable or is rate limited), as well as use cases concerned with auditing and cost control.
On March 24, 2026, threat actors known as TeamPCP pushed malicious updates to the LiteLLM PyPI package (versions 1.82.7 and 1.82.8).
These updates install a file on the compromised device (see indicators below), which executes a malicious script every time Python starts on the device. The script essentially functions as an infostealer — it seeks out and stages environment variables (API tokens, secrets, tokens), SSH keys, Git keys and CI/CD secrets, cloud credentials (AWS, GCP, Azure keys), database credentials, kubernetes secrets and SSL/TLS keys. It also pulls configuration files, shell history and other details (hostname, IP) about the compromised environment. The stolen data is staged and compressed into a file that is then exfiltrated to an attacker-controlled server.
The malicious package only impacted LiteLLM users that installed or upgraded LiteLLM via PyPi between 10:39 UTC and 16:00 UTC on March 24, 2026 and had not pinned prior versions.
Given LiteLLM was downloaded 96 million times last month, a back-of-the-envelope calculation says that we’re still talking about hundreds of thousands of downloads of the malicious payload.
The attackers claim — without providing evidence — to have exfiltrated over 300GB of data during this window.
Agent sprawl at play
The LiteLLM was a software supply chain attack, not an identity-based attack.
The extent of the impact on any entity depended on whether they pinned library versions to prevent automatic updates and whether they used tools that scan new code for malicious payloads.
That being said, the blast radius for any impacted organization is as much a story about agent sprawl — an unbridled experimentation with AI agents by individuals within an organization.
It’s reasonable to assume that the users most vulnerable to this attack will be individual developers that installed LiteLLM using pip (package installer for Python) to experiment with the technology.
In an ideal world, developers do not connect AI systems to production resources during experimentation phases.
We don’t live in an ideal world. We don’t tend to know if AI systems are useful until they’re granted system resources or access to real data. And in many organizations, the adoption of AI agents remains decentralized and doesn’t fall under the governance of a formal security program. Developers and other administrators repeatedly connect AI agents directly to production applications and data using static API tokens and service account credentials. That’s where a supply chain attack like this becomes an identity story.
In unmanaged environments, resources are often accessed using bearer tokens stored in configuration files. Possession of the tokens alone grants the holder of the token access to a target resource. The tokens are not short-lived and not constrained to a specific IP or client.
So while we can guess at the impact of this event, the actual impact largely hinges on:
- whether the tokens stolen were revoked or rotated before attackers could use them, and
- if the token would be valid in the context of the attacker’s client and IP.
How Auth0 helps developers secure agentic apps
From a developer perspective, one way to reduce the blast radius of supply chain attacks is to avoid hardcoding long-lived API keys in environment files, where malicious payloads are designed to find them.
Developers building agentic applications using Auth0 (see Auth0 for AI Agents) can use the Auth0 Token Vault service to replace static keys with short-lived access tokens.
Those short-lived access tokens can also be bound to the private key held by the client the tokens were issued to via Auth0 support for Demonstrating Proof of Possession (DPoP). Requests generated by attacker attempts to reuse the short-lived token outside of that specific context will fail.
How Okta helps address agent sprawl
Okta for AI Agents helps organizations to centralize all AI agents in the enterprise into a single directory, and to manage connections between agents and the applications and data they need to perform tasks.
Every AI agent brought under management with Okta is assigned a human owner. Agentic access to sensitive resources either uses scoped tokens issued by Okta, or vaulted credentials retrieved from Okta Privileged Access.
Recommendations
Centralize package retrieval and consider using tools that build packages direct from source repositories, rather than from community artifact repositories. Configure your environment to automatically scan updates for malware and vulnerabilities before they can be pulled by developers.
Connect agents to sensitive resources using ephemeral, sender-constrained tokens. Use Token Vaulting to issue short-lived access tokens, and apply DPoP (Demonstrating Proof of Possession) to ensure stolen tokens cannot be reused from an attacker’s IP or client.
Develop an enterprise-wide strategy for management and governance of AI agents.
Assign human ownership of every agent in use.
Apply policies for authorizing agentic access to sensitive resources. At minimum, require that the resources accessible by agents support OAuth 2.0 authorization with granular scopes.
Embrace cross app access (XAA). This reduces the number of consent requests users encounter when authorizing agents. Distributed auth protocols like XAA are also less vulnerable to attacks on a centralized gateway.
Apply mitigating controls to the use of static API tokens or service account credentials. Configure IP allowlists and vault the credentials using privileged access management tools.
Advanced posture checks
Okta Device Assurance can be used to assess whether a device running the Okta Verify client is running a vulnerable version of LiteLLM:
SELECT
CASE
WHEN COUNT(*)=0 THEN 0
ELSE 1
END AS litellm_vuln
FROM python_packages
WHERE name='litellm'
AND version IN ('1.82.7', '1.82.8');
Indicators of Compromise
Indicator | Type | Source | Context |
models.litellm[.]cloud | Domain | Exfiltration server associated with the LiteLLM attack | |
litellm_init.pth | Filename | Name of file downloaded to systems compromised in the LiteLLM attack | |
tpcp.tar.gz | Filename | Name of file script creates after staging credentials for exfiltration from a compromised system. | |
45.148.10.212 | IP Address | Associated with previous attacks by the same threat actor targeting the Trivy project. | |
scan.aquasecurtiy[.]org | Domain | Associated with previous attacks by the same threat actor targeting the Trivy project. |
Rob Gil and Rafa Bono contributed to this article.