Security

When Your AI Proxy Gets Poisoned: Lessons from the LiteLLM Supply Chain Attack

The LiteLLM supply chain attack harvested credentials across 36% of cloud environments. Learn how layered agent governance reduces blast radius.

RenLayer Team · Engineering March 25, 2026 9 min read

On March 24, 2026, attackers published trojanized versions of LiteLLM to PyPI, targeting a library present in 36 percent of cloud environments and harvesting API keys, cloud credentials, and secrets from every infected system.
Because the attack exploited a previously compromised API token to push malicious packages directly to PyPI, no repository was actually breached, since the supply chain itself was the entry point.
AI infrastructure libraries like LiteLLM are becoming high-value targets precisely because they sit at the intersection of sensitive credentials and broad deployment, which makes them far more attractive than traditional packages.
While no single tool prevents supply chain poisoning, organizations that combine scoped agent credentials with runtime policy enforcement and fleet-wide kill switches can reduce the blast radius from catastrophic to contained.
Ultimately, this attack is a forcing function for enterprises to treat AI agent credential hygiene with the same rigor they already apply to human identity and access management.

What happened

On March 24, 2026, the Wiz Threat Research team disclosed that a threat actor known as TeamPCP had published two malicious versions of LiteLLM, versions 1.82.7 and 1.82.8, to the Python Package Index. For context, LiteLLM is the open-source proxy layer that routes API calls across LLM providers, and according to Wiz it sits in more than a third of enterprise cloud environments.

Although the packages were quarantined within three hours, the damage window was real enough. Both versions contained payloads designed to harvest everything of value from the host system: environment variables with API keys, SSH keys, cloud credentials for AWS, GCP, and Azure, along with Kubernetes configuration files, CI/CD secrets, database credentials, and Docker configurations. Once collected, the exfiltrated data was encrypted with AES-256 and RSA-protected keys before being sent to attacker-controlled domains.

Version 1.82.8 was especially concerning because it abused Python’s .pth file mechanism, which meant the malicious code would run whenever any Python process started on the infected system, even if LiteLLM was never explicitly imported. In other words, the trojan persisted silently across the entire Python environment.

Why AI infrastructure is now a prime target

This was far from a random attack. TeamPCP chose LiteLLM because it sits at one of the most privileged positions in the modern AI stack, and that privilege made it an ideal collection point.

It holds the keys to everything. A typical LiteLLM deployment manages API keys for OpenAI, Anthropic, Azure, and other providers, while also running with access to cloud infrastructure credentials, database connection strings, and internal service tokens. So by compromising just one library, the attackers gained a single collection point for credentials that would otherwise require breaching dozens of separate systems.

It is everywhere. With presence in 36 percent of cloud environments, a single poisoned release reaches more organizations than most targeted campaigns could hope to touch. The economics are straightforward for attackers: one compromised publishing token, two malicious packages, and within hours you have credentials from thousands of environments.

It is trusted implicitly. Because LiteLLM is treated as infrastructure rather than application code, it runs in production pipelines, CI/CD systems, and development environments with elevated permissions, yet most organizations never scrutinize updates to a proxy library the way they would review changes to their own codebase.

As the AI toolchain matures, this pattern will repeat. The libraries that sit between agents and their infrastructure, like orchestration frameworks, vector databases, and tool registries, will inevitably become the supply chain targets of choice, since they tend to be high-privilege, widely deployed, and rarely audited at the package level.

Why did credential sprawl make this attack so damaging?

At its core, the LiteLLM attack is a case study in what happens when credential management gets treated as an afterthought in AI deployments.

Consider what the attackers actually harvested: environment variables full of API keys, cloud provider credentials, database connection strings, and Kubernetes configs. None of these secrets were stolen through sophisticated exploitation. They were just sitting in plaintext environment variables on systems where LiteLLM had been installed, readily accessible to any process with access to the runtime environment.

And this is the norm rather than the exception. Most enterprise AI agent deployments today follow a strikingly similar pattern:

Shared API keys get passed as environment variables, which gives every process on the system access to every provider.
Service accounts end up over-provisioned because scoping permissions properly takes time that nobody budgets for, so agents routinely get broader access than their task requires.
Credentials stay static and long-lived, often rotated on a quarterly schedule (if they get rotated at all), which gives attackers a generous window to use whatever they steal.
No per-agent identity exists, so when credentials are compromised there is no way to determine which agent used them, what it accessed, or how to revoke access without disrupting everything else.

The LiteLLM attack did not need to defeat encryption, bypass firewalls, or exploit a zero-day. It simply ran a script that read what was already sitting in memory, because the vulnerability was never in the software itself but in the operational architecture around it.

How can organizations defend against AI supply chain attacks?

No single product or practice prevents supply chain attacks outright. In this case the malicious code executed at Python interpreter startup, well before any application-level governance could intervene. So being honest about what each layer can and cannot do is the starting point for building a defense that actually holds up.

Layer 1: Dependency verification

This is where the attack should have been caught earliest. Organizations that pin exact package versions, verify checksums, and use private package registries with upstream scanning would have had a window to detect the malicious versions before deployment.

Lock files with integrity hashes ensure that pip install pulls exactly the version you reviewed rather than whatever an attacker uploaded ten minutes ago.
Software bills of materials make it possible to answer “are we running the affected versions?” within minutes of a disclosure, instead of spending hours auditing environments manually.
Private registries with scanning add a verification step between the public index and your infrastructure, which can catch known-malicious packages before they reach production.

This layer is table stakes, though it falls outside RenLayer’s domain. At the same time, it is not sufficient on its own since the three-hour window between publication and quarantine is long enough for automated deployment pipelines to pull and install the compromised package.

Layer 2: Scoped agent credentials

Once the attack succeeds and malicious code is running on your systems, the question shifts: what can it actually steal, and how much damage can those stolen credentials do?

This is where most AI deployments fall short. When every agent shares a broad API key stored in an environment variable, a single compromise exposes access to every provider, every database, and every internal service that key can reach, all at once.

The alternative is agent identity and access management, where every agent receives its own scoped credentials tied to a specific role, limited to exactly the permissions its task requires, and rotated on a short lifecycle. That way, when an agent’s credentials are compromised the blast radius stays bounded to what that specific agent could access, and revocation happens immediately without disrupting the rest of the fleet.

This approach does not prevent the initial theft, but it does transform a breach from “the attacker has keys to everything” into “the attacker has a narrowly scoped token that expires in hours.”

Layer 3: Runtime policy enforcement

Even with scoped credentials in place, a compromised library can still attempt actions that fall outside normal behavior, and runtime policy enforcement is what catches those anomalies as they happen.

When an agent governed by policy-as-code rules suddenly tries to make outbound network calls to unfamiliar domains, access data stores outside its normal scope, or execute patterns that deviate from its established behavior, the policy engine blocks the action before it completes. That is the core difference between logging a suspicious event for later review and actually preventing it in real time.

In practice, this means defining and enforcing rules like:

Network egress controls that restrict agents to approved endpoints only, so exfiltration to attacker-controlled domains gets blocked automatically.
Data access boundaries that limit each agent to the specific data stores its task requires, thereby reducing what a compromised library can reach.
Behavioral baselines where unusual patterns like bulk credential reads or unexpected file system access trigger automatic escalation.

Our article on policy-as-code for AI agents covers the implementation of these patterns in detail.

Layer 4: Fleet-wide response

When a zero-day drops or a supply chain compromise is disclosed, the response window is measured in minutes, which means organizations need the ability to pause, terminate, or roll back every affected agent across the fleet simultaneously.

In the LiteLLM case, organizations with centralized agent governance could have issued a fleet-wide kill command the moment the advisory was published, pausing every agent that depended on the compromised library while security teams assessed the damage. Without that capability, however, incident response turns into a manual process of identifying affected systems one by one, and every minute spent searching is another minute the compromised code keeps running.

On top of that, automatic circuit breakers add protection that does not depend on human response time at all: if an agent exceeds its cost threshold, triggers too many policy violations, or exhibits anomalous behavior, it gets paused automatically until a human reviews the situation.

What organizational changes does this require?

While the LiteLLM attack is a technical incident, the deeper lesson is organizational. Most enterprises still treat AI agent deployments as an extension of their existing application infrastructure, governed by the same processes, secured by the same tools, and managed by the same teams, yet that mental model is increasingly breaking down.

Because AI agents hold more credentials, operate with more autonomy, and depend on a supply chain that is newer, less audited, and more attractive to attackers than traditional software dependencies, governing them properly requires capabilities that most organizations simply have not built yet.

Security teams need visibility into the AI supply chain, which means knowing which AI libraries are deployed where, what credentials they can access, and how quickly they can be isolated when a compromise surfaces.

Engineering teams need to adopt agent-specific credential hygiene. Shared API keys in environment variables were always a shortcut, but now that AI libraries are becoming prime supply chain targets, that shortcut has turned into an unacceptable risk.

And leadership needs to fund governance as infrastructure rather than treating it as a compliance checkbox. The organizations that contained the LiteLLM blast radius are the ones that had already invested in agent identity, policy enforcement, and fleet-wide response capabilities before the attack happened, because governance built in reaction to an incident is always more expensive and less effective than governance built in advance.

For more on building this foundation, our guide on why AI agent governance matters for enterprise security lays out the strategic case, while our EU AI Act compliance guide covers the regulatory dimension.

This is not the last attack

TeamPCP’s campaign did not start with LiteLLM. They had previously compromised Trivy using the same techniques and the same infrastructure. The playbook is clear enough: identify a widely deployed open-source tool in the DevOps or AI ecosystem, acquire a publishing token through a prior breach, then push trojanized versions to the public package index.

Meanwhile, the AI toolchain is still early. Orchestration frameworks, vector database clients, tool registries, and model serving libraries are all being adopted faster than they are being audited, and each one tends to run with elevated privileges because it needs access to credentials, data stores, and infrastructure to do its job.

So the question is not whether this will happen again but whether your agent architecture is designed to limit the damage when it does. Defense in depth, from dependency verification through scoped credentials to runtime enforcement and fleet-wide response, remains the only honest answer to supply chain risk in the AI era.

Frequently Asked Questions

What was the LiteLLM supply chain attack?

On March 24, 2026, a threat actor known as TeamPCP published two malicious versions of LiteLLM (1.82.7 and 1.82.8) to PyPI. The trojanized packages harvested API keys, cloud credentials, SSH keys, and other secrets from infected systems, then encrypted the stolen data before exfiltrating it to attacker-controlled domains. Version 1.82.8 was particularly dangerous because it used Python’s .pth file mechanism to ensure the malicious code ran every time any Python process started, not just when LiteLLM was imported.

Why are AI infrastructure libraries attractive supply chain targets?

AI infrastructure libraries like LiteLLM occupy a uniquely privileged position since they manage API keys for multiple LLM providers, run with access to cloud credentials, and are deployed across a large percentage of cloud environments. As a result, compromising a single library gives attackers a collection point for credentials that would otherwise require breaching many separate systems, while their broad deployment means a single malicious release reaches thousands of organizations simultaneously.

Can runtime governance prevent supply chain attacks?

Runtime governance cannot prevent the initial supply chain compromise itself, since that happens at the package registry level before any application code runs. However, it does significantly reduce the blast radius. Scoped per-agent credentials limit what stolen keys can access, runtime policy enforcement blocks anomalous behavior like exfiltration to unauthorized domains, and fleet-wide kill switches enable immediate response when a compromise is disclosed. Together, these layers transform a supply chain attack from a catastrophic organization-wide breach into a contained incident with bounded damage.

What should enterprises do immediately in response to this type of attack?

Four actions matter most. First, audit whether any environment ran the affected package versions and rotate all credentials on those systems. Second, implement dependency pinning with integrity verification so that future malicious releases cannot be pulled automatically. Third, eliminate shared API keys in environment variables by moving to per-agent scoped credentials with short rotation cycles. And fourth, establish fleet-wide response capabilities so that when the next advisory drops, every affected agent can be paused within minutes rather than hours.

How is this different from traditional software supply chain attacks?

While the mechanics are similar, the impact is amplified considerably. Traditional supply chain attacks compromise libraries that process data or perform computations, whereas AI infrastructure libraries like LiteLLM manage credentials for multiple external services, access cloud infrastructure, and run with elevated permissions by design. That means a compromised AI proxy library is not just a foothold into one system but a collection point for keys that unlock dozens of others, making the credential exposure from a single attack significantly broader than what most traditional supply chain compromises would produce.

Key takeaways