Devlery
Blog/AI

84 Malicious Versions, and the Supply Chain Mine Under AI Development

The TanStack npm attack reached OpenAI employee devices and app signing certificates, exposing the supply chain boundary around AI development environments.

84 Malicious Versions, and the Supply Chain Mine Under AI Development
AI 요약
  • What happened: On May 11, 2026, attackers published 84 malicious versions across 42 TanStack Router and Start npm packages.
    • The chain combined pull_request_target, GitHub Actions cache poisoning, and OIDC token extraction.
  • AI impact: OpenAI said two employee devices were affected and that it would rotate macOS app signing certificates.
  • Core lesson: Even without stolen npm tokens, trusted publishing can become an attack path when the CI trust boundary collapses.
    • As coding agents and AI developer tools automate package installs, builds, and tests, the privilege of the install host becomes part of the security model.

On May 11, 2026 UTC, attackers published 84 malicious versions across TanStack Router and Start npm packages. At first glance, that may sound like another npm supply chain incident. The more important story is the attack path. The attackers did not directly steal an npm token. They did not simply edit the normal npm publish workflow and wait for a maintainer to miss it. Instead, they moved through GitHub Actions trust boundaries, poisoned cache state, and extracted an OIDC token from the release environment that later restored that cache.

The incident did not stay inside the front-end library world. In its May 13 response, OpenAI said the TanStack npm attack affected two employee devices and led to limited credential material being exfiltrated from some internal repositories. OpenAI also said it had no evidence of user data, production systems, intellectual property, or published software being modified. Even so, the company chose to re-sign its iOS, macOS, Windows, and Android apps with new certificates, and told macOS users to update by June 12, 2026.

That is why the important framing is not just "TanStack was compromised." AI development now depends on one connected supply chain: packages, CI, code signing, employee developer devices, coding-agent workspaces, and desktop or CLI distribution channels. Tools such as Codex, Claude Code, Cursor, and Gemini CLI routinely run pnpm install, tests, builds, code generation, and package updates inside real projects. In that world, a malicious dependency is not only a dependency problem. It is an entry point into the workspace and the developer authority that an agent or tool is already using.

TanStack npm attack chain diagram

84 Malicious Versions in Six Minutes

TanStack's official postmortem gives a unusually concrete incident timeline. Between 19:20 and 19:26 UTC on May 11, 2026, the attackers published two malicious versions each across 42 @tanstack/* npm packages, for a total of 84 compromised versions. TanStack limited the affected scope to the Router and Start repository. It said other TanStack package families, including Query, Table, Form, Virtual, Store, and AI, were not affected.

The malicious versions did not remain online for long. External researcher ashishkurmi of StepSecurity publicly detected the issue roughly 20 to 26 minutes after publication. TanStack deprecated the first two versions about 59 minutes after the initial publish, deprecated all 84 versions after about 1 hour and 43 minutes, and saw the affected tarballs removed from the npm registry between roughly 2 hours and 53 minutes and 4 hours and 35 minutes after the first malicious publish. That response was fast. But lifecycle-script malware does not need days. If a package runs payload code during installation, a few hours can be enough.

TanStack's description of the payload looks like classic credential theft. During npm install, pnpm install, or yarn install, a malicious optional dependency could resolve and execute a prepare lifecycle script. That script launched an obfuscated router_init.js file of about 2.3 MB. It searched for AWS metadata and Secrets Manager material, GCP metadata, Kubernetes service-account tokens, Vault tokens, ~/.npmrc, GitHub tokens, and SSH private keys. TanStack said the exfiltration path used the Session/Oxen messenger file-upload network. Its guidance was blunt: if a host installed an affected version on May 11, treat that host as potentially compromised and rotate cloud, GitHub, npm, and SSH credentials.

This is where a supply chain attack stops being a package cleanup task. If the install host is a developer laptop, local GitHub CLI credentials and SSH keys matter. If it is a CI runner, cloud deployment credentials and service accounts matter. If a coding agent automatically ran the install inside a project, the interesting question is not whether the agent meant harm. It is what authority the host exposed while that install script ran.

42
Affected packages
84
Malicious versions published
~26 min
Time to public detection

The Attack Was a Chain of Trust Boundaries

The uncomfortable part of this incident is that each component is already familiar on its own. pull_request_target has long been a dangerous GitHub Actions trigger when it runs code from forked pull requests in the base repository context. GitHub Actions cache poisoning is not new either. OIDC trusted publishing was introduced to reduce reliance on long-lived npm tokens, and it remains a better pattern than keeping broad publish tokens in repository secrets. The problem is that these pieces can reinforce one another when the trust boundaries are loose.

According to the TanStack postmortem, the attacker first created a fork of the TanStack router repository, added malicious commits, and opened a pull request. The pull request touched a bundle-size workflow that ran under pull_request_target. That workflow checked out the fork's merge ref and ran build commands. The attacker used that execution to place malicious content into the pnpm store and save it under a cache key that the later release workflow would restore.

The key detail is that a read-only workflow permission model did not fully close the door. TanStack explains that the actions/cache post-job save path used a separate runner-internal token, allowing cache mutation even though the workflow's visible permissions were constrained. Later, when the release workflow ran from the main branch, it restored the poisoned cache. That workflow had id-token: write because it used npm OIDC trusted publishing. The malware read the runner worker process memory, extracted an OIDC token, and posted directly to the npm registry.

TanStack says the publish did not come from the normal Publish Packages step. It happened during the test or cleanup phase when the malware directly used the token. That distinction matters because it breaks the simple conclusion that "we are safe because we do not store npm tokens." Removing long-lived npm tokens is still the right direction. But once a publish-capable identity exists inside a workflow runtime, arbitrary code execution inside that runtime can still turn into a publish path.

GitHub has been pointing at the same pattern in its broader supply chain security roadmap. The GitHub Actions 2026 security roadmap says recent attacks target not only the software being built but the CI/CD automation itself. Untrusted code execution, malicious workflows, compromised dependency propagation, and over-permissioned credential exfiltration are increasingly part of the same playbook. The TanStack incident is that paragraph turned into an incident timeline.

Why OpenAI Rotated Signing Certificates

OpenAI's response made the incident larger for AI developers. The company said two employee devices were affected and that the observed malicious behavior matched the publicly described malware behavior. It also saw unauthorized access and credential-focused exfiltration activity in some internal source-code repositories. OpenAI said the only successfully exfiltrated material was limited credential material, and that other information or code was not affected.

The affected repositories, however, included product signing certificates. OpenAI said signing keys for iOS, macOS, and Windows were affected, and that it would re-sign all applications with new certificates. It gave macOS users a concrete deadline: update by June 12, 2026. After that, older versions signed with previous certificates would no longer receive updates or support, and fresh downloads or first launches of those older builds could be blocked by macOS security protections. OpenAI also said user passwords and API keys were not affected, which narrows the issue toward developer and signing material rather than customer-facing credentials.

The practical lesson is sharp. An attacker does not need to reach an OpenAI production system to shake trust in product distribution. Employee developer devices and partial internal repository access can still affect the signing layer that users rely on when they install desktop apps, mobile apps, browser extensions, IDE plugins, CLIs, and agent harnesses. Users see names such as OpenAI, ChatGPT, and Codex and assume the installer is legitimate. Even the suspicion that a signing certificate was exposed raises the risk of fake installers and phishing.

That is also why OpenAI warned users to be careful with OpenAI, ChatGPT, and Codex installers delivered through email, messages, ads, file-sharing links, or third-party download sites. The second half of a supply chain incident is not always another exploit. Sometimes it is distribution trust: where users get software, whether updates come from the in-app channel, and whether the signing chain still represents the vendor the user thinks it does.

Why AI Development Tools Are More Sensitive

This is AI news not only because OpenAI appears in the incident response. It matters because AI development tools are already expanding the execution surface of supply chain attacks. Coding agents naturally use the developer's shell, package manager, Git credentials, and local environment. If an agent runs pnpm install, that is not magically more dangerous than a human running the same command. The difference is that agents can run these steps more often, faster, and inside longer automated work loops that the user may not inspect step by step.

Imagine a coding agent trying to fix a test failure by updating a dependency. If a malicious package is installed, the payload runs under the host's authority, not under some abstract "agent" authority. On a laptop, that means local keys and CLI credentials. In a cloud development environment, it may mean workspace secrets. In CI, it can mean deployment tokens and registry publish permissions. The agent does not need to make a malicious decision for the workflow to give an attacker time and access.

There is another pressure point: AI SDKs and agent frameworks are deeply embedded in package ecosystems. Mistral SDKs, OpenAI SDKs, LangChain, LlamaIndex, MCP servers, browser automation packages, vector database clients, and observability clients update quickly. Developers upgrade dependencies to reach new models and new capabilities. That speed helps product teams, but it also creates a surface where valuable packages meet credential-rich hosts.

So "AI agent security" cannot mean prompt injection alone. Prompt injection is the problem of getting a model to follow hostile instructions. A TanStack-style supply chain attack does not need to trick the model at all. The package install lifecycle runs. The CI cache restores. A token appears in runner memory. The AI team still has to think about model guardrails, but it also has to think about package manager policy, install-script restrictions, network egress control, credential scope, and CI trust boundaries.

The Weakness the Community Saw

Community reaction did not treat the incident as TanStack's mistake alone. In Reddit's r/netsec thread sharing the TanStack postmortem, commenters pointed to pull_request_target as a familiar GitHub Actions attack surface. In r/javascript and r/node discussions, the transparent postmortem was appreciated, but the practical burden was credential rotation. In a supply chain incident, "we removed the affected versions" and "every affected host is safe" are very different statements.

TanStack's own postmortem also acknowledges how much worse the event could have been. The attacker chose a payload that broke tests, which made the release path noisy and helped detection. A more careful attacker that avoided test failures might have remained quiet longer. The attacker also did not need to invent a new technique. The incident recombined previously described cache poisoning and memory-extraction patterns. For defenders, that is the less comfortable conclusion: known patterns still become incidents when several boundaries are open at once.

OpenAI's response shows a similar operational reality. It said the company was already deploying security controls to reduce the impact of supply chain attacks after the Axios incident. Those controls included package manager settings such as minimumReleaseAge and additional software to validate new package provenance. But the TanStack incident happened during a phased deployment, and the two affected employee devices did not yet have the updated configuration that would have blocked the newly observed malware package download.

This is often how real incidents happen. It is not that no controls exist. It is that controls are being rolled out, some devices are not yet covered, developer productivity still requires installation rights, and older CI workflows remain in place. AI development organizations have a lot of these interim states because they constantly evaluate new SDKs, CLIs, agent plugins, and MCP servers.

What Development Teams Should Check Now

First, audit every pull_request_target workflow. If forked pull-request code is being checked out, built, tested, or installed in the base repository context, that is a serious risk. Keep trusted operations such as labels, comments, and metadata handling separate from untrusted code execution. Run untrusted builds under pull_request with separate permissions. Also inspect whether caches cross from pull-request workflows into main-branch release workflows. Do not assume that read-only workflow permissions fully prevent cache writes.

Second, treat package-install lifecycle scripts differently on developer devices and CI. npm, pnpm, and yarn lifecycle scripts are useful, but they are also execution points for supply chain attacks. Where feasible, combine ignore-scripts, allowlists, delayed installation, minimumReleaseAge, and package provenance validation. Not every project can disable scripts all the time. But a CI environment with production credentials and a personal experiment workspace should not share the same risk posture.

Third, view OIDC trusted publishing as the beginning of a design, not the end. Removing long-lived tokens is the right move. But teams still need to design the workflow code path that can mint OIDC tokens, the runner isolation boundary, job permissions, publish-step gating, and provenance-source verification. If any code path inside a workflow can reach a publish-capable token, the intended publish step and the actual publish channel can diverge.

Fourth, reduce the credential scope available to coding agents and AI CLIs. If the shell an agent uses can see production cloud credentials, broad GitHub tokens, npm publish rights, and SSH private keys, a small dependency incident can become a large compromise. Agent-specific users, separate workspaces, limited network egress, temporary credentials, and approval gates are not cosmetic controls. The safety of an AI tool depends on host isolation and secret hygiene as much as answer quality.

Fifth, extend incident response playbooks beyond package removal. Teams need to know which hosts installed affected packages, when installation happened, which lifecycle scripts ran, what secrets were reachable, what credential rotation order applies, whether code-signing material was exposed, and when deployment workflows should pause. OpenAI's certificate rotation and macOS update deadline show that a supply chain incident can reach all the way to the user distribution layer.

Boundaries Before Signatures

The TanStack incident captures a paradox in supply chain security. Teams adopt OIDC trusted publishing to reduce stored npm tokens. They add package provenance and connect registries to CI. But if untrusted code runs inside CI, cache state crosses trust boundaries, and a publish-capable token exists in runtime memory, even better trust mechanisms can become part of the attack path.

That does not mean teams should abandon trusted publishing. It means the boundaries around trusted publishing matter more, not less. A signature can say where an artifact came from, but it does not by itself prove that the workflow that created the artifact was isolated from untrusted execution. Provenance can show a build path, but it does not automatically tell you whether a poisoned cache was restored or whether hostile code ran inside that path. AI development teams have to look at this layered trust model as one system.

TanStack's detailed postmortem and OpenAI's public response at least make the event learnable. The harder question is what happens when the next attack is quieter, does not break tests, and spreads through more AI SDKs or IDE extensions before anyone notices. AI development moves quickly because frequent updates and automation are useful. The same qualities let supply chain attacks move quickly too.

The practical question for builders is simple. If an AI agent writes code, installs dependencies, runs tests, and prepares deployments, what trust boundary surrounds that execution environment? Before asking only whether the model is safe, teams need to ask what the model's shell can see. The 84 malicious TanStack versions made that question much harder to postpone.