#malware #supply-chain #github-actions #npm

A case of Shai-Hulud

Shai-Hulud Banner

In this article we are going to deep dive into the source code of Shai-Hulud and try to figure out exactly how it works.

For context: Shai-Hulud is a self-propagating npm worm specifically designed for GitHub Actions and CI/CD environments, built by the threat group TeamPCP. Its name is taken from the giant sandworms in Frank Herbert’s Dune, a fitting reference to its “devouring” behavior as it aggressively hunts down and exfiltrates sensitive developer credentials.

On May 12, 2026, TeamPCP deliberately released the full source code to GitHub under the title “A Gift From TeamPCP”, complete with deployment instructions. The framework is a modular TypeScript/Bun toolkit for credential harvesting, supply chain poisoning, and encrypted data exfiltration, targeting both CI/CD pipelines and developer workstations.


Code Analysis

The scripts/ Folder , Build-Time Anti-Analysis Pipeline

The scripts/ folder forms a build-time obfuscation and anti-analysis pipeline that transforms, encrypts, and obscures the malware’s source code before packaging it into a final deployable bundle.

Build Pipeline (build.ts)

Main orchestrator that processes all TypeScript source files through 3 transformation stages before bundling into a single minified output.

Stage 1 , Environment Variable Obfuscation (env-scramble.ts) Rewrites process.env.SOME_KEY into obfuscated bracket-notation so environment variable names are hidden from static analysis.

Stage 2 , String Scrambling (scramble-shared.ts) Generates a unique 64-character random passphrase per build. Encodes all strings marked with scramble("...") in source code. The passphrase is baked into the binary, making each build cryptographically distinct.

Stage 3 , Log Stripping (strip-logs.ts) Removes every logging and debug call before compilation. Leaves zero diagnostic output in the final bundle when silent mode is enabled.

Post-Build Obfuscation (obfuscate.js)

Applies control flow flattening and Base64 string arrays to the final bundle, making the output extremely hostile to automated decompilation tools.

Asset Encryption (pack-assets.ts)

Encrypts each embedded asset file with a random AES-256-GCM key. Encryption keys are themselves passed through scramble() for double obfuscation.

Operator Decryption Tool (decrypt.ts)

Standalone tool used by the attacker to recover exfiltrated data. Uses hybrid RSA-OAEP + AES-256-GCM decryption with the attacker’s private key. Clean separation between the payload and the C2 recovery tooling.

build-plugin.ts

Alternative build path implementing the same scramble logic as a Bun plugin, intercepting files at load time.


Important src/ Files

1. providers/filesystem/filesystem.ts

Contains a hardcoded list of over 100 known sensitive file locations across Linux, macOS, and Windows , SSH keys, AWS credentials, .env files, browser data, crypto wallets (Bitcoin, Ethereum, Monero, Dogecoin), Telegram session data, Slack cookies, Git credentials, KeePass vaults, and VPN configs. It reads every one it can find.

2. providers/aws/ , Cloud Credential Harvester

Targets AWS credentials from every possible angle: environment variables, ~/.aws/credentials files, EC2 instance metadata (IMDSv2), ECS container metadata, and EKS web identity tokens. Wherever AWS credentials could possibly live in a CI/cloud environment, it looks there.

3. providers/actions/

If the malware finds a valid GitHub token with workflow scope, it iterates through accessible repositories and pulls all their GitHub Actions secrets , the environment variables CI pipelines use to store API keys, deploy tokens, and other sensitive data.

4. mutator/ , The Persistence & Supply Chain Module

Once it has a valid GitHub token, it can inject malicious commits into branches (branch/), and publish malicious npm packages to the public registry either directly (npm/) or using OIDC provenance to make them appear legitimate (npmoidc/).

5. sender/

Stolen data is encrypted and sent out through two channels. The primary channel posts to git-tanstack.com, a domain controlled by the attacker. If that is unreachable, it falls back to committing the encrypted stolen data directly into a GitHub repository created using one of the stolen tokens , a clever backup that abuses GitHub’s own infrastructure for data exfiltration.

6. assets/DEADMAN_SWITCH.sh

This embedded shell script installs a background service (a LaunchAgent on macOS, a systemd service on Linux) that monitors whether a stolen GitHub token is still valid by pinging the GitHub API every 60 seconds. The moment the token gets revoked, it fires an arbitrary handler , and in the source code that handler is literally rm -rf ~/, wiping the victim’s entire home directory as a cleanup and sabotage action.

7. collector/ & dispatcher/

These two work together to make sure nothing is lost. The Collector acts as a buffer , it gathers results from all the harvesters and batches them up, flushing every 100KB. The Dispatcher then takes each batch, encrypts it, and tries to deliver it through whichever sender is currently reachable, falling back automatically if one fails.

index.ts , The Brain

index.ts is where everything kicks off. It runs a preflight check first , if the system language is Russian, it quietly exits. It also checks if it is running inside a specific GitHub Actions workflow (opensearch-js) and behaves differently if so. Then it spins up all the credential harvesters in sequence and feeds everything into the collection pipeline.


The Critical Weakness , Token Exposed in Plain Sight

This is where it gets interesting. The malware authors put genuine effort into their encryption , and then completely undermined it with a trivial mistake.

Layer 1 , The Encryption (Solid)

In sender/base.ts, the stolen secrets are encrypted with a proper hybrid scheme:

  • A fresh AES-256-GCM key and IV are generated per envelope
  • That AES key is then RSA-OAEP-SHA256 encrypted with the attacker’s public key
  • The result is: { envelope: base64(iv + ciphertext + authTag), key: base64(encryptedAESKey) }

This part is cryptographically sound. Without the attacker’s RSA private key, the actual stolen data , AWS credentials, SSH keys, cloud secrets , cannot be recovered by anyone.

Layer 2 , The Weakness (Token in the Commit Message)

In sender/github/githubSender.ts, when the sender includes a GitHub token, it does the following:

const doubleEncodedToken = Buffer.from(
  Buffer.from(this.token).toString("base64"),
).toString("base64");
return { ...envelope, token: doubleEncodedToken };

That token then ends up embedded directly in the public commit message: IfYouRevokeThisTokenItWillWipeTheComputerOfTheOwner:WjJodlh4eHh4...

The token is protected by nothing more than double Base64 encoding , a trivially reversible operation. Two decodes in Python or PowerShell and the live GitHub token is fully exposed.

This means:

  • Anyone searching GitHub for repos matching the Shai-Hulud naming pattern
  • Finds a public commit message
  • Decodes it in two lines
  • Gets full access to the victim’s GitHub account

No cryptographic keys required.

Responsible disclosure note: During this research, live tokens were observed in public commit messages. These were not accessed or used. This issue was already detected by GitGuardian, who identified 7 commits containing exposed ghp_ tokens and confirmed they remained valid and active, and by OX Security, who noted that a double Base64 decode reveals the full compromised account data including GitHub tokens, AWS and GCP secrets. Both treated it as a detection signal. What neither articulated is the contradiction at the heart of the design: the attacker used RSA-4096 + AES-256-GCM to protect the stolen secrets, then stored the GitHub token controlling the victim’s entire account behind a trivially reversible encoding. The encryption protects the data. The encoding protects nothing.


IOCs , Indicators of Compromise

Network

IndicatorTypeNotes
git-tanstack.comC2 domainPrimary exfil server, impersonates tanstack.com
git-tanstack.com:443/routerC2 endpointPOST destination for stolen credentials
zero.masscan.cloud:443/v1/telemetryC2 domainSecondary C2 used in later variants
filev2.getsession.orgC2 domainSession P2P messenger exfil channel

File System

PathNotes
setup.mjsStage 1 loader , downloads Bun and executes payload
router_runtime.jsMain 11.7MB obfuscated credential stealer
opensearch_init.jsInjected malware binary in .claude/ directories
execution.jsAlternate payload filename used in some variants
ai_init.jsPayload variant targeting AI tooling environments
~/.local/bin/gh-token-monitor.shDead man’s switch monitor script
~/Library/LaunchAgents/com.user.gh-token-monitor.plistmacOS persistence (LaunchAgent)
~/.config/systemd/user/gh-token-monitor.serviceLinux persistence (systemd)

Repository Injection

IndicatorNotes
.claude/settings.json with SessionStart hookRe-executes payload on every Claude Code session
.claude/opensearch_init.jsMalware binary disguised as Claude config file
.vscode/tasks.json with folderOpen triggerRe-executes payload on VS Code folder open
.vscode/setup.mjsWorm loader injected into VS Code directory
Branch named dependabot/github_actions/format/setup-formatterFake Dependabot branch used to inject workflow
Workflow file Formatter dumping ${{ toJSON(secrets) }}Actions secrets exfil disguised as formatter
Commit author claude@users.noreply.github.comSpoofed identity used for injected commits
Commit message chore: update dependenciesCamouflage commit message for all injections

GitHub Search Queries

Repositories with embedded stolen tokens:

"IfYouRevokeThisTokenItWillWipeTheComputerOfTheOwner" in:commits

Source code repositories:

"A Gift From TeamPCP" in:repositories

npm

IndicatorNotes
scripts.preinstall: "node setup.mjs"Primary worm injection signature in package.json
optionalDependencies: { "@opensearch/setup": ... }Malicious dependency injected by worm
npm token description containing IfYouRevokeThisTokenItWillWipeTheComputerOfTheOwnerToken minted by worm with extortion description
Publisher account cloudmtabotCompromised npm account used in Wave 1

Resources