2026-05-134 minhigh

A Look Inside Claude's Leaked AI Coding Agent

The source code for Anthropic's Claude Code CLI was accidentally exposed through .map files in a public npm release. This leak reveals the internal architecture, permission models, and safety guardrails of the AI agent, potentially allowing attackers to craft targeted prompt injections or distribute tampered dependencies through unofficial repositories.

Conf:highAnalyzed:2026-04-04reports

Authors: Mark Vaitsman, Eric Saraga

Prompt Injection

Key Takeaways

Anthropic's Claude Code CLI source code was accidentally leaked via debug-only .map files in the public npm package @anthropic-ai/claude-code 2.1.88.
The leak exposes the internal architecture, including the QueryEngine, tool execution flow, permission models, and system prompt assembly.
Threat actors can analyze the exposed logic to bypass safety guardrails or craft targeted malicious repositories using prompt injection via CLAUDE.md files.
Unofficial mirrors and reimplementations (e.g., instructkr/claw-code) have rapidly appeared, posing a significant risk of tampered dependencies for developers.

Affected Systems

Claude Code CLI
npm (@anthropic-ai/claude-code version 2.1.88)

Vulnerabilities (CVEs)

Prompt Injection

Attack Chain

The incident originated from a human error where debug-only .map files were included in a public npm release of Claude Code. Developers and threat actors downloaded these files, extracting the full TypeScript source code. With access to the internal logic, attackers can analyze the permission models and system prompt assembly to craft malicious repositories containing prompt injections via CLAUDE.md files. Users downloading unofficial mirrors or tampered dependencies risk executing malicious code or bypassing AI safety guardrails on their local machines.

Detection Availability

YARA Rules: No
Sigma Rules: No
Snort/Suricata Rules: No
KQL Queries: No
Splunk SPL Queries: No
EQL Queries: No
Other Detection Logic: No

The article does not provide specific detection rules or queries.

Detection Engineering Assessment

EDR Visibility: Low — EDR tools do not natively inspect LLM prompt injections or the internal logic of AI coding assistants unless they result in malicious child process execution. Network Visibility: Low — Traffic to Anthropic's API is encrypted and standard; malicious prompt injections would be embedded within legitimate API calls. Detection Difficulty: Hard — Detecting prompt injection within local CLAUDE.md files requires semantic analysis of the text, which standard signature-based tools cannot easily perform.

Required Log Sources

npm audit logs
Process execution logs
File integrity monitoring (FIM)

Hunting Hypotheses

copy:

Hypothesis	Telemetry	ATT&CK Stage	FP Risk
Developers may be downloading and executing unofficial or tampered versions of the Claude Code CLI from third-party repositories.	Process execution logs, network connections to unofficial GitHub repositories, and npm installation logs.	Execution	Low
Malicious actors may attempt to exploit Claude Code by embedding prompt injections in CLAUDE.md files within shared repositories.	File creation/modification events for CLAUDE.md, specifically looking for hidden instructions or guardrail bypass commands.	Defense Evasion	Medium

Control Gaps

LLM Prompt Injection detection
Supply chain verification for AI development tools

Key Behavioral Indicators

Execution of Claude Code from non-official npm paths
Unexpected modifications to CLAUDE.md containing suspicious system instructions

False Positive Assessment

Recommendations

Immediate Mitigation

Ensure developers are only using the official @anthropic-ai/claude-code package.
Audit internal projects for malicious CLAUDE.md files that could trigger prompt injections.

Infrastructure Hardening

Implement strict egress filtering for development environments to prevent unauthorized code exfiltration or C2 communication from compromised AI agents.

User Protection

Educate developers on the risks of downloading unofficial mirrors or reimplementations of leaked software.

Security Awareness

Train engineering teams on AI prompt injection risks and secure usage of autonomous coding agents.

MITRE ATT&CK Mapping

T1195.002 - Supply Chain Compromise: Compromise Software Supply Chain
T1562.001 - Impair Defenses: Disable or Modify Tools
T1588 - Obtain Capabilities

Additional IOCs

File Paths:
- QueryEngine.ts - Core engine file exposed in the leak handling LLM interaction lifecycle.
- Tool.ts - Core file exposed in the leak handling tool execution.
- commands.ts - Core file exposed in the leak handling command systems.
- CLAUDE.md - Project-level context file used by Claude Code, identified as a potential vector for prompt injection attacks.

Stay currentSubscribe via RSS