Skip to content
.ca
Work being done in the backend.
4 minhigh

The Top 10 Threats Related to Agent Skills

Anthropic's new 'Agent Skills' feature, which uses progressive disclosure to manage AI agent context windows, introduces a novel attack surface. The article outlines the top 10 critical threats to this ecosystem, including prompt injection, supply chain manipulation, and unauthorized code execution, highlighted by the recent OpenClaw malware incident.

Conf:mediumAnalyzed:2026-03-21reports
ActorsOpenClaw

Source:Akamai

Key Takeaways

  • Anthropic's 'Agent Skills' introduce a new attack surface through progressive disclosure of reusable playbooks (SKILL.md).
  • The 'OpenClaw' incident demonstrated this risk, where a popular Skill acted as malware to exfiltrate SSH keys, crypto wallets, and establish a reverse shell.
  • Top threats include indirect prompt injection, supply chain manipulation, unexpected code execution, and memory abuse.
  • Security practitioners must treat Skills as a distinct trust boundary, apply least-privilege access, and restrict Skills to allowlisted registries.

Affected Systems

  • AI Agents
  • Anthropic Agent Skills
  • Model Context Protocol (MCP)

Attack Chain

Attackers can exploit AI Agent Skills through vectors such as supply chain manipulation (typosquatting) or dynamic discovery to inject malicious SKILL.md files. Once the agent activates the Skill via progressive disclosure, embedded malicious instructions or helper scripts execute. This can lead to indirect prompt injection, unauthorized data access (e.g., .env or .ssh folders), exfiltration of sensitive data, or establishing a reverse shell, as demonstrated in the OpenClaw incident.

Detection Availability

  • YARA Rules: No
  • Sigma Rules: No
  • Snort/Suricata Rules: No
  • KQL Queries: No
  • Splunk SPL Queries: No
  • EQL Queries: No
  • Other Detection Logic: No

No specific detection rules or queries are provided in the article.

Detection Engineering Assessment

EDR Visibility: Medium — EDR can detect the execution of malicious helper scripts (Python, JS, Shell) and reverse shells spawned by the AI agent process. Network Visibility: Medium — Network monitoring can detect reverse shell connections or exfiltration of data to attacker-controlled servers. Detection Difficulty: Hard — Distinguishing legitimate AI agent actions from malicious Skill executions requires deep context into the agent's expected behavior and strict boundary monitoring.

Required Log Sources

  • Process Creation Logs
  • Network Connection Logs
  • File Access Logs

Hunting Hypotheses

HypothesisTelemetryATT&CK StageFP Risk
Look for AI agent processes spawning unexpected interactive shells or scripting interpreters (Python, Node.js) that initiate outbound network connections.Process Creation, Network ConnectionsExecutionMedium
Monitor for AI agent processes accessing sensitive local files such as .env files, SSH keys, or browser cookies.File AccessCredential AccessLow

Control Gaps

  • Lack of strict validation between Skill descriptions and runtime behavior
  • Implicit trust chains between Skills (Skill-to-Skill data flow)
  • Dynamic Skill discovery bypassing load-time detection

Key Behavioral Indicators

  • Agent process spawning reverse shells
  • Unexpected access to .ssh or .env directories by agent processes
  • High-latency API calls or premium-tier model queries causing resource exhaustion

False Positive Assessment

  • Low

Recommendations

Immediate Mitigation

  • Scan existing Agent Skills for malicious code or unexpected dependencies before use.
  • Review and restrict permissions granted to AI agents and their associated Skills.

Infrastructure Hardening

  • Isolate Skill memory and execution environments using sandboxing.
  • Restrict Skill sourcing to allowlisted, trusted registries only.
  • Structure and validate Skill-to-Skill data flows.

User Protection

  • Apply least-privilege access to each Skill.
  • Require human-in-the-loop confirmation for high-impact actions initiated by agents.

Security Awareness

  • Educate development teams on the risks of progressive disclosure and prompt injection in AI agent architectures.
  • Incorporate AI Skill threat modeling (e.g., the Top 10 list) into secure SDLC processes.

MITRE ATT&CK Mapping

  • T1195.002 - Supply Chain Compromise: Compromise Software Supply Chain
  • T1059 - Command and Scripting Interpreter
  • T1552.004 - Credentials in Files: Private Keys
  • T1048 - Exfiltration Over Alternative Protocol