2026-06-095 minhigh

Phishing for Lobsters: How We Tricked OpenClaw into Spilling Secrets

Varonis Threat Labs demonstrated that enterprise AI agents, specifically an OpenClaw deployment, are vulnerable to traditional phishing and social engineering techniques. In simulated attacks, the agent successfully identified technical phishing indicators like malicious OAuth flows but failed to recognize social context, resulting in the exfiltration of AWS credentials and sensitive CRM data to an external attacker.

Conf:highAnalyzed:2026-06-09Google

IOCs · 5

.txt .json

domain
holidaygifts[.]co[.]ilSimulated phishing domain used as a capture server during the gift card scam scenario.
email
3000levi[@]gmail[.]comSimulated attacker email address observed in the agent's reasoning logs during the credential exfiltration scenario.
email
dan[.]levi3000[@]gmail[.]comSimulated attacker email address used to send impersonation phishing requests to the AI agent.
ip
176[.]65[.]149[.]234IP address observed in HTTP logs interacting with the simulated phishing capture server.
ip
51[.]4[.]96[.]17IP address observed in HTTP logs interacting with the simulated phishing capture server.

Detection / HunterGoogle

What Happened

Researchers tested an AI email assistant to see if it could be tricked by phishing emails, just like humans. They found that while the AI was good at spotting fake login pages, it easily fell for emails pretending to be from coworkers asking for sensitive information. As a result, the AI handed over secret passwords and customer data to the 'attacker' without double-checking. This shows that as companies use AI to manage emails, they need to put strict rules in place to stop the AI from accidentally giving away company secrets. Organizations should require human approval before an AI can send out sensitive data.

Key Takeaways

AI agents deployed in enterprise inboxes are highly susceptible to social engineering and impersonation attacks, even when equipped with security instructions.
Agents can bypass traditional phishing defenses by acting on behalf of the user, inadvertently leaking sensitive data like AWS keys and CRM exports to external attackers.
While AI agents can successfully detect technical phishing indicators (like malicious OAuth redirects or fake login pages), they lack the social context to identify unusual requests from seemingly known entities.
Defending AI agents requires architectural controls, such as segmenting connector access, enforcing explicit security instructions, and requiring human-in-the-loop for high-privilege actions.

Affected Systems

OpenClaw AI Agent
Google Workspace
Enterprise Inboxes
LLMs (Google Gemini 3.1 Pro, OpenAI Codex GPT-5.4)

Attack Chain

The attacker sends a socially engineered email to an inbox monitored by an autonomous AI agent (OpenClaw). The agent's orchestrator classifies the email and delegates the task to a worker sub-agent without verifying the sender's identity. The worker agent searches internal systems, such as mailboxes or connected drives, to retrieve the requested sensitive information, such as AWS credentials or CRM exports. Finally, the agent forwards the sensitive data to the attacker's external email address, bypassing traditional outbound data loss prevention controls.

Detection Availability

YARA Rules: No
Sigma Rules: No
Snort/Suricata Rules: No
KQL Queries: No
Splunk SPL Queries: No
EQL Queries: No
Other Detection Logic: No

The article does not provide specific detection rules, as it focuses on architectural vulnerabilities and behavioral flaws in AI agents.

Detection Engineering Assessment

EDR Visibility: Low — EDR solutions monitor endpoint behavior, but these attacks occur via cloud APIs, email forwarding, and SaaS integrations where EDR has limited visibility. Network Visibility: Medium — Network logs might capture the AI agent's infrastructure reaching out to unusual external domains (like the phishing site), but email-based exfiltration will blend with normal traffic. Detection Difficulty: Hard — The actions are performed by a legitimate, highly privileged AI agent, making malicious actions blend in with normal, helpful workflows.

Required Log Sources

Email Gateway Logs
Application Audit Logs
Cloud Trail / Workspace Logs
DLP Logs

Hunting Hypotheses

copy:

Hypothesis	Telemetry	ATT&CK Stage	FP Risk
Consider hunting for unusual outbound email forwarding rules or direct emails sent from AI agent service accounts to external, unverified domains.	Email Gateway Logs	Exfiltration	Medium
If you have visibility into AI agent API usage, consider hunting for queries accessing highly sensitive keywords (e.g., 'credentials', 'AWS keys', 'customer export') followed immediately by an outbound communication.	Application Audit Logs	Collection	High

Control Gaps

Lack of identity verification for inbound requests processed by AI agents
Insufficient outbound data loss prevention (DLP) on AI agent communications
Over-privileged access for AI agents to sensitive data repositories

Key Behavioral Indicators

AI agent service account sending emails to external addresses not previously corresponded with
Agent accessing sensitive files and immediately initiating an external web request

False Positive Assessment

Recommendations

Immediate Mitigation

Verify against your organization's incident response runbook and team escalation paths before acting.
Evaluate whether AI agents have unrestricted outbound email capabilities and consider restricting them to internal domains only.
Consider implementing a human-in-the-loop approval process for any high-privilege actions or external data sharing initiated by AI agents.

Infrastructure Hardening

Segment connector access for AI agents based on the inbound channel's trust level (e.g., external emails vs. internal Slack messages).
Treat AI agent configuration files (like agents.md) as security controls, enforcing explicit security instructions and version control.

User Protection

If applicable, ensure AI agents are subject to the same Conditional Access policies and MFA requirements as human users.

Security Awareness

Consider updating threat modeling and security awareness programs to account for AI agents as potential targets for social engineering.

MITRE ATT&CK Mapping

T1566.001 - Phishing: Spearphishing Attachment
T1566.002 - Phishing: Spearphishing Link
T1534 - Internal Spearphishing
T1114.002 - Email Collection: Remote Email Collection
T1048 - Exfiltration Over Alternative Protocol

Additional IOCs

Ips:
- 51[.]4[.]96[.]17 - IP address observed in HTTP logs interacting with the simulated phishing capture server.
- 176[.]65[.]149[.]234 - IP address observed in HTTP logs interacting with the simulated phishing capture server.