2026-06-015 mincritical

Pwn2Own Berlin 2026: On the Ground With TrendAI™ ZDI's Biggest AI Showdown Yet

At Pwn2Own Berlin 2026, security researchers demonstrated 47 unique zero-day vulnerabilities across AI platforms and traditional enterprise software. Notable exploits included root-level code execution in AI agents via trust boundary failures, a SYSTEM-level RCE in Microsoft Exchange, a pre-authentication RCE in SharePoint, and a cross-tenant guest-to-host escape in VMware ESXi.

Conf:highAnalyzed:2026-06-01Google

Detection / HunterGoogle

What Happened

At the Pwn2Own Berlin 2026 hacking competition, security researchers discovered 47 new vulnerabilities in popular software and AI tools. Major AI platforms and traditional enterprise systems like Microsoft Exchange, SharePoint, and VMware ESXi were successfully compromised. These discoveries highlight that both emerging AI technologies and established software contain critical flaws that attackers could exploit. Organizations should ensure their security monitoring is up to date and apply vendor patches as soon as they become available over the next 90 days.

Key Takeaways

Pwn2Own Berlin 2026 yielded 47 unique zero-days across 10 categories, with a record $1.29M in payouts.
AI products (OpenAI Codex, LiteLLM, LM Studio, NVIDIA Megatron Bridge) were heavily targeted, consistently failing due to 'trust boundary' issues when interacting with external tools.
Critical enterprise vulnerabilities were demonstrated, including a SYSTEM-level RCE in Microsoft Exchange, a pre-authentication RCE in SharePoint, and a cross-tenant guest-to-host escape in VMware ESXi.
AI is now actively used by researchers for vulnerability discovery and exploit development, accelerating the exploit lifecycle.
TrendAI released network security filters for several of the demonstrated vulnerabilities ahead of official vendor patches.

Affected Systems

OpenAI Codex
LiteLLM
LM Studio
NVIDIA Megatron Bridge
Microsoft Exchange
Microsoft SharePoint
Microsoft Edge
VMware ESXi
Claude
Cursor
NVIDIA Container Toolkit
Oracle AI Database
Windows 11
Red Hat Enterprise Linux

Attack Chain

Researchers utilized AI-assisted tooling to discover and exploit vulnerabilities across multiple platforms. For AI agents, attackers exploited trust boundary failures where the AI unconditionally trusted external tools or protocols, leading to root-level code execution. In enterprise systems, researchers chained logic bugs for sandbox escapes in browsers, leveraged unauthenticated HTTP requests for remote code execution on web servers, and exploited memory safety failures for guest-to-host hypervisor escapes.

Detection Availability

YARA Rules: No
Sigma Rules: No
Snort/Suricata Rules: No
KQL Queries: No
Splunk SPL Queries: No
EQL Queries: No
Other Detection Logic: Yes
Platforms: TrendAI TippingPoint

TrendAI has released TippingPoint network security filters to detect and block exploitation of several vulnerabilities demonstrated at the event.

Detection Engineering Assessment

EDR Visibility: Medium — EDR can detect post-exploitation activities such as anomalous shell spawns from AI agents, but may lack visibility into the initial logic or memory corruption exploits. Network Visibility: Medium — Network inspection is effective for unencrypted web exploits (e.g., SharePoint) via IPS filters, but visibility into local AI agent traffic or encrypted hypervisor communications is limited. Detection Difficulty: Hard — These are newly discovered zero-day vulnerabilities with no public proof-of-concept code available yet, making signature-based detection difficult outside of vendor-provided virtual patches.

Required Log Sources

Process Creation (Event ID 4688 / Sysmon 1)
Network Traffic Logs
File System Integrity Logs

Hunting Hypotheses

copy:

Hypothesis	Telemetry	ATT&CK Stage	FP Risk
Consider hunting for anomalous child processes (e.g., cmd.exe, sh, bash) spawning from AI agent runtimes or local LLM processes.	Process Creation (Event ID 4688 / Sysmon 1)	Execution	Low
Evaluate network telemetry for unexpected inbound HTTP requests to SharePoint servers that result in unusual w3wp.exe child processes.	Web Server Logs, Process Creation	Initial Access	Medium

Control Gaps

Lack of network segmentation for AI tools and local inference servers
Insufficient process-lineage monitoring for AI agents
Missing hypervisor-level behavioral monitoring for cross-tenant activity

Key Behavioral Indicators

Unexpected shell spawns from AI agent processes
Cross-tenant modifications in hypervisor environments

False Positive Assessment

Recommendations

Immediate Mitigation

Verify against your organization's incident response runbook and team escalation paths before acting.
If using TrendAI TippingPoint, ensure the latest filter package is applied to protect against the demonstrated LiteLLM, Edge, Exchange, and SharePoint vulnerabilities.
Monitor vendor security advisories closely over the next 90 days to apply official patches for the affected systems as they are released.

Infrastructure Hardening

Consider implementing strict network segmentation for AI coding agents and local inference servers to limit their access to sensitive internal resources.
Evaluate hypervisor hardening and segmentation strategies for multi-tenant virtualization environments to mitigate the impact of guest-to-host escapes.

User Protection

Consider deploying endpoint behavioral monitoring with process-lineage alerting specifically tailored for AI applications.
Evaluate file-system integrity monitoring on directories where AI agents have write access.

Security Awareness

Consider updating security training for developers to highlight the risks of AI agents interacting with untrusted external tools and protocols (the trust boundary problem).

MITRE ATT&CK Mapping

T1190 - Exploit Public-Facing Application
T1068 - Exploitation for Privilege Escalation
T1611 - Escape to Host
T1203 - Exploitation for Client Execution