Pwn2Own Berlin 2026: On the Ground With TrendAI™ ZDI's Biggest AI Showdown Yet
At Pwn2Own Berlin 2026, security researchers demonstrated 47 unique zero-day vulnerabilities across AI platforms and traditional enterprise software. Notable exploits included root-level code execution in AI agents via trust boundary failures, a SYSTEM-level RCE in Microsoft Exchange, a pre-authentication RCE in SharePoint, and a cross-tenant guest-to-host escape in VMware ESXi.
Detection / HunterGoogle
What Happened
At the Pwn2Own Berlin 2026 hacking competition, security researchers discovered 47 new vulnerabilities in popular software and AI tools. Major AI platforms and traditional enterprise systems like Microsoft Exchange, SharePoint, and VMware ESXi were successfully compromised. These discoveries highlight that both emerging AI technologies and established software contain critical flaws that attackers could exploit. Organizations should ensure their security monitoring is up to date and apply vendor patches as soon as they become available over the next 90 days.
Key Takeaways
- Pwn2Own Berlin 2026 yielded 47 unique zero-days across 10 categories, with a record $1.29M in payouts.
- AI products (OpenAI Codex, LiteLLM, LM Studio, NVIDIA Megatron Bridge) were heavily targeted, consistently failing due to 'trust boundary' issues when interacting with external tools.
- Critical enterprise vulnerabilities were demonstrated, including a SYSTEM-level RCE in Microsoft Exchange, a pre-authentication RCE in SharePoint, and a cross-tenant guest-to-host escape in VMware ESXi.
- AI is now actively used by researchers for vulnerability discovery and exploit development, accelerating the exploit lifecycle.
- TrendAI released network security filters for several of the demonstrated vulnerabilities ahead of official vendor patches.
Affected Systems
- OpenAI Codex
- LiteLLM
- LM Studio
- NVIDIA Megatron Bridge
- Microsoft Exchange
- Microsoft SharePoint
- Microsoft Edge
- VMware ESXi
- Claude
- Cursor
- NVIDIA Container Toolkit
- Oracle AI Database
- Windows 11
- Red Hat Enterprise Linux
Attack Chain
Researchers utilized AI-assisted tooling to discover and exploit vulnerabilities across multiple platforms. For AI agents, attackers exploited trust boundary failures where the AI unconditionally trusted external tools or protocols, leading to root-level code execution. In enterprise systems, researchers chained logic bugs for sandbox escapes in browsers, leveraged unauthenticated HTTP requests for remote code execution on web servers, and exploited memory safety failures for guest-to-host hypervisor escapes.
Detection Availability
- YARA Rules: No
- Sigma Rules: No
- Snort/Suricata Rules: No
- KQL Queries: No
- Splunk SPL Queries: No
- EQL Queries: No
- Other Detection Logic: Yes
- Platforms: TrendAI TippingPoint
TrendAI has released TippingPoint network security filters to detect and block exploitation of several vulnerabilities demonstrated at the event.
Detection Engineering Assessment
EDR Visibility: Medium — EDR can detect post-exploitation activities such as anomalous shell spawns from AI agents, but may lack visibility into the initial logic or memory corruption exploits. Network Visibility: Medium — Network inspection is effective for unencrypted web exploits (e.g., SharePoint) via IPS filters, but visibility into local AI agent traffic or encrypted hypervisor communications is limited. Detection Difficulty: Hard — These are newly discovered zero-day vulnerabilities with no public proof-of-concept code available yet, making signature-based detection difficult outside of vendor-provided virtual patches.
Required Log Sources
- Process Creation (Event ID 4688 / Sysmon 1)
- Network Traffic Logs
- File System Integrity Logs
Hunting Hypotheses
| Hypothesis | Telemetry | ATT&CK Stage | FP Risk |
|---|---|---|---|
| Consider hunting for anomalous child processes (e.g., cmd.exe, sh, bash) spawning from AI agent runtimes or local LLM processes. | Process Creation (Event ID 4688 / Sysmon 1) | Execution | Low |
| Evaluate network telemetry for unexpected inbound HTTP requests to SharePoint servers that result in unusual w3wp.exe child processes. | Web Server Logs, Process Creation | Initial Access | Medium |
Control Gaps
- Lack of network segmentation for AI tools and local inference servers
- Insufficient process-lineage monitoring for AI agents
- Missing hypervisor-level behavioral monitoring for cross-tenant activity
Key Behavioral Indicators
- Unexpected shell spawns from AI agent processes
- Cross-tenant modifications in hypervisor environments
False Positive Assessment
- Low
Recommendations
Immediate Mitigation
- Verify against your organization's incident response runbook and team escalation paths before acting.
- If using TrendAI TippingPoint, ensure the latest filter package is applied to protect against the demonstrated LiteLLM, Edge, Exchange, and SharePoint vulnerabilities.
- Monitor vendor security advisories closely over the next 90 days to apply official patches for the affected systems as they are released.
Infrastructure Hardening
- Consider implementing strict network segmentation for AI coding agents and local inference servers to limit their access to sensitive internal resources.
- Evaluate hypervisor hardening and segmentation strategies for multi-tenant virtualization environments to mitigate the impact of guest-to-host escapes.
User Protection
- Consider deploying endpoint behavioral monitoring with process-lineage alerting specifically tailored for AI applications.
- Evaluate file-system integrity monitoring on directories where AI agents have write access.
Security Awareness
- Consider updating security training for developers to highlight the risks of AI agents interacting with untrusted external tools and protocols (the trust boundary problem).
MITRE ATT&CK Mapping
- T1190 - Exploit Public-Facing Application
- T1068 - Exploitation for Privilege Escalation
- T1611 - Escape to Host
- T1203 - Exploitation for Client Execution