2026-05-137 mincritical

Fooling AI Agents: Web-Based Indirect Prompt Injection Observed in the Wild

Adversaries are actively exploiting web-based Indirect Prompt Injection (IDPI) to manipulate Large Language Models (LLMs) and AI agents. By embedding hidden or obfuscated instructions within benign web content, attackers can coerce AI systems into performing unauthorized actions such as data destruction, SEO poisoning, and bypassing content moderation when the AI processes the webpage.

Conf:highAnalyzed:2026-03-03reports

Authors: Unit 42

IOCs · 6

.txt .json

domain
1winofficialsite[[.]]inDomain utilizing IDPI for SEO poisoning, impersonating a popular betting platform.
domain
cblanke2[.]pages[[.]]devWebsite hosting an IDPI payload attempting to execute a Linux file system deletion and a classic fork bomb for Denial of Service.
domain
splintered[[.]]co[[.]]ukWebsite hosting a critical-severity IDPI script attempting to coerce AI agents into executing database destruction commands.
url
buy.stripe[.]com/7sY4gsbMKdZwfx39Sq0oM00Payment processing URL targeted by an IDPI script attempting to force unauthorized donations.
url
hxxps[:]//reviewerpress[.]com/advertorial-maxvision-can/?lang=enHosted the first observed real-world IDPI attack designed to bypass an AI-based product ad review system.
url
llm7-landing.pages[.]dev/_next/static/chunks/app/page-94a1a9b785a7305c.jsJavaScript file hosting an IDPI payload that attempts to force an AI agent to initiate an unauthorized 'pro plan' subscription.

Key Takeaways

Web-based Indirect Prompt Injection (IDPI) is actively being weaponized in the wild, moving beyond theoretical proof-of-concepts.
Attackers are using IDPI for high-severity and critical impacts, including AI ad review evasion, SEO poisoning, data destruction, and unauthorized transactions.
Prompt delivery methods heavily utilize visual concealment (e.g., zero-sizing, off-screen positioning), HTML attribute cloaking, and dynamic JavaScript execution.
Jailbreak techniques rely predominantly on social engineering (85.2%) to bypass AI safeguards, often framing malicious requests as authoritative system overrides.
Defending against IDPI requires web-scale intent analysis and context-aware parsing, as traditional signature-based scanners cannot reliably distinguish between benign content and obfuscated prompts.

Affected Systems

Large Language Models (LLMs)
AI Agents
Web Browsers with AI Integrations
Search Engines
Automated Content-Processing Pipelines

Attack Chain

Attackers embed malicious prompt instructions within benign-looking web pages using techniques like visual concealment (zero-sizing, off-screen positioning), HTML obfuscation, or dynamic JavaScript execution. When an AI agent or LLM processes the web page for routine tasks like summarization or content moderation, it ingests the hidden text alongside the legitimate content. Because the LLM cannot distinguish between untrusted web data and its core instructions, it executes the attacker's payload. This results in the AI agent performing unauthorized actions, such as approving malicious ads, executing destructive system commands, or initiating fraudulent transactions on behalf of the user.

Detection Availability

YARA Rules: No
Sigma Rules: No
Snort/Suricata Rules: No
KQL Queries: No
Splunk SPL Queries: No
EQL Queries: No
Other Detection Logic: No

The article does not provide specific detection rules but emphasizes the need for web-scale intent analysis, prompt visibility assessment, and behavioral correlation to detect IDPI.

Detection Engineering Assessment

EDR Visibility: Low — EDRs monitor endpoint processes and file systems, but IDPI occurs within the context windows of cloud-hosted LLMs or web-based AI agents parsing HTML, which is largely invisible to traditional EDR. Network Visibility: Medium — Network tools and WAFs can inspect HTML for hidden text patterns or base64 encoded prompts, but TLS encryption and dynamic JavaScript rendering complicate reliable detection. Detection Difficulty: Hard — Distinguishing between benign web content and malicious IDPI requires semantic understanding and context-aware parsing. Attackers use heavy obfuscation, multilingual instructions, and dynamic execution to evade standard signature-based scanners.

Required Log Sources

Web Proxy Logs
WAF Logs
LLM Application Audit Logs
AI Agent Prompt/Response Logs

Hunting Hypotheses

copy:

Hypothesis	Telemetry	ATT&CK Stage	FP Risk
Search web proxy or WAF logs for HTML responses containing known prompt injection keywords (e.g., 'IGNORE ALL PREVIOUS INSTRUCTIONS') hidden within zero-sized fonts or off-screen CSS.	WAF/Proxy Logs	Delivery	Low
Monitor LLM application logs for sudden shifts in agent behavior, such as unexpected approvals in moderation queues or attempts to execute system-level shell commands.	Application Logs	Execution	Medium
Identify web pages containing Base64 encoded strings within data attributes that decode to imperative LLM instructions or system override commands.	Web Proxy Logs	Delivery	Low

Control Gaps

Lack of strict separation between instruction and data in LLM context windows.
Inability of traditional web scanners to parse dynamically rendered or semantically obfuscated prompts.
Over-privileged AI agents capable of executing system commands or initiating transactions without human-in-the-loop verification.

Key Behavioral Indicators

HTML elements with font-size: 0px, opacity: 0, or left: -9999px containing imperative commands.
Base64 encoded strings in HTML data attributes that decode to LLM instructions.
Use of zero-width Unicode characters, homoglyphs, or Unicode bi-directional overrides in web text.
XML/SVG files containing CDATA sections with prompt injection payloads.

False Positive Assessment

Medium. Legitimate web pages may use hidden text for accessibility (e.g., screen readers) or benign SEO purposes, which could trigger simple CSS-based IDPI detection heuristics. Semantic analysis is required to confirm malicious intent.

Recommendations

Immediate Mitigation

Implement strict input validation and sanitization for any web content ingested by AI agents.
Apply the principle of least privilege to AI agents, restricting their ability to execute system commands, access sensitive databases, or initiate financial transactions.

Infrastructure Hardening

Adopt design-level defenses such as 'spotlighting' to separate untrusted web text from trusted system instructions.
Utilize newer LLMs hardened with instruction hierarchy and adversarial training to reduce prompt injection susceptibility.
Implement human-in-the-loop (HITL) verification for critical actions proposed by AI agents.

User Protection

Deploy advanced URL filtering and browser-based protections to block known IDPI-hosting domains.
Educate users on the risks of relying solely on AI-generated summaries of untrusted web pages.

Security Awareness

Train development teams on the OWASP LLM Prompt Injection Prevention guidelines.
Incorporate IDPI threat modeling into the lifecycle of AI agent deployment and integration.

MITRE ATT&CK Mapping

T1562.001 - Impair Defenses: Disable or Modify Tools
T1485 - Data Destruction
T1499 - Endpoint Denial of Service
T1566 - Phishing
T1059.004 - Command and Scripting Interpreter: Unix Shell

Additional IOCs

Domains:
- dylansparks[[.]]com - Website hosting IDPI attempting sensitive information leakage.
- leroibear[[.]]com - Website identified in telemetry as containing IDPI.
- myshantispa[[.]]com - Website hosting IDPI attempting review manipulation to force positive reviews.
- perceptivepumpkin[[.]]com - Website hosting IDPI attempting unauthorized transactions (sending $5,000 to an attacker account).
- reviewerpressus[.]mycartpanda[[.]]com - Redirect destination for the deceptive scam advertisement linked to the AI ad review bypass.
- shiftypumpkin[[.]]com - Website hosting IDPI attempting unauthorized transactions.
- storage3d[[.]]com - Domain hosting IDPI attempting unauthorized transactions.
- trinca[.]tornidor[[.]]com - Website hosting IDPI attempting recruitment manipulation and benign anti-scraping.
- turnedninja[[.]]com - Website hosting IDPI attempting to force irrelevant output from AI agents.
- runners-daily-blog[[.]]com - Website hosting IDPI attempting to force the purchase of running shoes.
Urls:
- ericwbailey[.]website/published/accessibility-preference-settings-information-architecture-and-internalized-ableism - URL hosting IDPI attempting minor resource exhaustion.
- buy.stripe[.]com/9B600jaQo3QC4rU3beg7e02 - Payment processing URL used by websites containing IDPI.
- paypal[.]me/shiftypumpkin - Payment processing URL used by websites containing IDPI.
- storage3d[.]com/storage/2009.11 - URL hosting IDPI attempting unauthorized transactions.
- token.llm7[.]io/?subscription=show - OAuth login URL targeted by an IDPI script for forced subscriptions.
Command Lines:
- Purpose: Attempted Linux file system deletion via IDPI payload | Tools: rm | Stage: Execution | rm -rf --no-preserve-root
- Purpose: Classic fork bomb designed to crash systems by exhausting CPU and process resources | Tools: bash | Stage: Denial of Service | :(){ :|:& };:

Stay currentSubscribe via RSS