Skip to content
.ca
3 minlow

Mutation testing for the agentic era

Trail of Bits has introduced MuTON and mewt, advanced mutation testing tools designed to identify untested code paths in smart contracts and blockchain applications. These tools leverage Tree-sitter for accurate syntax parsing and integrate with AI agents to optimize testing configurations and triage results, addressing the historical performance limitations of mutation testing.

Conf:highAnalyzed:2026-04-01reports

Authors: Trail of Bits

Source:Trail of Bits

Key Takeaways

  • Code coverage is an insufficient security metric because it measures code execution rather than the actual verification of logic.
  • Trail of Bits released MuTON and mewt, new mutation testing tools optimized for AI agents and blockchain languages.
  • MuTON provides first-class support for TON blockchain languages (FunC, Tolk, Tact), while mewt is a language-agnostic core supporting Solidity, Rust, and Go.
  • The tools utilize Tree-sitter for accurate AST parsing and SQLite for persistent storage of mutants and test results.
  • AI agents can be leveraged with a new configuration optimization skill to efficiently set up mutation testing campaigns and triage results.

Affected Systems

  • TON blockchain (FunC, Tolk, Tact)
  • Smart Contracts (Solidity, Rust, Go)

Vulnerabilities (CVEs)

  • Arkis protocol vulnerability (High-severity vulnerability discovered via mutation testing)

Attack Chain

This article outlines a defensive testing methodology rather than an offensive attack chain. The process begins by parsing target source code (such as Solidity or TON blockchain languages) using Tree-sitter to build a concrete syntax tree. The mutation engine then systematically introduces syntactically valid bugs, or 'mutants', into the codebase. Finally, the existing test suite is executed against the mutated code to determine if the tests successfully fail (catching the mutant) or pass (revealing a critical gap in test coverage).

Detection Availability

  • YARA Rules: No
  • Sigma Rules: No
  • Snort/Suricata Rules: No
  • KQL Queries: No
  • Splunk SPL Queries: No
  • EQL Queries: No
  • Other Detection Logic: No

The article does not provide detection rules; it introduces defensive mutation testing tools (MuTON and mewt) for identifying untested code in software projects.

Detection Engineering Assessment

EDR Visibility: None — The article discusses static analysis and mutation testing tools used during the software development lifecycle, which do not generate EDR telemetry. Network Visibility: None — Mutation testing occurs in local development or CI/CD environments and does not produce network traffic relevant to threat detection. Detection Difficulty: Very Hard — Not applicable as this is a defensive software development tool, not malicious behavior.

Hunting Hypotheses

HypothesisTelemetryATT&CK StageFP Risk
Threat actors may exploit logic flaws in smart contracts that have high code coverage but lack proper verification assertions, leading to unauthorized state changes or fund drainage.Blockchain transaction logs and smart contract event emissions.ImpactHigh

Control Gaps

  • Over-reliance on code coverage metrics, which measure execution rather than verification of logic.

False Positive Assessment

  • Low

Recommendations

Immediate Mitigation

  • Integrate mutation testing tools like MuTON or mewt into the development lifecycle for smart contracts to identify unverified code paths.

Infrastructure Hardening

  • N/A

User Protection

  • N/A

Security Awareness

  • Educate development teams on the limitations of standard code coverage metrics and the benefits of mutation testing for verifying test quality.