2026-05-133 minlow

How we made Trail of Bits AI-native (so far)

Trail of Bits details their organizational shift to an AI-native workflow using Claude Code and autonomous agents. The post outlines their strategy for overcoming employee resistance, establishing an AI Maturity Matrix, and securing agent autonomy through sandboxing, curated marketplaces, and strict usage policies.

Analyzed:2026-03-31reports

Authors: Dan Guido, Trail of Bits

AI Security Claude Code macOS

Key Takeaways

Trail of Bits successfully transitioned to an 'AI-native' organization by addressing psychological barriers to AI adoption, such as identity threat and intolerance for imperfection.
They implemented an AI Maturity Matrix to measure, incentivize, and track AI capability across different roles within the company.
Security and autonomy are balanced using mandatory sandboxing (devcontainers, native macOS, Dropkit) and hardened defaults via MDM.
The initiative resulted in significant productivity gains, including AI-augmented auditors finding up to 200 bugs a week on certain engagements.
They open-sourced their AI toolchain, including skills repositories, Claude Code configurations, and sandboxing tools to encourage secure adoption.

Affected Systems

macOS
Claude Code
LLM Agents

Detection Availability

YARA Rules: No
Sigma Rules: No
Snort/Suricata Rules: No
KQL Queries: No
Splunk SPL Queries: No
EQL Queries: No
Other Detection Logic: No

No detection rules are provided as this article discusses organizational AI adoption and security engineering rather than threat detection.

Detection Engineering Assessment

EDR Visibility: N/A — The article does not discuss threat detection or EDR visibility. Network Visibility: N/A — The article does not discuss network-based threat detection. Detection Difficulty: N/A — Not applicable to this organizational security post.

Hunting Hypotheses

copy:

Hypothesis	Telemetry	ATT&CK Stage	FP Risk
Identify AI agent processes (e.g., Claude Code) executing outside of approved sandboxing mechanisms (like devcontainers or Dropkit) to detect policy violations or potential agent escapes.	Process execution logs, MDM compliance logs	Execution	Medium

Control Gaps

Prompt injection in client code
Lack of kernel-level policy enforcement for agents
Data privacy and confidentiality with public LLM models

Recommendations

Immediate Mitigation

Standardize on a single supported AI agent workflow across the organization.
Establish a clear AI usage policy and handbook to remove ambiguity regarding sensitive data.

Infrastructure Hardening

Implement sandboxing (e.g., devcontainers, native macOS sandboxing, Dropkit) for autonomous AI agents.
Enforce secure package manager defaults and mandatory package cooldown policies via MDM (e.g., Jamf).
Utilize curated marketplaces for third-party AI skills to prevent supply chain attacks.

User Protection

Provide safe, copy-pasteable default configurations for AI tools to prevent user error.
Restrict AI web access when handling sensitive client data.

Security Awareness

Create an AI Maturity Matrix to measure and guide employee AI adoption.
Run short, focused hackathons to encourage hands-on usage and skill building.
Educate staff on the risk models behind AI usage policies, not just the rules themselves.

Additional IOCs

Domains:
- tinfoil[.]sh - Private inference provider mentioned for confidential computing
- continue[.]dev - Alternative AI coding assistant mentioned for specific client code scenarios
File Paths:
- ~/.claude/CLAUDE.md - Recommended configuration file for Claude Code
Other:
- trailofbits/skills - Public skills repository
- trailofbits/skills-curated - Curated third-party skills marketplace
- trailofbits/claude-code-config - Recommended Claude Code configurations
- trailofbits/claude-code-devcontainer - Devcontainer for sandboxed development
- trailofbits/dropkit - macOS sandboxing for agents
- trailofbits/slither-mcp - MCP server for Slither

Stay currentSubscribe via RSS