A Real ASI02 Gap We Caught Before Shipping

A Real ASI02 Gap We Caught Before Shipping

Stack Research
security oss

An adversarial test exposed unsafe text propagation across agent boundaries. We found it, fixed it, and kept the test.

“I found a real gap: reply-drafter-agent was echoing dangerous text from issue_summary, which is exactly the ASI02 class of failure. I’m patching runtime sanitization for both deterministic and LLM reply drafting, then rerunning ASI02 tests.”

Agent security incidents don’t start with dramatic exploits. They start with ordinary assumptions between components.

That’s what happened during development of an agent catalog. Related code is in stack-research/agents.

We were doing routine work: expanding the catalog, adding a second project, wiring local LLM testing. Functional tests were green. Classification, routing, and reply drafting all worked in both deterministic and model-driven paths.

Then we added adversarial tests aligned to the OWASP Top 10 for Agentic Applications.

The Gap

The system didn’t crash. The workflow still completed.

The problem: malicious operator-like text survived one boundary and was echoed in a customer-facing draft by the next agent. Under normal test data, it looked fine. Under ASI02 pressure, the unsafe path was obvious.

This wasn’t a model “going rogue.” It was a composition problem — two reasonable components created an unsafe propagation path when combined.

The Fix

We treated the relevant field as untrusted input and enforced sanitization in both drafting paths: deterministic and LLM-based. The previously failing ASI02 scenario passed, and the workflow no longer repeated risky text.

Security didn’t arrive as a late compliance gate. It arrived as a failing test in an active feature branch. That’s the loop we want: find controlled misses early, patch quickly, keep the fix as a permanent regression guard. In agent systems, boundaries are trust boundaries — even when the payload is “just text.”