Runtime boundary usage: inline guards before MCP/tool calls

#2
by armorerlabs - opened

I have been comparing prompt-injection defenses from the agent-runtime side, where the guard sits directly before memory writes, stored outputs, or MCP/tool calls.

The main constraint I am trying to optimize for is not only classification quality. If the guard is on the hot path, latency becomes product latency, and the runtime needs a small structured decision it can act on.

I wrote up the benchmark note for Armorer Guard here: https://armorerlabs.com/blog/armorer-guard-inline-prompt-injection-defense

In the default-threshold run, Guard completed 977 cases at 3.4ms average / 4.3ms p95 locally, with no scanner network calls.

For HF builders using prompt-injection models or local guards: what output shape has been most useful in actual agent runtimes? I am leaning toward suspicious, reasons, confidence, scan id, sanitized text, and enough metadata to tie the decision back to the tool call boundary.

Sign up or log in to comment