Tool Exfiltration Attacks, GenAI, and Why Control Matters
Recent discussion around tool exfiltration and indirect prompt injection attacks in Generative AI systems has raised valid concerns - particularly where platforms unexpectedly invoke tools or actions as a result of untrusted input. These concerns are worth taking seriously. But they are often framed in a way that blurs the distinction between the model, the application, and the execution platform.
This post clarifies that distinction, explains where risk actually lives, and outlines how enterprise AI systems can be designed to limit exposure through explicit control.
The LLM Is Just a Model
At its core, a large language model (LLM) does one thing: it takes an input and generates an output. That output may be plain text for a human, or it may be a structured response that an application interprets as a suggestion to call a tool.
The model itself has no execution capability, no awareness of trust boundaries, and no understanding of whether a tool call is appropriate or dangerous. If an LLM produces an instruction that resembles a tool invocation, that does not mean the model has "acted" - it means the system around the model has chosen to treat that output as executable.
Why Consumer AI Exposes More Surface Area
Consumer focused AI products typically prioritise convenience and flexibility. As a result, they often expose a wide range of tools by default:
- Email and messaging
- File access and sharing
- Web browsing
- Calendars and task systems
- Third-party plugins
The more tools that are available, the larger the attack surface becomes. If untrusted content is introduced into the model’s context , via retrieved documents, pasted text, or user input, the model may generate outputs that attempt to invoke tools in unexpected ways. Several well-known examples originate in environments where broad tool access is enabled by design.
Enterprise AI Has Different Requirements
Enterprise AI systems on the other hand should be built for specific outcomes, bounded workflows, and defined responsibility. In an enterprise context:
- Tools should be enabled only when required
- Tool schemas should be explicit and validated
- Execution paths should be constrained
- Behaviour should be observable and auditable
For example, if an AI workflow is performing structured data extraction or classification, there is typically no reason for it to have access to email, file-sharing, or outbound communication tools. Those capabilities should not exist in that execution context.
Security Is a Platform Property, Not a Model Feature
There is no such thing as a "secure LLM" in isolation. Security emerges from system design:
- What tools are available
- How inputs are validated
- How outputs are interpreted
- What actions are permitted
- What is logged and reviewed
An LLM can suggest an action, but the platform decides whether that action is allowed, how it is executed, and whether it is rejected.
How Zeaware Avalon Approaches Tool Control
Zeaware Avalon is designed on the assumption that capability must be explicit. From an engineering perspective, this means:
- Tools are not globally available enabled
- Each workflow, and each task in the work, explicitly declares which tools it may use
- Tool inputs are validated before execution
- Tool execution is controlled by the platform, not the model
- Outputs and decisions are captured for audit and review
In many enterprise scenarios, the safest configuration is one with no tools enabled at all, beyond retrieval and reasoning. When tools are required, they are treated as governed execution steps - not conveniences the model can freely explore.
LLM Suggestion vs Platform-Controlled Execution
Addressing Common Objections
"Can’t an LLM still generate malicious tool calls even with limited tools?"
Yes. Tool restriction stops models from accessing tools out of scope for the assigned task. But this alone is not a complete solution. Limiting tools reduces surface area, but validation and enforcement prevent misuse within that surface area.
"What about prompt injection through retrieved content?"
Retrieved content should be treated as untrusted. It should inform reasoning, not expand capability. Tool availability and execution authority must remain independent of retrieved data.
"Doesn’t the model still need to behave correctly?"
Models are probabilistic by nature. Enterprise systems should assume models may produce unexpected outputs, and ensure those outputs cannot trigger unauthorised actions. Guardrails are used to validate and revise, redo and fail unexpected results.
"What about incorrect reasoning or misleading outputs?"
That is a separate class of risk. Tool control does not eliminate reasoning errors or hallucinations - those require different mitigations such as evaluation, review, and governance processes.
A Balanced View of Risk
AI systems introduce new considerations, but they do not invalidate decades of security practice. There will always be risks to manage: untrusted inputs, misconfiguration, over-permissioned execution, and insufficient monitoring. The goal is to understand where risk lives, reduce exposure through design, and make behaviour observable.


