Tool Use: When Language Triggers Actions
Why execution changes accountability and requires guardrails.
Key takeaways
- Tool use turns language into actions, not just answers.
- The app is the gatekeeper for safety and validation.
- Execution introduces accountability and audit needs.
- Guardrails matter more once actions are triggered.
Tool use gives a Large Language Model the ability to interact with the outside world. It transforms the model from a passive text generator into an active agent that can query databases, call APIs, and perform actions. This capability is powerful, but it introduces significant new risks and responsibilities.
What changes when a model can use tools?
The moment a model can use tools, language stops being only output and starts becoming a request for execution. This page is for builders designing agentic or automated workflows, and the main shift is that application policy, validation, and auditability matter more than prompt cleverness.
In practice, clarity at boundaries reduces downstream errors more than late-stage tuning.
Act I: The fundamentals
The tool-use loop
By itself, an LLM can only process text. It cannot access real-time information, interact with other software, or affect the physical world. Tool use, also known as function calling, bridges this gap.
The process begins by providing the LLM with a list of available “tools.” Each tool is a function in your application’s code, described in natural language (e.g., get_current_weather(location: string)). When a user issues a prompt, the model can decide that it needs to use one of these tools. Instead of generating a text response, it generates a structured JSON object specifying the tool to call and the arguments to use.
Act II: The modern paradigm
Application as gatekeeper
The application code acts as an intermediary. It inspects the JSON object from the LLM and, if it deems the call safe and valid, executes the requested function. For example, if the LLM generates { "tool": "get_current_weather", "arguments": { "location": "Boston, MA" } }, the application would call its internal weather function for Boston.
The result of that function call (e.g., { "temperature": "72F", "conditions": "Sunny" }) is then passed back to the LLM as part of a new prompt. The LLM, now equipped with this real-time information, generates the final, human-readable response: “The current weather in Boston is 72°F and sunny.” This entire loop can happen in a single turn of conversation.
Act III: Principles in practice
Guardrails and accountability
The moment an LLM can trigger an action, the stakes become much higher. A bug is no longer just a poorly worded sentence; it could be an accidental purchase, a deleted file, or an incorrect database query. This elevates the importance of safety and accountability.
- Never trust the LLM’s output directly. Always validate the tool name and arguments against a strict schema before execution.
- Implement guardrails. Use confirmation steps for any destructive or costly actions. A prompt like “Are you sure you want to delete this file?” is a critical safety mechanism.
- Provide precise tool descriptions. The LLM’s decision to use a tool is based entirely on the description you provide. Vague or ambiguous descriptions will lead to incorrect tool usage.
- Plan for failure. The tool might fail, an API could be down, or the result might be an error. Your system must be able to handle these failures gracefully and report them back to the LLM.
For related systems context, see Systems 001: Foundations and From Prompt to Production. For an execution-focused companion, use the Engineering Agentic Systems deck. For a review artifact that turns guardrails into a repeatable checklist, use the Prompt Safety Checklist.
What this changes in practice
When you give a model tools, you must shift your focus from prompt quality to a system of validation, error handling, and safety guardrails.