An agent is an AI that doesn't just answer — it does. It plans, picks up the right tool, observes what happened, and tries again until the job is done. This guide walks you through how it works, using everyday analogies and step-by-step diagrams.
Featured skill: Tool use — the ability that turns a chatty language model into something that can actually act in the world.
Most AI you chat with is a language model: you give it text, it gives you text back. An agent goes further — it has goals, can decide what to do, and can act in the world through tools. The collection of capabilities that make this possible are called agentic skills.
Calling calculators, search, code, calendars — anything with an API.
Breaking a big goal into smaller steps in the right order.
Remembering what happened earlier — and what it has learned.
Looking at its own work and deciding whether to try again.
Working through problems step by step instead of guessing.
Reading images, files, and other inputs beyond plain text.
Coordinating with other agents, each with their own specialty.
Finally pressing the button, sending the email, deploying the code.
A language model on its own is like a brilliant chef trapped in a kitchen with no utensils, no ingredients, and no phone. It knows how to cook — it just can't touch anything. Tools are the utensils: search, calculators, calendars, databases, code, browsers, even other software. With them, the agent can finally do.
Think of a handyman showing up to fix your sink. They don't carry every tool that exists — they carry a curated toolkit: a wrench for pipes, a multimeter for wires, a level for shelves. When you say "the tap drips," they pick the right tool, use it, observe the result, and grab another if needed.
An AI agent works the same way. We give it a list of available tools, each with a clear name, a description, and the inputs it needs. When the agent has a goal, it picks the best tool for the moment.
The model itself never gets bigger or smarter at the moment of action — but its reach explodes. By calling tools, it can know the weather right now, do precise arithmetic, query your bank, or read today's news. It borrows the strengths of every system it can connect to.
Every agent, no matter how complex, runs the same basic loop. It keeps cycling until the goal is finished — or until it gives up after a few tries. Here's the same loop visualised two ways.
Every tool an agent can use is described like a little recipe card. Three pieces matter most: what it does, what inputs it needs, and what comes back. The agent reads these descriptions to decide which tool fits the moment.
Let's see an agent do something real. You say "Plan me a quiet weekend in Paris next month, under $800." The agent can't do that from its own knowledge — it needs to look things up, calculate, and write. Watch the loop run several times.
The agent decides the order: flights → weather → hotels → restaurants → itinerary.
→ no tool needed yetCalls search_flights(origin, destination, dates). Gets three options back, picks the cheapest nonstop.
Calls get_forecast("Paris", dates). Sees rain on Saturday — notes it and shifts the outdoor plan to Sunday.
Calls search_hotels(city, budget, vibe="quiet"). Filters out noisy central districts. Picks one near a park.
Calls book_restaurant(name, time, party=2). Confirms a Saturday dinner at a small bistro.
Adds up the totals with the calculator tool, makes sure it's under $800, then drafts a tidy day-by-day plan for the user.
→ tools: calculator, write_fileBoth look similar in a chat window. The difference is what they can reach. A language model is a brilliant conversation partner trapped inside its text box. An agent is one that has been handed a phone, a calculator, a calendar, and a key to the filing cabinet.
The model only knows what was in its training. Ask it the weather today and it has to guess. Ask it your bank balance and it makes one up. It is closed off from the live world.
The agent can ask other systems for fresh information, do exact math, read and write files, and trigger side effects like sending an email or placing an order. Its answers are grounded in real, current data.
In a single session, an agent might call five different tools — search, calculate, write, schedule, send — all while keeping track of what it's already tried. The "magic" isn't in any one tool; it's in how the agent strings them together to finish a job.
When you use an agentic system, three patterns are worth noticing:
Agents aren't magic. They can't do things their tools can't do. If the toolkit doesn't include a way to read your email, the agent can't read your email. The model decides when and how to act — the tools decide what is possible.