What is an AI agent?

An AI agent is an LLM in a loop with tools. Give it a goal. It thinks. It picks a tool. It uses the tool. It reads the result. It thinks again. It repeats until the goal is done. That's the whole idea. Everything else (frameworks, memory systems, multi-agent orchestration) is scaffolding around this one loop. If you understand the loop, you understand agents.

Why this matters

A plain LLM call is a one-shot: you send a prompt, you get an answer, done. The model works with whatever context you crammed into the prompt. If it needs fresh information, it guesses. If it needs to take an action in the real world, it can't.

An agent breaks out of the one-shot. It can look things up. It can call APIs. It can check its own work. It can run until it actually finishes the task instead of producing a plausible-sounding sentence. That's the jump from "chatbot that sounds smart" to "system that gets things done."

The loop, drawn out

Four moving parts. Goal goes in. The LLM picks what to do. A tool runs. The result comes back. The LLM checks whether the goal is done. If not, it loops. That's the whole shape.

The three components, precisely

  1. The model, the reasoning engine. It takes the running conversation (goal + all prior tool calls + all tool results) and decides what to do next: call another tool, or stop and answer.
  2. Tools, functions the model can call. A tool has a name, a description, and a schema for its arguments. Examples: web_search(query), query_db(sql), send_email(to, subject, body), run_python(code). Tools are how the agent reaches the outside world.
  3. The loop, the orchestration code that runs the cycle: send context to the model, parse its choice, execute the tool if one was called, append the result, repeat until the model says "I'm done" or a stop condition fires (max steps, timeout, cost cap).

Agent vs LLM call, side by side

A worked example: "Book me a flight to Austin next Thursday"

Watch the loop run:

  1. Turn 1 - model thinks: "I need to know what Thursday means. Today's date would help." → calls get_current_date().
  2. Tool result: 2026-04-19. Appended to context.
  3. Turn 2 - model thinks: "Thursday is 2026-04-23. I need flights from the user's home airport to AUS on that date." → calls search_flights(from: "DFW", to: "AUS", date: "2026-04-23").
  4. Tool result: a list of 6 flights with times and prices.
  5. Turn 3 - model thinks: "Before I book, I should confirm preferences." → calls ask_user("Morning or evening? Budget?").
  6. User replies: "Morning, under $300."
  7. Turn 4 - model thinks: "The 8am flight at $247 matches." → calls book_flight(id: "AA1234").
  8. Tool result: confirmation number. Model stops. Replies to user with the booking.

Same model that would have hallucinated a fake flight number in a one-shot prompt, now actually books a real flight. The difference is the loop and the tools.

The autonomy spectrum

"Agent" covers a wide range. Here's roughly where real systems sit:

Most production agents you'll build sit in the middle: a bounded loop with maybe 3-10 tools, running for 5-30 steps, with a timeout. The "fully autonomous agent that runs for days" is rare and hard. Don't start there.

When agents pay off

Pitfalls people hit early

What to do with this

Further reading

Watch

Andrej Karpathy - Intro to Large Language Models (1 hour)