๐ค Agents ยท Advanced Tutorial
Build an AI Agent from Scratch
โฑ๏ธ 45 minutes
๐ 12 steps
๐ป Python 3.10+
๐ Last updated: June 2026
Most "AI agent" tutorials stop at a chat loop. We'll build a real agent that breaks down goals,
uses tools, remembers context, and executes autonomously. By the end, you'll have a working agent that can
research topics and write structured reports โ all from a single prompt.
๐ฏ What You'll Build
A command-line research agent that:
- Takes a high-level goal (e.g., "Research the latest trends in AI video generation")
- Breaks it into sub-tasks automatically
- Uses tools: web search, calculator, file writer
- Stores intermediate findings in memory
- Produces a structured markdown report
No frameworks. Just Python + OpenAI API. You'll understand every line.
๐ Prerequisites
- Python 3.10+ installed
- OpenAI API key (get one at platform.openai.com)
- Basic Python knowledge (functions, dictionaries, lists)
- $2โ5 in API credits (we use GPT-4o-mini to keep costs low)
๐ก Tip: Using GPT-4o-mini keeps this tutorial under $0.50. If you want better reasoning, swap to GPT-4o โ it'll cost ~$2โ3 for the full tutorial.
Step 1: Project Setup
Create your project directory and install dependencies:
# Create project
mkdir ai-agent-tutorial && cd ai-agent-tutorial
python -m venv venv
# Activate (Mac/Linux)
source venv/bin/activate
# Activate (Windows)
venv\Scripts\activate
# Install dependencies
pip install openai requests python-dotenv
# Create files
touch agent.py tools.py memory.py .env
Add your API key to .env:
OPENAI_API_KEY=sk-your-key-here
Step 2: The Agent Loop
The core of any agent is a simple loop: think โ act โ observe โ repeat.
Open agent.py and add the skeleton:
import os
from openai import OpenAI
from dotenv import load_dotenv
load_dotenv()
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
class Agent:
def __init__(self, goal):
self.goal = goal
self.memory = []
self.tools = {}
def think(self):
"""Decide next action based on goal + memory"""
messages = [
{"role": "system", "content": "You are a task-planning AI. Given a goal and past actions, decide the next step."},
{"role": "user", "content": f"Goal: {self.goal}\nMemory: {self.memory}\nWhat should I do next? Respond with JSON: {{'action': 'tool_name', 'input': '...'}} or {{'action': 'finish', 'result': '...'}}"}
]
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
response_format={"type": "json_object"}
)
import json
decision = json.loads(response.choices[0].message.content)
return decision
def run(self, max_steps=10):
for step in range(max_steps):
decision = self.think()
print(f"Step {step + 1}: {decision}")
if decision["action"] == "finish":
print(f"\nโ
Done: {decision['result']}")
return decision["result"]
# Execute tool and store result
result = f"Executed {decision['action']} with {decision['input']}"
self.memory.append({"step": step, "action": decision, "result": result})
return "Max steps reached"
# Test it
agent = Agent("Research AI video generation trends")
agent.run()
๐ก Tip: We use response_format={"type": "json_object"} to force structured output. This is crucial โ unstructured LLM responses break agent loops.
Step 3: Tool System
Tools are just Python functions the agent can call. Create tools.py:
import requests
from datetime import datetime
def web_search(query):
"""Search the web using DuckDuckGo (no API key needed)"""
try:
url = "https://html.duckduckgo.com/html/"
params = {"q": query}
headers = {"User-Agent": "Mozilla/5.0"}
r = requests.get(url, params=params, headers=headers, timeout=10)
# In production, use a real search API like Serper or Exa
return f"Found results for: {query}. Top result summary would go here."
except Exception as e:
return f"Search error: {str(e)}"
def calculate(expression):
"""Safely evaluate math expressions"""
try:
# Only allow safe characters
safe = ''.join(c for c in expression if c in '0123456789+-*/.() ')
result = eval(safe)
return f"Result: {result}"
except:
return "Invalid expression"
def write_file(filename, content):
"""Write content to a file"""
try:
with open(filename, "w") as f:
f.write(content)
return f"Wrote {len(content)} characters to {filename}"
except Exception as e:
return f"Error: {str(e)}"
# Tool registry
TOOL_REGISTRY = {
"web_search": web_search,
"calculate": calculate,
"write_file": write_file,
}
def get_tool_descriptions():
return """
Available tools:
- web_search(query): Search the web for information
- calculate(expression): Evaluate math expressions safely
- write_file(filename, content): Write text to a file
"""
Now update agent.py to use tools:
from tools import TOOL_REGISTRY, get_tool_descriptions
# In Agent.__init__, add:
self.tools = TOOL_REGISTRY
# In Agent.think(), update the system prompt:
system_msg = f"""You are a task-planning AI. Given a goal and past actions, decide the next step.
{get_tool_descriptions()}
Respond with JSON: {{'action': 'tool_name', 'input': '...'}} or {{'action': 'finish', 'result': '...'}}"""
# In Agent.run(), replace the execution block:
if decision["action"] in self.tools:
tool_fn = self.tools[decision["action"]]
result = tool_fn(decision["input"])
print(f" โ {result}")
self.memory.append({"step": step, "action": decision, "result": result})
else:
print(f" โ Unknown tool: {decision['action']}")
self.memory.append({"step": step, "action": decision, "result": "Unknown tool"})
Step 4: Memory
Our current memory is just a list. For complex tasks, we need structured memory. Create memory.py:
class AgentMemory:
def __init__(self):
self.observations = []
self.facts = set()
def add(self, observation):
self.observations.append(observation)
# Extract key facts (simplified โ in production use an LLM)
if "found" in observation.lower() or "result" in observation.lower():
self.facts.add(observation)
def get_context(self, max_items=5):
"""Return recent observations + key facts"""
recent = self.observations[-max_items:]
facts_list = list(self.facts)[:max_items]
return f"Recent: {recent}\nKey facts: {facts_list}"
def __str__(self):
return self.get_context()
Update agent.py to use structured memory:
from memory import AgentMemory
# In __init__:
self.memory = AgentMemory()
# In think(), pass memory context:
{"role": "user", "content": f"Goal: {self.goal}\nMemory: {self.memory}\nWhat next?"}
# In run(), use memory.add() instead of list append:
self.memory.add(f"Step {step}: {decision} โ {result}")
Step 5: Task Decomposition
For complex goals, we need the agent to plan ahead. Add a planning step:
def plan(self):
"""Break goal into sub-tasks before execution"""
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Break the user's goal into 3-5 concrete sub-tasks. Respond as a JSON list of strings."},
{"role": "user", "content": f"Goal: {self.goal}"}
],
response_format={"type": "json_object"}
)
import json
plan = json.loads(response.choices[0].message.content)
return plan.get("tasks", [])
# In run(), before the loop:
def run(self, max_steps=10):
tasks = self.plan()
print(f"๐ Plan: {tasks}\n")
self.memory.add(f"Plan: {tasks}")
for step in range(max_steps):
# ... existing loop ...
๐ก Tip: This simple planner works for 80% of use cases. For production, replace with hierarchical planning (plan โ sub-plan โ execute).
Step 6: Running the Agent
Your complete agent.py should now look like this:
# agent.py โ Complete Agent
import os, json
from openai import OpenAI
from dotenv import load_dotenv
from tools import TOOL_REGISTRY, get_tool_descriptions
from memory import AgentMemory
load_dotenv()
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
class Agent:
def __init__(self, goal):
self.goal = goal
self.memory = AgentMemory()
self.tools = TOOL_REGISTRY
def plan(self):
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Break the goal into 3-5 sub-tasks. JSON: {'tasks': ['...']}"},
{"role": "user", "content": f"Goal: {self.goal}"}
],
response_format={"type": "json_object"}
)
return json.loads(response.choices[0].message.content).get("tasks", [])
def think(self):
system = f"""Plan next action. {get_tool_descriptions()}
Respond: {'action': 'tool_name', 'input': '...'} or {'action': 'finish', 'result': '...'}"""
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": system},
{"role": "user", "content": f"Goal: {self.goal}\nMemory: {self.memory}"}
],
response_format={"type": "json_object"}
)
return json.loads(response.choices[0].message.content)
def run(self, max_steps=10):
tasks = self.plan()
print(f"๐ Plan: {tasks}\n")
self.memory.add(f"Plan: {tasks}")
for step in range(max_steps):
decision = self.think()
print(f"Step {step+1}: {decision.get('action')}({decision.get('input', '')})")
if decision["action"] == "finish":
print(f"\nโ
Done: {decision['result']}")
return decision["result"]
if decision["action"] in self.tools:
result = self.tools[decision["action"]](decision.get("input", ""))
print(f" โ {result}")
self.memory.add(f"{decision} โ {result}")
else:
print(f" โ Unknown tool")
return "Max steps reached"
if __name__ == "__main__":
agent = Agent("Write a summary of AI coding tools to report.md")
agent.run()
Run it:
python agent.py
๐ Expected output: The agent plans sub-tasks, executes tools, and finishes with a result. With a real search API, it would actually research and write a file.
Step 7: Adding More Tools
Extend your agent by adding tools to tools.py:
def read_file(filename):
"""Read file contents"""
try:
with open(filename, "r") as f:
return f.read()
except:
return f"File {filename} not found"
# Add to TOOL_REGISTRY:
TOOL_REGISTRY["read_file"] = read_file
Update the tool descriptions so the LLM knows about new tools:
# In get_tool_descriptions(), add:
- read_file(filename): Read contents of a file
โ ๏ธ Security warning: Never let an agent delete files or execute shell commands without sandboxing. Our agent only reads/writes in the project directory โ but validate this in production.
๐ Next Steps
You now have a working agent. Here's how to make it production-ready:
- Add a real search API โ Replace DuckDuckGo with Serper.dev, Exa, or Tavily for reliable web search
- Add retry logic โ LLM calls fail. Wrap them in exponential backoff
- Add observability โ Log every step to LangSmith, Langfuse, or a simple JSONL file
- Swap to a stronger model โ GPT-4o or Claude 3.7 Sonnet for harder reasoning tasks
- Add human-in-the-loop โ Pause for approval on expensive or destructive actions
๐ Further reading: Compare this to frameworks like LangGraph, CrewAI, and AutoGPT in our
AI Agents Directory.