A structured, project-driven path from zero to building production-grade AI agents β tailored for developers who know JavaScript/TypeScript and want to move into the AI engineering space.
- What Is Agentic AI?
- Phase 0 β Foundations (2β3 weeks)
- Phase 1 β LLM APIs & SDKs (2β3 weeks)
- Phase 2 β Orchestration Frameworks (4β6 weeks)
- Phase 3 β Multi-Agent Systems (4β6 weeks)
- Phase 4 β Local Model Tooling (3β4 weeks)
- Capstone Projects
- Skills Matrix
- Recommended Learning Order
Agentic AI refers to systems where an LLM doesn't just respond to a single prompt β it plans, uses tools, calls APIs, and iterates across multiple steps to accomplish a goal autonomously.
Think of it as the difference between:
- Chatbot β User asks, model answers, done.
- Agent β User gives a goal, model breaks it into steps, calls tools (web search, code execution, databases), reflects on results, and loops until the task is complete.
As a frontend dev, you already understand state machines, async flows, and UI components β agentic AI is essentially that, but the "component" is an LLM making decisions.
Before touching any framework, get comfortable with how LLMs actually work at the API level.
- Tokens & context windows β why 128k tokens matters, how billing works
- System prompts vs user prompts β the "role" architecture of messages
- Temperature & sampling β controlling creativity vs determinism
- Tool use / function calling β how models trigger external actions
- Retrieval-Augmented Generation (RAG) β grounding models in your own data
- Embeddings β turning text into vectors for semantic search
| Term | What it means |
|---|---|
| Prompt | Input text given to the model |
| Completion | The model's output |
| Context window | Max tokens the model can "see" at once |
| Tool/Function call | Model requests an external action (e.g. search the web) |
| Agent loop | The cycle: think β act β observe β repeat |
| RAG | Injecting your own docs into the model's context |
| Embedding | A numeric vector representing text meaning |
| Fine-tuning | Training a model further on your own data |
Build a Next.js app that lets you:
- Toggle system prompt / user prompt
- Adjust temperature slider
- See token count live
- Compare outputs side by side
What you'll learn: How prompt structure affects output quality. This is the most underrated skill in AI engineering.
Install: npm install openai
The OpenAI SDK is the most widely used. Even if you end up using Claude or open-source models, most frameworks are modelled after OpenAI's API shape.
import OpenAI from "openai";
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
// Basic completion
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain RAG in one paragraph." },
],
});
console.log(response.choices[0].message.content);const stream = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Write a haiku about TypeScript." }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}const tools = [
{
type: "function" as const,
function: {
name: "get_weather",
description: "Get current weather for a location",
parameters: {
type: "object",
properties: {
location: { type: "string", description: "City name" },
},
required: ["location"],
},
},
},
];
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "What's the weather in Tokyo?" }],
tools,
tool_choice: "auto",
});
// Model will respond with a tool_call if it decides to use the tool
const toolCall = response.choices[0].message.tool_calls?.[0];
if (toolCall) {
const args = JSON.parse(toolCall.function.arguments);
// Call your actual weather API with args.location
}A VS Code-style web editor (use Monaco Editor) that:
- Takes a code snippet as input
- Calls OpenAI to review it with a structured tool response (issues, suggestions, severity)
- Renders results in a sidebar panel
Stack: Next.js + OpenAI SDK + Monaco Editor + shadcn/ui
Install: npm install @anthropic-ai/sdk
Claude's API has a slightly different shape but the same concepts. The key differences are:
- System prompt is a top-level field, not a message role
- Tool use returns
tool_usecontent blocks (nottool_calls) - Streaming uses server-sent events
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
const message = await client.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 1024,
system: "You are an expert code reviewer.",
messages: [
{
role: "user",
content: "Review this function: const add = (a,b) => a + b",
},
],
});
console.log(message.content[0].type === "text" && message.content[0].text);const tools: Anthropic.Tool[] = [
{
name: "analyze_code",
description: "Analyze code and return structured feedback",
input_schema: {
type: "object",
properties: {
issues: {
type: "array",
items: { type: "string" },
},
score: { type: "number", minimum: 0, maximum: 10 },
},
required: ["issues", "score"],
},
},
];
const response = await client.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 1024,
tools,
tool_choice: { type: "any" },
messages: [{ role: "user", content: "Analyze: for(let i=0;i<arr.length;i++)" }],
});Build an app where users:
- Upload a PDF or paste text
- Ask questions about it
- Get answers with quoted citations from the original text
Stack: Next.js + Anthropic SDK + pdf-parse
These frameworks sit on top of raw LLM APIs and give you chains, memory, RAG pipelines, and agent abstractions.
LangChain is the most popular orchestration library. It's available in both Python and JavaScript (langchain, @langchain/core, @langchain/openai).
Key concepts: Chains β LCEL (LangChain Expression Language) β Agents β Memory
import { ChatOpenAI } from "@langchain/openai";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { StringOutputParser } from "@langchain/core/output_parsers";
// LCEL pipe syntax β like Unix pipes for AI
const model = new ChatOpenAI({ model: "gpt-4o" });
const prompt = ChatPromptTemplate.fromTemplate(
"Summarize this in one sentence: {text}"
);
const parser = new StringOutputParser();
const chain = prompt.pipe(model).pipe(parser);
const result = await chain.invoke({
text: "LangChain is a framework for building LLM applications...",
});import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings } from "@langchain/openai";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { createRetrievalChain } from "langchain/chains/retrieval";
import { createStuffDocumentsChain } from "langchain/chains/combine_documents";
// 1. Split your documents into chunks
const splitter = new RecursiveCharacterTextSplitter({
chunkSize: 1000,
chunkOverlap: 200,
});
const docs = await splitter.createDocuments([yourLongText]);
// 2. Embed and store
const vectorStore = await MemoryVectorStore.fromDocuments(
docs,
new OpenAIEmbeddings()
);
// 3. Create retrieval chain
const retriever = vectorStore.asRetriever();
const llm = new ChatOpenAI({ model: "gpt-4o" });
const combineDocsChain = await createStuffDocumentsChain({ llm, prompt });
const chain = await createRetrievalChain({
retriever,
combineDocsChain,
});
const result = await chain.invoke({ input: "What is the refund policy?" });Build a chatbot for a mock company's docs:
- Ingest markdown/PDF files into a vector store (start with in-memory, then try Pinecone or Supabase pgvector)
- Stream responses with source citations
- Add a conversation history sidebar
Stack: Next.js + LangChain.js + Pinecone (or Supabase) + Vercel AI SDK for streaming UI
LlamaIndex specialises in data ingestion and indexing β it's the go-to for RAG-heavy applications. Better than LangChain when your primary challenge is "getting data into the model intelligently."
import { Document, VectorStoreIndex, SimpleDirectoryReader } from "llamaindex";
// Load documents from a folder
const reader = new SimpleDirectoryReader();
const documents = await reader.loadData("./data");
// Build index
const index = await VectorStoreIndex.fromDocuments(documents);
// Query
const queryEngine = index.asQueryEngine();
const response = await queryEngine.query({
query: "What are the main features?",
});
console.log(response.toString());- Nodes β chunks of your document with metadata
- Index β organized structure for fast retrieval
- Query Engine β retrieves and synthesises an answer
- Chat Engine β adds memory/conversation history on top
A tool that lets you:
- Upload 5β10 research papers (PDF)
- Ask cross-document questions ("What do papers 2 and 4 say about X?")
- See which chunks were retrieved with relevance scores
Stack: Next.js + LlamaIndex TS + PDF.js
Haystack (by deepset) is enterprise-focused, built around pipelines and great for production RAG systems. It's Python-first, so you'd typically expose it as a FastAPI backend.
from haystack import Pipeline
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.generators import OpenAIGenerator
from haystack.components.builders import PromptBuilder
from haystack.document_stores.in_memory import InMemoryDocumentStore
# Build a pipeline
store = InMemoryDocumentStore()
pipeline = Pipeline()
pipeline.add_component("retriever", InMemoryBM25Retriever(document_store=store))
pipeline.add_component("prompt_builder", PromptBuilder(template="""
Answer based on context:
{% for doc in documents %} {{ doc.content }} {% endfor %}
Question: {{ question }}
"""))
pipeline.add_component("llm", OpenAIGenerator(model="gpt-4o"))
pipeline.connect("retriever", "prompt_builder.documents")
pipeline.connect("prompt_builder", "llm")
result = pipeline.run({"retriever": {"query": "..."}, "prompt_builder": {"question": "..."}})Build a full support ticket system:
- Python/FastAPI backend with Haystack pipeline
- Classify tickets (billing, technical, general) with an LLM router
- Route to different RAG pipelines per category
- Next.js frontend that polls status and streams responses
Single agents are powerful. Multiple specialised agents working together are transformative.
Microsoft's AutoGen lets you define conversable agents that talk to each other. One agent can be an LLM, another can run code, another can be a human proxy.
import autogen
# Config for GPT-4o
config_list = [{"model": "gpt-4o", "api_key": "your-key"}]
llm_config = {"config_list": config_list}
# Define agents
assistant = autogen.AssistantAgent(
name="assistant",
llm_config=llm_config,
system_message="You are a helpful AI assistant.",
)
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
human_input_mode="TERMINATE", # only ask human to terminate
code_execution_config={"work_dir": "workspace", "use_docker": False},
)
# Start conversation β agents will go back and forth
user_proxy.initiate_chat(
assistant,
message="Write a Python script to scrape Hacker News top stories, then run it.",
)# Define specialist agents
planner = autogen.AssistantAgent("planner", llm_config=llm_config,
system_message="Break tasks into sub-tasks.")
coder = autogen.AssistantAgent("coder", llm_config=llm_config,
system_message="Write code only. No explanations.")
reviewer = autogen.AssistantAgent("reviewer", llm_config=llm_config,
system_message="Review code for bugs and security issues.")
# Group chat orchestrates who speaks when
groupchat = autogen.GroupChat(
agents=[user_proxy, planner, coder, reviewer],
messages=[],
max_round=10,
)
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)
user_proxy.initiate_chat(manager, message="Build a REST API for a todo app.")A web app where you:
- Describe a feature ("add user authentication")
- Watch a planner β coder β reviewer agent pipeline work through it
- Stream each agent's "thoughts" and output in real time to a terminal-style UI
Stack: FastAPI (AutoGen) + Next.js + WebSockets for live streaming
CrewAI is higher-level and more intuitive than AutoGen β agents have roles, backstories, and goals. Great for workflows where you want human-readable agent personas.
from crewai import Agent, Task, Crew, Process
# Define specialist agents with personas
researcher = Agent(
role="Senior Research Analyst",
goal="Uncover cutting-edge developments in {topic}",
backstory="You work at a leading tech think tank. Your expertise is finding credible, recent information.",
verbose=True,
allow_delegation=False,
)
writer = Agent(
role="Tech Content Strategist",
goal="Craft compelling content about {topic}",
backstory="You're a renowned content strategist known for insightful, accessible tech articles.",
verbose=True,
allow_delegation=True,
)
# Define tasks
research_task = Task(
description="Investigate the latest AI agent frameworks in 2025. Find at least 5 key developments.",
expected_output="A detailed report with sources.",
agent=researcher,
)
write_task = Task(
description="Using the research, write a 500-word blog post for developers.",
expected_output="A polished blog post in markdown.",
agent=writer,
)
# Assemble the crew
crew = Crew(
agents=[researcher, writer],
tasks=[research_task, write_task],
process=Process.sequential,
verbose=True,
)
result = crew.kickoff(inputs={"topic": "agentic AI frameworks"})from crewai_tools import SerperDevTool, FileReadTool, WebsiteSearchTool
search_tool = SerperDevTool() # Google search
file_tool = FileReadTool()
web_rag_tool = WebsiteSearchTool()
researcher = Agent(
role="Researcher",
goal="Find accurate information",
backstory="Expert researcher with web access.",
tools=[search_tool, web_rag_tool],
)Build an automated blog content pipeline:
- Researcher agent finds trending topics (with web search tool)
- Writer agent drafts the post
- Editor agent critiques and rewrites
- SEO agent adds metadata, keywords, alt text
Frontend: React dashboard showing each agent's status, output preview, and an "approve & publish" button.
Running models locally means no API costs, privacy, and full control. Essential for production systems.
PyTorch is the foundational deep learning library. You don't need to train models from scratch β but understanding tensors, devices, and inference loops helps you debug everything else.
import torch
from torch import nn
# Check if GPU is available
device = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu"
print(f"Using: {device}")
# Tensors β the building block
x = torch.tensor([[1.0, 2.0], [3.0, 4.0]], device=device)
y = torch.tensor([[5.0, 6.0], [7.0, 8.0]], device=device)
print(x @ y) # Matrix multiply
# A simple neural network
class SimpleNet(nn.Module):
def __init__(self):
super().__init__()
self.layers = nn.Sequential(
nn.Linear(10, 64),
nn.ReLU(),
nn.Linear(64, 1),
)
def forward(self, x):
return self.layers(x)
model = SimpleNet().to(device)# Load a pretrained model checkpoint and run inference
model.eval() # turn off dropout/batchnorm for inference
with torch.no_grad(): # don't track gradients (saves memory)
input_tensor = torch.randn(1, 10).to(device)
output = model(input_tensor)
print(output)The transformers library by Hugging Face gives you thousands of pretrained models (LLaMA, Mistral, Gemma, Phi, etc.) with a unified API.
Install: pip install transformers accelerate torch
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "microsoft/Phi-3-mini-4k-instruct" # small but capable
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16, # half precision = half the VRAM
device_map="auto", # automatically uses GPU if available
)
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain vector embeddings in simple terms."},
]
# Apply chat template (each model has its own format)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
output = model.generate(**inputs, max_new_tokens=256, temperature=0.7, do_sample=True)
response = tokenizer.decode(output[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True)
print(response)from transformers import AutoTokenizer, AutoModel
import torch
model_id = "BAAI/bge-small-en-v1.5" # excellent small embedding model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModel.from_pretrained(model_id)
def get_embedding(text: str) -> list[float]:
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
outputs = model(**inputs)
# Mean pooling over token embeddings
embedding = outputs.last_hidden_state.mean(dim=1).squeeze()
return embedding.tolist()
vec = get_embedding("What is agentic AI?")
print(len(vec)) # 384 dimensionsfrom transformers import AutoModelForCausalLM, BitsAndBytesConfig
quant_config = BitsAndBytesConfig(
load_in_4bit=True, # 4-bit quantization (4x less VRAM)
bnb_4bit_compute_dtype=torch.float16,
)
# Now you can run LLaMA-3 8B on a 6GB GPU
model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Meta-Llama-3-8B-Instruct",
quantization_config=quant_config,
device_map="auto",
)Build an Ollama-powered coding assistant:
- Spin up
ollama run codellamalocally - Build a Next.js frontend that streams responses
- Add a local vector store (LanceDB or ChromaDB) for "remember this code" functionality
- No API keys, runs entirely on your machine
Stack: Ollama + LanceDB + Next.js + Vercel AI SDK (local provider)
These projects combine multiple phases and are portfolio-ready.
What it does: Analyzes a codebase for design inconsistencies.
Architecture:
- CrewAI with 3 agents: Scanner (reads files) β Analyzer (finds issues) β Reporter (writes suggestions)
- LangChain for chunking and indexing large codebases
- React frontend with a file tree + annotation overlay
Why it's impressive: Combines multi-agent orchestration, RAG on code, and a polished UI. Real-world utility for frontend teams.
What it does: Given a topic, independently searches the web, reads papers, synthesizes findings, and produces a structured report with citations.
Architecture:
- AutoGen GroupChat: Planner + Researcher + Critic + Writer
- Haystack for document ingestion and storage
- Next.js dashboard with live agent activity feed
Why it's impressive: Demonstrates full agentic loops with reflection and criticism.
What it does: A VS Code extension (or web IDE) powered entirely by local models.
Architecture:
- Transformers (Phi-3 or DeepSeek Coder) for code completion
- LlamaIndex for codebase RAG
- WebSocket server exposing the model, Next.js as the frontend
Why it's impressive: Privacy-first, no API costs, real technical depth.
| Skill | Phase | Priority | Notes |
|---|---|---|---|
| Prompt engineering | 0 | π΄ Critical | Foundation of everything |
| OpenAI SDK | 1 | π΄ Critical | Industry standard API shape |
| Anthropic SDK | 1 | π‘ Important | Best-in-class for complex reasoning |
| Streaming responses | 1 | π΄ Critical | Required for good UX |
| LangChain | 2 | π‘ Important | Widest ecosystem, slightly verbose |
| LlamaIndex | 2 | π‘ Important | Best for document-heavy RAG |
| Haystack | 2 | π’ Nice-to-have | Enterprise, Python-first |
| Vector databases | 2 | π΄ Critical | Core infrastructure for RAG |
| AutoGen | 3 | π‘ Important | Best for code-executing agents |
| CrewAI | 3 | π‘ Important | Best for role-based workflows |
| PyTorch basics | 4 | π’ Nice-to-have | Helps with debugging |
| Transformers | 4 | π‘ Important | Local model control |
| Quantization | 4 | π’ Nice-to-have | For resource-constrained environments |
Week 1β2: Phase 0 β Core concepts + Project 0 (Playground)
Week 3β4: Phase 1 β OpenAI SDK + Project 1A (Code Review Bot)
Week 5β6: Phase 1 β Anthropic SDK + Project 1B (Document Q&A)
Week 7β9: Phase 2 β LangChain + LlamaIndex + Projects 2A & 2B
Week 10β11: Phase 2 β Haystack + Project 2C (optional, Python exposure)
Week 12β14: Phase 3 β AutoGen + Project 3A (Dev Team Simulator)
Week 15β16: Phase 3 β CrewAI + Project 3B (Content Pipeline)
Week 17β18: Phase 4 β PyTorch basics + Transformers
Week 19β20: Phase 4 β Local models + Project 4A (Local Copilot)
Week 21β24: Capstone project of your choice
Most AI engineers have poor UX skills. You have the opposite problem. Lean into it:
- Build beautiful streaming UIs (most AI tools look terrible)
- Focus on real-time feedback β agent thought logs, progress indicators
- Master Vercel AI SDK β it's the bridge between AI backends and React UIs
- Think about error states β agents fail, timeout, hallucinate. Handle it gracefully.
The best agentic AI products will be built by people who understand both the model layer and the experience layer. That's you.