Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save carefree-ladka/3bde97a48aac10952cb2ce47e1232880 to your computer and use it in GitHub Desktop.

Select an option

Save carefree-ladka/3bde97a48aac10952cb2ce47e1232880 to your computer and use it in GitHub Desktop.
Agentic AI Roadmap for Frontend Developers

Agentic AI Roadmap for Frontend Developers

A structured, project-driven path from zero to building production-grade AI agents β€” tailored for developers who know JavaScript/TypeScript and want to move into the AI engineering space.


Table of Contents

  1. What Is Agentic AI?
  2. Phase 0 β€” Foundations (2–3 weeks)
  3. Phase 1 β€” LLM APIs & SDKs (2–3 weeks)
  4. Phase 2 β€” Orchestration Frameworks (4–6 weeks)
  5. Phase 3 β€” Multi-Agent Systems (4–6 weeks)
  6. Phase 4 β€” Local Model Tooling (3–4 weeks)
  7. Capstone Projects
  8. Skills Matrix
  9. Recommended Learning Order

What Is Agentic AI?

Agentic AI refers to systems where an LLM doesn't just respond to a single prompt β€” it plans, uses tools, calls APIs, and iterates across multiple steps to accomplish a goal autonomously.

Think of it as the difference between:

  • Chatbot β†’ User asks, model answers, done.
  • Agent β†’ User gives a goal, model breaks it into steps, calls tools (web search, code execution, databases), reflects on results, and loops until the task is complete.

As a frontend dev, you already understand state machines, async flows, and UI components β€” agentic AI is essentially that, but the "component" is an LLM making decisions.


Phase 0 β€” Foundations (2–3 weeks)

Before touching any framework, get comfortable with how LLMs actually work at the API level.

Concepts to Understand

  • Tokens & context windows β€” why 128k tokens matters, how billing works
  • System prompts vs user prompts β€” the "role" architecture of messages
  • Temperature & sampling β€” controlling creativity vs determinism
  • Tool use / function calling β€” how models trigger external actions
  • Retrieval-Augmented Generation (RAG) β€” grounding models in your own data
  • Embeddings β€” turning text into vectors for semantic search

Key Terms Cheat Sheet

Term What it means
Prompt Input text given to the model
Completion The model's output
Context window Max tokens the model can "see" at once
Tool/Function call Model requests an external action (e.g. search the web)
Agent loop The cycle: think β†’ act β†’ observe β†’ repeat
RAG Injecting your own docs into the model's context
Embedding A numeric vector representing text meaning
Fine-tuning Training a model further on your own data

πŸ›  Project 0: Prompt Engineering Playground

Build a Next.js app that lets you:

  • Toggle system prompt / user prompt
  • Adjust temperature slider
  • See token count live
  • Compare outputs side by side

What you'll learn: How prompt structure affects output quality. This is the most underrated skill in AI engineering.


Phase 1 β€” LLM APIs & SDKs (2–3 weeks)

OpenAI SDK

Install: npm install openai

The OpenAI SDK is the most widely used. Even if you end up using Claude or open-source models, most frameworks are modelled after OpenAI's API shape.

Core Concepts

import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

// Basic completion
const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "Explain RAG in one paragraph." },
  ],
});

console.log(response.choices[0].message.content);

Streaming (important for UX)

const stream = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Write a haiku about TypeScript." }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

Tool/Function Calling

const tools = [
  {
    type: "function" as const,
    function: {
      name: "get_weather",
      description: "Get current weather for a location",
      parameters: {
        type: "object",
        properties: {
          location: { type: "string", description: "City name" },
        },
        required: ["location"],
      },
    },
  },
];

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "What's the weather in Tokyo?" }],
  tools,
  tool_choice: "auto",
});

// Model will respond with a tool_call if it decides to use the tool
const toolCall = response.choices[0].message.tool_calls?.[0];
if (toolCall) {
  const args = JSON.parse(toolCall.function.arguments);
  // Call your actual weather API with args.location
}

πŸ›  Project 1A: Smart Code Review Bot

A VS Code-style web editor (use Monaco Editor) that:

  • Takes a code snippet as input
  • Calls OpenAI to review it with a structured tool response (issues, suggestions, severity)
  • Renders results in a sidebar panel

Stack: Next.js + OpenAI SDK + Monaco Editor + shadcn/ui


Anthropic SDK

Install: npm install @anthropic-ai/sdk

Claude's API has a slightly different shape but the same concepts. The key differences are:

  • System prompt is a top-level field, not a message role
  • Tool use returns tool_use content blocks (not tool_calls)
  • Streaming uses server-sent events
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

const message = await client.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 1024,
  system: "You are an expert code reviewer.",
  messages: [
    {
      role: "user",
      content: "Review this function: const add = (a,b) => a + b",
    },
  ],
});

console.log(message.content[0].type === "text" && message.content[0].text);

Tool Use with Claude

const tools: Anthropic.Tool[] = [
  {
    name: "analyze_code",
    description: "Analyze code and return structured feedback",
    input_schema: {
      type: "object",
      properties: {
        issues: {
          type: "array",
          items: { type: "string" },
        },
        score: { type: "number", minimum: 0, maximum: 10 },
      },
      required: ["issues", "score"],
    },
  },
];

const response = await client.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 1024,
  tools,
  tool_choice: { type: "any" },
  messages: [{ role: "user", content: "Analyze: for(let i=0;i<arr.length;i++)" }],
});

πŸ›  Project 1B: Document Q&A with Citations

Build an app where users:

  • Upload a PDF or paste text
  • Ask questions about it
  • Get answers with quoted citations from the original text

Stack: Next.js + Anthropic SDK + pdf-parse


Phase 2 β€” Orchestration Frameworks (4–6 weeks)

These frameworks sit on top of raw LLM APIs and give you chains, memory, RAG pipelines, and agent abstractions.

LangChain

LangChain is the most popular orchestration library. It's available in both Python and JavaScript (langchain, @langchain/core, @langchain/openai).

Key concepts: Chains β†’ LCEL (LangChain Expression Language) β†’ Agents β†’ Memory

import { ChatOpenAI } from "@langchain/openai";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { StringOutputParser } from "@langchain/core/output_parsers";

// LCEL pipe syntax β€” like Unix pipes for AI
const model = new ChatOpenAI({ model: "gpt-4o" });
const prompt = ChatPromptTemplate.fromTemplate(
  "Summarize this in one sentence: {text}"
);
const parser = new StringOutputParser();

const chain = prompt.pipe(model).pipe(parser);

const result = await chain.invoke({
  text: "LangChain is a framework for building LLM applications...",
});

RAG Pipeline with LangChain

import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings } from "@langchain/openai";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { createRetrievalChain } from "langchain/chains/retrieval";
import { createStuffDocumentsChain } from "langchain/chains/combine_documents";

// 1. Split your documents into chunks
const splitter = new RecursiveCharacterTextSplitter({
  chunkSize: 1000,
  chunkOverlap: 200,
});
const docs = await splitter.createDocuments([yourLongText]);

// 2. Embed and store
const vectorStore = await MemoryVectorStore.fromDocuments(
  docs,
  new OpenAIEmbeddings()
);

// 3. Create retrieval chain
const retriever = vectorStore.asRetriever();
const llm = new ChatOpenAI({ model: "gpt-4o" });

const combineDocsChain = await createStuffDocumentsChain({ llm, prompt });
const chain = await createRetrievalChain({
  retriever,
  combineDocsChain,
});

const result = await chain.invoke({ input: "What is the refund policy?" });

πŸ›  Project 2A: Internal Knowledge Base Chatbot

Build a chatbot for a mock company's docs:

  • Ingest markdown/PDF files into a vector store (start with in-memory, then try Pinecone or Supabase pgvector)
  • Stream responses with source citations
  • Add a conversation history sidebar

Stack: Next.js + LangChain.js + Pinecone (or Supabase) + Vercel AI SDK for streaming UI


LlamaIndex

LlamaIndex specialises in data ingestion and indexing β€” it's the go-to for RAG-heavy applications. Better than LangChain when your primary challenge is "getting data into the model intelligently."

import { Document, VectorStoreIndex, SimpleDirectoryReader } from "llamaindex";

// Load documents from a folder
const reader = new SimpleDirectoryReader();
const documents = await reader.loadData("./data");

// Build index
const index = await VectorStoreIndex.fromDocuments(documents);

// Query
const queryEngine = index.asQueryEngine();
const response = await queryEngine.query({
  query: "What are the main features?",
});

console.log(response.toString());

Key LlamaIndex Concepts

  • Nodes β€” chunks of your document with metadata
  • Index β€” organized structure for fast retrieval
  • Query Engine β€” retrieves and synthesises an answer
  • Chat Engine β€” adds memory/conversation history on top

πŸ›  Project 2B: Multi-Document Research Assistant

A tool that lets you:

  • Upload 5–10 research papers (PDF)
  • Ask cross-document questions ("What do papers 2 and 4 say about X?")
  • See which chunks were retrieved with relevance scores

Stack: Next.js + LlamaIndex TS + PDF.js


Haystack

Haystack (by deepset) is enterprise-focused, built around pipelines and great for production RAG systems. It's Python-first, so you'd typically expose it as a FastAPI backend.

from haystack import Pipeline
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.generators import OpenAIGenerator
from haystack.components.builders import PromptBuilder
from haystack.document_stores.in_memory import InMemoryDocumentStore

# Build a pipeline
store = InMemoryDocumentStore()
pipeline = Pipeline()

pipeline.add_component("retriever", InMemoryBM25Retriever(document_store=store))
pipeline.add_component("prompt_builder", PromptBuilder(template="""
Answer based on context:
{% for doc in documents %} {{ doc.content }} {% endfor %}
Question: {{ question }}
"""))
pipeline.add_component("llm", OpenAIGenerator(model="gpt-4o"))

pipeline.connect("retriever", "prompt_builder.documents")
pipeline.connect("prompt_builder", "llm")

result = pipeline.run({"retriever": {"query": "..."}, "prompt_builder": {"question": "..."}})

πŸ›  Project 2C: Customer Support Pipeline

Build a full support ticket system:

  • Python/FastAPI backend with Haystack pipeline
  • Classify tickets (billing, technical, general) with an LLM router
  • Route to different RAG pipelines per category
  • Next.js frontend that polls status and streams responses

Phase 3 β€” Multi-Agent Systems (4–6 weeks)

Single agents are powerful. Multiple specialised agents working together are transformative.

AutoGen

Microsoft's AutoGen lets you define conversable agents that talk to each other. One agent can be an LLM, another can run code, another can be a human proxy.

import autogen

# Config for GPT-4o
config_list = [{"model": "gpt-4o", "api_key": "your-key"}]
llm_config = {"config_list": config_list}

# Define agents
assistant = autogen.AssistantAgent(
    name="assistant",
    llm_config=llm_config,
    system_message="You are a helpful AI assistant.",
)

user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    human_input_mode="TERMINATE",  # only ask human to terminate
    code_execution_config={"work_dir": "workspace", "use_docker": False},
)

# Start conversation β€” agents will go back and forth
user_proxy.initiate_chat(
    assistant,
    message="Write a Python script to scrape Hacker News top stories, then run it.",
)

GroupChat β€” Multiple Agents

# Define specialist agents
planner = autogen.AssistantAgent("planner", llm_config=llm_config,
    system_message="Break tasks into sub-tasks.")
coder = autogen.AssistantAgent("coder", llm_config=llm_config,
    system_message="Write code only. No explanations.")
reviewer = autogen.AssistantAgent("reviewer", llm_config=llm_config,
    system_message="Review code for bugs and security issues.")

# Group chat orchestrates who speaks when
groupchat = autogen.GroupChat(
    agents=[user_proxy, planner, coder, reviewer],
    messages=[],
    max_round=10,
)
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)

user_proxy.initiate_chat(manager, message="Build a REST API for a todo app.")

πŸ›  Project 3A: AI Dev Team Simulator

A web app where you:

  • Describe a feature ("add user authentication")
  • Watch a planner β†’ coder β†’ reviewer agent pipeline work through it
  • Stream each agent's "thoughts" and output in real time to a terminal-style UI

Stack: FastAPI (AutoGen) + Next.js + WebSockets for live streaming


CrewAI

CrewAI is higher-level and more intuitive than AutoGen β€” agents have roles, backstories, and goals. Great for workflows where you want human-readable agent personas.

from crewai import Agent, Task, Crew, Process

# Define specialist agents with personas
researcher = Agent(
    role="Senior Research Analyst",
    goal="Uncover cutting-edge developments in {topic}",
    backstory="You work at a leading tech think tank. Your expertise is finding credible, recent information.",
    verbose=True,
    allow_delegation=False,
)

writer = Agent(
    role="Tech Content Strategist",
    goal="Craft compelling content about {topic}",
    backstory="You're a renowned content strategist known for insightful, accessible tech articles.",
    verbose=True,
    allow_delegation=True,
)

# Define tasks
research_task = Task(
    description="Investigate the latest AI agent frameworks in 2025. Find at least 5 key developments.",
    expected_output="A detailed report with sources.",
    agent=researcher,
)

write_task = Task(
    description="Using the research, write a 500-word blog post for developers.",
    expected_output="A polished blog post in markdown.",
    agent=writer,
)

# Assemble the crew
crew = Crew(
    agents=[researcher, writer],
    tasks=[research_task, write_task],
    process=Process.sequential,
    verbose=True,
)

result = crew.kickoff(inputs={"topic": "agentic AI frameworks"})

Adding Tools to CrewAI Agents

from crewai_tools import SerperDevTool, FileReadTool, WebsiteSearchTool

search_tool = SerperDevTool()  # Google search
file_tool = FileReadTool()
web_rag_tool = WebsiteSearchTool()

researcher = Agent(
    role="Researcher",
    goal="Find accurate information",
    backstory="Expert researcher with web access.",
    tools=[search_tool, web_rag_tool],
)

πŸ›  Project 3B: Content Pipeline Crew

Build an automated blog content pipeline:

  • Researcher agent finds trending topics (with web search tool)
  • Writer agent drafts the post
  • Editor agent critiques and rewrites
  • SEO agent adds metadata, keywords, alt text

Frontend: React dashboard showing each agent's status, output preview, and an "approve & publish" button.


Phase 4 β€” Local Model Tooling (3–4 weeks)

Running models locally means no API costs, privacy, and full control. Essential for production systems.

PyTorch

PyTorch is the foundational deep learning library. You don't need to train models from scratch β€” but understanding tensors, devices, and inference loops helps you debug everything else.

import torch
from torch import nn

# Check if GPU is available
device = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu"
print(f"Using: {device}")

# Tensors β€” the building block
x = torch.tensor([[1.0, 2.0], [3.0, 4.0]], device=device)
y = torch.tensor([[5.0, 6.0], [7.0, 8.0]], device=device)
print(x @ y)  # Matrix multiply

# A simple neural network
class SimpleNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.layers = nn.Sequential(
            nn.Linear(10, 64),
            nn.ReLU(),
            nn.Linear(64, 1),
        )

    def forward(self, x):
        return self.layers(x)

model = SimpleNet().to(device)

Running Inference (what you'll actually do most)

# Load a pretrained model checkpoint and run inference
model.eval()  # turn off dropout/batchnorm for inference
with torch.no_grad():  # don't track gradients (saves memory)
    input_tensor = torch.randn(1, 10).to(device)
    output = model(input_tensor)
    print(output)

Transformers (Hugging Face)

The transformers library by Hugging Face gives you thousands of pretrained models (LLaMA, Mistral, Gemma, Phi, etc.) with a unified API.

Install: pip install transformers accelerate torch

Running a Local LLM

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "microsoft/Phi-3-mini-4k-instruct"  # small but capable

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,  # half precision = half the VRAM
    device_map="auto",           # automatically uses GPU if available
)

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Explain vector embeddings in simple terms."},
]

# Apply chat template (each model has its own format)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    output = model.generate(**inputs, max_new_tokens=256, temperature=0.7, do_sample=True)

response = tokenizer.decode(output[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True)
print(response)

Embeddings for RAG

from transformers import AutoTokenizer, AutoModel
import torch

model_id = "BAAI/bge-small-en-v1.5"  # excellent small embedding model

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModel.from_pretrained(model_id)

def get_embedding(text: str) -> list[float]:
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
    with torch.no_grad():
        outputs = model(**inputs)
    # Mean pooling over token embeddings
    embedding = outputs.last_hidden_state.mean(dim=1).squeeze()
    return embedding.tolist()

vec = get_embedding("What is agentic AI?")
print(len(vec))  # 384 dimensions

Quantization (run big models on small hardware)

from transformers import AutoModelForCausalLM, BitsAndBytesConfig

quant_config = BitsAndBytesConfig(
    load_in_4bit=True,              # 4-bit quantization (4x less VRAM)
    bnb_4bit_compute_dtype=torch.float16,
)

# Now you can run LLaMA-3 8B on a 6GB GPU
model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Meta-Llama-3-8B-Instruct",
    quantization_config=quant_config,
    device_map="auto",
)

πŸ›  Project 4A: Local AI Coding Assistant

Build an Ollama-powered coding assistant:

  • Spin up ollama run codellama locally
  • Build a Next.js frontend that streams responses
  • Add a local vector store (LanceDB or ChromaDB) for "remember this code" functionality
  • No API keys, runs entirely on your machine

Stack: Ollama + LanceDB + Next.js + Vercel AI SDK (local provider)


Capstone Projects

These projects combine multiple phases and are portfolio-ready.

Capstone 1: AI-Powered Design System Auditor

What it does: Analyzes a codebase for design inconsistencies.

Architecture:

  • CrewAI with 3 agents: Scanner (reads files) β†’ Analyzer (finds issues) β†’ Reporter (writes suggestions)
  • LangChain for chunking and indexing large codebases
  • React frontend with a file tree + annotation overlay

Why it's impressive: Combines multi-agent orchestration, RAG on code, and a polished UI. Real-world utility for frontend teams.


Capstone 2: Autonomous Research Agent

What it does: Given a topic, independently searches the web, reads papers, synthesizes findings, and produces a structured report with citations.

Architecture:

  • AutoGen GroupChat: Planner + Researcher + Critic + Writer
  • Haystack for document ingestion and storage
  • Next.js dashboard with live agent activity feed

Why it's impressive: Demonstrates full agentic loops with reflection and criticism.


Capstone 3: Local-First Coding Copilot

What it does: A VS Code extension (or web IDE) powered entirely by local models.

Architecture:

  • Transformers (Phi-3 or DeepSeek Coder) for code completion
  • LlamaIndex for codebase RAG
  • WebSocket server exposing the model, Next.js as the frontend

Why it's impressive: Privacy-first, no API costs, real technical depth.


Skills Matrix

Skill Phase Priority Notes
Prompt engineering 0 πŸ”΄ Critical Foundation of everything
OpenAI SDK 1 πŸ”΄ Critical Industry standard API shape
Anthropic SDK 1 🟑 Important Best-in-class for complex reasoning
Streaming responses 1 πŸ”΄ Critical Required for good UX
LangChain 2 🟑 Important Widest ecosystem, slightly verbose
LlamaIndex 2 🟑 Important Best for document-heavy RAG
Haystack 2 🟒 Nice-to-have Enterprise, Python-first
Vector databases 2 πŸ”΄ Critical Core infrastructure for RAG
AutoGen 3 🟑 Important Best for code-executing agents
CrewAI 3 🟑 Important Best for role-based workflows
PyTorch basics 4 🟒 Nice-to-have Helps with debugging
Transformers 4 🟑 Important Local model control
Quantization 4 🟒 Nice-to-have For resource-constrained environments

Recommended Learning Order

Week 1–2:   Phase 0 β€” Core concepts + Project 0 (Playground)
Week 3–4:   Phase 1 β€” OpenAI SDK + Project 1A (Code Review Bot)
Week 5–6:   Phase 1 β€” Anthropic SDK + Project 1B (Document Q&A)
Week 7–9:   Phase 2 β€” LangChain + LlamaIndex + Projects 2A & 2B
Week 10–11: Phase 2 β€” Haystack + Project 2C (optional, Python exposure)
Week 12–14: Phase 3 β€” AutoGen + Project 3A (Dev Team Simulator)
Week 15–16: Phase 3 β€” CrewAI + Project 3B (Content Pipeline)
Week 17–18: Phase 4 β€” PyTorch basics + Transformers
Week 19–20: Phase 4 β€” Local models + Project 4A (Local Copilot)
Week 21–24: Capstone project of your choice

Your Unfair Advantage as a Frontend Dev

Most AI engineers have poor UX skills. You have the opposite problem. Lean into it:

  • Build beautiful streaming UIs (most AI tools look terrible)
  • Focus on real-time feedback β€” agent thought logs, progress indicators
  • Master Vercel AI SDK β€” it's the bridge between AI backends and React UIs
  • Think about error states β€” agents fail, timeout, hallucinate. Handle it gracefully.

The best agentic AI products will be built by people who understand both the model layer and the experience layer. That's you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment