A practical guide for developers who want to use AI to produce more, without depending on "magic prompts".
Based on real experience, not hype.
Structured in 3 pillars and 5 phases.
If you're like most developers, you've already tried using AI to code. Maybe you ran some prompts in ChatGPT, installed an extension in VS Code, or even bought access to Cursor, Windsurf, or GitHub Copilot.
But the result was frustrating, wasn't it?
- Generic responses that don't work for your project.
- More time fixing AI code than writing from scratch.
- The feeling that "this doesn't work for real cases."
- Fear of falling behind while other devs "master" AI.
The truth is: you're not doing anything wrong. You just don't have a method.
McKinsey tested developers on real tasks and discovered that, with the right method, it's possible to be:
- 45-50% faster at code documentation.
- ~50% faster writing new code.
- Significantly more efficient at refactoring.
The research concludes by saying:
Generative AI is poised to transform software development in a way that no other tooling or process improvement has done. Using today's class of generative AIβbased tools, developers can complete tasks up to two times fasterβand this is just the beginning. As the technology evolves and is seamlessly integrated within tools across the software development life cycle, it is expected to further improve the speed and even quality of the development process. But as our research shows, tooling alone is not enough to unlock the technology's full potential. A structured approach encompassing generative AI training and coaching, use case selection, workforce upskilling, and risk controls can lay a solid foundation for organizations to pursue generative AI's promise of extraordinary productivity and unparalleled software innovation.
The conclusion is clear: AI multiplies productivity when used with method. The secret isn't in the tool, but in how you integrate it into your workflow through a structured process that combines practical training, targeted coaching, well-selected use cases, and rigorous quality controls.
Additionally, the research also mentions that developers using AI tools are twice as likely to report happiness, satisfaction, and better flow state. They attributed this to the tools' ability to automate repetitive tasks that prevented them from performing more satisfying activities, and to obtain information faster than searching for solutions on the web.
Proof of this is that developers drastically decreased the number of questions created on Stack Overflow after the GPT launch in 2022.

The problem isn't AI. It's how we're using it.
- Extreme Programming with AI: AI as a fixed pair programming partner.
- Intelligent Context Switching: Strategic context management.
- Don't Trust, Verify: Verification is key.
Extreme Programming (XP) was developed and formalized by Kent Beck in the 90s as an agile software development methodology focused on continuous delivery with quality and adaptability. One of XP's core practices is Pair Programming: two developers work on the same code, alternating between "driver" (who types) and "navigator" (who reviews the written code) every 15-30 minutes. While the navigator reviews the code, they can suggest improvements, discuss ideas and direction, and evaluate complexities. This is done to relieve the driver's focus on tactical tasks, allowing them to concentrate on the current task, using the navigator as a guide and safety net.
Why XP works:
- Quality: Continuous review reduces bugs
- Knowledge: Knowledge is shared between developers through constant discussion
- Focus: Constant discussion requires attention, meaning fewer distractions.
- Learning: Exposure to different approaches, ideas, and perspectives.
The concept of pair programming is powerful, and AI can be the perfect partner: available 24/7, no ego, vast knowledge, infinite patience, and tolerates cursing calmly π€£.
When you use AI as a pair programming partner, you maintain all the benefits of traditional XP - quality from continuous review, shared knowledge, constant focus, etc.
However, the dynamic changes: Instead of rotating roles as in traditional XP, you establish a fixed partnership where you are always the navigator (make strategic decisions, define architecture, validate results) and AI is always the driver (suggests implementations, writes code, executes instructions, etc).
This means:
- Constant discussion: You and AI converse about each decision.
- Immediate review: AI analyzes and suggests improvements in real-time. You review, validate, and suggest changes until reaching the desired result, then authorize implementation - strategically directing the AI.
- Expanded knowledge: Access to millions of tested patterns and solutions and ease of finding information.
- Total availability: AI is always ready, without time or energy limitations, only depends on your willingness and pocket to pay for more tokens.
The practical result? You can understand complex projects, summarize extensive documentation in seconds, review code with multiple perspectives, write tests, receive implementation suggestions, and generate ideas for edge cases you didn't think of - all while maintaining control and quality.
Role Definition:
- You: Navigator, Architect, Validator, Decision Maker
- AI: Driver, Implementer, Typist, Focus on the execution
- Dialogue: Constant discussion during all stages
General Rules:
- Always discuss the approach before implementing
- Use AI to generate multiple options and choose the best
- Keep specialized conversations by context
- Review and validate each suggestion before accepting and implementing
- Document important decisions for future reference
Context switch is a concept that comes from computer science - when a processor needs to stop one task and start another, there's a cost to "switching context." Gerald Weinberg, in the book Quality Software Management: Systems Thinking, was one of the first to apply this concept to software development. He considered that context switching is the process of stopping work on a project and returning to it after performing a task on a different project. Just like in computing, this also has a cost for developers.
Weinberg discovered that when developers work on multiple projects or tasks simultaneously, productivity drops drastically, as attention is divided between different tasks. Thus, the "warm-up" time and cognitive effort to return to the previous task increase as more context switches are performed:
- 1 project: 100% productivity
- 2 projects: 40% productivity each (20% lost to context switching)
- 3 projects: 20% productivity each (40% lost)
- 4 projects: 10% productivity each (60% lost)
- 5 projects: <10% productivity each (80% lost)
Effort Lost to Context Switching: https://www.sei.cmu.edu/blog/addressing-the-detrimental-effects-of-context-switching-with-devops
Why is this devastating for developers?
- High cognitive load: Programming requires keeping dozens of variables, assumptions, architecture, and business rules in memory
- "Warm-up" time: It takes 15-30 minutes to resume the previous mental state
- Loss of flow state: Interruptions break the state of deep concentration
- Increased bugs: Frequent context changes lead to more errors from lack of focus
Every time you stop working on the frontend to fix a bug in the backend, or leave a feature to attend a meeting, you lose not only the interruption time, but also the time to "reconnect" your brain to the original problem.
The result? A developer working on 3 different projects loses 40% of their productivity just switching context, meaning if you spend 8 hours programming, only 4.8 hours are actually productive.
AI helps you partially solve this problem because it can keep all context available. While you need 15-30 minutes to "reconnect" to the project, AI keeps all context instantly available. Just a quick read and a few questions for you to resume the task exactly where you left off. You only need to keep important decisions and references documented before switching context.
Context Management
- Create a specific conversation for each context (e.g., Backend-Auth, Frontend-Dashboard)
- Use descriptive names that clearly identify the scope
- Keep conversations focused on a single domain/project
Context Loading
- Always start sessions by loading the relevant context
- Ask for summaries of main points from the previous session
- Provide documentation, architecture, and relevant code examples
Context Saving
- Document decisions and progress at the end of each session
- Save executive summaries to facilitate resumption
- Keep a history of changes and justifications
- I usually always make a summary of the state I'm leaving the task in and what I intend to do when resuming it
The result? You drastically reduce "warm-up" time and maintain a good productivity level even alternating between different projects. Because your memory has been extended and the model accelerates the process of resuming the task.
The maxim "Don't Trust, Verify" comes from the crypto community. It's a fundamental principle in decentralized networks to not trust third parties, always verify. When using LLM models for programming, we can apply the same principle because the nature of LLMs requires constant verification, as models can hallucinate.
Large Language Models (LLMs) like GPT, Claude, and Gemini are giant neural networks trained to predict the next word in a text. Given an initial excerpt, the model calculates which word is most likely to come next, and repeats this several times until forming a complete response.
A neural network is basically a set of mathematical functions called neurons, organized in layers. Each neuron receives information, does a calculation, and sends the result forward to the next neurons. Neurons are connected by synapses, which are represented by weights - values that indicate how much one neuron influences another. During training, the model learns by adjusting these weights millions (or billions) of times until identifying patterns and generating coherent responses.
Why do they hallucinate? OpenAI explains:
Hallucinations are not bugs, but direct consequences of how models are trained and evaluated. Traditional evaluation systems reward guessing instead of honesty about uncertainty.
It's like a multiple-choice test: if you don't know the answer but guess, you have a chance of getting it right. Leaving it blank guarantees zero points. Similarly, when models are evaluated only for accuracy, they're incentivized to guess instead of saying "I don't know."
Why is this problematic in development:
- Hallucinations: LLMs invent functions, APIs, or syntaxes that don't exist
- False confidence: They respond with certainty even when they're completely wrong
- Limited context: They can lose important details of your specific project
- Outdated patterns: Training may include obsolete or insecure practices
Real hallucination example:
// β AI "invented" a function that doesn't exist
const result = await usersDb.findUserByEmailAndValidate(email);
// β
Real function that exists
const user = await usersDb.findOne({ email });
if (!user) throw new Error('User not found');You can learn more about why LLM models hallucinate in this OpenAI blog post: Why language models hallucinate.
Before Accepting Any AI Code
Important: This is a practical checklist, not exhaustive. Use it as a starting point and adapt according to your project's specific needs and tech stack. The goal is to create a critical review habit that becomes natural in your workflow.
1. Immediate Technical Verification
- Does the code compile/execute without errors?
- Do the tests pass?
- Was fake code generated only to satisfy tests?
- Do all functions and methods actually exist?
- Are imports and dependencies correct?
2. Context Validation
- Does it make sense within your project's architecture?
- Does it follow established code patterns?
- Doesn't it break existing functionality?
- Is it really the best approach for this specific case?
3. Security Check
- Doesn't it introduce known vulnerabilities?
- Does it adequately handle errors and exceptions?
- Does it correctly validate all inputs?
- Does it follow security principles?
Always ask for explanations
- "Explain the reasoning behind this solution"
- "What are the trade-offs of this approach?"
- "What edge cases does this implementation consider?"
- "What code or documentation served as the basis for this solution?"
Request alternatives
- "Show other different ways to solve this"
- "What would be a simpler version of this solution?"
- "How do other frameworks/languages/modules/libraries solve this problem?"
- "What would be a more secure version of this solution?"
Increment the context
- Provide relevant project documentation
- Share examples of similar code already working
- Include architecture diagrams when necessary
- Provide links to framework/library documentation
- Links to tickets, pull requests, etc.
There are three main ways to work with AI in software development. Each mode has its ideal use cases, advantages, and limitations. Understanding when to use each one is fundamental to maximizing productivity.
What it is: Direct interaction through web interfaces of ChatGPT, Claude, or other LLMs.
How it works:
- You copy code/context (Ctrl+C)
- Paste in chat (Ctrl+V)
- Receive the response
- Copy back to your editor
When to use:
- Research and exploration: Understanding concepts, APIs, frameworks
- Specific debugging: Analyzing a specific error or stack trace
- Code review: Asking for analysis of a specific snippet
- Brainstorming: Discussing approaches without implementing
- Documentation: Generating or reviewing documentation
Advantages:
- β No additional cost (free versions available)
- β Access to newer models quickly
- β Requires no configuration or installation
- β Useful for conceptual discussions
- β Good for learning and exploration
Limitations:
- β Manual and repetitive workflow (copy/paste)
- β No direct access to project files
- β Limited context (you need to copy everything manually)
- β Doesn't execute code or tests
- β Difficult to control history between sessions
What it is: Tools integrated into your development environment (Windsurf, Cursor, GitHub Copilot, etc.)
How it works:
- AI has direct access to project files
- Can read, modify, and create files
- Executes commands and tests
- Maintains context between iterations
- Native integration with git, terminal, etc.
When to use:
- Feature implementation: Complete development of functionalities
- Refactoring: Structural changes across multiple files
- Complex debugging: Problems involving several files
- Testing: Creating and running automated tests
- Fast iteration: Multiple attempts until getting it right
Advantages:
- β Complete access to the project
- β Integrated workflow (no copy/paste)
- β Can execute code and tests
- β Maintains context
- β Real-time suggestions
- β Granular control over changes
- β Easy to control conversation history
Limitations:
- β Additional cost (~$10-30/month)
- β Tool learning curve
- β May consume many resources (RAM/CPU)
- β Depends on stable internet
Popular tools:
- Windsurf: $15/month - Lightweight, organized, checkpoints, great for complex projects
- Cursor: $20/month - Fast, practical, ideal for prototypes
- GitHub Copilot: $10/month - Intelligent autocomplete, GitHub integration
- Continue.dev: Free - Open source, customizable
Cost: $10-30/month + API costs (if using own models)
What it is: Autonomous specialized agents that solve specific tasks from start to finish, with minimal human intervention.
How it works:
- You define the task/objective
- Agent plans execution
- Executes all steps automatically
- Validates results
- Reports completion or problems
When to use:
- Repetitive tasks: Migrations, dependency updates, mass refactoring
- Code analysis: Security audits, performance analysis, code quality
- Boilerplate generation: Scaffolding, complete CRUD, configurations
- Automated testing: Generation of complete test suites
- CI/CD: Pipeline automation, deploys, rollbacks
Advantages:
- β Fully automated
- β Solves complex end-to-end problems
- β Specialization in specific tasks
- β Consistency and repeatability
- β Frees time for strategic work
- β Can run in background or CI/CD
Limitations:
- β Requires advanced setup and configuration
- β Less granular control
- β Can make incorrect decisions without supervision
- β Higher cost (multiple API calls)
- β More complex debugging when it fails
- β Needs rigorous validation of results
Tools and frameworks:
- Devin: Complete development agent
- AutoGPT: Framework to create custom agents
- LangChain Agents: Library for building specialized agents
- Sweep: Agent for GitHub issues
- Mentat: CLI agent for development
| Aspect | Chat Mode | IDE/CLI Mode | Agent Mode |
|---|---|---|---|
| Control | π’ Total | π‘ High | π΄ Medium |
| Speed | π΄ Slow | π’ Fast | π’ Very Fast |
| Automation | π΄ Manual | π‘ Semi-auto | π’ Total |
| Cost | π’ Low | π‘ Medium | π΄ High |
| Setup | π’ Zero | π‘ Simple | π΄ Complex |
| Use Cases | Exploration | Development | Automation |
This guide is focused on integrated IDE/CLI mode, which offers the best balance between control, productivity, and quality for software development.
Why IDE/CLI is the ideal mode:
- β Total control over each change
- β Complete context of the project available
- β Fast iteration with immediate feedback
- β Natural integration with your workflow
- β Organized and traceable history
- β Easy to customize
When to use other modes as complement:
Chat Mode - For quick research when you don't want to load complete context:
- Understand a new concept quickly
- Validate an idea before implementing
- Explore alternatives without compromising the project
Agent Mode - For automation of specific and repetitive tasks:
- Execute mass code analysis
- Automate processes
Context Engineering is the design and management of all information provided to an AI system to optimize its performance and ensure relevant and accurate results. It goes beyond simple prompts, including structured data, tool outputs, memory, and rules. It's an evolution of prompt engineering. You can find more information in this Anthropic post: https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents .
Context Enrichment
The context you provide to the model determines the quality of responses. There are three main ways to enrich context:
1. Rules (Custom Rules)
Rules are instructions you define to guide AI behavior. Instead of using default tool settings (which consume context space with generic information), create your own rules focused on what really matters for your project.
Examples of useful rules:
- Your team's code patterns (naming, file structure)
- Project-specific security practices
- Commit and documentation conventions
- Technical restrictions (dependency versions, compatibility)
2. Output Templates
Templates define the expected format of AI responses. This ensures consistency and facilitates output integration into your workflow. For example, you can create templates for:
- Technical specifications (standardized structure for requirements, tasks, tests)
- Architectural decision documentation
- Implementation plans with estimates
- Code analysis reports
3. MCPs (Model Context Protocol)
MCPs are integrations that connect AI with external tools (databases, APIs, file systems). They expand AI capabilities beyond just reading and writing code.
Important caution: MCPs that access databases can expose sensitive data to the model. Use only with anonymized data or in isolated development environments. Never connect MCPs directly to production databases with real user data. A database MCP can make unwanted changes to the database.
Optimization tip: IDE/CLI tools usually load default configuration files automatically, consuming precious context tokens. By creating your own rules and templates, you control exactly what's loaded, optimizing context window usage for what really matters in your project.
Context Engineering Challenges
Working with context in AI isn't trivial. There are three main problems you need to manage:
1. Context Window Management
The context window is the maximum amount of information the model can process at once, measured in tokens. Tokens are pieces of text the model uses to process language - a word can be 1 token or several, depending on size and language.
Limit examples:
- GPT-4: ~128k tokens (~96k words)
- Claude 3.5 Sonnet: ~200k tokens (~150k words)
- Gemini 1.5 Pro: ~2M tokens (~1.5M words)
The problem: The more context you load (files, history, documentation), the faster you hit the limit. When this happens, the model starts to "forget" old information or simply fails.
Solution: Be strategic about what to include. Load only files relevant to the current task. Use the .context/ folder to keep external documentation that you load manually when needed, instead of letting the tool load everything automatically. Start a new conversation when you feel the model is "forgetting" old information or generating many inaccuracies. A good rule is to keep context between 40-50% of the limit, as when you add more data to context the model degrades its quality and often fails execution.
2. Context Rot
Context rot is the phenomenon where an LLM's performance and accuracy decline as context size increases. The model becomes overloaded by data volume, leading to forgetting, hallucinations, repetitive responses, and inability to focus on relevant information. Even in simple tasks, the model may ignore important details, fixate on irrelevant information, or produce off-topic responses.
Common symptoms:
- AI suggests code that contradicts recent decisions
- References to files or functions that were already removed
- Repetition of solutions that already failed previously
- Confusion about the project's current state
Solution: When you notice context rot, start a new conversation. Before starting, load an updated summary of the project's current state (this is where the .context/ folder really helps - you maintain updated documents that you can load in new conversations). Always control context size to not hit the limit and try to stay within the 40-50% range of the model's token limit.
3. Hallucination
Hallucination occurs when an LLM generates confident and plausible information, but incorrect or invented. This happens because the model predicts the next word in a sequence to create coherent text, but doesn't possess real understanding or fact-checking ability. The result is invented details, false sources, or misleading statements - not from intent to deceive, but from limitations in training and lack of factual knowledge.
As we saw in Pillar 3, LLMs can invent code, functions, or APIs that don't exist. This gets worse when context is overloaded or confused, increasing the probability of hallucinations.
Mitigation Strategies:
-
RAG (Retrieval-Augmented Generation): Connect the LLM to external databases to retrieve factual context before generating responses. In development, this means providing official documentation, existing code examples, and technical specifications.
-
Constant verification: Religiously implement the "Don't Trust, Verify" principle. Check each suggestion, validate APIs and functions, run tests, request existing code examples before accepting.
-
Quality context: Provide real examples from your project, official framework/library documentation, and links to reliable sources. The better the context, the lower the chance of hallucination.
-
Explicit feedback: When AI hallucinates, correct immediately and provide the correct information. This helps adjust the current conversation's context (but this doesn't retrain the model, only helps update the current conversation's context).
Another strategy: .context/ Folder
All these problems led to the creation of the .context/ folder - a centralized location to keep documentation, decisions, specs, and project context in an external and controlled way.
Instead of letting the tool automatically load everything (consuming tokens and creating rot), you maintain organized documents that you load manually when needed. This gives total control over what enters the context, when it enters, and keeps everything updated and versioned. Additionally, you do more efficient Context Window management, as you can control exactly what enters the context.
The .context/ folder is the folder that contains all your project's context files. It's used to store all files used to explain to the model what's happening in your project. This folder doesn't depend on the AI tool you're using. You can keep it versioned in your repository or not. Ideally, you should document your work and keep context files updated.
I usually keep this folder at the monorepo root that I create to manage all projects I'll work on.
Here's an example of how I organize the .context/ folder:
<monorepo-root>/
βββ .context/
β βββ .config/
β β βββ rules/
β β β βββ rule-js-best-practices.md
β β β βββ rule-swe-best-practices.md
β β βββ templates/
β β βββ template-plan.md
β β βββ template-spec.md
β βββ data/
β β βββ data-anonimized-conversions.csv
β β βββ data-anonimized-customers.json
β β βββ data-anonimized-sellers.json
β βββ decisions/
β β βββ decision-x.md
β βββ projects/
β β βββ project-x/
β β βββ overview.md
β β βββ bugs/
β β β βββ bug-x.md
β β βββ plans/
β β β βββ plan-x.md
β β βββ specs/
β β β βββ spec-feature-xyz.md
β β βββ tasks/
β β βββ task-x.md
β βββ temp/
β βββ <random>
βββ frontend-app/
β βββ src/
β βββ public/
β βββ package.json
βββ backend-api/
βββ src/
βββ tests/
βββ package.json
.config/ stores templates that will be used by AI agents to perform tasks, plus rules that can be added as context to the agent or in IDE rule settings. This folder is maintained as a separate git repository so I can version and share between projects.
templates/: Contains output templates to standardize AI responses.template-plan.md: Used for higher-level project planning with time and complexity estimates.template-spec.md: Used for technical detailing of what will be implemented and creating specific tasks so the agent can execute instructions and implement code changes.
rules/: Contains rules that can be added as context to the agent or in IDE rule settings. Generally these rules are related to your experience as a developer and best practices the team adopted.
data/ contains example and anonymized data for testing and analysis. Useful for providing real context to the model without exposing sensitive information.
decisions/ documents important technical and architectural decisions for future reference. Helps maintain a history of the "why" behind critical project choices.
projects/ contains one folder per monorepo project, facilitating organization of bugs, specs, plans, tasks, and the overview.md file that serves as the main context to understand each project.
temp/ is only for storing temporary files and experiments that don't need permanent organization.
Monorepo projects (like frontend-app/ and backend-api/) are the actual projects where you'll make code modifications, maintaining clear separation between context/documentation and source code.
- Choose tool (Windsurf/Cursor/etc)
- Create a monorepo
- Configure
.context/folder - Checkout other projects within the monorepo
- Create your templates and rules within
.context/.config/folder
Now that you know the 3 fundamental pillars, learned about Context Engineering, and understood how to organize your workspace, I'll share the method I developed that consistently accelerated my productivity with AI.
Most developers use AI in a reactive and random way: they ask a question, copy the answer, test the application, and when it doesn't work, try reformulating the prompt hoping for a different result. It's a frustrating cycle of trial and error without clear direction.
No matter how advanced the prompt, it can't understand your project's context and can't understand what you want to do. So learning to manage context and having a structured process is fundamental to accelerating your productivity with AI.
The problem isn't the tool - it's the lack of process.
A structured method solves this because:
- Consistency: You reach quality solutions predictably, not by luck
- Efficiency: Eliminates time wasted on unnecessary iterations
- Control: You maintain command of the process, AI is just a tool
- Flexibility: Works for both fixing simple bugs and implementing complex features
Important: Due to the non-deterministic nature of LLMs, this method doesn't guarantee the same response for the same input. What it guarantees is a consistent process that guides you to quality solutions, regardless of variations in the model's responses.
Think of the method as a decision-making framework, not a magic recipe. It structures your thinking and interaction with AI to maximize chances of success.
Results you can expect:
- 3x more productivity in the first weeks of use
- 5x or more productivity as you master the process and adapt to your needs
- Less frustration with code that doesn't work or needs to be redone
- More confidence in AI-generated solutions
There are various methods on the internet, and you can (and should) develop yours based on your experience. This is what works for me. Use it as a starting point and adapt as needed.
Objective: Fully understand the project before any changes
- Start the IDE in Chat mode asking for project explanation
- Ask questions about architecture, patterns, limitations
- Request examples of existing code
- Ask to generate document summarizing everything
- Save in context folder as
<project-name>-overview.md
You can use the template I created for this: .config/templates/template-project-overview.md. Just include the document in chat and ask to generate an overview based on the template.
Objective: Discuss EVERYTHING before implementing any changes.
- Present the problem with maximum information
- Discuss solutions as in pair programming
- Question everything, never trust 100%
- Validate assumptions with code examples
- Map risks and edge cases
Prompt Template For Brainstorm:
Problem: [detailed description]
Objective: [what to solve]
Context: [project information]
Constraints: [technical/business limitations]
Documentation: [relevant files]
Related code: [files that may be affected]
After the model's response, you should initiate discussion with it so it can explain its solution. Ask questions about:
- Other alternatives it didn't consider
- How to solve without breaking existing functionalities
- How it will impact the rest of the system or a specific module that uses the code to be changed
- Request code examples that led to the solution it proposed
- What risks to consider
- How to test adequately
- What can be simplified and/or improved
Use your knowledge and mastery of the project to mature the proposed solution. The first proposed solution will rarely be the best, but with discussion and feedback you can reach a more robust solution.
In phases 1 and 2 you'll spend more time, but it's essential to ensure the proposed solution is viable and robust. A bad solution can lead to serious problems in the future. If your understanding of the project is limited, AI can suggest solutions that don't meet project needs. So try your best to understand business rules and technical limitations of the project, this way you can guide AI to reach a viable solution.
If you already have a solution in mind, don't hesitate to share. But remember AI can suggest better alternatives, so don't rush to implement. The most important thing in these first two phases is to reach correct understanding about the problem and proposed solution. Because prompts, templates, rules or any other technique won't help when the initial context was built incorrectly.
In case of new discoveries about fundamentals, business rules or technical limitations, discuss with the Agent and update the project overview. Proceed to phase 3 when you're 100% satisfied with the proposed solution.
Objective: Transform discussion into professional document
Now that you understand the problem and proposed solution, you can create a planning document and/or specification describing what will be implemented. I usually use planning documents for larger projects involving multiple modules or functionalities, and specifications for smaller projects involving just one module or functionality. I also create specifications based on planning documents for larger projects. This is up to you, what's important is having a document detailing everything discussed in previous phases, and preparing tasks granularly so they can be executed by the Agent.
Planning documents and specifications should be saved in .context/projects/<project>/specs folder.
Instructions for the Agent
- Request structured specification based on previous discussions
- Iterate until professional - no emojis, clear, objective, with clear decisions, no redundancy
- Always include: objective, requirements, assumptions, limitations, constraints, tasks, test cases, granular tasks, obvious caveats and trade-offs
- Save in
.context/projects/<project>/specsfolder
Read and fully understand specification. It needs to make sense to you. The objective is for it to be clear and understandable for a junior programmer starting on the project to implement tasks. If you don't understand something, ask, request examples or help. Often you'll need to validate business rules with your team, then continue interaction with model to refine specification.
If you made new discoveries, discuss with Agent and update plan or specification if necessary. Proceed to phase 4 when you're 100% satisfied with the proposal.
Specification Template
Here's a simplified version of the template I use to create specifications:
# [Feature/Bug Fix Name]
## Problem
[Clear description of problem or need]
## Context
[Relevant information about project, architecture, limitations]
## Proposed Solution
[Description of chosen approach and justification]
## Affected Files
- `path/file1.js` - [what will be changed]
- `path/file2.js` - [what will be changed]
## Implementation Tasks
- [ ] Task 1: [specific and granular description]
- [ ] Task 2: [specific and granular description]
- [ ] Task 3: [specific and granular description]
## Test Cases
- [ ] Test 1: [success scenario]
- [ ] Test 2: [error scenario]
- [ ] Test 3: [important edge case]
## Constraints and Trade-offs
- [Technical limitation or design decision]
- [Important trade-off to consider]Tip: Keep tasks granular. Each task should be self-contained enough for Agent to execute it without additional context.
Objective: Implement tasks detailed in specification with safety and quality
- Present final specification to Agent in Chat mode
- Request it to map change points before starting implementation
- Validate approach through discussion - Chat Mode doesn't make changes, only analyzes and discusses
- When satisfied, switch to Code mode to start implementation
- Instruct Agent to implement one task at a time and minimize scope of changes
- Review each change before continuing
- Validate functionality at each checkpoint and if application is in consistent state, commit the change to record successfully completed modification
Why Chat Mode first?
- Chat Mode: Discussion, analysis and approach validation without modifying code
- Code Mode: Implementation with direct access to project files
- Benefit: Avoids hasty changes and ensures approach is correct before modifying code
Implementation rules:
- Be specific: "Change only function X in file Y"
- One thing at a time: Don't mix functionalities
- Always validate: Test each change immediately
- Document decisions: Update context with important changes
Proceed to phase 5 to validate implementation. You'll probably do several iterations between phases 4 and 5 to ensure implementation is working correctly.
Objective: Ensure implementation is working correctly and application is in consistent state
- Create tests based on existing ones (patterns)
- Test basic cases for success and error
- Include edge cases and edge cases
- Validate integration with rest of system
- Document coverage in specification
- Validate functionality at each checkpoint and if application is in consistent state, request Agent to run tests and proceed only when tests pass
Whenever it fails, understand why, discuss with Agent and find solution, validate solution and then update specification if necessary.
- Never trust 100% - LLMs can and will hallucinate
- Always validate - Test everything before continuing (Agent itself can run tests for you at each checkpoint)
- Context is king - Balanced Context Window Size, better results (keep context token count below 50% of model limit)
- Small granularity - Small changes = fewer errors
- Document everything - Context folder is your external memory
- Seniors benefit most: Experience + AI = exponential multiplication
- Juniors can also use: But need to develop repertoire. Accelerated learning
- Positive cost-benefit: Investment in tools vs time gained. You spend money to pay for tokens, but you're actually buying time
- Superior quality: Constant review + rigorous validation
Context Management
- Context Engineering >> Prompt Engineering: keep tokens below 50% of model limit
- Start new conversations after planning phase to keep context under control
- Excellent prompts are nothing without excellent context
- The higher the detail level, more chance of partial or truncated responses
Quality and Validation
- Use tests to validate each change
- Make commits in stable states
- Allow progress upon approval
- Monitor infinite loops: models can do/undo the same change repeatedly
Specifications and Documentation
- Write specs for junior developers: granular tasks with checkboxes
- Informing output format helps model better understand what you expect
- Provide visual context and examples for inspiration (especially for UIs)
Optimization and Costs
- Use open source models for simpler tasks (cost savings)
- Use MCPs to give more power to models when appropriate
- Open questions with little context are good for exploring model "creativity"
Now that you know the method, here's a practical plan to start today:
Initial Setup
- Choose your tool (Windsurf, Cursor or Continue.dev)
- Create basic
.context/folder structure in your project - Configure basic rules (your team's code patterns)
First Simple Task
- Choose a small bug or simple feature
- Apply Phases 1-2 (Research + Brainstorm)
- Create your first specification using the simplified template
- Implement following Phases 4-5
Reflection and Adjustments
- Document what worked and what didn't
- Adjust your templates and rules based on experience
- Identify patterns you can reuse
Objectives:
- Apply the method to 2-3 tasks per week
- Refine your templates based on real use
- Start documenting important decisions in
.context/folder - Measure your productivity (time before vs after)
- Start with simple tasks to understand how the process works and to refine the method and templates as needed
- Avoid complex tasks at the beginning
Progress Signs:
- β You're spending 60-70% of time on planning (Research + Brainstorm + Plan)
- β Implementation is getting faster with fewer bugs
- β You're reusing specifications and patterns
- β
.context/folder is growing with useful documentation - β Feel you recover flow state faster after switching between tasks
- β You feel you're evolving and adapting better to using AI
- β Feel productivity is increasing
- β Notice your main focus is thinking more about the task and not typing
Start small, but start. Don't wait to have perfect setup. Choose a simple task tomorrow and apply only Phases 1-2 (Research + Brainstorm). You'll learn more by doing than by reading.
Remember: the goal isn't perfection, it's consistent progress. Each iteration will teach you something new about how to work better with AI.
# Project Overview Template
## 1. Project Summary
**Name:**
**Location in Monorepo:** `packages/<folder>` or `services/<folder>`
**Owner / Maintainer:**
**Status:** (active / deprecated / experimental / internal / public)
**Purpose:**
Briefly describe what this project does and why it exists.
Explain the problem it solves and how it fits into the broader system.
**Key Features:**
-
-
-
## 2. Context and Relationships
**Upstream Dependencies:**
List internal or external systems this project depends on.
Example:
- `@core/auth` (for token validation)
- `@infra/redis-client` (for caching)
- External: `Stripe API`, `AWS S3`
**Downstream Dependents:**
List projects or modules that depend on this one.
Example:
- `@app/backend-api`
- `@web/console`
**Integration Points:**
Describe how this project connects with others (shared schemas, APIs, queues, events, etc.).
**Bounded Context (if applicable):**
Summarize its domain role (e.g., "billing", "storage", "auth", "notifications").
## 3. Architecture Overview
**Type:** (library / service / worker / CLI / frontend app / shared module)
**Language / Stack:**
**Key Technologies Used:**
-
-
**Architecture Summary:**
Describe the internal structure and flow. If possible, include a diagram or flowchart.
Example:
> The project exposes a REST API with three main routes.
> It consumes messages from the `egress-events` queue, processes billing data, and updates Stripe via the Billing Meter API.
**Key Components or Directories:**
| Path | Description |
|------|--------------|
| `/src/handlers/` | Event processing logic |
| `/src/api/` | HTTP endpoints |
| `/src/utils/` | Shared helpers |
**Data Flow (if applicable):**
Outline how data moves through this module (from input to output).
## 4. Configuration & Environment
**Environment Variables:**
| Variable | Description | Required | Default |
|-----------|--------------|-----------|----------|
| `DATABASE_URL` | Connection string for Postgres | Yes | β |
| `REDIS_URL` | Redis connection URL | No | localhost |
**Build / Run Commands:**
**Deployment Target:**
(e.g., Cloudflare Worker, AWS Lambda, Kubernetes, Vercel, etc.)
**CI/CD:**
Summarize pipelines or automation connected to this project.
## 5. API / Event Interfaces
**Public APIs (if any):**
Describe endpoints or exported functions.
Example:
| Endpoint | Method | Description |
|-----------|--------||--------------|
| `/api/usage/record` | POST | Receives egress usage events |
**Events Produced / Consumed:**
| Event | Direction | Description |
|--------|------------||--------------|
| `egress.recorded` | produced | Sent to SQS for billing updates |
| `user.revoked` | consumed | Used to invalidate delegations |
## 6. Data & Storage
**Database Tables or Collections:**
| Name | Purpose |
|------|----------|
| `usages` | Stores egress records |
| `revocations` | Tracks revoked delegations |
**External Resources:**
(e.g., KV namespaces, buckets, queues, etc.)
## 7. Security & Permissions
- List key permissions or auth requirements (e.g., UCAN capabilities, API keys, role checks).
- Identify potential sensitive operations or data handled by this project.
## 8. Observability
**Metrics / Dashboards:**
(e.g., Grafana dashboard URLs, key metrics like latency, error rate, queue depth)
**Logs & Alerts:**
Describe where logs are stored and how alerts are configured.
## 9. Development Notes
**Local Development:**
Describe how to run or test the project locally, including prerequisites.
Example:
> Requires Docker running Postgres and Redis.
> Use `pnpm dev` to start with hot reload.
**Testing Strategy:**
Mention test frameworks and coverage expectations if any.
## 10. Any Findings about Future Improvements
-
-
-
## 11. Cross-Project Links
| Related Project | Relationship | Notes |
|-----------------|---------------|-------|
| `@core/auth` | upstream | Provides user verification |
| `@infra/logger` | shared | Centralized logging module |
## 12. Revision History
| Date | Author | Summary |
|------|---------|----------|
| YYYY-MM-DD | name | Initial draft |
| YYYY-MM-DD | name | Updated dependencies section |
