Skip to content

Instantly share code, notes, and snippets.

@yifanzz
Created December 17, 2024 00:01
Show Gist options
  • Save yifanzz/3cfb8f9065769ffbf94348255f85597d to your computer and use it in GitHub Desktop.
Save yifanzz/3cfb8f9065769ffbf94348255f85597d to your computer and use it in GitHub Desktop.
EP12 - The One File to Rule Them All

[Project Name]

Every time you choose to apply a rule(s), explicitly state the rule(s) in the output. You can abbreviate the rule description to a single word or phrase.

Project Context

[Brief description ]

  • [more description]
  • [more description]
  • [more description]

Code Style and Structure

  • Write concise, technical TypeScript code with accurate examples
  • Use functional and declarative programming patterns; avoid classes
  • Prefer iteration and modularization over code duplication
  • Use descriptive variable names with auxiliary verbs (e.g., isLoading, hasError)
  • Structure repository files as follows:
server/
├── src/
    ├── components/     # Shared React components
    ├── hooks/          # Custom React hooks
    ├── utils/          # Helper functions
    ├── types/          # TypeScript types
    └── lib/            # Shared libraries
extension/
├── src/
    ├── background/     # Service worker scripts
    ├── content/        # Content scripts
    ├── popup/          # Extension popup UI
    ├── options/        # Extension options page
    ├── components/     # Shared React components
    ├── hooks/          # Custom React hooks
    ├── utils/          # Helper functions
    ├── lib/            # Shared libraries
    ├── types/          # TypeScript types
    └── storage/        # Chrome storage utilities
shared/
├── src/
    ├── types/          # TypeScript types, only used for shared types between server and extension
    └── utils/          # Helper functions, only used for shared functions between server and extension

Tech Stack

  • React
  • TypeScript
  • Tailwind CSS
  • Shadcn UI
  • Chrome Extension
  • Express.js

Naming Conventions

  • Use lowercase with dashes for directories (e.g., components/form-wizard)
  • Favor named exports for components and utilities
  • Use PascalCase for component files (e.g., VisaForm.tsx)
  • Use camelCase for utility files (e.g., formValidator.ts)

TypeScript Usage

  • Use TypeScript for all code; prefer interfaces over types
  • Avoid enums; use const objects with 'as const' assertion
  • Use functional components with TypeScript interfaces
  • Define strict types for message passing between different parts of the extension
  • Use absolute imports for all files @/...
  • Avoid try/catch blocks unless there's good reason to translate or handle error in that abstraction
  • Use explicit return types for all functions

Chrome Extension Specific

  • Use Manifest V3 standards
  • Implement proper message passing between components:
    interface MessagePayload {
      type: string;
      data: unknown;
    }
  • Handle permissions properly in manifest.json
  • Use chrome.storage.local for persistent data
  • Implement proper error boundaries and fallbacks
  • Use lib/storage for storage related logic
  • For the async injected scripts in content/,
    • they must not close over variables from the outer scope
    • they must not use imported functions from the outer scope
    • they must have wrapped error handling so the error message is returned to the caller

State Management

  • Use React Context for global state when needed
  • Implement proper state persistence using chrome.storage (for extension)
  • Implement proper cleanup in useEffect hooks

Syntax and Formatting

  • Use "function" keyword for pure functions
  • Avoid unnecessary curly braces in conditionals
  • Use declarative JSX
  • Implement proper TypeScript discriminated unions for message types

UI and Styling

  • Use Shadcn UI and Radix for components
  • use npx shadcn@latest add <component-name> to add new shadcn components
  • Implement Tailwind CSS for styling
  • Consider extension-specific constraints (popup dimensions, permissions)
  • Follow Material Design guidelines for Chrome extensions
  • When adding new shadcn component, document the installation command

Error Handling

  • Implement proper error boundaries
  • Log errors appropriately for debugging
  • Provide user-friendly error messages
  • Handle network failures gracefully

Testing

  • Write unit tests for utilities and components
  • Implement E2E tests for critical flows
  • Test across different Chrome versions
  • Test memory usage and performance

Security

  • Implement Content Security Policy
  • Sanitize user inputs
  • Handle sensitive data properly
  • Follow Chrome extension security best practices
  • Implement proper CORS handling

Git Usage

Commit Message Prefixes:

  • "fix:" for bug fixes
  • "feat:" for new features
  • "perf:" for performance improvements
  • "docs:" for documentation changes
  • "style:" for formatting changes
  • "refactor:" for code refactoring
  • "test:" for adding missing tests
  • "chore:" for maintenance tasks

Rules:

  • Use lowercase for commit messages
  • Keep the summary line concise
  • Include description for non-obvious changes
  • Reference issue numbers when applicable

Documentation

  • Maintain clear README with setup instructions
  • Document API interactions and data flows
  • Keep manifest.json well-documented
  • Don't include comments unless it's for complex logic
  • Document permission requirements

Development Workflow

  • Use proper version control
  • Implement proper code review process
  • Test in multiple environments
  • Follow semantic versioning for releases
  • Maintain changelog
@Jamie-BitFlight
Copy link

@yifanzz Hi, thank you for sharing your work.
I have created, collected, and generated many of these rules files across my projects.
(Happy to share as well!)

Apart from just paying attention to the quality of the response and output, how are you methodically measuring the improvement that your rules provide?
Are there tools or techniques?

Anyway here's the tax:

# Framework Response Structure

This file outlines the structured response framework that should be used when addressing requests.

## Core Framework

When responding to requests, use the following structured approach:

1. **UNDERSTAND**: Restate what you believe the core request is
   - Clarify the central ask
   - Extract key requirements and constraints
   - Identify implied needs

2. **PLAN**: Outline the specific steps to address the request
   - Break down complex tasks into sequential steps
   - Specify the approach for each component
   - Include decision points where alternatives might be considered

3. **ASSESS**: For each step, evaluate if this is the most efficient approach
   - Consider alternatives and trade-offs
   - Evaluate performance implications
   - Identify potential risks or limitations

4. **EXECUTE**: Complete the task based on your refined plan
   - Implement the solution
   - Provide clear and concise code when needed
   - Document key decisions inline

5. **REVIEW**:
   - Consider: "If a domain expert reviewed this work, what would they critique?"
   - Critically examine output for:
     - Relevance and adherence to the original request
     - Completeness of the solution
     - Any assumptions made
     - Any potential edge cases missed

## Implementation Guidelines

- Begin every significant response with this framework
- Adapt the depth of each section based on the complexity of the request
- For simple requests, some sections may be very brief
- For complex requests, expand each section with appropriate detail
- Always make the UNDERSTAND section explicit to confirm alignment

@Jamie-BitFlight
Copy link

Jamie-BitFlight commented May 23, 2025

I asked Claude Opus 4 to research the question I asked above, and wanted to share the results. I have no opinions or knowledge if these are good or bad, and would like to hear from you guys if it is!

--- Response Start ---

I'll research techniques for validating and tuning context rules in development environments, particularly focusing on methods that could apply to Cursor IDE's rule system.

Based on my research, I've identified several comprehensive techniques for validating and tuning context rules in development environments like Cursor IDE. Here's a structured approach to measure and improve how well your rules are being followed:

1. Automated Testing and Validation Framework

Statistical Analysis Methods

  • BLEU or ROUGE scores, which rely on exact word overlaps, fuzzy match techniques, such as token-based similarity, evaluate how closely the generated text resembles the reference
  • Track consistency metrics across multiple generations with the same rules
  • Use embeddings to compare semantic similarity between expected and actual outputs

Multi-Path Validation

Self-consistency is a prompt engineering technique that improves reasoning in step-by-step problem-solving. Instead of relying on a single reasoning path, this method explores multiple paths and selects the most consistent answer:

  • Generate multiple outputs with the same rules
  • Check for consistency across outputs
  • Flag areas where outputs diverge significantly

2. LLM-as-Judge Evaluation

This is an innovative approach that involves using a secondary Large Language Model (LLM) to assess the labeling accuracy of the primary LLM. This method capitalizes on the capabilities of LLMs to provide meta-evaluations—essentially using one LLM to judge the performance of another

Implementation Steps:

  1. Create evaluation prompts that check if generated code follows your specific rules
  2. Use a different LLM model to evaluate compliance
  3. Score outputs on various dimensions (style adherence, pattern matching, security compliance)
  4. Track scores over time to identify rule effectiveness

3. Context-Specific Validation Components

In domain-driven design, using decoupled validator components allows your code to be much more reusable and enables validation rules to rapidly grow

Key Practices:

  • Create modular validators for different rule categories
  • Implement context-aware validation that understands your specific domain
  • Use structured error reporting to identify which rules are being violated

4. Production Monitoring and Feedback Loops

Use an observability tool like Helicone to log your LLM requests and track key metrics like usage, latency, cost, time-to-first-token (TTFT). These tools provide dashboards to help you monitor irregularities

Monitoring Strategy:

  • Log all AI-generated code with metadata about which rules were active
  • Track rule compliance rates over time
  • Implement real-time validation checks
  • Collect developer feedback on rule effectiveness

5. A/B Testing Framework for Rules

Easily A|B test prompts and models. Edit and manage datasets on the cloud. LLM observability with online evals. Publicly sharable testing reports. Automated human feedback collection

Testing Approach:

  1. Create rule variations
  2. Test on identical code generation tasks
  3. Compare outputs for quality, compliance, and developer satisfaction
  4. Use statistical significance testing to validate improvements

6. Automated Compliance Checking

Security and Standards Validation

AI systems analyze code for potential security risks, such as SQL injection or buffer overflow vulnerabilities. Providing contextual feedback: Unlike traditional tools, AI reviews offer suggestions based on the overall architecture and intent of the application

Implementation:

  • Integrate static analysis tools to verify rule compliance
  • Use pattern matching to detect rule violations
  • Create automated test suites that verify generated code follows standards

7. Rule Effectiveness Metrics

Key Metrics to Track:

  1. Rule Hit Rate: How often each rule influences the output
  2. Compliance Score: Percentage of outputs that follow each rule
  3. Developer Override Rate: How often developers need to modify AI suggestions
  4. Performance Impact: Latency added by complex rule sets
  5. Error Reduction: Bugs prevented by following rules

8. Iterative Improvement Process

After getting results from each search, reason about the search results to determine the next action and refine the next query. Continue this loop until the question is answered

Continuous Improvement Steps:

  1. Baseline Measurement: Establish current rule compliance rates
  2. Hypothesis Formation: Identify underperforming rules
  3. Rule Refinement: Adjust rule phrasing and structure
  4. Testing: Validate improvements with controlled experiments
  5. Deployment: Roll out improved rules gradually
  6. Monitoring: Track impact in production

9. Human-in-the-Loop Validation

Human-in-the-loop (HITL) reviews add contextual judgment where automation is insufficient, particularly in high-stakes applications like healthcare or finance

Review Process:

  • Sample generated code for manual review
  • Create rubrics for evaluating rule compliance
  • Track inter-rater reliability for consistency
  • Use findings to refine automated validators

10. Context Retrieval Validation

For Cursor specifically, ensure your rules are being properly retrieved and applied:

  • Verify rule loading through debug logs
  • Check that path-based rules match correctly
  • Monitor which rules are being included in the context
  • Test rule prioritization when multiple rules apply

Practical Implementation Tools

Based on the research, consider these tools and frameworks:

  • Logging/Monitoring: Helicone, custom observability solutions
  • Testing Frameworks: DeepEval, Patronus AI, custom evaluation harnesses
  • Version Control: Track rule changes alongside code changes
  • Analytics: Build dashboards to visualize rule effectiveness

Best Practices Summary

  1. Start with a small, well-defined set of rules and validate thoroughly
  2. Use multiple validation methods (automated + human review)
  3. Track metrics continuously, not just during development
  4. Create feedback loops from production usage
  5. Document which rules work well and which need improvement
  6. Consider the trade-off between rule complexity and performance
  7. Regularly review and update rules based on new patterns and requirements

By implementing these validation techniques, you can create a data-driven approach to improving your Cursor IDE context rules, ensuring they effectively guide AI code generation while maintaining high quality and compliance standards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment