Date: 2025-12-18 Model: Claude Opus 4.5 for both approaches Projects: 6 small-to-medium Go CLI tools (1-5k LOC each) Full Report: spec-comparison-projects.md
Instead of going directly from prompt to code (prompt-driven), spec-driven follows a structured workflow:
Prompt → Specification → Plan → Tasks → Implementation
The autospec CLI orchestrates this using Claude Code, generating YAML artifacts (spec.yaml, plan.yaml, tasks.yaml) before writing any code.
| Metric | Spec-Driven | Prompt-Driven | Difference |
|---|---|---|---|
| Avg Time | 27.9 min | 8.4 min | 3.3x slower |
| Avg Quality | 87% | 71% | +16% better |
| Avg Go LOC | 4,078 | 1,932 | 2.1x more code |
| Avg Test LOC | 2,312 | 1,009 | 2.3x more tests |
| Build Success | 6/6 | 6/6 | Both work |
| Criterion | Spec | Prompt | Δ |
|---|---|---|---|
| Architecture | 9.5 | 6.3 | +3.2 |
| Documentation | 7.3 | 5.7 | +1.6 |
| Test Quality | 8.5 | 7.0 | +1.5 |
| Error Handling | 8.7 | 7.3 | +1.4 |
| CLI Experience | 8.6 | 7.2 | +1.4 |
| Edge Cases | 9.0 | 8.0 | +1.0 |
| Feature Completeness | 9.3 | 8.3 | +1.0 |
Spec-driven: Better architecture, modularity, tests, and edge case handling—at 3.3x the time cost.
Prompt-driven: 2.7x more efficient (quality points per minute), but 16% lower quality on average.
If you value 1% quality improvement at ~1.2 minutes of dev time, the approaches are equivalent.
- Production code where quality matters → Use spec-driven
- Prototypes/POCs where speed matters → Use prompt-driven
- Building production/enterprise code
- Complex features with many edge cases
- Team projects requiring consistent patterns
- Multiple output formats or integrations needed
- Quality > speed
- Building prototypes or POCs
- Simple utilities with clear requirements
- Time-constrained situations
- Exploring feasibility before spec-driven commitment
Where spec-driven excelled:
- Architecture (+3.2 pts): Consistent package organization, separation of concerns
- Documentation (+1.6 pts): Detailed READMEs with examples
- Test Quality (+1.5 pts): More test files, benchmarks, integration tests
Where prompt-driven was competitive:
- All 6 projects build and pass tests
- 3.3x faster delivery
- Simpler, easier to understand initially
Biggest spec advantage: Linkcheck project (+26% quality) due to concurrent HTTP handling, multiple output formats, and complex edge cases.
All projects were greenfield implementations (starting from scratch). Spec-driven advantages likely compound further for:
- Enterprise/large codebases (50k+ LOC)
- Team development (specs as living docs)
- Incremental features (adding to existing systems)
- Regulatory/compliance requirements
- Long-term maintenance needs
- URL Shortener - CLI for URL shortening with local JSON storage
- Linkcheck - Markdown link validator with concurrent HTTP checking
- Git Hooks Manager - Config-based git hooks installer
- Env Validator - Environment variable schema validator
- API Mock Server - OpenAPI-based mock HTTP server
- Cron Parser - Cron expression parser library
See full report for detailed per-project breakdowns, code samples, and raw metrics.
- autospec GitHub: https://github.com/ariel-frischer/autospec
- autospec Docs: https://ariel-frischer.github.io/autospec/