| Category | Count | Notes |
|---|---|---|
| Bugs | 30 | Core correctness issues |
| Enhancements | 55 | Behavior improvements, not new API |
| API Suggestions | 76 | Need API review process |
| API Approved | 6 | Ready to implement |
| (remaining ~95) | ~95 | Unlabeled beyond area tag, or other combos |
Age: ~70% of bugs and enhancements are 2+ years old. These are the "long tail" -- not urgent enough to prioritize, but real enough not to close.
Special labels: 8 "help wanted", 5 "in-pr" (already have a PR), 2 regressions, 5 partner-impact, 18 wishlist.
You've identified the key insight: there are three bottleneck roles that are currently human-bound:
- Triage -- "Is this worth fixing?" / "Should we close this as won't-fix?"
- Design -- "We want behavior A not B" / "What's the right API shape?"
- Implementation + Review -- Write code, write tests, review the PR
AI can already do (3) for many issues. The bottleneck is (1) and (2), which are policy/judgment calls. But here's the thing: not all issues need all three steps. Some issues have already been triaged and the desired behavior is unambiguous.
These share characteristics:
- The desired behavior is unambiguous (bug has clear repro, expected vs actual is obvious)
- The fix is localized (one file, one code path)
- Tests can be derived from the issue (repro steps = test case)
- No design decisions needed
Examples from the current bugs:
#125237Struct properties deserialized as null from stream (regression, has repro)#123372Breaking change in JsonNodeConverter (regression)#113268Deserialization failing for types that worked in net8 (regression)#110450JsonException deserializing async to nullable types#75269Polymorphic deserialization ignores PropertyNameCaseInsensitive#92780Virtual property with JsonPropertyName serialized twice (already in-pr)#104700Collection property without getter gives misleading error (help wanted)#71784Duplicate object keys causing ArgumentException (help wanted)
Also: the 6 API-approved issues are fully designed and just need implementation.
Why near-trivial to review: The reviewer's job is essentially "does the test match the issue, and does the fix make the test pass without breaking others." CI does most of that.
These need a single "which way do we go" judgment call, after which the implementation is mechanical.
Examples:
#60560Custom converter for Dict<string,object> collides with JsonExtensionData -- what should win?#50078JsonIgnore not inherited on override -- should it be? (13 reactions, clearly wanted)#51165JsonPropertyName inheritance preserves parent member -- intended?#96996Properties overriding setter only are ignored -- should they be supported?#56212JsonConverter inconsistency without public getter
Workflow: AI reads the issue + comments, proposes behavior A vs B to a human, human picks one in a comment, AI implements.
The 76 API suggestions + some enhancements. These need the API review process. AI cannot shortcut this -- it's a governance question, not a technical one.
However, AI can help by:
- Drafting API proposals in the correct format
- Doing feasibility analysis ("here's what the implementation would look like")
- Identifying which suggestions are duplicates or already addressed
AI could scan for:
- Issues from .NET 5/6 era that describe behaviors now changed
- Issues with no activity for 3+ years and no reactions
- Issues marked "wishlist" with very niche use cases
- Issues that are duplicates of other issues
- Issues where the workaround is trivial and the fix would be disproportionately complex
The 18 "wishlist" items are prime candidates for review.
Opening 200 PRs at once is a non-starter. But this can be solved:
- AI analyzes all 262 issues and produces a spreadsheet/report categorizing each into the tiers above
- AI recommends ~20-40 for closure with justification
- Human reviews the triage recommendations in bulk (much faster than individual issue analysis)
- Result: Issue count drops to ~180-220; remaining issues are categorized
- AI fixes Tier 1 issues in batches of 5-8 PRs at a time
- Each PR: fix + tests + links to issue
- Reviewer can review these quickly because the scope is small and behavior is unambiguous
- Pace: ~5 PRs/week, clears Tier 1 in 3-5 weeks
- API-approved items (6) get done first since they're fully spec'd
- AI posts a structured comment on each Tier 2 issue: "I can fix this. The design question is: A or B. Here's what each looks like."
- Human responds with a one-line decision
- AI implements based on the decision
- Pace: depends on human response time for decisions
- AI drafts formal API proposals for promising suggestions
- AI identifies duplicates and recommends consolidation
- This feeds the existing API review process, just faster
- Regression bugs are the sweet spot. Clear before/after, test = repro, fix = restore old behavior.
- "Help wanted" bugs are pre-triaged as "yes, fix this." They're literally asking for someone to do the work.
- API-approved issues are fully designed. Pure implementation.
- Behavioral ambiguity. Many STJ bugs are really "is this the intended design?" questions. Edge cases in serialization where reasonable people disagree. AI can propose, but someone with context on STJ's design philosophy needs to decide.
- Cross-cutting concerns. Some fixes touch the source generator, reflection path, AND the core reader/writer. These need architectural awareness.
- Performance implications. STJ is perf-critical. Some "obvious" fixes (e.g., adding a null check) can have measurable perf impact on hot paths. Human judgment needed.
The "200 PRs" problem is a throughput issue, not a fundamental barrier. Solutions:
- Batch by subsystem -- group related fixes so one reviewer context-loads once
- AI-generated review summaries -- "This PR fixes #X. The change is: [one-sentence diff summary]. Test coverage: [list]. Perf impact: none (cold path)."
- Graduated trust -- as AI PRs prove reliable, review can become lighter
The real unlock isn't "AI fixes all bugs." It's AI collapses the triage-to-fix pipeline. Today:
- User files issue (minutes)
- Human triages (days to weeks)
- Human prioritizes (weeks to never)
- Human implements (hours to days)
- Human reviews (hours to days)
With AI: steps 2-5 can happen in hours, with humans only needed for judgment calls in step 2-3. The backlog doesn't accumulate because the fix comes almost as fast as the report.
- triage-sweep: Build an AI triage report for all 262 issues -- categorize each into Tier 1/2/3/4 with justification
- close-candidates: From the triage sweep, extract the "recommend closing" list with per-issue rationale
- tier1-pilot: Pick 3-5 Tier 1 bugs and have AI create actual fix PRs as a proof of concept
- api-approved-impl: Implement the 6 API-approved issues (these are unambiguous and ready)
- tier2-decisions: For Tier 2 issues, draft the "A vs B" decision comments for human review
- measure-review-cost: After the pilot PRs, measure actual review time to validate the "near-trivial" claim