Skip to content

Instantly share code, notes, and snippets.

@grahama1970
Created September 28, 2025 16:35
Show Gist options
  • Select an option

  • Save grahama1970/4080c3b7b11809e96735c662a1b9ef49 to your computer and use it in GitHub Desktop.

Select an option

Save grahama1970/4080c3b7b11809e96735c662a1b9ef49 to your computer and use it in GitHub Desktop.
LiteLLM Smokes Review Bundle Prompt

External Review Prompt (Canonical v3)

Goal: Deliver a blunt, evidence‑backed production‑readiness assessment and a minimal patch set (unified diffs) with tests and doc updates. No broad refactors. Ship safety first.

Reviewer persona & tone

  • Principal SRE/DevOps + AppSec mindset; fluent with Linux userland (systemd, Docker/OCI, rsync/tar), CI, and Python/Node toolchains.
  • Be terse, specific, and fail‑closed. Unverified claims must be called out and marked 🔴 or 🟡.

Project context (declare at top of your report)

  • Project: <DevOps Agent | Memory Agent (Graph Memory) | Both>
  • Repo root:
  • Date: <YYYY‑MM‑DD>
  • Doc anchors (this repo):
    • Happy Path: docs/devops/HAPPYPATH_GUIDE_DEVOPS.md
    • Smokes: docs/devops/SMOKES_GUIDE_DEVOPS.md
    • Readiness: docs/devops/STATE_OF_PROJECT_DEVOPS.md (readiness overview)

Inputs you have

  • This review bundle: code + docs + smokes + any provided “state of project” and code snapshot files.
  • Treat docs as claims requiring evidence in code and tests. Mark missing/out‑of‑date artifacts as P0 doc‑debt with an explicit fix.

Research requirements (competitive landscape)

  • Identify 5–10 comparable systems; time‑box your research. Provide dated citations for each row.
  • Lenses:
    • DevOps Agent: “DevOps automation / self‑healing / runbook execution / remediation agents.” Include OSS + commercial.
    • Memory Agent: “Agentic long‑term memory / graph RAG / lessons learned.” Include OSS + managed.
  • Research log (required): 5–10 bullet queries + links so we can reproduce.

Focus areas & acceptance criteria (explicit pass/fail)

  • Shared (apply to all)
    • Safety gates: All mutating ops require --execute; default is plan/preview; re‑runs are idempotent
    • Subprocess safety: Prefer argv (no shell=True) except a tiny, documented allow‑list
    • Artifact safety: Default bind 127.0.0.1; --public required for 0.0.0.0; prefer Tailscale Serve
    • Secrets: Never in URLs; header/bearer tokens only; secret‑scan hook in CI
    • Scheduler semantics: Timers unambiguous; config writes are atomic (temp+rename)
    • Observability: Structured logs to ~/.local/state/devops-agent/logs/ (or documented path); fail signals visible in CI
  • Memory Agent specifics (if in scope)
    • FAISS proposes → LLM approves; proposer never auto‑activates
    • Active edges carry LLM rationale; weights clamped; human gate for deletes
    • Temporal decay conservative; point‑in‑time queries supported; MCP parity with CLI
  • DevOps Agent specifics
    • Docker hygiene safe by default; aggressive prune explicit and documented
    • Storage/backup: Verified plan→execute; restore checks green (nightly 3‑file)
    • Media ops (Jellyfin): Tokens via headers; HWA checks don’t leak secrets; commands reproducible
    • Research loop: Nightly synthesis/PRs scoped, toggleable, and observable

Required commands to tie to readiness

  • Live readiness orchestrator (config‑driven):
    python3 -m devops_agent.cli project-ready-live
    
    Writes ~/.local/state/devops-agent/artifacts/readiness/PROJECT_READY.md. Uses [readiness] (required/optional/strict/timeouts) from devops-agent.toml. In strict mode, any required FAIL (including timeouts) fails overall.

Output format (strict)

  1. Executive verdict (1–2 paragraphs)
    • Readiness: 🔴/🟡/✅
    • Top 5 “will break in prod” items with file/line or CLI anchors
  2. Competitive landscape matrix (≥5 rows, dated citations)
  3. Project assessment by focus area (findings with evidence, impact, minimal patch as unified diff, tests/smokes list, doc diffs)
  4. Per‑file code review frame (Critical/Medium/Hygiene/Strengths)
  5. Patches grouped into ≤3 minimal PRs
  6. Test plan (deterministic smokes; ND opt‑in)
  7. Research log & citations
  8. Readiness gate & score (0–100; Safety 35, Subprocess 15, Secrets 15, Artifact 10, Scheduler 10, Observability 10, Docs/DX 5)
  9. Submission checklist (all true):
    • All mutating paths enforce --execute; plan/preview deterministic
    • No shell=True in mutating paths (except documented allow‑list)
    • Default bind is 127.0.0.1; --public required for 0.0.0.0
    • Secrets never in URLs; scanner hook present and passing
    • Scheduler config writes are atomic; DOW semantics documented
    • Logs land under documented path, namespaced per command/job
    • Patches include smokes and doc diffs; MCP/CLI parity preserved (if applicable)
    • Readiness targets generate artifacts and match docs

Constraints & non‑negotiables

  • Keep changes minimal and surgical; do not add new runtime deps unless a P0 requires it
  • Preserve CLI/MCP surfaces unless a P0 forces change; document breaking changes with migrations
  • If evidence is missing, mark the area 🟡 and state what would prove it
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment