Skip to content

Instantly share code, notes, and snippets.

@decagondev
Created June 1, 2026 11:16
Show Gist options
  • Select an option

  • Save decagondev/4838cb3a42801b01d7204fa7f193bb5b to your computer and use it in GitHub Desktop.

Select an option

Save decagondev/4838cb3a42801b01d7204fa7f193bb5b to your computer and use it in GitHub Desktop.

Module 1 Challenger — Take-Home

Layered Architectures & the Division of Labor

In class we named the patterns. Here you apply them to a problem of your own. There's no single right answer — the grade is in the justification.

Time: ~45–60 min · Due: before the Module 2 session · Submit: post your write-up + diagram in the cohort channel.


The assignment

Part A — Design a layered pipeline (core)

Pick one problem:

  • Triaging incoming benefit-fraud tips (most are noise; a few are serious).
  • Routing citizen correspondence to the right team.
  • Bring your own — any high-volume problem where most cases are easy and a few are consequential.

⚠️ The support-inbox example worked below is off-limits as your submission — it's the demo. Pick something else.

Produce two things:

  1. A diagram of your pipeline — tiers top to bottom, the escalation spine, and the side-exits where confident cases resolve early. Any tool: a photo of a hand sketch, draw.io, a slide, or the Mermaid skeleton at the bottom of this doc.
  2. A write-up (~300–400 words) that:
    • names each layer (deterministic → probabilistic → AI/ML) and what it does,
    • justifies every layer's position by both cost and certainty,
    • marks the human boundary and ties it to consequence and reversibility,
    • flags one "resolve-too-early" risk — a cheap layer that might clear a case it can't be sure of — and a guard against it.

Part B — Three short reflections (a paragraph each)

  1. One decision you'd fully automate and one you wouldn't — what's the actual difference?
  2. Is escalation a sign the system failed? When is it the system working, and when is it a problem?
  3. Depth, breadth, or both for a detection problem you pick — and what about the target drives your answer?

Optional — Peer response (this is our discussion, async)

Reply to one classmate's Part A: find the single case their cheap layer would wrongly clear, and say how you'd close it.


Before you start: the four moves

Everything below is just these four ideas applied. Keep them in front of you:

  1. Order by cost and certainty. Cheapest, most certain methods first; most capable, least certain last. Not "AI is smartest, so it goes last" — say why cost and certainty put each layer where it is.
  2. Every layer resolves or escalatesas early as possible, but no earlier. A layer resolves a case only when it's confident.
  3. The human is the decider, not a fallback. The boundary sits where consequence is high — not where the model happens to get weak.
  4. Escalation is the system working. A layer escalating means it correctly recognized its own limit. The real failure is resolving too early.

A worked starter — on a different problem

Let's run the four moves on something that is not your assignment: triaging a customer-support inbox. Watch the moves, then make the same ones on your own problem.

Problem in one line: thousands of inbound emails a day; most are routine, a handful are high-stakes (legal threats, security reports, distressed customers). Get each to the right place cheaply, without mishandling the serious ones.

Sketch the tiers and reason out loud:

  • Tier 1 — Deterministic. Exact rules. "Where's my order?" + a valid order ID → send tracking. Password-reset → self-serve flow. Sender on a known-spam list → drop. Nearly free, and correct by construction when a rule fires. Clears the unambiguous majority.
  • Tier 2 — Probabilistic. A classifier scores each remaining message into a category (billing / technical / account / complaint) with a confidence. Route when confidence is high; pass the ambiguous ones up. Cheap, some uncertainty, still clears a large share.
  • Tier 3 — AI/ML. An LLM reads the full thread for genuinely ambiguous or mixed-intent messages and proposes a category and a draft reply — but never sends. Most capable, most expensive, least certain, so it only sees the hard residue.
  • Human boundary. Certain types go to a person regardless of confidence: a legal threat, a data-breach report, a mention of self-harm, a churn risk on a major account. Consequence overrides confidence.

The shape — the support-inbox pipeline:

flowchart TD
    IN([Incoming support email])
    T1["Tier 1 · Deterministic rules<br/>~free · certain"]
    T2["Tier 2 · Classifier, scored<br/>cheap · some uncertainty"]
    T3["Tier 3 · LLM reads full thread<br/>capable · costly · least certain"]
    H["Human · decides<br/>routed here regardless of confidence"]
    R1([Auto-resolved])
    R2([Routed to a queue])

    IN --> T1
    T1 -- confident match --> R1
    T1 -- no confident rule --> T2
    T2 -- high confidence --> R2
    T2 -- ambiguous or mixed --> T3
    T3 -. proposes a draft .-> R2
    T3 -- consequential type --> H

    style H stroke:#F5A623,stroke-width:2px
    style T3 stroke:#C97E12,stroke-width:1.5px
Loading

Justify the order by cost and certainty (one line each):

  • Deterministic first — it's ~free and right when it matches, so it should absorb everything it can.
  • Classifier next — cheap and certain enough to clear a big chunk, but not free and not always right.
  • LLM last — the costliest and least certain, so we spend it only on what the cheaper layers couldn't settle.

Spot one resolve-too-early risk + guard:

A Tier-1 rule auto-closes anything containing "refund" with a template. But "I want a refund and I've contacted my lawyer" is consequential — and it'd be wrongly auto-closed. Guard: the deterministic auto-close only fires when no high-consequence signals are present (legal / threat / large-account); otherwise the case escalates even though the cheap rule "matched."

Notice what just happened: I named the tiers, justified the order by cost and certainty, drew the human boundary around consequence, and found a place the cheap layer fails. That's the whole of Part A. Your job is to make those same moves on your problem — your domain, your rules, your consequential cases.


Your turn — a template to fill in

Copy this into your doc and replace each blank. Keep it tight; the diagram carries the structure, the prose carries the why.

PROBLEM (one line): ______________________________________________

TIERS  (cheapest & most certain first)
  Tier 1 — Deterministic:  what it catches → __________________
     Why here (cost + certainty): ___________________________
  Tier 2 — Probabilistic:  what it catches → __________________
     Why here (cost + certainty): ___________________________
  Tier 3 — AI / ML:        what it catches → __________________
     Why here (cost + certainty): ___________________________

HUMAN BOUNDARY
  Always goes to a person: ________________________________
  Why (consequence / reversibility — NOT "model gets weak"): ____

RESOLVE-TOO-EARLY RISK
  Where a cheap layer might wrongly "clear" a case: ___________
  Guard against it: _______________________________________

Part B — starter prompts (answer in your own words; don't just restate these):

  1. Name your two decisions first. Then ask of each: if this call were wrong, who is harmed, and can it be undone? The difference you're looking for lives in that answer — not in how accurate the model is.
  2. Picture a layer passing a case up. When is that the layer being honest about its limit, and when is it the layer being lazy or mis-tuned? What distinguishes the two?
  3. Ask what your target looks like: is it fixed and structured, or messy and actively trying to evade you? Let that — not your tooling — pick depth, breadth, or both.

Make the diagram (a skeleton you can adapt)

Hand-drawn and photographed is completely fine. Or copy this Mermaid block and relabel it for your problem — paste it into your Gist (or any Markdown file on GitHub) and it renders as a real flowchart:

flowchart TD
    IN([Incoming case])
    T1["Tier 1 · Deterministic<br/>~free · certain"]
    T2["Tier 2 · Probabilistic<br/>cheap · some uncertainty"]
    T3["Tier 3 · AI / ML<br/>capable · costly · least certain"]
    H["Human · decides"]
    R1([Resolved])
    R2([Resolved])
    R3([Resolved])

    IN --> T1
    T1 -- confident --> R1
    T1 -- uncertain --> T2
    T2 -- confident --> R2
    T2 -- uncertain --> T3
    T3 -- confident --> R3
    T3 -- consequential --> H

    style H stroke:#F5A623,stroke-width:2px
Loading

Common traps (self-check before you submit)

  • Ordering by "smartness." If your only reason AI goes last is "it's the best," you've skipped certainty and cost. Redo the justification.
  • A cheap layer clearing what it can't be sure of. Resolving early feels efficient; it's where the quiet, dangerous errors come from.
  • Putting the human where the model is weak instead of where the stakes are high. The boundary is about consequence, not model accuracy.
  • Treating escalation as failure. If your design "minimizes" human involvement as a cost, you've built a fallback, not a decider.

What a strong submission shows

  • Cost and certainty both appear in the ordering — not one, and not "smartness."
  • The human boundary is drawn around consequence and reversibility, with a clear reason.
  • A real resolve-too-early risk is named, with a concrete guard.

Good luck — bring the version of this you'd defend to a regulator, not the version that's easiest to build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment