andrew-templeton/context-packet.md

Created December 23, 2025 18:32

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Select an option

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/andrew-templeton/76b3ffa3674db0a5b6bcf4597f4b9f7f.js"></script>
Save andrew-templeton/76b3ffa3674db0a5b6bcf4597f4b9f7f to your computer and use it in GitHub Desktop.

Download ZIP

SLT VP Product & Technology — Take-Home Case Study (Candidate Materials)

Raw

context-packet.md

Sur La Table (SLT) — VP Product & Technology Take-Home: Working Context Brief

This document is provided to candidates alongside the case study. It defines the operating environment, constraints, and how to interpret the data.

1) What you should assume is true

You are joining as VP of Product Management & Tech at Sur La Table.
You own product + software platforms end-to-end: e-commerce (Next.js/headless), BFF/API gateway, services, OMS, POS software, data warehouse/ETL, loyalty, personalization, ERP integrations.
SLT-only operating model: there is no CSC matrixed engineering to rely on.
Store hardware/network operations are owned day-to-day by the IT Director, but you set standards/security policy and approve major infra changes.
This take-home case uses synthetic but internally consistent metrics. Your job is to make decisions and plans using what's here, while clearly stating bounded assumptions.

2) Org, team, and operating model

2.1 Reporting lines (your directs)

Sr. Director of Engineering (software engineering, SRE/platform, QA)
Head of Product (PMs, UX partnership)
IT Director (store infrastructure/devices, networking, endpoint management)

2.2 Team size (this is the reality you inherit)

Function	Headcount	Notes
Software Engineers	8	Full-stack and backend; no dedicated frontend
Product Managers	2	One Sr PM (web/checkout), one PM (store systems)
QA	4	Manual + automation; owns release confidence
SRE/Platform	2	SLOs, observability, incident response, infra
App Support	2	L1/L2 triage, escalation to engineering
Total	18

Implication: With 8 engineers, you cannot pursue every initiative. Prioritization under constraint is the test. The 2 allowed contractors (per budget rules) are a critical lever.

Contractor economics: $180/hr fully loaded; 2-week onboarding; ~3 weeks productive time before freeze

QA capacity: Can support 2 major initiatives in parallel; beyond that requires sequencing or external help

2.3 Team topology (current state)

No formal squads — engineers are assigned to work as needed
Typical splits: ~4 engineers on web/checkout, ~2 on OMS/services, ~2 on data/integrations
You may propose a different structure in your plan

2.4 Decision rights (for this case)

You are final decision maker on roadmap trade-offs, release gates, and investment prioritization.
IT Director must approve changes affecting store networks/devices; you co-sign major changes impacting store-cloud segmentation.
Architecture & Experiment Review Board (ARB) is your mechanism for enforcing standards/guardrails.
- ARB composition: VP (chair) + Sr Dir Eng + Head of Product
- Cadence: Meets weekly; fast-track approval available for critical items

2.5 Realistic initiative limits

With 8 engineers × 5 weeks (40 eng-weeks) + 2 contractors × ~3 productive weeks (6 eng-weeks, conservative) = 46-47 eng-weeks available.

Rough sizing:

Initiative	Typical Effort
Variant A ship (with guardrails)	4-6 eng-weeks (+2 if payment fallback)
OMS reliability (timeout fixes)	6-8 eng-weeks
CWV quick wins (ISR, caching)	4-6 eng-weeks
Peak capacity (scaling + load test)	6-8 eng-weeks
Legacy OrderBridge (partial)	8-12 eng-weeks

Implication: You can realistically complete 2-3 medium initiatives pre-freeze. Proposing 5+ parallel initiatives signals unrealistic planning.

3) Business context: Sur La Table

3.1 What SLT sells

Premium cookware, kitchenware, and gourmet foods — Williams-Sonoma competitor positioning
Cooking classes — in-store and online; meaningful revenue line (~10% of total)
Gift registry — weddings, housewarmings

3.2 Channel mix

E-commerce: ~55% of revenue (growing)
Retail stores: ~50 locations nationally; ~35% of revenue
Cooking classes: ~10% of revenue (high margin, drives store traffic)

3.3 Seasonality

Q4 is everything: Nov-Dec represents ~40% of annual revenue
Key dates: Black Friday, Cyber Monday, holiday gifting
Cookware as gifts: High AOV, high consideration purchase

3.4 Competitive pressure

Williams-Sonoma, Crate & Barrel, Amazon (commoditization)
Differentiation through expertise, classes, curated assortment

4) Systems landscape (high-level)

4.1 Customer-facing digital

Web storefront: Next.js headless application (current; needs performance work)
Key journeys: Home → PLP → PDP → Cart → Checkout
Performance instrumentation: Core Web Vitals from real-user monitoring (RUM)
Current capacity: ~400 RPS sustained at SLO latency (load tested in Q3)
Q4 pre-provisioned: Auto-scaling configured to 1,200 RPS burst; 800 RPS sustained (provisioned; load test validation pending)
Remaining gap to BFCM (2,625 RPS): CDN optimization, caching tuning, and load test validation required

4.2 API layer and services

BFF/API gateway: request aggregation/composition, auth/session handling, rate limiting/back-pressure, response caching headers
Core services: cart/pricing/promo, catalog/search integration, loyalty/personalization, payment orchestration

4.3 OMS and fulfillment

OMS: order create, inventory reservation/commit, fulfillment orchestration; integrates with ERP and store systems
Primary reliability KPI: OMS Success % (see definitions below)
Known issues: Timeouts (47% of failures), inventory mismatch (28%), payment gateway (15%)

4.4 Store systems (POS software)

POS software: store transactions (sales/returns), payment flows, device interactions
Primary reliability KPI: POS Success % during store hours
PCI scope includes POS
Current state: 99.86% success; minimal gap to 99.9% target — low priority unless other initiatives complete early

4.5 Data platform

Warehouse + ETL/CDC: supports BI, experiment analysis, operational dashboards
Primary latency KPI: ops feeds (orders/inventory) available in DWH within target window
Known issue: p95 lag is 48 min (target ≤ 15 min); contributes to 0.6% oversell rate

4.6 Legacy surface area (what "legacy" means here)

"OrderBridge" — legacy order/checkout integration layer being strangled
"Legacy endpoints" = APIs still required for order completion or downstream fulfillment
You will propose a strangler + parallel run + cutover plan with decommission milestones
Realistic expectation: Full cutover may extend past the 5-week pre-freeze window. Plan what's achievable pre-freeze vs. Q1.

5) Tooling you may assume is available

You do not need to pick vendors. Describe capabilities.

5.1 Experimentation and feature flags

Platform capable of: randomization, exposure logging, guardrails, sequential decisioning, phased rollouts, kill switch
Assume something like LaunchDarkly or Statsig

5.2 Observability

RUM: CWV p75 metrics (LCP/INP/CLS) and route-level breakdown
APM/tracing: service latency, dependency maps, error rates (Datadog-like)
Logs: structured logs with correlation IDs across web/BFF/services/OMS
Synthetics: checkout probes and key journey monitors

5.3 CI/CD and releases

Automated pipelines for web and services
Support for feature flags, canaries, and rapid rollback
Ability to enforce release checklists and block deploys when guardrails fail

5.4 Infrastructure

Cloud: AWS
Compute: ECS for services; considering Lambda@Edge for Next.js edge rendering
CDN: CloudFront (current); edge caching not fully optimized
CDN optimization note: Quick wins (cache rules, TTLs) achievable pre-freeze; advanced optimization (Lambda@Edge, custom origins) should be phased post-peak

5.5) Operational baselines (current state)

Metric	Current	Target
P1 incidents	1.2/week	≤1/week
MTTR	45 min	≤30 min
On-call load	2 eng/week rotation	Maintain
Release cadence	Web 3×/week, services 2×/week, OMS weekly	Maintain
Change failure rate	8%	<5% (stretch)
Rollback rate	6%	<5%

Note: Rollback plans must be tested in staging before production use.

6) Governance constraints and dates

6.1 Peak / holiday window

Code freeze begins Nov 10
Exceptions allowed only via Major Release Go/No-Go with:
- SLO targets satisfied or explicitly risk-accepted
- Rollback tested and time-bounded (< 30 min)
- Dashboards/alerts in place for primary metrics + guardrails
- Named owner on-call during change window

6.2 Budget and resourcing

Opex remaining pre-peak: $1.2M (for contractors + cloud/infra + tooling through Q4; engineer salaries separate)
Net new hires frozen; you may use up to 2 contractors
Avoid engineering becoming a procurement bottleneck

Current infra run rate: ~$180K/month (compute: $110K, CDN: $35K, observability: $25K, misc: $10K)

Peak scaling assumption: 1.5-2× compute cost during Nov-Dec (~$165-220K/month for compute alone)

6.3 Escalation and exceptions

Freeze exceptions require VP approval + President notification
Emergency changes during peak require VP + Sr Dir Eng sign-off
On-call engineer has authority to rollback without approval

6.4 Key dates

Today (case context): Early October 2025
Nov 10: Code freeze begins
Nov 24-Dec 2: BFCM / peak week
You have ~5 weeks to execute pre-freeze initiatives

7) Metric definitions (authoritative for the exercise)

Note on basis points (bps): 1 bps = 0.01 percentage points. Example: CVR moving from 2.00% to 2.10% is a +10 bps improvement.

Metric	Definition
CVR (site-wide)	Successful orders / sessions
AOV	Revenue / successful orders
Checkout p95 latency	p95 end-to-end time from "Place order" click to confirmation render
CWV p75	p75 on mobile from RUM: LCP, INP, CLS
OMS Success %	Successful order-creates / order-create attempts. Each automatic retry counts as a separate attempt. Manual customer retries (re-submitting after error page) count as new order attempts.
POS Success %	Successful store transactions / attempts during store hours
P1 MTTR	Median time from page to mitigation for P1 incidents
Error budget burn	% of quarterly error budget consumed
Ops → DWH latency	p95 minutes from source commit to warehouse availability
Change failure rate	% of deploys causing customer-impacting degradation
Rollback rate	% of deploys rolled back
Cost per order	All-in cloud/platform/observability cost ÷ orders

8) Security, compliance, and segmentation

PCI-DSS applies to checkout/payment flows and includes POS
Store-cloud network segmentation required:
- Store networks segmented from corporate and public
- Least privilege connectivity to cloud endpoints
- Logging/monitoring for store-origin traffic
GDPR/CPRA applies for customer data handling

9) What is explicitly in scope for candidate proposals

You may propose changes across:

Next.js rendering strategy and caching (SSR/SSG/ISR)
Edge/CDN configuration
BFF/API gateway composition/back-pressure
Service reliability improvements (timeouts/retries/circuit breakers, idempotency, queueing)
OMS integration robustness (inventory reservation timing, contract versioning, failure handling)
Release governance (guardrails, go/no-go, freeze exceptions)
Team structure and contractor allocation

10) What is intentionally not provided

We are not providing full system diagrams, vendor names, or code. When information is missing:

State a minimal assumption
Propose 1-2 options
Pick one and explain trade-offs and verification
Keep plans consistent with stated constraints

11) The core tension you must navigate

With 8 engineers and ~5 weeks to code freeze, you face hard trade-offs:

Note: Variant A (BFF changes) and CWV initiatives (Next.js/BFF optimization) share infrastructure and engineering resources. Consider sequencing when planning.

Option	Risk
Ship Variant A for CVR lift	OMS reliability drops; fulfillment failures during peak
Fix OMS first, delay Variant A	Miss CVR opportunity; may not hit conversion targets
Do both in parallel	Spread 8 engineers too thin; neither done well
Hire contractors for one workstream	Onboarding time; quality risk

There is no "right" answer — there are defensible trade-offs. We're evaluating your judgment, not your ability to do everything.

Raw

instruction-packet.md

SLT — VP Product & Technology: Take-Home Case Study

Company: Sur La Table (SLT Lending SPV, Inc.) Role: Vice President of Product Management & Tech Take-home timebox: 4-5 hours (please do not exceed) Live review: 60 minutes (exec readout + deep dives)

What we're evaluating

Prioritization under constraint — With 8 engineers and 5 weeks to freeze, you cannot do everything. Choosing what NOT to do is as important as what you do.
Trade-off reasoning — Especially conversion vs. reliability, speed vs. safety.
Technical judgment — Next.js/headless, caching, BFF patterns, SLOs, observability.
Operational rigor — Peak readiness, guardrails, rollback, incident response.
Communication — Can you tell a clear story to the President and Finance?

The constraint you must internalize

Resource	Reality
Engineers	8 total
Time to freeze	~5 weeks (Nov 10)
Contractors allowed	2 max
Budget	$1.2M opex remaining

You will not complete every possible initiative. The test is whether you make defensible choices about what to prioritize and what to defer.

Target State (for your plan)

Your plan must show a path toward these targets:

Category	Target
Checkout latency	p95 ≤ 800 ms
Core Web Vitals	LCP p75 ≤ 2.5s, INP p75 ≤ 200ms, CLS p75 ≤ 0.1
OMS reliability	≥ 99.5% success
POS reliability	≥ 99.9% success (store hours)
Peak headroom	≥ 2.5× p95 RPS for BFCM (design to ≥ 2,625 RPS)
Change safety	CFR < 5%, rollback < 5%

Data Appendix

All data is synthetic but internally consistent. Use it to make quantified decisions.

A1. Product outcomes (weekly baseline)

Week Start	Sessions	CVR %	AOV $	Orders	Revenue $
2025-08-25	1,200,000	2.10	92	25,200	2,318,400
2025-09-01	1,150,000	2.05	91	23,575	2,145,325
2025-09-08	1,180,000	2.08	93	24,544	2,282,592
2025-09-15	1,220,000	2.12	92	25,864	2,379,488
2025-09-22	1,210,000	2.06	94	24,926	2,343,044
2025-09-29	1,190,000	2.04	92	24,276	2,233,392

Baseline CVR: ~2.07% | Baseline AOV: ~$92 (weekly range $91-94)

A2. Web performance & reliability (weekly)

Week Start	Checkout p95 (ms)	LCP p75 (s)	INP p75 (ms)	CLS p75	OMS %	POS %
2025-08-25	1,250	3.20	240	0.12	99.20	99.86
2025-09-01	1,210	3.10	235	0.11	99.15	99.84
2025-09-08	1,180	3.05	230	0.11	99.25	99.87
2025-09-15	1,275	3.30	250	0.13	99.10	99.83
2025-09-22	1,230	3.15	245	0.12	99.18	99.85
2025-09-29	1,205	3.00	238	0.11	99.22	99.86

Gap to target: Checkout ~400ms over; LCP ~0.5s over; OMS ~30 bps under

B1. Checkout experiment (14-day sample)

Experiment ran Sept 15-28, overlapping with weekly baseline data above. Control metrics align with weekly averages.

Arm	Sessions	Orders	CVR % (95% CI)	AOV $	Revenue $	OMS % (95% CI)	Checkout p95	LCP p75
Control (legacy)	1,200,000	24,960	2.08 (2.06-2.10)	92.0	2,296,320	99.20 (99.15-99.25)	1,220 ms	3.10 s
Variant A (BFF + new payment)	1,200,000	25,680	2.14 (2.12-2.16)	92.5	2,375,400	98.90 (98.85-98.95)	950 ms	2.60 s

The tension: Variant A improves CVR (+6 bps), latency (-270ms), and LCP (-0.5s), but OMS drops 30 bps (99.20% → 98.90%).

Funnel breakdown (add-to-cart, cart-to-checkout) not available — focus analysis on end-to-end CVR and OMS impact.

Math you should do:

Control: 24,960 orders at 99.20% OMS → ~25,161 attempts → 201 failed
Variant A: 25,680 orders at 98.90% OMS → ~25,965 attempts → 285 failed
Net: +720 orders, but +84 additional fulfillment failures

Statistical context:

Historical CVR standard deviation: ±3 bps
With 1.2M sessions per arm, α=0.05, power=0.80: experiment can detect ±2 bps differences
The +6 bps CVR lift is statistically significant (p < 0.001); focus your analysis on the risk/reward trade-off, not statistical validity

B2. Variant A OMS degradation — root cause analysis

Post-experiment investigation identified the primary driver:

Factor	Control	Variant A	Impact
Payment integration latency (p95)	200 ms	350 ms	+150 ms
Payment provider success rate	99.4%	99.1%	-30 bps
End-to-end OMS request time (p95)	1,850 ms	2,000 ms	+150 ms

Root cause: New payment provider in Variant A has lower reliability (99.1% vs 99.4%). The 30 bps OMS degradation is primarily driven by payment failures, not timeouts or inventory issues.

Mitigation options to consider:

Ship with payment provider fallback logic (+2 eng-weeks; requires vendor contract amendment)
Gate and defer to Q1 (vendor has committed to reliability improvements by Feb)
Ship with guardrails + aggressive rollback threshold (accepts higher failure cost during ramp)

C1. Peak forecast

Week of	Sessions	p95 Sustained RPS
2025-11-10	2,000,000	750
2025-11-17	2,800,000	900
2025-11-24 (BFCM)	3,600,000	1,050
2025-12-01	2,600,000	820

Current tested capacity: ~400 RPS sustained at SLO latency (Q3); Q4 pre-provisioned to 800 RPS sustained (load test validation pending)

Requirement: Headroom ≥ 2.5× → design to ≥ 2,625 RPS (2.5× based on 2024 BFCM traffic spikes of 2.2×; buffer for safety)

Note: RPS reflects total requests, not sessions. Assume ~15 requests per session (page loads, API calls, assets). 3.6M sessions × 15 requests ÷ 604,800 seconds ≈ 89 RPS average; peak at 1,050 RPS = ~12× peak/average ratio.

Forecast confidence: BFCM sustained RPS forecast is 1,050 (p50). Historical forecast error shows p75 = 1,260 RPS (+20%), p95 = 1,575 RPS (+50%). The 2.5× buffer (2,625 RPS) covers the p95 worst-case forecast (1,575 RPS) plus margin for instantaneous traffic spikes above sustained load.

C2. OMS failure distribution (last 30 days, Control baseline)

Timeouts: 47%
Inventory mismatch / oversell: 28%
Payment gateway (systemic): 15%
Other: 10%

Note: This distribution reflects Control (legacy payment provider). In Variant A, payment gateway failures increase to ~38% of total failures due to new provider's lower reliability.

OMS retry policy: 2 retries with exponential backoff; timeouts not auto-retried (customers see immediate error)

C3. Inventory feed lag

p95 lag: 48 min (target ≤ 15 min)
Oversell rate: 0.6% (target ≤ 0.1%)

Note: Lag → oversell relationship is not linear; fixing may require both pipeline improvements and reservation logic changes

Candidate Initiatives (for prioritization)

These are the initiatives you should consider prioritizing or deferring:

Variant A ship — checkout experiment with CVR lift but OMS risk
OMS reliability — fix timeouts, inventory mismatch, payment gateway issues
CWV improvements — LCP, INP, CLS optimization for web platform
Peak capacity — scale to 2.5× headroom for BFCM
Data pipeline latency — reduce 48 min lag to ≤15 min
Legacy OrderBridge cutover — strangler migration to new services
POS improvements — already at 99.86%, minimal gap to 99.9% (low priority; defer unless other initiatives complete early)

You cannot do all of these with 8 engineers in 5 weeks. Choose wisely.

Your 5 Tasks

Task 1: Outcome Targets + Pre-Freeze & Q1 Roadmap (required)

Deliverable: Slides 1-3 of your deck + spreadsheet tab

Set specific targets: CVR (bps improvement), checkout p95, CWV, OMS %
Identify 2-3 major initiatives (not 10) that move these metrics given 8 engineers and 5 weeks
For each: problem → hypothesis → outcome → owner → cost (eng weeks + $)
Show what you're explicitly deferring and why
Note: QA can support 2 major initiatives in parallel (see Context Brief, Section 2.2). A 3rd initiative requires sequencing, reduced test coverage, or contractor QA support.

Task 2: Experiment Decision — Variant A (required)

Deliverable: Slides 4-5 of your deck + 1-page experiment plan (PDF)

Decision: Ship immediately / Ship with guardrails / Gate / Iterate — with explicit rationale
If shipping: what guardrails and rollback triggers?
If gating: what OMS fixes are required first? Timeline?
If iterating: what's the re-test hypothesis?
Include: stat-sig requirements (assume α=0.05, power=0.80), alert thresholds, auto-rollback conditions
Quantify the trade-off: expected revenue gain vs. fulfillment failure cost
Assume failed order cost: $85 avg (CS: $25, refund/restock: $40, LTV impact: $20)
Note: We evaluate reasoning quality, not the specific decision. Ship, Gate, or Iterate can all score well if justified.

Task 3: Peak / Holiday Readiness (required)

Deliverable: Slide 6 of your deck + spreadsheet tab

Capacity plan: How do you get from current capacity to ≥ 2,625 RPS?
Load test plan: Profiles, pass/fail criteria, timeline (must complete before Nov 10)
DR drill: What, when, pass criteria
Freeze governance: Exception process, rollback requirements, on-call coverage
Incident response: Runbooks, escalation, page budget (target ≤ 2 pages/eng/week)

Task 4: Legacy Transition — OrderBridge Cutover (required)

Deliverable: Slide 7 of your deck

Strangler approach: What gets migrated first? Contract versioning?
Parallel run: How long? What SLIs determine readiness?
Cutover checklist: Go/no-go criteria, rollback plan (< 30 min)
Decommission milestones: % legacy endpoints retired by when?
Given 8 engineers, be realistic about timeline (may extend past 90 days)
Show phasing: What's achievable pre-freeze (Nov 10) vs. what extends into Q1?

Task 5: Web Platform — Path to CWV Targets (required)

Deliverable: Slide 8 of your deck

Rendering strategy: SSR/SSG/ISR by route (Home, PLP, PDP, Cart, Checkout)
Caching: Edge/CDN rules, cache keys, TTLs, invalidation
BFF responsibilities: Composition, back-pressure, timeout budgets
Target path: TTFB p75 → LCP p75 → INP p75 (show the math)
Rollout safety: Feature flags, canary %, automated rollback

Deliverables (strict limits)

Artifact	Format	Limit
Executive deck	PDF	10 slides max
Model / calculations	XLSX	1 file, multiple tabs OK
Experiment plan	PDF	1 page

File naming:

SLT_VP_Case_<LastName>_Deck.pdf
SLT_VP_Case_<LastName>_Model.xlsx
SLT_VP_Case_<LastName>_ExperimentPlan.pdf

Live Review Agenda (60 min)

Segment	Time	Focus
Exec walkthrough	20 min	Present as if to President: strategy, trade-offs, risks
Deep dive: Experiment + OMS	15 min	Variant A decision, guardrails, OMS failure modes
Deep dive: Peak readiness	10 min	Capacity math, load test, freeze governance
Deep dive: Web platform	10 min	Caching strategy, TTFB→LCP path
Wrap-up / Q&A	5 min	Anything we didn't cover

What "good" looks like

Makes a clear call on Variant A — not "it depends" without a recommendation
Shows the math — net orders, failure cost, capacity headroom
Acknowledges constraints — "With 8 engineers, we cannot do X before freeze; here's when we will"
Has specific guardrails — "Auto-rollback if OMS < 99.0% over 15-min window"
Defers explicitly — "Data pipeline latency is deferred to Q1; here's why it's lower priority"

Model expectations: Your spreadsheet should include (1) Variant A ROI calculation, (2) capacity math, (3) initiative cost estimates

What will hurt your score

Proposing a plan that requires 20+ engineers
Ignoring OMS degradation in Variant A decision
No rollback plan for any major change
Vague targets ("improve performance") instead of specific numbers
Hand-waving peak readiness ("we'll load test")

FAQ

Can I make assumptions? Yes — list them explicitly and bound uncertain decisions with options + trade-offs.

Do I need to write code? No. Diagrams, decision frameworks, and rollout plans are sufficient.

What if I think the targets are unrealistic? Say so — and propose what IS achievable with the constraints. That's a valid answer.

Should I address compliance/security/data pipelines? Only if directly relevant to your 5 tasks. We'll ask about these in the live session if needed.

What about backup slides for the live review? Use your 10-slide deck for the entire review; no backup slides needed.