Skip to content

Instantly share code, notes, and snippets.

@lethain
Created May 23, 2025 04:37
Show Gist options
  • Save lethain/a1d7c578ef9da0c6a2f9f5beb6b9ed3b to your computer and use it in GitHub Desktop.
Save lethain/a1d7c578ef9da0c6a2f9f5beb6b9ed3b to your computer and use it in GitHub Desktop.

Engineering Architecture Decision-Making Strategy

Reading this document

To apply this strategy, start at the top with Policy. To understand the thinking behind this strategy, read sections in reverse order, starting with Explore, then Diagnose and so on.

More detail on this structure in Making a readable Engineering Strategy document.

Policy & Operation

Our policies for engineering architecture decision-making are:

Federated Decision Authority Framework

We will implement clear decision rights at three organizational levels:

  • Team-Level Decisions: Product engineering teams have full authority over architecture decisions that affect only their services and don't create cross-team dependencies
  • Cross-Team Decisions: Architecture changes affecting multiple teams require input from our Architecture Advisory Group (AAG) - composed of Staff+ engineers representing each major domain
  • Organization-Level Decisions: Technology choices that impact company-wide standards (new programming languages, major infrastructure changes) require CTO approval with AAG recommendation

Mandatory Architecture Decision Records (ADRs)

All significant architecture decisions must be documented using ADRs within 48 hours of the decision. ADRs must include context, options considered, decision made, and rationale. ADRs are stored in a searchable, central repository accessible to all engineers. Teams cannot proceed with implementation until the ADR is published and reviewed.

Weekly Architecture Advisory Sessions

We will hold 30-minute weekly "Architecture Office Hours" where any engineer can present decisions for feedback. AAG members rotate facilitation duties to prevent bottlenecks. This is a non-binding advisory format where teams receive feedback but make final decisions. If 2+ AAG members strongly disagree with a team's decision, it escalates to CTO review.

Technology Choice Governance

We maintain a default technology stack approved for new projects. Experimental technology trials require AAG approval and sunset review after 6 months. New language/framework adoption requires demonstrable advantage over existing options and commitment to long-term support. Exception requests require written justification addressing operational impact, team expertise, and migration costs.

Architecture Advisory Group Structure

The AAG consists of 5-7 Staff+ engineers representing domains (Frontend, Backend, Infrastructure, Data, Security). Members are nominated by teams and confirmed by engineering leadership based on technical expertise and collaborative judgment. Terms are 18 months to prevent entrenchment. Time commitment is 2-3 hours per week for office hours, retrospectives, and decision review.

Escalation and Exception Handling

Standard escalation path follows Team → AAG feedback → CTO review if unresolved. Emergency decisions can be made without full process but require retroactive ADR within 24 hours. Teams can request CTO review of AAG recommendations they strongly disagree with.

Measurement and Review

We track leading indicators (ADR completion rate, Architecture Office Hours attendance, escalation frequency) and lagging indicators (cross-team integration issues, technical debt metrics, engineer satisfaction). Monthly metrics review occurs in AAG retrospectives with quarterly presentation to engineering leadership. Success criteria include 90% ADR compliance, <5% decision escalation rate, and improved engineering satisfaction scores.

Diagnose

Based on our analysis of the current state, we've identified the following root causes and constraints:

Technical Constraints

  • Decision Authority Ambiguity: There is no clear framework for determining who has final authority on architecture decisions, leading to inconsistent outcomes where the most persistent voices prevail rather than the most informed ones
  • Inconsistent Technical Standards: Without coordinated decision-making, teams make incompatible technology choices that create integration challenges, operational overhead, and knowledge fragmentation
  • Lack of Decision Documentation: Architecture decisions are made in meetings, Slack discussions, or informal conversations, leaving future engineers without context for why systems were designed as they were

Organizational Constraints

  • Staff+ Engineer Utilization: Senior engineers spend significant time in reactive architectural debates rather than proactive technical leadership, reducing their impact on strategic technical initiatives
  • Team Autonomy vs. Coordination Tension: Teams want sufficient autonomy to move quickly, but the absence of coordination mechanisms creates downstream problems that ultimately slow everyone down
  • Onboarding and Context Transfer: New engineers struggle to understand architectural patterns and decision-making precedents, leading to repeated debates about previously settled questions

Business Impact

  • Reduced Engineering Velocity: The combination of unclear decision rights and lack of precedent documentation means architectural questions consume disproportionate engineering time
  • Technical Debt Accumulation: Inconsistent architectural decisions create technical debt that becomes expensive to resolve as the codebase grows
  • Talent Retention Risk: Experienced engineers become frustrated with inefficient decision-making processes, while newer engineers feel excluded from important technical discussions

Constraints We Must Work Within

  • Team Size and Growth: We cannot significantly expand the number of senior engineers dedicated to architecture coordination, so any solution must scale efficiently
  • Existing Technical Diversity: We already have multiple programming languages and architectural patterns in production, so we cannot impose uniform standards retroactively
  • Product Development Pressure: Product teams have aggressive delivery timelines that cannot accommodate lengthy approval processes

Explore

Based on research across technology organizations, three distinct approaches have emerged for managing architecture decision-making:

Advisory Architecture Process (Stripe/Netflix Model)

Stripe's approach uses architecture decisions made by implementing teams with a group of senior engineers providing feedback and guidance. No formal approval is required, but teams are expected to incorporate feedback. Netflix follows "Freedom and Responsibility" where engineers make decisions within their sphere of responsibility, with context shared broadly through documentation and RFCs.

Federated Architecture Councils (Amazon/Google Model)

Amazon's "Two-Pizza Team" model allows each service team to own their architecture decisions within their "Well-Architected Framework." Google uses Technical Lead Networks where Technical Leads coordinate architecture decisions in each area, with regular Architecture Review Committees for company-wide decisions and strong emphasis on written design documents.

Centralized Architecture Authority (Traditional Enterprise)

Traditional enterprise approaches like Microsoft's historical model use central architecture boards that approve all significant decisions, with detailed architecture standards and governance processes. Technology choices are limited to approved stacks with strong emphasis on consistency and risk management.

Architecture Decision Records (ADRs)

Michael Nygard's ADR pattern provides lightweight documentation of architecture decisions, capturing context, options considered, and rationale. This creates an immutable record enabling future teams to understand reasoning and has been successfully implemented at ThoughtWorks, Spotify, and numerous startups.

Technology Strategy Patterns

Eben Hewitt's "Architectural Decision Authority" pattern emphasizes clearly defined decision rights at different organizational levels with escalation paths for cross-cutting concerns, balancing autonomy and coordination with specific implementation guidance for different organization sizes.


This strategy combines the team autonomy model successfully used at Netflix and Amazon with the coordinated oversight from Google's Technical Lead networks, while avoiding the bureaucratic overhead that makes traditional enterprise architecture governance ineffective. The federated approach directly addresses our diagnosed constraints while providing the documentation and coordination mechanisms necessary to scale architectural decision-making as the organization grows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment