Skip to content

Instantly share code, notes, and snippets.

@calebamiles
Last active April 26, 2017 15:40
Show Gist options
  • Save calebamiles/dc160c6a6fc6da9f8786a0d810bb6d38 to your computer and use it in GitHub Desktop.
Save calebamiles/dc160c6a6fc6da9f8786a0d810bb6d38 to your computer and use it in GitHub Desktop.
Possible solutions to currently identified problems with Kubernetes governance

PROBLEM: The structure of the project is opaque to newcomers

  • project structure added to governance repository under Kubernetes GitHub organization
  • create "Front Desk" inspired by Debian Project to welcome new contributors
  • create mentorship requirement in contributor ladder

PROBLEM: There is no clear technical escalation path / procedure

  • adopt RFC process inspired by Rust langage governance
  • create technical steering committee bootstrapped from "top level OWNERS" as penultimate arbiters with power vested in Kubernetes community to make all final decisions
  • affirm decision by technical steering committee in case inability of community to make decision
  • devolve primary decision making and consensus building authority to SIGs
  • require SIGs to create policies for exemption from RFC process for "minor" changes
  • establish Technical Lead role for SIGs responsible for final decision within SIG
  • establish formal escalation from SIG to technical steering committee
  • deprecate "features" process and repository
  • place component/area ownership under SIGs, define escalation from component/area owners to SIGs

PROBLEM: There aren't consistent / official decision-making procedures for ~anything: consensus, lazy consensus, consensus-seeking, CIVS voting, etc.

  • empower the "PM Group" to serve as a policy clearinghouse which operates under consensus-seeking model
    • other SIGs chartered under the "PM Group" and or a technical steering committee will make policy decisions related to a particular area (e.g. contributor experience, release management, harassment) but the "PM Group" should reconcile possibly conflicting policy prescriptions
  • devolve technical decision making from technical steering committee to SIGs which operate under consensus-seeking model unless otherwise specified in their SIG charter
  • require SIG charter to specify decision making specifics (e.g. TTL, majority thresholds)
  • require supermajority voting on issues before technical steering committee
  • affirm all decisions made by PM Group (or delegates) or technical steering committee by lazy consensus

PROBLEM: There are no official processes for adding org members, reviewers, approvers, maintainers, project leaders, etc

  • define minimum requirements to be added to project repositories
  • establish "Front Desk" for project wide GitHub administrative tasks
  • empower SIGs to perform additional GitHub administration tasks independently. Ideally each SIG should have at least one person who is an GitHub Organization owner. Git is a powerful time machine and I feel that we should be generally permissive in granting access to SCM in order to support a "merge then review" development workflow which differs from the "review than merge" workflow currently adopted by the project
  • require SIGs to specify component/area ownership change rules in SIG charter
  • require elections for all project leadership positions

PROBLEM: There is no official / regularly meeting body to drive overall technical vision of the project.

  • establish technical steering committee

PROBLEM: We don't agree on the right level/types of engagement for leaders. Some feel that leaders should be recused from responsibilities such as SIG leadership, while others feel they need to be deeply involved in releases, etc.

  • split technical, administrative, and product/project roles at SIG level
  • establish permanent bodies to handle testing and release management; add or reserve at least one seat for release management leader and testing leader on any technical steering committee
    • SIG Testing already exists but should be empowered to prescribe policies including blocking submit queue until test failures are addressed
    • Establish SIG Release focusing on release management and infrastructure

PROBLEM: There aren't official technical leads for most subareas of the project

  • formally devolve component/area ownership to SIGs
  • establish mentorship requirements for project leadership to grow new leaders
  • split technical, administrative, and product/project roles at SIG level to encourage broader participation

PROBLEM: There is no centralized / authoritative means of resolving non-technical problems on the project, including staffing gaps (engineering, docs, test, release, ...), effort gaps (tragedy of the commons), expertise mismatches, priority conflicts, personnel conflicts, etc.

  • establish permanent body to handle moderation issues
  • establish staffing requirements for corporate inclusion in project materials (e.g https://kubernetes.io/partners/) to address gaps
  • establish body under permanent "PM Group" to aggregate feature requests
  • allow removal from any role of leadership
  • establish "bounties" or PR multipliers for contributions to poorly staffed areas of the project

PROBLEM: [In particular, there is] insufficient effort on contributor experience (e.g., github tooling, project metrics), code organization and build processes, documentation, test infrastructure, and test health. Some on the project have argued that there is insufficient backpressure and/or incentives for people employed to deliver customer-facing features to spend time on issues important to overall project health.

  • establish "bounties" or PR multipliers for contributions to poorly staffed areas of the project
  • establish body under permanent "PM Group" to aggregate feature requests
  • establish staffing requirements for corporate inclusion in project materials (e.g https://kubernetes.io/partners/) to address gaps

PROBLEM: [A related issue is] counterbalancing technical- and product-based decision-making.

  • establish body under permanent "PM Group" to aggregate feature requests
  • empower permanent bodies focused on release management and testing to directly influence SIG priorities
  • empower SIGs to set their own priorities with veto power held by technical steering committee
  • formalize SIG PM representative role as a project rather than product management role

PROBLEM: Visibility across the entire project is lacking.

  • require basic project management as a precondition of SIG chartering
  • require PM representative as a precondition of SIG chartering
  • require PM Group and a "Technical Steering Committee" produce "weather forecast" for Kubernetes
    • weather forecast format will hopefully allow for longer term communication of project direction while acknowledging that deadlines slip and no feature is guarenteed to arrive in a particular release
    • weather forecast for external consumers of the project could be combined with a monthly development report based on changes actually merged into SCM

PROBLEM: Metrics, metrics, metrics, metrics. We're flying blind.

  • require metrics on intended effect to be produced before implementing project policy changes
  • require corporate staffing of body focused on contributor experience
  • create PR multiplier for contributions for contributor experience

PROBLEM: There is no documented proposal process.

  • consolidate proposal and feature process into single RFC process inspired by Rust

PROBLEM: There isn't a documented process for advancing APIs through alpha, beta, stable stages of development.

  • drop stability distinction to two categories unstable|stable with technical steering committee deciding on graduation
  • require technical steering committee to ungate functionality

PROBLEM: Project technical leaders (de facto or otherwise) are not available via office hours.

  • require communication SLAs
  • establish regularly meeting technical steering committee
  • require SIG office hours as a precondition for charter

PROBLEM: We don't have processes or documentation for onboarding new contributors.

  • establish "Front Desk" inspired by Debian project to guide new contributors through ladder
  • establish mentorship requirements for project leaders

PROBLEM: There are no official safeguards to prevent control over the project by a single company.

  • establish quotas for corporate contributions for all project leadership positions
  • route corporate engagements through single body, possibly "PM Group"

PROBLEM: Project leadership lacks diversity.

  • establish "Front Desk" to help guide new contributors through process
  • term limits for elected project leadership roles but possibly not for "merit" based component/area ownership roles
  • create permanent engagement body to reach out to underrepresented communities

PROBLEM: There is no conflict of interest policy regarding leading/directing both the open-source project and commercialization efforts around the project.

  • require project members to act in project interest
  • disallow corporate affiliation of contributions (e.g. work is done by individuals not corporations)
  • allow for removal from all elected roles of project leadership
  • allow for removal from component/area ownership due to conflict of interest

PROBLEM: There is no consistent, documented process for rolling out new processes and major project changes (e.g., requiring two-factor auth, adding the approvers mechanism, moving code between repositories).

  • empower "PM Group" to make process and project changes

PROBLEM: Nobody has taken responsibility to think about and improve the structure of the project, processes, values, etc. A few people have been working on this part time, but it needs more and more consistent attention given the rate of growth of the project.

  • consider having a corporate dues collecting umbrella organization sponsor people to improve structure of project
  • establish corporate staffing requirements for contributor experience body under "PM Group" and technical steering committee

PROBLEM: [We're also] lacking people to drive, implement, communicate, and roll out improvements (and test, measure, rollback, etc.).

  • require metrics collection as part of policy and process changes
  • require corporate staffing requirements for contributor experience body under "PM Group" and technical steering committee

PROBLEM: There isn't a sufficiently strong feedback loop between technical contributors/leadership and the PM group.

  • have "PM Group" and technical steering committee jointly determine release priorities
  • empower SIGs to draft their own priorities for each release
  • empower permanent body focused on release management to focus on releasing Kubernetes project continuously to reduce "fear of missing out" when developing features against a time bounded release window
  • allow "PM Group" and technical steering committee to veto proposed SIG priorities

PROBLEM: Nobody has taken responsibility for legal issues, license validation, trademark enforcement, etc.

  • use CNCF or create Kubernetes specific umbrella organization to handle legal issues

PROBLEM: Technical conventions/principles are not sufficiently documented.

  • empower contributor experience body under "PM Group" and technical steering committee to document project wide technical conventions
  • require changes to technical conventions to follow RFC process
  • require SIGs to document conventions/principles in their charter

PROBLEM: Development practices and conventions across our repositories are not consistent.

  • empower technical steering committee to create body focused on standardizing development practices and conventions

PROBLEM: Our communication media are highly fragmented, which makes it hard to understand past decisions.

  • make source control the canonical source of truth for all past decisions
  • ensure all policy and process changes also follow RFC process with results committed to source control
@bgrant0607
Copy link

Thanks for the detailed proposal. I'm going to address points incrementally.

Structure of the project:

Agree with these points. I tried to document the structure in the governance.md PR. As for the Front Desk and mentorship, I think the challenge is to figure out the right incentives to make them happen.

@bgrant0607
Copy link

I think the Rust RFC process could be a good starting point for formalizing our proposal process.

@bgrant0607
Copy link

I agree areas of ownership of the project should be divided amongst SIGs.

kubernetes/community#402 (comment)

I also think there's an agreement that we need a concept of Technical Leads for SIGs. Whether one or multiple per SIG is TBD.

Yes, the "features" process needs improvement.

@bgrant0607
Copy link

Comments on lazy consensus vs consensus seeking vs variants:

kubernetes/community#402 (comment)

Needs more work.

@bgrant0607
Copy link

Agree that we need a regularly meeting Technical Oversight/Steering Committee. Actually, I like Steering, because it's more active and due to the Kubernetes tie-in. :-)

@bgrant0607
Copy link

I think not all "policy" will fall under one SIG. At least PM, contributor experience, testing, scalability, API machinery, release, and governance are likely to own some policies.

@bgrant0607
Copy link

Github permissions are an unworkable disaster. What "github administration" tasks did you have in mind?

@bgrant0607
Copy link

+1 to "split technical, administrative, and product/project roles at SIG level"

@bgrant0607
Copy link

"establish permanent bodies to handle testing" should be SIG Testing

@bgrant0607
Copy link

+1 to "establish permanent body to handle moderation issues"

"establish staffing requirements for corporate inclusion in project materials (e.g https://kubernetes.io/partners/) to address gaps": Worth a try.

"establish body under permanent "PM Group" to aggregate feature requests": How do you imagine this would work and how it would differ from the current features process?

bounties: Hasn't worked in the past. k8sport is trying gamification. I think we should start with an effort to better curate which tasks we want picked up (desirable, straightforward, medium priority, etc.) and make areas in need of more contributors more visible.

@bgrant0607
Copy link

"empower permanent bodies focused on release management and testing to directly influence SIG priorities": As we've seen with the "flaky test" discussions, there are details that need to be figured out regarding how we make that happen.

@bgrant0607
Copy link

How do you see the "weather forecast" as different from what the PM group has been doing for K8s?

@bgrant0607
Copy link

"require corporate staffing of body focused on contributor experience": Would love to figure out how to make that happen.

"require metrics...": While I desperately want metrics, I don't want the lack of metrics to necessary block efforts to stop the bleeding. If only a half a person is working on Contributor Experience, I have to work with what I can.

@bgrant0607
Copy link

Agree we need to document a lot more decisions.

@bgrant0607
Copy link

It's not practical to bottleneck all features on TSC approval.

I think SIG ownership of APIs is a reasonable goal, but there are some issues that need to be resolved: kubernetes/community#419 (comment)

@calebamiles
Copy link
Author

I think the Rust RFC process could be a good starting point for formalizing our proposal process.

Will work with @pwittrock on trying to adapt Rust's RFC process to our project

I also think there's an agreement that we need a concept of Technical Leads for SIGs. Whether one or multiple per SIG is TBD.

I can imagine having both a "technical lead" for a single SIG and a "technical director" for multiple SIGs once we scale to needing the additional structure

Comments on lazy consensus vs consensus seeking vs variants:

kubernetes/community#402 (comment)

Needs more work.

I think we could decide that different decision making processes should be followed for different types of decisions and for different decision making bodies. I would think that a decision made by the community at large could follow a voting process whereas most SIGs which contribute to SCM could operate under a consensus seeking model. I think that many policy decisions could largely be made by a lazy consensus model with some fallback mechanism

I think not all "policy" will fall under one SIG. At least PM, contributor experience, testing, scalability, API machinery, release, and governance are likely to own some policies.

Agreed but I also believe that there needs to be a single body responsible for serving as a policy clearinghouse which should delage policy for specific policies to a few cross cutting SIGs

Github permissions are an unworkable disaster. What "github administration" tasks did you have in mind?

Given our development workflow I agree that GitHub permissions are not nearly granular enough. There's a lot of existing art where direct access to SCM is a general right of contributors and I much prefer a "merge then review" model over a "review then merge" model given the latter seems to incur a fair amount of administrative overhead (e.g. conducting timely code reviews) in practice

"establish body under permanent "PM Group" to aggregate feature requests": How do you imagine this would work and how it would differ from the current features process?

I would like user cases to be consolidated under a single body to

  • aggregate user requests
  • help guard against commercial conflicts of interest where a Product Manager from Company Y talks directly to a TL or developer also employed by Company Y so there's no transparency for anyone not employed by Company Y as to why a feature is important

In my ideal world I would imagine a relationship between the PM Group + TSC and a SIG to be similar to the relationship between the legislative and executive branches of the US government where the PM Group and TSC decide what's important across the project and the SIG agrees/vetoes the plan and is responsible for execution of the plan. This arrangement would require some mechanism for overriding a SIG veto and enforcing compliance

"empower permanent bodies focused on release management and testing to directly influence SIG priorities": As we've seen with the "flaky test" discussions, there are details that need to be figured out regarding how we make that happen.

Agreed we need to hammer out some details. I think an idea is to give SIG Testing and a future SIG Release the power to make developers feel pain if project wide health drops below some threshold as a "nuclear" option. I don't know yet how to make individual SIGs responsible for working on globally optimizing for project health (e.g. test flakes)

How do you see the "weather forecast" as different from what the PM group has been doing for K8s?

I think it would primarily signal uncertainty of deliverable dates while providing some insights into overall project direction. I think that monthly reports on changes which actually have landed would also be helpful

"require corporate staffing of body focused on contributor experience": Would love to figure out how to make that happen.

Sneak it into a Corporate CLA? 😄

It's not practical to bottleneck all features on TSC approval.

I think SIG ownership of APIs is a reasonable goal, but there are some issues that need to be resolved: kubernetes/community#419 (comment)

I wouldn't want to block users or developers from having access to features but I think the default enablement of features should be centralized. If members of the TSC cannot discharge their responsibility to steer the project we should have mechanisms for communicating no confidence from the community and changing the members.

@bgrant0607
Copy link

One quick comment: Not sure whether it's your intention or not, but the outcome of "merge then review" would be that almost nothing got reviewed. 2x+ the rate of change we have now, with no additional reviewers, with nothing to force people to spend time reviewing, with as much noise as we have in github notifications today means it would be infeasible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment