Skip to content

Instantly share code, notes, and snippets.

@cwgem
Created July 21, 2018 14:46
Show Gist options
  • Save cwgem/8713273b6c7b213e7fccd929cb2d6153 to your computer and use it in GitHub Desktop.
Save cwgem/8713273b6c7b213e7fccd929cb2d6153 to your computer and use it in GitHub Desktop.
GitHub Issues Workflow Enhancement Proposal

Problem

GitHub has become a major platform for hosting code and making it more visible. Many projects hosted on GitHub rely on the Issues feature to provide a way for users to report issues with the project. Unfortunately there is a low cost to entry in posting an issue. The effect of this is a lot of noise relative to the popularity of the project. A solid example of this getting out of hand is the Visual Studio Code issue tracker with over 5000 issues total.

While a new feature was added around May to provide an initial filter, it's not a complete solution and even has a way to easily pull back to the original tracker. There is also a lack of rich metadata around issues. One can only look for labels (which even then have to be applied by someone with permissions) or attempt some sort of natural language parsing. This barrier means that automation around issue filtering is not feasible for a majority of repository administrators.

The lack of structure around the post bug category filtering can make the issue reporting experience tedious. A user is often forced to parse issue templates and figure out what they actually need to adjust, what's relevant to the issue, and what's just commentary that they can delete. In many cases it's easier to simply delete the entire template and provide the issue in a free form. There needs to be a more rich experience for users to post their issues in a way that improves the overall quality of the report.

User Stories

  • As a repository owner I want to enable automation powered by rich issue metadata to provide a way to do simple triage without requiring time from my contributors
  • As a project contributor I want a way to see what issues affect components I work on without requiring someone to apply labels first
  • As a repository owner I want a way to direct users through a workflow that brings the issue close to a high quality report
  • As a bug reporter I want interface components to direct me towards what I actually need to fill out

Solution 1: External Issue Tracker

Using an external issue tracker is a possibility for larger projects. Instead of using GitHub Issues the project can use bug tracking solutions which are more feature rich (Jira, Bugzilla, etc.) to filter their issues.

Pros

  • Software may provide rich metadata out of the box for automation
  • No customization required on the GitHub side
  • All features will be geared towards bug tracking instead of GitHub as a whole

Cons

  • Workflow breakup going back and forth between GitHub and the external system
  • Self hosted solutions require time and maintenance to keep up, most likely requiring a hosted solution which meats a number of requirements
  • Depending on tracker functionality yet another login to work with for users
  • A lot of tedious UI customization may be required to not overwhelm users

Solution 2: Machine Readable Bug Reports

A bug reporter formats their issue in a way that is machine readable such as JSON or YAML.

Pros

  • No need to adjust the existing Issues interface
  • Allows for automation against issue reports
  • Contributors may be able to develop tooling around it for their workflow

Cons

  • Enforcing validation means that a user may have to redo a substantial part of their report
  • A bug reporter could easily make a mistake while trying to format (CLI tooling could help them out with it)
  • More education is required for users to report their issues properly
  • Users may give up early reporting issues due to the unusual workflow process

Solution 3: Issue Workflow Enhancement

This solution would enable repository owners and core contributors a way to use files in a machine parsable format (JSON/YAML/etc.) to describe a workflow for their project's issue submission. The functionality around this solution would require a few base features:

  • A way to describe components in a way that easily maps to an HTML entity (select, text, etc.)
  • Ability to enforce workflows as the only way to post issues
  • Triggers based on issue submission as a whole (mostly implemented)
  • Dynamic display of components based on selection (subcategory selection shown after category selection)
  • Ability to set required state for certain fields, including basing it on state of other fields
  • Allowing different component views and field editing based on very top level constraints (repository owner, user submitter, contributor, etc)
  • Unique identifiers for base elements providing a reliable method for accessing them
  • Grouping of elements (mostly for dynamic component display) with a unique identifier
  • Support for includes to enable shared components across different workflows (so you don't have to keep asking for the version of your app across every single workflow)
  • Enhancement: Ways to validate fields in certain ways including min/max length, regex, content type (numeric, float, string)

Essential components for the new workflow process would include:

  • Basic text fields free mostly free entry (name of package)
  • Custom GitHub fields, such as user selection for assignment or related bug/PR linking
  • Input fields for large amounts of text (steps to reproduce, debug pastes)
  • Drop down selection for a specific range of selections (engine version, category)
  • Multi-selection for "all that apply" type situations
  • Checklists to ensure users have taken proper steps before posting the issue (searched existing reports, attempted with latest version, ran test suites to ensure sane environment)
  • Nice to have: File upload fields (uploading logs, Dockerfiles, code files)

Format Considerations

The file description of what elements constitute a workflow and how elements may interact with each other would need to have the following properties:

  • Basic data types (arrays, objects, strings, integers)
  • Mostly human readable and yet machine parsable
  • Somewhat compact to not make repositories too bulk
  • Support of relationships
  • Easy escaping options in case of lengthy text content or nested components (JSON in JSON for example)
  • Existing support in languages (you'd want to avoid a custom format here unless you plan to add language support)

The filesystem storage of these could mostly take off from the existing issue template layout. Everything would exist under the .github folder somewhat like:

.github
|_ ISSUE_WORKFLOWS
   |_ COMMON
      |_ basic_info.workflow
   |_ bugs.workflow
   |_ bugs_WORKFLOWS
      |_ api.workflow
      |_ engine.workflow
      |_ client.workflow
   |_ enhancements.workflow
   |_ documentation.workflow

First is the root of the entire workflow process. When filing an issue the UI would look at what's under ISSUE_WORKFLOW and present all .workflow files as a sort of top level filter. Selection of a workflow with a _WORKFLOWS folder containing the same prefix would repeat the process recursively. To prevent too much screen switching workflow authors may simply utilize dynamic view components to change the workflow UI based on selection of categories.

To reduce repetition of certain components COMMON folders would be allowed at any _WORKFLOWS directory, exposing shared components. These imports would be exposed in a cascading manner so toplevel imports would be available to child workflows. To reduce repetition of certain components COMMON folders would be allowed at any _WORKFLOWS directory, exposing shared components. These imports would be exposed in a cascading manner so toplevel shared components would be available to child workflows. Aliasing and namespaces would allow for ways to make imports unique should they be named the same way.

Frontend Considerations

All workflow components should map to a specific HTML entity and nothing else. All text fields and input fields should allow GitHub flavored markdown for formatting purposes. There should be a system imposed limitation for the maximum number of components per workflow to prevent a malicious user from say, having an issue workflow with 20,000 text input fields (effectively bringing less powerful systems to their knees).

Collaborators should have the ability to show/hide individual components. For example, if the bug requests verbose log output that's deemed unnecessary for the issue at hand, they can simply hide that particular component. As far as permissions go those with full access to edit issues can edit any field they want. Granular permissions should be possible for only allowing a select set of fields to be modifiable (this would be different from the top level constraints on fields, allowing for locking down based on business concerns/compliance purposes).

Backend Storage Considerations

This is probably the more interesting part of the solution. I believe that storage as a JSON blob and associating the issue to that (or having it as a property of the issue) would be the best case. While possible to do in a mapping type setting for a relational database the freeform nature of the workflow would make that sort of solution highly complex. JSON blobs would have an object state that would make querying them more feasible.

Implementation Reference

The AWS CloudFormation service is an example of machine parsable formats mapping to UI units, as well as grouping and element validation.

Note that this is only a reference and does not endorse JSON being used as the implementation format.

Pros

  • Higher quality bug reports
  • Enablement of automation and tooling around bug reports through rich metadata
  • Bug reporters are presented with a more easy to digest workflow which assures them their bug report will contain minimal information needed
  • End result would be simple files to keep in the GitHub repository
  • Much more feasible to convert issues to feature rich bug tracking software issues
  • Visual mapping to metadata could define a way for contributors to hide certain elements of the report (they could hide version output for example if it was deemed unnecessary in reasoning about the issue)

Cons

  • Substantial changes to the codebase
  • API automation around issue creation would require additional modifications to support the custom workflow process
  • Any sort of tooling around reporting automation would not work properly without adjustments (GitHub Desktop for example)
  • Server side validation would become more complex in order to ensure a valid workflow
  • The bootstrapping of workflows would require substantial time to setup for the project owner/core contributor
  • Some level of lock in if a user wanted to migrate their issues to a system that didn't support it (this might be workable with an API call that takes all the metadata and formats it out to a single comment type format)
  • Issue migration tools would require high levels of customization to properly export issues over
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment