The Azure SDK GitHub repository experiences recurring support challenges, notably repetitive customer inquiries and inconsistent responsiveness across internal teams. Users submit issues already addressed in documentation or previously resolved issues, contributing to longer response times and unnecessary workloads. Additionally, engagement levels vary significantly among the teams responsible for different Azure SDK packages, leading to uneven customer experiences.
We propose implementing AI-driven support automation to streamline issue management and improve customer satisfaction. This solution will reduce time-to-resolution, deliver a consistent support experience, and enable engineering teams to focus on high-impact development efforts. Other key benefits include:
- Improved automated triage classification of incoming GitHub issues.
- Detection and intelligent response to common, previously solved questions.
- Proactive redirection to relevant documentation and resources.
- Increased focus on unresolved issues, improving team responsiveness.
- Implementation of feedback mechanisms to measure customer sentiment and support quality.
The Azure SDK is developed and maintained in a monorepo on GitHub, serving a large and diverse developer community. However, the support process currently faces several key challenges:
-
Recurring questions
Many support requests involve questions or problems that have already been addressed in existing documentation, FAQs, or past GitHub issues. Users frequently open new issues without reviewing these resources, resulting in unnecessary duplication and increased demand on engineering teams.
-
Inconsistent team responsiveness
Engagement levels vary across internal teams maintaining different Azure SDK packages. While some teams respond promptly, others may leave customer issues unresolved for extended periods, contributing to a fragmented and unpredictable customer experience.
-
Inefficient use of engineering resources
Engineers spend significant time addressing repetitive questions and simple support requests that could be handled more efficiently, diverting focus away from complex development and higher-value customer needs.
-
Improve accuracy of automated triage by leveraging AI for label prediction, replacing the outdated ML models used by our current automation.
-
Update issues with an immediate solution as part of initial triage for recurring questions and those with well-known outcomes. This will greatly improve the time to resolution for customers and avoid randomizing engineering resources. It is estimated that roughly 15% of incoming issues fall into this category.
-
Provide a suggestion for next steps as part of initial triage when AI-generated content is high-quality and relevant but we are not confident that it is the solution. This will improve the perceived responsiveness to issues and potentially allow the issue author to unblock themselves while they wait for human assistance. The target is to provide AI-generated suggestions for 20% of the unanswered incoming issues.
-
Encourage issue authors to register feedback on the quality of solutions/suggestions that were predicted. Leverage feedback and sentiment to continually improve outcomes by refining the knowledge base, tuning confidence thresholds, and evaluating the success of automated assistance.
-
Increase engineering productivity by freeing engineers from answering recurring issues and those with well-known outcomes. Allowing AI to address these issues will allow our engineers greater focus on development efforts.
The knowledge base will focus on quality over quantity, leveraging a curated set of sources with trusted and authoritative answers to Azure SDK inquiries. Over time, we will continue to refine how we identify authoritative answers and build on that to grow the knowledge base using additional resources.
By explicitly curating the knowledge base rather than including a wider diversity of sources, we can ensure high quality AI-generated content. This will give us confidence in the generated content and lower the risk of official Microsoft-provided responses to an issue being incorrect. With higher quality answers, we can achieve our goal of quicker response times by avoiding the need for a human subject matter expert to review AI-generated content before it is shared on issues.
To ensure that we offer developers a quality support experience end-to-end, any AI-generated content must be relevant to the issue. Developers will have a better support experience waiting for a human subject matter expert for assistance than to have low-quality or irrelevant AI content to work through.
In order to enable automation to evaluate AI-generated content and filter out content that does not meet the expected quality bar, a means of judging confidence is needed. This will take into account multiple aspects, such as its relevance to the issue, the quality of content structure, and how likely it is that the content is an exact solution to the issue.
Leverage the Azure AI Content Safety offering to reduce the risk of prompt injection and inappropriate AI content.
As part of issue triage, use the issue details as the basis for AI-based categorization by predicting the labels to associate with the issue from the knowledge base. Each should be associated with a confidence rating. If the prediction meets or exceeds the confidence threshold, then consider the labels to be the correct categorization for the issue and safe to use for automatic issue triage; otherwise, the prediction is ignored and the issue is not categorized.
If the issue remains uncategorized by AI label prediction, issue triage will be performed manually. No AI-generated content will be added to the issue. Follow the remaining steps of the triage process.
If labels were confidently predicted, use them and the issue details as the basis for AI-generation of a solution to the issue from the knowledge base. The content should be associated with a confidence rating. Evaluate the generated content:
-
If the content meets or exceeds the confidence threshold for a solution to the issue, add it as a comment and mark the issue as addressed. Follow the remaining steps of the triage process.
-
If the confidence was not high enough for a solution but meets or exceeds the suggestion threshold, add it as a comment and mark the issue as needing human follow-up. Follow the remaining steps of the triage process.
In order to gather feedback and tune results, we'll first launch a prototype using the .NET and Python repositories. These were chosen because they are heavily monitored and issue triage is consistently performed by the same people. This will allow us to see consistent feedback as we make adjustments and ensure that AI-related mistakes are corrected quickly.
Because the Azure SDK repositories use a custom automation to apply a number of rules and workflows to our repositories, we'll want to ensure that the AI integrations exist in a stand-alone service which can be customized by other teams. This will allow it to be leveraged without our specific automation.
To aid other teams, we will also create:
- A TypeSpec definition of the AI integration service contract
- A bicep file that automates creation of the Azure resources needed
- A set of GitHub action definitions to use as a template for integration into other repositories
- An example repository that demonstrates a basic GitHub Action-based integration
-
Collect customer feedback via GitHub by encouraging them to up vote or down vote the AI-generated content as part of the associated comment. This channel of feedback is monitored by our product management team and included in repository health tracking.
-
Track the count of issues being reopened after automatic resolution. Our "issue addressed" workflow provides the ability to mark an issue as unresolved via a slash command, with instructions to do so provided as part of resolution. If AI-generated content was inadequate to resolve an issue, it is expected that developers would use this mechanism to engage for additional assistance. This would require additional tracking by our product management team.
-
For issues with AI-generated content - solutions and suggestions - analyze the other issue comments with semantic analysis. This would provide an indicator of whether developers engaging with the issue feel positively or negatively. This would require additional tooling to be developed as well as additional tracking by our product management team.
-
Ensure that AI inquiries not meeting the quality bar are logged and can be reviewed by the development team to track the average confidence of generation over time and capture the success rate. This will provide insights on the quality of the knowledge base and algorithm used for confidence rating with the intent to inform performance tuning.
- Act on feedback from the prototype to tune confidence thresholds
- Explore more robust algorithms for a confidence score
- Migrate to an AppConfiguration-based system
- Include additional issue categorization (bug, question, feature-request)
- Allow for more advanced experimentation, such as using a specific configuration for 20% of incoming issues
The proposal looks great. Here are some of my general thoughts on the project that we might want to discuss as I see these as possible questions we may get.
FAQ.md
orllm.txt
file into their libraries directory that is referenced.