Learn MCP → Skill Generator: A Docs-First Approach to Agent Skills

Brainstorm document — April 2026

The Problem

Agent skills today are agent-generated through user interaction and by service SMEs, producing agent-consumable markdown grounded in Microsoft Learn documentation. This works well for individual skills, but leads to inconsistencies in content coverage across services — some skills have deep SDK references and troubleshooting tables while others have minimal coverage. The Microsoft Learn MCP tools (microsoft_docs_search, microsoft_docs_fetch, microsoft_code_sample_search) already exist and return rich, structured content — but are inconsistently used in skill authoring today, with only a few skills leveraging them while most rely on manual knowledge capture.

The Idea

Flip the authoring model: start from Learn docs, and templates. Use Learn MCP tools as the primary input to a skill generator along with MCP-specific tools for domain-specific dynamic content generation in structured skill packages.

A CLI to generate skills that assists domain-specific skill authors (e.g., Azure Functions, AKS, Cosmos DB SMEs) in producing consistent, high-quality skills — and enables customers to create their own custom skills by combining Microsoft Learn content with their own knowledge bases through their MCPs.

Goals for CLI / Learn Skill Generator

Keep skills grounded in official documentation — Every code sample, best practice, and troubleshooting pattern traces back to a Learn MCP source, ensuring accuracy and auditability
Produce token-efficient, compacted content — Learn MCP tools today return full article content (e.g., microsoft_docs_fetch returns 36KB+ per page). The CLI should distill this into compacted, agent-consumable references — extracting only the relevant patterns, tables, and snippets rather than embedding entire articles
Accelerate skill authoring for domain SMEs — Reduce time-to-first-draft by providing a scaffold pre-populated with Learn content, so authors spend time on secret sauce (orchestration, safety patterns) rather than manual documentation gathering
Enable customers to create custom skills — Provide a CLI that customers can use to generate skills combining Microsoft Learn content with their own knowledge bases through their MCPs
Consistent content coverage across services — Every generated skill starts from the same Learn-powered discovery process, eliminating gaps where one service has deep SDK references while another has minimal coverage
Enforce structural consistency — Generated skills follow the same template (frontmatter, Quick Reference, When to Use, MCP Tools, Workflow, Error Handling), token budgets, and progressive disclosure patterns regardless of author
Support multi-domain skill generation — Not limited to Azure; the same CLI works across .NET, M365, Power Platform, SQL, Security, and any domain Microsoft Learn covers
Composable with domain-specific MCP tools — The CLI uses Learn as the foundation but plugs in domain-specific tools (bicepschema_get, wellarchitectedframework, customer MCPs) for specialized content, if the tools are present in the generation context (Copilot CLI, GitHub repo)

Current State (GitHub Copilot for Azure)

24 skills with hundreds of manually-curated reference files (Bicep, Terraform, SDK, troubleshooting, best practices)
A few skills actively use microsoft_docs_search/microsoft_docs_fetch at runtime
Most content is static markdown with hardcoded learn.microsoft.com URLs
SDK reference files are generated from the upstream microsoft/skills repo
No content generation pipeline from Learn exists today

Content Types & Their Right Tool

Content Type	Right Tool	Learn MCP Role
Bicep/Terraform snippets	`bicepschema_get`	❌ Not Learn's job — use schema tools
SDK quick-references	`microsoft_code_sample_search`	✅ Primary source
Troubleshooting tables	`microsoft_docs_search` + `_fetch`	✅ Primary source
Auth best practices	`microsoft_docs_fetch` (specific URLs)	✅ Primary source
CLI references	`microsoft_docs_search`	✅ Good supplement
KQL patterns	`microsoft_docs_search`	✅ Good supplement
Naming conventions	`microsoft_docs_fetch`	✅ Proven pattern (enterprise-infra-planner)
Architecture guidance	`wellarchitectedframework_*` + Learn	✅ Combination

Generator Workflow

Example: Azure Infrastructure Skill

Based on the existing azure-enterprise-infra-planner skill, which already uses this pattern:

Input: "Plan Azure infrastructure for hub-spoke network with private endpoints"
microsoft_docs_search → discover available content (networking, VNets, firewalls, private endpoints)
microsoft_docs_fetch → fetch full articles for naming rules and architecture patterns
bicepschema_get → get IaC schema for VNets, subnets, firewalls, private endpoints
wellarchitectedframework → get WAF service guide for reliability, security, performance
get_azure_bestpractices_get → get best practices for code generation and deployment
Secret sauce: plan schema, pairing checks, resource constraints, WAF checklist, verification
Output: infrastructure plan JSON + Bicep or Terraform code

Example: SDK-Focused Skill

Input: "Create skill for Azure Storage Blob SDK"
microsoft_docs_search("Azure Blob Storage SDK overview authentication patterns") → discover SDK documentation, auth guidance, feature coverage
microsoft_code_sample_search("azure-storage-blob upload download", language="python") → pull Python samples
microsoft_code_sample_search("azure-storage-blob upload download", language="javascript") → pull JS/TS samples
microsoft_code_sample_search("azure-storage-blob upload download", language="csharp") → pull C# samples
microsoft_docs_search("Azure Blob Storage best practices performance tuning") → pull optimization guidance
microsoft_docs_fetch → fetch full articles for deep patterns (retry policies, connection pooling, managed identity auth)
Secret sauce: deduplicate snippets across languages, normalize auth patterns to DefaultAzureCredential, map to skill sections, apply token budget
Output: skill package with per-language SDK reference files, auth best practices, and common operation patterns

The SDK example highlights how microsoft_code_sample_search becomes the primary driver — called multiple times with language filters to build out the references/sdk/ tree. Learn docs fill in the conceptual gaps (when to use which API, performance tuning, error handling strategies) that code samples alone don't cover.

Example: Azure Functions Deploy Skill

A complex, multi-concern skill that spans infrastructure, code samples, deployment workflows, and troubleshooting. Content breaks down roughly as:

~40% Learn docs searchable (concepts, hosting plans, runtimes, troubleshooting)
~20% Code sample searchable (triggers, SDK patterns, Durable patterns)
~15% Bicep schema derivable (IaC resource definitions)
~25% Secret sauce (orchestration, decision trees, safety patterns, agent UX)

From microsoft_docs_search — conceptual foundations:

Content	Learn Query
Hosting plans comparison (Flex Consumption, Y1, Premium, Dedicated)	"Azure Functions hosting plans comparison"
Runtime stacks & version mappings (Node 20, Python 3.11, .NET 8)	"Azure Functions supported languages runtimes"
Durable Functions patterns (fan-out/fan-in, chaining, human interaction)	"Durable Functions application patterns"
Deployment slots guidance (which plans support them)	"Azure Functions deployment slots"
Durable Task Scheduler vs Azure Storage backend	"Durable Task Scheduler overview"
AZD deployment common errors	"Azure Developer CLI common errors troubleshooting"

From microsoft_code_sample_search — multi-language trigger samples:

Content	Learn Query
HTTP trigger (v4 model, streaming, App Insights)	`("Azure Functions HTTP trigger", language="javascript")`
Timer trigger (v2 decorators, retry policies, cron)	`("Azure Functions timer trigger cron", language="python")`
Service Bus queue/topic triggers (managed identity)	`("Azure Functions Service Bus trigger", language="python")`
Durable orchestrator (chaining, compensation, timeouts, retry)	`("Durable Functions orchestrator activity", language="javascript")`
Blob trigger (EventGrid source, SDK binding)	`("Azure Functions blob trigger", language="javascript")`

Each query returns ~10 production-ready snippets covering latest programming models (v4 Node.js, Python v2 decorators, new Durable Task SDK).

⚠️ Caveat: microsoft_code_sample_search URL accuracy per language

The code snippets returned by microsoft_code_sample_search are accurate and language-correct. However, the accompanying Learn URLs can be inaccurate per language. Many Learn articles use pivot-based language tabs, and the URLs returned sometimes contain malformed pivot parameters (e.g., ?pivots=programming-language-javascript%20programming-language-typescript with a space-separated multi-pivot). When opened in a browser, these URLs fall back to the default language pivot (often Python), not the language of the returned snippet.

From bicepschema_get — IaC resource definitions:

Content
Flex Consumption: Storage + App Insights + Service Plan (FC1) + Function App + RBAC
Consumption Linux/Windows: Y1 SKU + staging slots
Premium Plan: EP1 + elastic instances
Service Bus integration: namespace + role assignments

🔴 NOT from Learn — Secret Sauce:

Content	Why It's Secret Sauce
Composition algorithm (base + recipe template selection)	Proprietary 12-indicator decision tree
Mandatory `--no-prompt` enforcement for azd	Agent-specific UX pattern
Pre-deploy checklist (8-step MCP tool orchestration)	Workflow logic
Live RBAC verification post-deployment	Cross-skill validation
`curl -o /dev/null` instead of `curl -I` for Functions testing	Agent-specific HTTP testing pattern
Plan status enforcement (Validated → Deploying)	Three-skill pipeline orchestration
Idempotent SQL patterns for managed identity grants	Deployment safety

This example shows that even complex skills have ~75% of raw content derivable from Learn MCP tools + Bicep schema, but the 25% secret sauce — orchestration, safety guardrails, agent UX patterns — is what makes it a skill rather than just docs.

What Remains With Domain Skill Authors — The Secret Sauce

The MCP tools give you raw material. The value-add is:

Section mapping — knowing that docs_search results about "queues vs topics" become the "When to Use" section, not a reference file
Deduplication — code_sample_search returns 10 Python snippets that are variations of the same pattern; pick the best one
Token budgeting — enforce .token-limits.json; the generator must know how much content fits in SKILL.md (~5000 tokens) vs references (loaded on demand)
Progressive disclosure — stable content (architecture patterns) baked into references; volatile content (SDK versions) left as runtime fetch hooks
Cross-skill consistency — auth patterns should match what other skills use, not be regenerated from scratch each time
Combining domain-specific knowledge from MCP knowledge sources and tools — skill authors bring their own MCP servers (internal docs, proprietary APIs, customer-specific configurations) and weave that knowledge together with Learn content to produce skills that go beyond what public documentation alone can provide
Packaging style — deciding skill granularity: one skill per service (e.g., azure-servicebus), one skill per operation (e.g., azure-servicebus-deploy, azure-servicebus-troubleshoot), or one skill per workflow (e.g., azure-messaging spanning Service Bus + Event Hubs). The right packaging depends on token budgets, activation patterns, and how users think about tasks

Inspiration: Context7's Model

Context7 CLI solves the same problem for open-source libraries. Key patterns to borrow:

Context7 Pattern	Learn Skill Generator Equivalent
`ctx7 library react` → resolve library ID	`microsoft_docs_search("Service Bus")` → discover content
`ctx7 docs /facebook/react "useEffect"` → get snippets	`microsoft_code_sample_search("service bus python")` → SDK samples
`ctx7 skills generate` → AI-driven skill creation	The missing piece — `learn skills generate` for Microsoft ecosystem
`ctx7 skills suggest` → scan project deps	Scan `azure.yaml` / Bicep / Terraform → recommend skills
Two-step resolve → query	Don't assume content exists; search first, then fetch
Generation + human review loop	Produce draft → author curates → finalize

Key difference: Context7 resolves a single library (react, prisma). Microsoft skills span multiple data sources (Learn docs + Bicep schemas + WAF + SDK samples) that need to be composed together.

Beyond Azure: Learn Covers the Entire Microsoft Ecosystem

Microsoft Learn has ~100,000+ articles across all domains. The microsoft_docs_search and microsoft_code_sample_search tools don't filter by Azure — they search all of Learn. The generator pattern is domain-agnostic.

Domain	Example Generator Input	Learn Content Available
.NET / C#	"Create skill for ASP.NET Core authentication"	Middleware, Identity, JWT, Blazor, EF Core
Microsoft 365	"Create skill for Graph API email integration"	Graph SDK, permissions, webhooks, Teams apps
Power Platform	"Create skill for Power Automate custom connectors"	Connector authoring, Dataverse, Power Fx
SQL Server	"Create skill for query performance tuning"	Execution plans, indexing, DMVs, Always On
Windows	"Create skill for WinUI 3 app development"	XAML, packaging, MSIX, Win32 interop
DevOps	"Create skill for Azure DevOps pipeline optimization"	YAML pipelines, artifacts, environments
Security	"Create skill for Entra conditional access"	Zero Trust, MFA, RBAC, token lifecycle
Dynamics 365	"Create skill for Dataverse plugin development"	Plugin registration, entity model, business rules
Fabric	"Create skill for Microsoft Fabric lakehouse"	Spark notebooks, data pipelines, semantic models

Domain-Specific Schema Tools

The only Azure-specific step in the workflow is bicepschema_get. For other domains, that slot gets filled by whatever structured reference exists:

Domain	"Schema" Equivalent
Azure	`bicepschema_get`
.NET	NuGet package metadata, API reference
M365	Graph API OpenAPI specs
SQL	DMV catalog, system views
Power Platform	Connector definition schema
Everything else	`microsoft_docs_search` covers it

Domain-Specific Skill Templates

Different domains need different skill structures:

Azure Infra skill:
  SKILL.md + references/services/{svc}/bicep.md + terraform.md + sdk/

.NET skill:
  SKILL.md + references/api/ + references/patterns/ + references/migration/

M365 skill:
  SKILL.md + references/graph-api/ + references/permissions/ + references/webhooks/

SQL skill:
  SKILL.md + references/queries/ + references/dmvs/ + references/optimization/

Security skill:
  SKILL.md + references/policies/ + references/identity/ + references/compliance/

These templates are the real IP — knowing how to structure knowledge for each domain so agents can consume it effectively.

Delivery: Both CLI and Skill

CLI (like `ctx7`)

learn skills generate                    # interactive
learn skills generate --domain azure     # scoped to Azure
learn skills generate --domain dotnet    # scoped to .NET
learn skills generate --domain m365      # scoped to M365
learn skills suggest                     # scan project, infer domain
learn skills refresh <skill-name>        # re-query Learn, update stale content
learn skills validate <skill-name>       # check URLs, detect content drift

Skill (runs inside an agent)

The existing skill-authoring skill in the repo could be extended with a "start from Learn" step 0, using the Learn MCP tools interactively to discover and collect content before templating.

Complementary to Context7

Context7:  OSS libraries → docs → skills    (React, Prisma, Next.js)
Learn CLI: Microsoft ecosystem → docs → skills (Azure, .NET, M365)
Combined:  Full-stack coverage

A developer building a Next.js app on Azure with Graph API integration would use Context7 for Next.js/React patterns and Learn CLI for Azure deployment, Graph API, and Entra auth.

Summary

Layer	What	Who
Learn MCP tools	Raw material (docs, code samples, schemas)	Already exists in `@azure/mcp`
Generator / CLI	Discover → Collect → Template → Output	Needs to be built
Secret sauce	Section mapping, dedup, token budgets, consistency	The hard part / real IP
Domain templates	Skill structures per domain	Extensible framework
microsoft/skills repo	Upstream skill factory	Consumes the generator
Consumer repos	This repo + others	Consume generated skills

kvenkatrajan/learn-mcp-skill-generator-brainstorm.md

Select an option

No results found

Select an option

No results found

Learn MCP → Skill Generator: A Docs-First Approach to Agent Skills

The Problem

The Idea

Goals for CLI / Learn Skill Generator

Current State (GitHub Copilot for Azure)

Content Types & Their Right Tool

Generator Workflow

Example: Azure Infrastructure Skill

Example: SDK-Focused Skill

Example: Azure Functions Deploy Skill

What Remains With Domain Skill Authors — The Secret Sauce

Inspiration: Context7's Model

Beyond Azure: Learn Covers the Entire Microsoft Ecosystem

Domain-Specific Schema Tools

Domain-Specific Skill Templates

Delivery: Both CLI and Skill

CLI (like `ctx7`)

Skill (runs inside an agent)

Complementary to Context7

Summary

kvenkatrajan/learn-mcp-skill-generator-brainstorm.md

Learn MCP → Skill Generator: A Docs-First Approach to Agent Skills

The Problem

The Idea

Goals for CLI / Learn Skill Generator

Current State (GitHub Copilot for Azure)

Content Types & Their Right Tool

Generator Workflow

Example: Azure Infrastructure Skill

Example: SDK-Focused Skill

Example: Azure Functions Deploy Skill

What Remains With Domain Skill Authors — The Secret Sauce

Inspiration: Context7's Model

Beyond Azure: Learn Covers the Entire Microsoft Ecosystem

Domain-Specific Schema Tools

Domain-Specific Skill Templates

Delivery: Both CLI and Skill

CLI (like ctx7)

Skill (runs inside an agent)

Complementary to Context7

Summary

CLI (like `ctx7`)