Goal: Build an advanced RAG sample showing how LMP (Language Model Programs) and MEDI (Microsoft.Extensions.DataIngestion) coexist and natively layer. MEDI handles pipeline infrastructure (ingestion, vector search, RRF, tree traversal). LMP handles the intelligence layer (optimizable typed predictors for every LLM call). LMP optimizers tune the whole thing end-to-end.
Domain: Survival/equipment documentation (matching the advanced-rag sample).
- The Big Picture — How LMP and MEDI Coexist
- Architecture Deep Dive
- The Three Integration Seams
- Project Structure
- Step 0: Project Setup
- Step 1: Signature Types
- Step 2: LMP-Powered MEDI Adapters
- Step 3: The AdvancedRagModule
- Step 4: Training & Dev Data
- Step 5: Program.cs
- Step 6: Build & Run
- Design Decisions & Rationale
- Glossary
The advanced-rag sample has ~7 processors that make LLM calls (query expansion,
entity extraction, topic classification, reranking, CRAG, etc.). Each one uses
IChatClient.GetResponseAsync() directly with hand-written prompt strings.
If query expansion produces bad results, you hand-tune the prompt. There's no
systematic way to improve these LLM calls.
LMP solves this: every LLM call becomes a Predictor<TInput, TOutput> with
learnable parameters (instructions + few-shot demos). An optimizer can
discover better prompts and examples automatically from labeled data.
┌─────────────────────────────────────────────────────────┐
│ YOUR APPLICATION │
│ │
│ ┌───────────────────────────────────────────────────┐ │
│ │ LMP Layer (Intelligence) │ │
│ │ │ │
│ │ • Typed Predictors for every LLM call │ │
│ │ • Learnable instructions + few-shot demos │ │
│ │ • Validation guards (LmpAssert) │ │
│ │ • End-to-end optimization (BootstrapFewShot) │ │
│ │ • Save/load optimized state │ │
│ └──────────┬────────────────────────────────────────┘ │
│ │ LMP predictors power MEDI processors │
│ ┌──────────▼────────────────────────────────────────┐ │
│ │ MEDI Layer (Infrastructure) │ │
│ │ │ │
│ │ • Ingestion pipeline (PDF→chunks→enrichment→store)│ │
│ │ • Retrieval pipeline (expand→search→RRF→rerank) │ │
│ │ • Vector store orchestration (Qdrant) │ │
│ │ • Tree traversal (RAPTOR hierarchies) │ │
│ │ • Reciprocal Rank Fusion (multi-query merge) │ │
│ │ • Metadata propagation & diagnostics │ │
│ └───────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
LMP does NOT replace MEDI. MEDI handles pipeline orchestration, vector search plumbing, RRF deduplication, tree traversal, metadata propagation — things that have nothing to do with LLM prompts. LMP replaces the raw IChatClient calls inside MEDI processors with typed, optimizable predictors.
| Stays in MEDI (Infrastructure) | Moves to LMP (Intelligence) |
|---|---|
IngestionPipeline orchestration |
Entity extraction LLM call |
RetrievalPipeline orchestration |
Topic classification LLM call |
SemanticSimilarityChunker |
Hypothetical query generation LLM call |
| Qdrant vector store operations | Query expansion LLM call |
| Reciprocal Rank Fusion algorithm | LLM reranking scoring call |
| Tree traversal grouping/sorting | CRAG confidence scoring LLM call |
| PDF reading (PdfPig) | Answer generation LLM call |
| OpenTelemetry diagnostics | Self-RAG critique LLM call |
| Metadata propagation on chunks | HyDE hypothetical answer LLM call |
Rule of thumb: If it calls IChatClient, it's a candidate for LMP. If it's
data plumbing, it stays in MEDI.
User Query
│
▼
┌──────────────────────────────────────────────┐
│ MEDI RetrievalPipeline │
│ │
│ QueryProcessors (pre-search): │
│ MultiQueryExpander ──── IChatClient ──→ prompt string
│ TreeSearchRetriever ── metadata annotation │
│ │
│ Vector Search: Qdrant ──→ RRF merge │
│ │
│ ResultProcessors (post-search): │
│ LlmReranker ──── IChatClient ──→ prompt string
│ CragValidator ── IChatClient ──→ prompt string
└──────────────────────────────────────────────┘
│
▼
SelfRagOrchestrator ──── IChatClient ──→ prompt string
│
▼
Answer
Every IChatClient arrow is a hardcoded prompt string. No typed inputs,
no validation, no learning, no optimization.
User Query
│
▼
┌──────────────────────────────────────────────────────────────┐
│ AdvancedRagModule : LmpModule<QuestionInput, GroundedAnswer> │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ MEDI RetrievalPipeline (infrastructure) │ │
│ │ │ │
│ │ QueryProcessors: │ │
│ │ LmpQueryExpander ─── Predictor<QI, EQ> (learnable) │ │
│ │ TreeSearchRetriever ─ metadata (no LLM) │ │
│ │ │ │
│ │ Vector Search: Qdrant → RRF merge (no LLM) │ │
│ │ │ │
│ │ ResultProcessors: │ │
│ │ LmpReranker ─── Predictor<RI, RJ> (learnable) │ │
│ │ LmpCragValidator ─ Predictor<CI, CC> (learnable) │ │
│ └──────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ RetrievalResults │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ ChainOfThought<AnswerInput, GroundedAnswer> │ │
│ │ (learnable: instructions + demos + reasoning) │ │
│ └──────────────────────────────────────────────────────┘ │
│ │ │
│ All predictors traced ──┼── Trace records every LLM call │
│ All predictors learnable┼── Optimizer injects demos │
│ All predictors validated┼── LmpAssert guards output ranges │
└──────────────────────────┼──────────────────────────────────────┘
▼
GroundedAnswer { Answer, Citations }
│
BootstrapRandomSearch optimizes ALL predictors
The AdvancedRagModule owns both:
- The MEDI RetrievalPipeline (for infrastructure)
- The LMP Predictors inside the MEDI processors (for intelligence)
When you run optimizer.CompileAsync(module, trainSet, metric), the optimizer:
- Runs the full pipeline (MEDI infrastructure + LMP predictors) on training data
- Collects traces from every predictor call (expand, rerank, CRAG, answer)
- Identifies which traces led to high-scoring answers
- Injects those successful traces as few-shot demos into each predictor
- Returns an optimized module where every LLM call has learned from data
The MEDI pipeline doesn't change. The LLM calls inside it get smarter.
There are exactly three places where LMP meets MEDI. Each is an adapter class that subclasses a MEDI processor base but uses an LMP Predictor internally.
MEDI calls: processor.ProcessAsync(RetrievalQuery query)
Adapter does: predictor.PredictAsync(typed input from query)
writes result back into query.Variants / query.Metadata
Used for: Query expansion, HyDE
MEDI calls: processor.ProcessAsync(RetrievalResults results, RetrievalQuery query)
Adapter does: predictor.PredictAsync(typed input per chunk)
updates results.Chunks ordering / results.Metadata
Used for: LLM reranking, CRAG validation
Module calls: _pipeline.RetrieveAsync(collection, query, topK, ...)
→ runs all MEDI processors (which internally use LMP predictors)
→ returns RetrievalResults
Then calls: _answer.PredictAsync(context built from results)
→ LMP ChainOfThought for answer generation
Used for: Answer generation (the only LMP predictor not inside a MEDI processor)
The adapter pattern preserves everything MEDI gives you for free:
RetrievalPipelineorchestrates processors in order- RRF deduplication runs between query processors and result processors
- Tree traversal runs after vector search but before result processors
- OpenTelemetry diagnostics (
ActivitySourcewith structured tags) - Logging (
ILoggerFactoryintegration) - The
UseXxx()fluent builder DI pattern
You get all of this without reimplementing it in LMP.
LMP.Samples.AdvancedRag/
├── LMP.Samples.AdvancedRag.csproj
├── Program.cs ← Entry point: ingest → predict → evaluate → optimize
├── Types.cs ← All [LmpSignature] output types + input records
├── Adapters/
│ ├── LmpQueryExpander.cs ← MEDI QueryProcessor backed by LMP Predictor
│ ├── LmpReranker.cs ← MEDI ResultProcessor backed by LMP Predictor
│ └── LmpCragValidator.cs ← MEDI ResultProcessor backed by LMP Predictor
├── AdvancedRagModule.cs ← LmpModule wrapping MEDI RetrievalPipeline
├── IngestedChunk.cs ← Vector store record schema (from advanced-rag)
└── data/
├── train.jsonl ← Training examples (survival/equipment Q&A)
└── dev.jsonl ← Dev evaluation examples
The actual documents (Example_Emergency_Survival_Kit.pdf, Example_GPS_Watch.md)
come from the wwwroot/Data/ directory — same as advanced-rag.
Total new files: 8 (.cs) + 2 (.jsonl) + 1 (.csproj)
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net10.0</TargetFramework>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
<UserSecretsId>lmp-samples-advanced-rag</UserSecretsId>
</PropertyGroup>
<!-- LMP library references (adjust paths to your lmp-dotnet clone) -->
<ItemGroup>
<ProjectReference Include="..\..\src\LMP.Modules\LMP.Modules.csproj" />
<ProjectReference Include="..\..\src\LMP.Optimizers\LMP.Optimizers.csproj" />
<ProjectReference Include="..\..\src\LMP.SourceGen\LMP.SourceGen.csproj"
OutputItemType="Analyzer"
ReferenceOutputAssembly="false" />
</ItemGroup>
<!-- MEDI: Ingestion + Retrieval pipelines -->
<ItemGroup>
<PackageReference Include="Microsoft.Extensions.DataIngestion" Version="10.5.1-dev" />
<PackageReference Include="Microsoft.Extensions.DataIngestion.Markdig" Version="10.5.0-preview" />
<PackageReference Include="MEDIExtensions" Version="1.0.0-dev" />
</ItemGroup>
<!-- Vector store -->
<ItemGroup>
<PackageReference Include="Microsoft.SemanticKernel.Connectors.Qdrant" Version="1.74.0-preview" />
</ItemGroup>
<!-- Azure OpenAI -->
<ItemGroup>
<PackageReference Include="Azure.AI.OpenAI" />
<PackageReference Include="Azure.Identity" />
<PackageReference Include="Microsoft.Extensions.AI.OpenAI" />
<PackageReference Include="Microsoft.Extensions.Configuration.UserSecrets" />
</ItemGroup>
<!-- Copy data files to output -->
<ItemGroup>
<None Update="data\**\*" CopyToOutputDirectory="PreserveNewest" />
</ItemGroup>
</Project>Note: The exact package versions depend on the pre-release feeds. Check
nuget.configin the advanced-rag repo for the local package source.
Every [LmpSignature] record is an LLM output type. The source generator reads
the [Description] attributes and the signature string to build typed prompts.
The optimizer can then improve on this baseline with learned demos and instructions.
using System.ComponentModel;
using LMP;
namespace LMP.Samples.AdvancedRag;
// ════════════════════════════════════════════════════════════
// INPUT TYPES (plain records — NOT [LmpSignature])
// ════════════════════════════════════════════════════════════
/// <summary>Module input: the user's question.</summary>
public record QuestionInput(
[property: Description("The user's natural language question")]
string Question);
/// <summary>Reranker input: a (question, passage) pair to judge.</summary>
public record PassageJudgmentInput(
[property: Description("The original user question")]
string Question,
[property: Description("A retrieved text passage to evaluate for relevance")]
string Passage);
/// <summary>CRAG input: a (question, top passages) pair for confidence assessment.</summary>
public record ConfidenceInput(
[property: Description("The original user question")]
string Question,
[property: Description("The top retrieved passages, separated by newlines")]
string TopPassages);
/// <summary>Answer generation input: question + ranked context.</summary>
public record AnswerInput(
[property: Description("The original user question")]
string Question,
[property: Description("Retrieved and ranked context passages, separated by ---")]
string Context)
{
public override string ToString()
=> $"Question: {Question}\n\nContext:\n{Context}";
}
// ════════════════════════════════════════════════════════════
// OUTPUT / SIGNATURE TYPES ([LmpSignature] — LLM outputs)
// ════════════════════════════════════════════════════════════
/// <summary>
/// Query expansion output. The LLM generates alternative search queries
/// to improve retrieval recall across the survival/equipment domain.
/// </summary>
[LmpSignature("Given a user question about survival equipment or emergency preparedness, generate alternative search queries that would help find relevant information. Each alternative should approach the topic from a different angle — synonyms, related concepts, or more specific/general phrasings.")]
public partial record ExpandedQueries
{
[Description("Three alternative phrasings of the question, each approaching it differently")]
public required string[] Alternatives { get; init; }
}
/// <summary>
/// Reranker output. The LLM scores a single passage for relevance.
/// Used by LmpReranker adapter inside the MEDI ResultProcessor pipeline.
/// </summary>
[LmpSignature("Given a question and a text passage about survival equipment or emergency preparedness, judge how relevant the passage is to answering the question.")]
public partial record PassageJudgment
{
[Description("Relevance score from 1 (not relevant at all) to 5 (directly answers the question)")]
public required int Score { get; init; }
}
/// <summary>
/// CRAG confidence output. The LLM assesses whether retrieved passages
/// are sufficient to answer the question confidently.
/// </summary>
[LmpSignature("Given a question and top retrieved passages, assess whether the passages provide enough information to confidently answer the question. Consider factual coverage, specificity, and directness.")]
public partial record ConfidenceClassification
{
[Description("Confidence level: 'correct' (passages directly answer the question), 'ambiguous' (partially relevant, may need refinement), or 'incorrect' (passages do not address the question)")]
public required string Confidence { get; init; }
[Description("Brief reasoning for the confidence assessment")]
public required string Reasoning { get; init; }
}
/// <summary>
/// Final answer output. The LLM produces a grounded answer with citations.
/// This is the module's top-level output type.
/// </summary>
[LmpSignature("Given a question about survival equipment or emergency preparedness and supporting context passages, generate a comprehensive answer that is fully grounded in the provided context. Cite specific passages to support your claims.")]
public partial record GroundedAnswer
{
[Description("A comprehensive answer derived from the context passages")]
public required string Answer { get; init; }
[Description("Direct quotes or close paraphrases from context that support the answer")]
public required string[] Citations { get; init; }
}| Type | Used By | MEDI Seam |
|---|---|---|
QuestionInput |
Module input | — |
ExpandedQueries |
LmpQueryExpander adapter |
Seam 1: QueryProcessor |
PassageJudgmentInput + PassageJudgment |
LmpReranker adapter |
Seam 2: ResultProcessor |
ConfidenceInput + ConfidenceClassification |
LmpCragValidator adapter |
Seam 2: ResultProcessor |
AnswerInput + GroundedAnswer |
_answer predictor in module |
Seam 3: Module wraps pipeline |
These are the integration seam classes. Each one:
- Extends a MEDI processor base class (
RetrievalQueryProcessor/RetrievalResultProcessor) - Contains an LMP
Predictor<TIn, TOut>with learnable state - Bridges the MEDI
ProcessAsync()contract to the LMPPredictAsync()contract - Exposes its predictor via a public property so the
AdvancedRagModulecan register it withGetPredictors()for optimization
Replaces MultiQueryExpander from MEDIExtensions. Same MEDI contract, but the
LLM call is now a typed, optimizable predictor.
using LMP;
using Microsoft.Extensions.AI;
using Microsoft.Extensions.DataIngestion; // RetrievalQuery, RetrievalQueryProcessor
namespace LMP.Samples.AdvancedRag.Adapters;
/// <summary>
/// MEDI RetrievalQueryProcessor that uses an LMP Predictor for query expansion.
///
/// What MEDI sees: a standard QueryProcessor that populates query.Variants.
/// What LMP sees: a Predictor<QuestionInput, ExpandedQueries> with learnable state.
///
/// The MEDI RetrievalPipeline calls ProcessAsync() as part of its normal flow.
/// Internally, we delegate to the LMP predictor which has typed input/output,
/// validation, and learnable instructions + demos.
/// </summary>
public sealed class LmpQueryExpander : RetrievalQueryProcessor
{
/// <summary>
/// The LMP predictor powering this processor. Exposed so the AdvancedRagModule
/// can include it in GetPredictors() for optimization.
/// </summary>
public Predictor<QuestionInput, ExpandedQueries> Predictor { get; }
public LmpQueryExpander(IChatClient client)
{
Predictor = new Predictor<QuestionInput, ExpandedQueries>(client)
{
Name = "expand_query"
};
}
/// <summary>
/// MEDI contract: receive a RetrievalQuery, return it with Variants populated.
/// We bridge to the LMP predictor for the actual LLM call.
/// </summary>
public override async Task<RetrievalQuery> ProcessAsync(
RetrievalQuery query,
CancellationToken cancellationToken = default)
{
var result = await Predictor.PredictAsync(
new QuestionInput(query.Text),
validate: r =>
LmpAssert.That(r,
r => r.Alternatives is { Length: > 0 },
"Must generate at least one alternative query"),
maxRetries: 2,
cancellationToken: cancellationToken);
// Write LMP output back into the MEDI data model
query.Variants = [query.Text, .. (result.Alternatives ?? [])];
return query;
}
}Replaces LlmReranker from MEDIExtensions. Instead of a single batch prompt
that asks the LLM to rank all passages at once (fragile, hard to optimize), this
scores each passage individually with a typed predictor.
using LMP;
using Microsoft.Extensions.AI;
using Microsoft.Extensions.DataIngestion; // RetrievalResults, RetrievalResultProcessor, etc.
namespace LMP.Samples.AdvancedRag.Adapters;
/// <summary>
/// MEDI RetrievalResultProcessor that uses an LMP Predictor for reranking.
///
/// The original MEDIExtensions LlmReranker asks the LLM to rank all passages
/// in a single batch prompt. This adapter scores each passage individually,
/// giving the optimizer granular traces to learn from.
/// </summary>
public sealed class LmpReranker : RetrievalResultProcessor
{
private readonly int _maxResults;
/// <summary>
/// The LMP predictor powering this processor. Exposed for optimization.
/// </summary>
public Predictor<PassageJudgmentInput, PassageJudgment> Predictor { get; }
public LmpReranker(IChatClient client, int maxResults = 5)
{
_maxResults = maxResults;
Predictor = new Predictor<PassageJudgmentInput, PassageJudgment>(client)
{
Name = "rerank"
};
}
/// <summary>
/// MEDI contract: receive RetrievalResults + original query, return reranked results.
/// </summary>
public override async Task<RetrievalResults> ProcessAsync(
RetrievalResults results,
RetrievalQuery query,
CancellationToken cancellationToken = default)
{
if (results.Chunks.Count == 0)
return results;
// Score each chunk individually via the LMP predictor
var scored = new List<(RetrievalChunk Chunk, int Score)>();
foreach (var chunk in results.Chunks)
{
var judgment = await Predictor.PredictAsync(
new PassageJudgmentInput(query.Text, chunk.Content),
validate: r =>
LmpAssert.That(r,
r => r.Score >= 1 && r.Score <= 5,
"Score must be between 1 and 5"),
maxRetries: 2,
cancellationToken: cancellationToken);
scored.Add((chunk, judgment.Score));
}
// Sort by score descending, take top results
var reranked = scored
.OrderByDescending(x => x.Score)
.Take(_maxResults)
.ToList();
// Write back into MEDI data model
results.Chunks = reranked
.Select(x =>
{
x.Chunk.Score = x.Score; // Update the MEDI score field
return x.Chunk;
})
.ToList();
results.Metadata["reranked"] = true;
results.Metadata["reranked_count"] = results.Chunks.Count;
return results;
}
}Replaces CragValidator from MEDIExtensions. The three-way routing logic
(correct / ambiguous / incorrect) stays the same, but the LLM call that
determines confidence is now an optimizable predictor.
using LMP;
using Microsoft.Extensions.AI;
using Microsoft.Extensions.DataIngestion;
namespace LMP.Samples.AdvancedRag.Adapters;
/// <summary>
/// MEDI RetrievalResultProcessor that uses an LMP Predictor for CRAG
/// (Corrective Retrieval-Augmented Generation) confidence assessment.
///
/// Routes on three paths:
/// "correct" → use results as-is
/// "ambiguous" → flag for refinement
/// "incorrect" → clear chunks, set low_confidence
/// </summary>
public sealed class LmpCragValidator : RetrievalResultProcessor
{
private readonly int _evaluateTopN;
/// <summary>
/// The LMP predictor powering this processor. Exposed for optimization.
/// </summary>
public Predictor<ConfidenceInput, ConfidenceClassification> Predictor { get; }
public LmpCragValidator(IChatClient client, int evaluateTopN = 3)
{
_evaluateTopN = evaluateTopN;
Predictor = new Predictor<ConfidenceInput, ConfidenceClassification>(client)
{
Name = "crag"
};
}
public override async Task<RetrievalResults> ProcessAsync(
RetrievalResults results,
RetrievalQuery query,
CancellationToken cancellationToken = default)
{
if (results.Chunks.Count == 0)
{
results.Metadata["crag_path"] = "incorrect";
results.Metadata["low_confidence"] = true;
return results;
}
// Build preview of top-N passages for the LLM to assess
var topPassages = string.Join("\n\n",
results.Chunks
.Take(_evaluateTopN)
.Select((c, i) => $"[{i + 1}] {c.Content[..Math.Min(c.Content.Length, 300)]}"));
var classification = await Predictor.PredictAsync(
new ConfidenceInput(query.Text, topPassages),
validate: r =>
LmpAssert.That(r,
r => r.Confidence is "correct" or "ambiguous" or "incorrect",
"Confidence must be 'correct', 'ambiguous', or 'incorrect'"),
maxRetries: 2,
cancellationToken: cancellationToken);
// Apply CRAG routing
results.Metadata["crag_path"] = classification.Confidence;
results.Metadata["crag_reasoning"] = classification.Reasoning;
switch (classification.Confidence)
{
case "incorrect":
results.Chunks.Clear();
results.Metadata["low_confidence"] = true;
break;
case "ambiguous":
results.Metadata["needs_followup"] = true;
break;
case "correct":
// Use results as-is
break;
}
return results;
}
}| Without Adapters (raw MEDI) | With Adapters (MEDI + LMP) |
|---|---|
| Prompt strings in processor code | Typed [LmpSignature] with [Description] fields |
| No validation on LLM output | LmpAssert.That() with retry on failure |
| No learning from data | Predictor.Demos populated by optimizer |
| No instruction tuning | Predictor.Instructions evolved by MIPROv2 |
| No tracing of individual calls | Each PredictAsync recorded in Trace |
| Can't save/load tuned state | SaveStateAsync / ApplyStateAsync per predictor |
This is the orchestration layer. It owns the MEDI RetrievalPipeline (with LMP
adapters inside), plus a standalone LMP predictor for answer generation.
The partial class is required — the source generator emits GetPredictors()
which returns ALL four predictors (from the three adapters + the answer predictor),
enabling the optimizer to tune them all.
using LMP;
using LMP.Samples.AdvancedRag.Adapters;
using Microsoft.Extensions.AI;
using Microsoft.Extensions.DataIngestion;
using Microsoft.Extensions.VectorData;
namespace LMP.Samples.AdvancedRag;
/// <summary>
/// Advanced RAG module that composes:
/// • MEDI RetrievalPipeline (infrastructure: vector search, RRF, tree traversal)
/// • LMP Predictors (intelligence: query expansion, reranking, CRAG, answer gen)
///
/// The module owns both layers. When optimized, all four predictors get
/// learned instructions and few-shot demos.
///
/// IMPORTANT: Must be partial for source generator to emit GetPredictors() + CloneCore().
/// </summary>
public partial class AdvancedRagModule : LmpModule<QuestionInput, GroundedAnswer>
{
// ── Infrastructure (MEDI) ──────────────────────────────
private readonly RetrievalPipeline _pipeline;
private readonly VectorStoreCollection<Guid, IngestedChunk> _collection;
private readonly int _topK;
// ── Intelligence (LMP) — exposed via adapters ──────────
// These are the LMP-powered MEDI processors. Their Predictor
// properties are registered with GetPredictors() by the source gen.
private readonly LmpQueryExpander _expander;
private readonly LmpReranker _reranker;
private readonly LmpCragValidator _cragValidator;
// ── Standalone LMP predictor (not inside a MEDI processor) ──
private readonly ChainOfThought<AnswerInput, GroundedAnswer> _answer;
public AdvancedRagModule(
IChatClient client,
VectorStoreCollection<Guid, IngestedChunk> collection,
int topK = 5)
{
ArgumentNullException.ThrowIfNull(client);
ArgumentNullException.ThrowIfNull(collection);
Client = client;
_collection = collection;
_topK = topK;
// Create LMP-powered MEDI adapters
_expander = new LmpQueryExpander(client);
_reranker = new LmpReranker(client, maxResults: topK);
_cragValidator = new LmpCragValidator(client, evaluateTopN: 3);
// Wire adapters into the MEDI RetrievalPipeline
_pipeline = new RetrievalPipeline();
_pipeline.QueryProcessors.Add(_expander);
_pipeline.ResultProcessors.Add(_reranker);
_pipeline.ResultProcessors.Add(_cragValidator);
// Standalone answer predictor with chain-of-thought
_answer = new ChainOfThought<AnswerInput, GroundedAnswer>(client)
{
Name = "answer"
};
}
// ── Source generator needs to know about ALL predictors ──
// We override GetPredictors() manually since the predictors live
// inside adapter objects, not directly as fields on this class.
//
// NOTE: If the source generator can discover predictors in adapter
// fields automatically (via the [Predict] attribute or similar),
// you could simplify this. For now, explicit override is safest.
public override IReadOnlyList<(string Name, IPredictor Predictor)> GetPredictors()
{
return
[
(_expander.Predictor.Name, _expander.Predictor),
(_reranker.Predictor.Name, _reranker.Predictor),
(_cragValidator.Predictor.Name, _cragValidator.Predictor),
(_answer.Name, _answer),
];
}
/// <summary>
/// Full RAG pipeline:
/// 1. MEDI RetrievalPipeline handles: expand → search → RRF → rerank → CRAG
/// (with LMP predictors powering the LLM calls inside)
/// 2. LMP ChainOfThought handles: grounded answer generation
/// </summary>
public override async Task<GroundedAnswer> ForwardAsync(
QuestionInput input,
CancellationToken cancellationToken = default)
{
// ── Step 1: MEDI Retrieval Pipeline ──────────────────────────
// This single call runs the ENTIRE retrieval pipeline:
// a. LmpQueryExpander.ProcessAsync() → expands query via LMP predictor
// b. Qdrant vector search per variant → RRF merge
// c. LmpReranker.ProcessAsync() → scores each chunk via LMP predictor
// d. LmpCragValidator.ProcessAsync() → confidence gate via LMP predictor
//
// All LMP predictor calls are traced if this.Trace is set.
// The optimizer collects these traces for demo learning.
PropagateTraceToAdapters();
var results = await _pipeline.RetrieveAsync(
_collection,
input.Question,
topK: _topK,
contentSelector: chunk => chunk.Text,
cancellationToken: cancellationToken);
// Handle CRAG "incorrect" path — no confident results
if (results.Chunks.Count == 0)
{
return new GroundedAnswer
{
Answer = "I could not find sufficiently relevant information to answer this question confidently.",
Citations = []
};
}
// ── Step 2: Answer Generation ────────────────────────────────
// Build context from the MEDI pipeline's reranked, CRAG-validated results.
// ChainOfThought makes the LLM reason step-by-step before answering.
var context = string.Join("\n\n---\n\n",
results.Chunks.Select(c => c.Content));
var answer = await _answer.PredictAsync(
new AnswerInput(input.Question, context),
trace: Trace,
validate: result =>
{
LmpAssert.That(result,
r => !string.IsNullOrWhiteSpace(r.Answer),
"Answer must not be empty");
LmpAssert.That(result,
r => r.Citations is { Length: > 0 },
"Must include at least one citation from context");
},
maxRetries: 2,
cancellationToken: cancellationToken);
return answer;
}
/// <summary>
/// Passes the module's Trace down to adapters so every LMP predictor
/// call (expand, rerank, CRAG) is recorded for optimization.
/// </summary>
private void PropagateTraceToAdapters()
{
// The adapters' Predictor.PredictAsync() calls need a trace reference.
// Since the MEDI pipeline calls adapter.ProcessAsync() (not PredictAsync
// directly), we need to inject the trace before the pipeline runs.
//
// Implementation options:
// a. Adapters accept a Trace property and pass it to PredictAsync
// b. Adapters check a shared field on the module
// c. Use AsyncLocal<Trace> for ambient tracing
//
// For simplicity, adapters should expose a settable Trace property.
// TODO: Add `public Trace? Trace { get; set; }` to each adapter
// and pass it in PredictAsync calls.
}
}This is the vector store schema — same as in advanced-rag. Shown here for completeness; copy from the advanced-rag repo.
using Microsoft.Extensions.VectorData;
namespace LMP.Samples.AdvancedRag;
/// <summary>
/// Vector store record representing an ingested and enriched document chunk.
/// Schema matches the advanced-rag sample's Qdrant collection.
/// </summary>
public sealed class IngestedChunk
{
public const string CollectionName = "data-AdvancedRag-chunks";
public const int VectorDimensions = 1536;
public const string VectorDistanceFunction = "CosineSimilarity";
[VectorStoreKey]
public Guid Key { get; set; }
[VectorStoreData]
public string DocumentId { get; set; } = "";
[VectorStoreData]
public string Text { get; set; } = "";
[VectorStoreData]
public string Context { get; set; } = "";
// Entity metadata (from EntityExtractionProcessor)
[VectorStoreData]
public string EntitiesPeople { get; set; } = "";
[VectorStoreData]
public string EntitiesOrganizations { get; set; } = "";
[VectorStoreData]
public string EntitiesTechnologies { get; set; } = "";
[VectorStoreData]
public string EntitiesVersions { get; set; } = "";
// Topic metadata (from TopicClassificationProcessor)
[VectorStoreData]
public string TopicPrimary { get; set; } = "";
[VectorStoreData]
public string TopicSecondary { get; set; } = "";
// Tree metadata (from TreeIndexProcessor)
[VectorStoreData]
public int Level { get; set; }
[VectorStoreData]
public string ParentId { get; set; } = "";
// Hypothetical query metadata
[VectorStoreData]
public string ChunkType { get; set; } = "";
[VectorStoreData]
public string ParentChunkId { get; set; } = "";
// Vector embedding
[VectorStoreVector(VectorDimensions)]
public ReadOnlyMemory<float> Vector { get; set; }
}The advanced-rag sample processes Example_Emergency_Survival_Kit.pdf and
Example_GPS_Watch.md. Training data should be Q&A pairs about this content.
12 training examples. The optimizer runs the full MEDI+LMP pipeline on these and collects successful traces.
{"input": {"Question": "What items should be included in a basic emergency survival kit?"}, "label": {"Answer": "A basic emergency survival kit should include water (one gallon per person per day for at least three days), non-perishable food (at least a three-day supply), a battery-powered or hand-crank radio, a flashlight with extra batteries, a first aid kit, a whistle to signal for help, dust masks, plastic sheeting and duct tape for sheltering in place, moist towelettes and garbage bags for personal sanitation, a wrench or pliers to turn off utilities, a manual can opener, and local maps.", "Citations": ["one gallon per person per day for at least three days", "battery-powered or hand-crank radio, a flashlight with extra batteries, a first aid kit"]}}
{"input": {"Question": "How much water should you store for emergency preparedness?"}, "label": {"Answer": "You should store at least one gallon of water per person per day for a minimum of three days, for both drinking and sanitation purposes. Consider storing more for hot climates, pregnant women, or sick individuals.", "Citations": ["one gallon per person per day for at least three days", "for both drinking and sanitation"]}}
{"input": {"Question": "How does the GPS watch track your location?"}, "label": {"Answer": "The GPS watch uses satellite-based Global Positioning System technology to track your location. It receives signals from multiple GPS satellites to calculate your precise coordinates, elevation, and movement data. The watch can display your current position, track your route, and provide navigation back to a saved waypoint.", "Citations": ["satellite-based Global Positioning System technology", "receives signals from multiple GPS satellites"]}}
{"input": {"Question": "What first aid supplies should be in a survival kit?"}, "label": {"Answer": "A survival kit's first aid supplies should include adhesive bandages in various sizes, sterile gauze pads, adhesive tape, elastic bandages, antiseptic wipes, antibiotic ointment, burn cream, scissors, tweezers, a thermometer, pain relievers, anti-diarrhea medication, and any personal prescription medications.", "Citations": ["adhesive bandages", "sterile gauze pads", "antiseptic wipes", "personal prescription medications"]}}
{"input": {"Question": "What is the battery life of the GPS watch?"}, "label": {"Answer": "The GPS watch battery life varies by mode. In standard watch mode with basic features, battery life extends significantly. In full GPS tracking mode with continuous satellite communication, battery life is reduced. Power-saving modes that reduce GPS polling frequency can extend battery life during extended outdoor activities.", "Citations": ["battery life varies by mode", "Power-saving modes that reduce GPS polling frequency"]}}
{"input": {"Question": "How do you purify water in an emergency situation?"}, "label": {"Answer": "In an emergency, water can be purified by boiling it for at least one minute (three minutes at elevations above 6,500 feet), using water purification tablets, using a portable water filter, or adding 8 drops of unscented household bleach per gallon and waiting 30 minutes. Always filter cloudy water through a clean cloth before treating it.", "Citations": ["boiling it for at least one minute", "water purification tablets", "8 drops of unscented household bleach per gallon"]}}
{"input": {"Question": "What features does the GPS watch compass offer?"}, "label": {"Answer": "The GPS watch compass provides both magnetic and GPS-based bearing information. It displays cardinal and intercardinal directions, supports bearing lock for following a specific heading, and can be calibrated for local magnetic declination. The compass integrates with route navigation to show the direction to your next waypoint.", "Citations": ["magnetic and GPS-based bearing information", "bearing lock for following a specific heading"]}}
{"input": {"Question": "What type of food should be stored for emergencies?"}, "label": {"Answer": "Store non-perishable food that requires no refrigeration, minimal preparation, and little or no water to cook. Good choices include canned meats and fish, canned fruits and vegetables, protein bars, granola bars, dried fruit, nuts, peanut butter, crackers, and ready-to-eat canned soups. Ensure you have a manual can opener. Replace food annually to maintain freshness.", "Citations": ["non-perishable food that requires no refrigeration", "canned meats and fish", "Replace food annually"]}}
{"input": {"Question": "How do you use the GPS watch for route navigation?"}, "label": {"Answer": "To use route navigation on the GPS watch, first save waypoints at key locations along your planned route. Then create a route by connecting waypoints in sequence. During navigation, the watch displays the direction and distance to the next waypoint, your current bearing, and estimated time of arrival. Breadcrumb tracking records your actual path for backtracking.", "Citations": ["save waypoints at key locations", "displays the direction and distance to the next waypoint", "Breadcrumb tracking"]}}
{"input": {"Question": "What should you do with your emergency kit every six months?"}, "label": {"Answer": "Every six months, review and update your emergency survival kit. Check expiration dates on food, water, medications, and batteries and replace any that are expired or near expiration. Test flashlights and radio equipment. Update personal documents and contact information. Adjust contents for seasonal needs and any changes in family size or medical requirements.", "Citations": ["Check expiration dates on food, water, medications, and batteries", "Update personal documents and contact information"]}}
{"input": {"Question": "Can the GPS watch measure elevation and barometric pressure?"}, "label": {"Answer": "Yes, the GPS watch includes an altimeter that measures elevation using both GPS satellite data and a built-in barometric pressure sensor. The barometric altimeter provides more accurate real-time elevation readings, while GPS altitude is used for calibration. The barometric sensor can also display weather trend information based on pressure changes.", "Citations": ["altimeter that measures elevation using both GPS satellite data and a built-in barometric pressure sensor", "weather trend information based on pressure changes"]}}
{"input": {"Question": "How do you shelter in place during a chemical emergency?"}, "label": {"Answer": "To shelter in place during a chemical emergency, go indoors immediately and close all windows and doors. Turn off heating, ventilation, and air conditioning systems. Use plastic sheeting and duct tape to seal windows, doors, and vents. Move to an interior room with few windows on an upper floor if possible. Listen to emergency broadcasts for instructions on when it is safe to leave.", "Citations": ["plastic sheeting and duct tape to seal windows, doors, and vents", "interior room with few windows", "Listen to emergency broadcasts"]}}5 held-out examples for evaluation. These are NOT seen during optimization.
{"input": {"Question": "What communication devices should be in a survival kit?"}, "label": {"Answer": "A survival kit should include a battery-powered or hand-crank NOAA Weather Radio for emergency broadcasts, a whistle for signaling rescuers, and a fully charged cell phone with a portable battery charger. Consider including a two-way radio for areas without cell coverage. Keep extra batteries for all electronic devices.", "Citations": ["battery-powered or hand-crank radio", "whistle to signal for help", "extra batteries"]}}
{"input": {"Question": "Is the GPS watch waterproof?"}, "label": {"Answer": "The GPS watch is designed to be water-resistant and can withstand exposure to rain, splashing, and brief immersion. It is rated for use during water-based outdoor activities. However, it should not be used for deep-water diving as prolonged submersion beyond its rated depth may cause damage.", "Citations": ["water-resistant", "rated for use during water-based outdoor activities"]}}
{"input": {"Question": "How do you signal for help in a wilderness emergency?"}, "label": {"Answer": "Signal for help using a whistle (three blasts is the universal distress signal), a signal mirror to reflect sunlight toward aircraft or distant rescuers, bright-colored clothing or fabric spread on the ground, and a flashlight at night. Build a signal fire with green vegetation to create visible smoke during the day. The GPS watch can also share your coordinates for rescue teams.", "Citations": ["whistle", "three blasts is the universal distress signal", "signal mirror"]}}
{"input": {"Question": "How do you set waypoints on the GPS watch?"}, "label": {"Answer": "To set a waypoint on the GPS watch, navigate to the GPS/Navigation menu, select 'Mark Waypoint' or press the dedicated waypoint button. The watch captures your current GPS coordinates and allows you to name the waypoint for easy identification. You can also enter coordinates manually for a known location. Saved waypoints can be organized into groups and used for route creation.", "Citations": ["navigate to the GPS/Navigation menu", "captures your current GPS coordinates", "enter coordinates manually"]}}
{"input": {"Question": "What documents should be kept in an emergency kit?"}, "label": {"Answer": "Keep copies of important family documents in a waterproof container in your emergency kit, including identification (driver's licenses, passports), insurance policies, bank account records, medical records and prescriptions, proof of address, and emergency contact information. Include both physical copies and copies on a USB drive.", "Citations": ["copies of important family documents in a waterproof container", "identification", "insurance policies", "emergency contact information"]}}This demonstrates the full lifecycle: setup → ingest (MEDI) → predict (MEDI+LMP) → evaluate baseline → optimize (LMP) → evaluate optimized → save/load.
using Azure.AI.OpenAI;
using Azure.Identity;
using LMP;
using LMP.Optimizers;
using LMP.Samples.AdvancedRag;
using LMP.Samples.AdvancedRag.Adapters;
using Microsoft.Extensions.AI;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.VectorData;
// ══════════════════════════════════════════════════════════════
// 1. SETUP
// ══════════════════════════════════════════════════════════════
var config = new ConfigurationBuilder()
.AddUserSecrets<Program>()
.Build();
string endpoint = config["AzureOpenAI:Endpoint"]
?? throw new InvalidOperationException("Set AzureOpenAI:Endpoint in user secrets");
string deployment = config["AzureOpenAI:Deployment"]
?? throw new InvalidOperationException("Set AzureOpenAI:Deployment in user secrets");
IChatClient client = new AzureOpenAIClient(
new Uri(endpoint), new DefaultAzureCredential())
.GetChatClient(deployment)
.AsIChatClient();
// Qdrant vector store collection (assumes documents already ingested
// by the advanced-rag sample's MEDI ingestion pipeline).
//
// For a standalone demo without Qdrant, swap this with an in-memory
// vector store — the AdvancedRagModule doesn't care.
var qdrantClient = new Qdrant.Client.QdrantClient("localhost", 6334);
var vectorStore = new Microsoft.SemanticKernel.Connectors.Qdrant.QdrantVectorStore(qdrantClient);
var collection = vectorStore.GetCollection<Guid, IngestedChunk>(IngestedChunk.CollectionName);
await collection.EnsureCollectionExistsAsync();
Console.WriteLine("╔══════════════════════════════════════════════════════════╗");
Console.WriteLine("║ LMP + MEDI Advanced RAG ║");
Console.WriteLine("║ MEDI pipeline infrastructure + LMP optimizable intel ║");
Console.WriteLine("╚══════════════════════════════════════════════════════════╝");
Console.WriteLine();
// ══════════════════════════════════════════════════════════════
// 2. SINGLE PREDICTION: Full MEDI+LMP pipeline in action
// ══════════════════════════════════════════════════════════════
Console.WriteLine("─── Single Prediction ──────────────────────────────────");
Console.WriteLine();
var module = new AdvancedRagModule(client, collection);
var question = new QuestionInput(
"What items should I pack for emergency water purification?");
Console.WriteLine($"Q: {question.Question}");
Console.WriteLine();
var result = await module.ForwardAsync(question);
Console.WriteLine($"A: {result.Answer}");
Console.WriteLine();
Console.WriteLine("Citations:");
foreach (var citation in result.Citations)
Console.WriteLine($" • {citation}");
Console.WriteLine();
// ══════════════════════════════════════════════════════════════
// 3. BASELINE EVALUATION
// ══════════════════════════════════════════════════════════════
Console.WriteLine("─── Baseline Evaluation ────────────────────────────────");
Console.WriteLine();
var dataDir = Path.Combine(AppContext.BaseDirectory, "data");
var trainSet = Example.LoadFromJsonl<QuestionInput, GroundedAnswer>(
Path.Combine(dataDir, "train.jsonl"));
var devSet = Example.LoadFromJsonl<QuestionInput, GroundedAnswer>(
Path.Combine(dataDir, "dev.jsonl"));
Console.WriteLine($"Train: {trainSet.Count} examples | Dev: {devSet.Count} examples");
Console.WriteLine();
// Metric: keyword overlap + citation bonus
Func<GroundedAnswer, GroundedAnswer, float> metric = (predicted, expected) =>
{
if (string.IsNullOrWhiteSpace(predicted.Answer) ||
string.IsNullOrWhiteSpace(expected.Answer))
return 0f;
var stopWords = new HashSet<string>(StringComparer.OrdinalIgnoreCase)
{
"the", "a", "an", "is", "are", "was", "were", "in", "on", "at",
"to", "for", "of", "with", "and", "or", "but", "it", "its",
"by", "from", "as", "be", "this", "that", "can", "will",
"should", "your", "you", "have", "has", "not", "also"
};
var expectedWords = expected.Answer
.Split([' ', ',', '.', '!', '?', ';', ':', '(', ')', '-', '"'],
StringSplitOptions.RemoveEmptyEntries)
.Where(w => w.Length > 2 && !stopWords.Contains(w))
.Distinct(StringComparer.OrdinalIgnoreCase)
.ToHashSet(StringComparer.OrdinalIgnoreCase);
if (expectedWords.Count == 0) return 0f;
var matchCount = expectedWords.Count(kw =>
predicted.Answer.Contains(kw, StringComparison.OrdinalIgnoreCase));
float keywordScore = (float)matchCount / expectedWords.Count;
float citationBonus = predicted.Citations is { Length: > 0 } ? 0.1f : 0f;
return Math.Min(keywordScore + citationBonus, 1.0f);
};
var baselineModule = new AdvancedRagModule(client, collection);
var baselineResult = await Evaluator.EvaluateAsync(baselineModule, devSet, metric);
Console.WriteLine($"Baseline: {baselineResult.AverageScore:P1}");
Console.WriteLine($" Min: {baselineResult.MinScore:P1} Max: {baselineResult.MaxScore:P1}");
Console.WriteLine();
// ══════════════════════════════════════════════════════════════
// 4. OPTIMIZATION
// ══════════════════════════════════════════════════════════════
Console.WriteLine("─── Optimizing (BootstrapRandomSearch, 8 trials) ────────");
Console.WriteLine();
var untypedMetric = Metric.Create(metric);
var optimizer = new BootstrapRandomSearch(
numTrials: 8,
maxDemos: 4,
metricThreshold: 0.3f);
var optimizedModule = await optimizer.CompileAsync(
new AdvancedRagModule(client, collection),
trainSet,
untypedMetric);
// ══════════════════════════════════════════════════════════════
// 5. OPTIMIZED EVALUATION
// ══════════════════════════════════════════════════════════════
Console.WriteLine("─── Optimized Evaluation ───────────────────────────────");
Console.WriteLine();
var optimizedResult = await Evaluator.EvaluateAsync(optimizedModule, devSet, metric);
Console.WriteLine($"Optimized: {optimizedResult.AverageScore:P1}");
Console.WriteLine($" Min: {optimizedResult.MinScore:P1} Max: {optimizedResult.MaxScore:P1}");
Console.WriteLine();
Console.WriteLine($"Improvement: {baselineResult.AverageScore:P1} → {optimizedResult.AverageScore:P1}");
Console.WriteLine();
// Show what the optimizer learned for each predictor
Console.WriteLine("─── Learned Parameters ─────────────────────────────────");
Console.WriteLine();
foreach (var (name, predictor) in optimizedModule.GetPredictors())
{
Console.WriteLine($" [{name}]");
Console.WriteLine($" Demos: {predictor.Demos.Count}");
Console.WriteLine($" Instructions: {Truncate(predictor.Instructions, 80)}");
}
Console.WriteLine();
// ══════════════════════════════════════════════════════════════
// 6. SAVE & RELOAD
// ══════════════════════════════════════════════════════════════
Console.WriteLine("─── Save & Reload ──────────────────────────────────────");
Console.WriteLine();
var statePath = Path.Combine(Path.GetTempPath(), "advanced-rag-optimized.json");
await optimizedModule.SaveStateAsync(statePath);
Console.WriteLine($"Saved to: {statePath}");
var reloaded = new AdvancedRagModule(client, collection);
await reloaded.ApplyStateAsync(statePath);
var reloadedResult = await Evaluator.EvaluateAsync(reloaded, devSet, metric);
Console.WriteLine($"Reloaded: {reloadedResult.AverageScore:P1} (should match optimized)");
Console.WriteLine();
Console.WriteLine("Done! ✓");
// ── Helper ──────────────────────────────────────────────────
static string Truncate(string text, int max)
=> text.Length <= max ? text : text[..max] + "…";- Clone
luisquintanilla/lmp-dotnet - Clone
luisquintanilla/advanced-rag(for the Qdrant data + documents) - Run the advanced-rag ingestion first (so Qdrant has data to search)
- Place this sample in
lmp-dotnet/samples/LMP.Samples.AdvancedRag/ - .NET 10 SDK + Docker (for Qdrant)
# 1. Start Qdrant (if not already running via Aspire)
docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant
# 2. Run advanced-rag ingestion first to populate Qdrant
# (Follow its README for Azure OpenAI + Aspire setup)
# 3. Navigate to the LMP sample
cd lmp-dotnet/samples/LMP.Samples.AdvancedRag
# 4. Set user secrets
dotnet user-secrets init
dotnet user-secrets set "AzureOpenAI:Endpoint" "https://YOUR-RESOURCE.openai.azure.com/"
dotnet user-secrets set "AzureOpenAI:Deployment" "gpt-4o-mini"
# 5. Build & run
dotnet build && dotnet runIf you don't want to run Qdrant, create an InMemoryVectorStore with passages
from the knowledge base. The AdvancedRagModule accepts any
VectorStoreCollection<Guid, IngestedChunk> — swap the implementation.
- MEDIExtensions is a separate repo — you can't add LMP dependencies to it
- Adapters preserve the MEDI contract — the pipeline doesn't know LMP exists
- Clean separation — MEDI processors can still be used without LMP
- Composition over inheritance — adapter wraps, doesn't replace
The LMP source generator discovers predictors stored as direct fields on the
module class (e.g., private readonly Predictor<A, B> _foo). Our predictors
are nested inside adapter objects. The source generator can't see through the
adapter. So we override GetPredictors() manually to expose all four predictors.
Future improvement: The source generator could be taught to discover predictors
in objects that implement a IHasPredictor interface.
The original LlmReranker asks the LLM to rank ALL passages in one prompt
(returns {"rankedIndices": [3, 1, 5, ...]}). This is efficient but:
- Hard to optimize — the optimizer can't learn from individual scoring decisions
- Fragile — if the LLM returns malformed indices, the whole batch fails
- No granular traces — one trace entry for N passages vs. N trace entries
Individual scoring gives the optimizer N traces per query, each showing "for this question + this passage, the correct score was X." This is much richer signal for few-shot demo selection.
The cost tradeoff (N calls vs. 1) is acceptable because:
- N is typically 5-10 (after RRF dedup, before final top-K)
- Each call is very short (one passage, one score)
- The optimizer can reduce calls by learning to score accurately in fewer retries
The advanced-rag sample uses Example_Emergency_Survival_Kit.pdf and
Example_GPS_Watch.md. Matching the domain means:
- You can test against the same Qdrant collection (shared ingestion)
- Training data questions are realistic for the corpus
- The sample tells a coherent end-to-end story
- Simpler to understand for a first sample
- Faster (8 trials vs. MIPROv2's 3-phase process)
- Good enough for 12 training examples
- MIPROv2 can be shown as an "upgrade path" exercise
| Term | Definition |
|---|---|
| MEDI | Microsoft.Extensions.DataIngestion — pipeline framework for ingestion + retrieval |
| RetrievalPipeline | MEDI class that orchestrates query processors → vector search → result processors |
| RetrievalQueryProcessor | MEDI abstract class for pre-search processing (e.g., query expansion) |
| RetrievalResultProcessor | MEDI abstract class for post-search processing (e.g., reranking) |
| RetrievalQuery | MEDI data class with Text, Variants, and Metadata |
| RetrievalChunk | MEDI data class with Content, Score, and Record metadata |
| RetrievalResults | MEDI data class with Chunks list and Metadata dictionary |
| RRF | Reciprocal Rank Fusion — algorithm to merge rankings from multiple queries |
| CRAG | Corrective RAG — quality gate that classifies retrieval confidence |
| Adapter | Class that extends a MEDI processor but delegates LLM calls to an LMP Predictor |
| Predictor | LMP class: typed LLM call with learnable instructions + demos |
| LmpModule | LMP class: composes predictors and defines ForwardAsync |
| [LmpSignature] | Attribute marking a record as an LLM output type (source-gen trigger) |
| Trace | LMP class: log of all predictor calls during execution |
| BootstrapRandomSearch | LMP optimizer: N independent BootstrapFewShot trials, keep best |
| ChainOfThought | LMP predictor variant: LLM reasons step-by-step before answering |
| LmpAssert | LMP validation: throws on failure, triggers retry with error feedback |
- Create project directory and
LMP.Samples.AdvancedRag.csproj - Create
Types.cs(4 input records + 4[LmpSignature]output records) - Create
Adapters/LmpQueryExpander.cs(extendsRetrievalQueryProcessor) - Create
Adapters/LmpReranker.cs(extendsRetrievalResultProcessor) - Create
Adapters/LmpCragValidator.cs(extendsRetrievalResultProcessor) - Create
AdvancedRagModule.cs(owns MEDI pipeline + LMP predictors) - Create
IngestedChunk.cs(copy from advanced-rag) - Create
data/train.jsonl(12 survival/equipment Q&A examples) - Create
data/dev.jsonl(5 held-out evaluation examples) - Create
Program.cs(full lifecycle: predict → eval → optimize → save) - Resolve
PropagateTraceToAdapters()TODO (add Trace property to adapters) - Set user secrets (AzureOpenAI endpoint + deployment)
- Run advanced-rag ingestion to populate Qdrant
-
dotnet build(verify source generator runs clean) -
dotnet run(verify baseline → optimization → improvement)