LMP + MEDI Advanced RAG — Integration Spec

Goal: Build an advanced RAG sample showing how LMP (Language Model Programs) and MEDI (Microsoft.Extensions.DataIngestion) coexist and natively layer. MEDI handles pipeline infrastructure (ingestion, vector search, RRF, tree traversal). LMP handles the intelligence layer (optimizable typed predictors for every LLM call). LMP optimizers tune the whole thing end-to-end.

Domain: Survival/equipment documentation (matching the advanced-rag sample).

The Big Picture — How LMP and MEDI Coexist
Architecture Deep Dive
The Three Integration Seams
Project Structure
Step 0: Project Setup
Step 1: Signature Types
Step 2: LMP-Powered MEDI Adapters
Step 3: The AdvancedRagModule
Step 4: Training & Dev Data
Step 5: Program.cs
Step 6: Build & Run
Design Decisions & Rationale
Glossary

1. The Big Picture

The Problem

The advanced-rag sample has ~7 processors that make LLM calls (query expansion, entity extraction, topic classification, reranking, CRAG, etc.). Each one uses IChatClient.GetResponseAsync() directly with hand-written prompt strings. If query expansion produces bad results, you hand-tune the prompt. There's no systematic way to improve these LLM calls.

LMP solves this: every LLM call becomes a Predictor<TInput, TOutput> with learnable parameters (instructions + few-shot demos). An optimizer can discover better prompts and examples automatically from labeled data.

The Insight: They Solve Different Layers

┌─────────────────────────────────────────────────────────┐
│                    YOUR APPLICATION                       │
│                                                           │
│  ┌───────────────────────────────────────────────────┐   │
│  │  LMP Layer (Intelligence)                          │   │
│  │                                                     │   │
│  │  • Typed Predictors for every LLM call             │   │
│  │  • Learnable instructions + few-shot demos         │   │
│  │  • Validation guards (LmpAssert)                   │   │
│  │  • End-to-end optimization (BootstrapFewShot)      │   │
│  │  • Save/load optimized state                       │   │
│  └──────────┬────────────────────────────────────────┘   │
│             │ LMP predictors power MEDI processors        │
│  ┌──────────▼────────────────────────────────────────┐   │
│  │  MEDI Layer (Infrastructure)                       │   │
│  │                                                     │   │
│  │  • Ingestion pipeline (PDF→chunks→enrichment→store)│   │
│  │  • Retrieval pipeline (expand→search→RRF→rerank)   │   │
│  │  • Vector store orchestration (Qdrant)             │   │
│  │  • Tree traversal (RAPTOR hierarchies)             │   │
│  │  • Reciprocal Rank Fusion (multi-query merge)      │   │
│  │  • Metadata propagation & diagnostics              │   │
│  └───────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────┘

LMP does NOT replace MEDI. MEDI handles pipeline orchestration, vector search plumbing, RRF deduplication, tree traversal, metadata propagation — things that have nothing to do with LLM prompts. LMP replaces the raw IChatClient calls inside MEDI processors with typed, optimizable predictors.

What Changes, What Doesn't

Stays in MEDI (Infrastructure)	Moves to LMP (Intelligence)
`IngestionPipeline` orchestration	Entity extraction LLM call
`RetrievalPipeline` orchestration	Topic classification LLM call
`SemanticSimilarityChunker`	Hypothetical query generation LLM call
Qdrant vector store operations	Query expansion LLM call
Reciprocal Rank Fusion algorithm	LLM reranking scoring call
Tree traversal grouping/sorting	CRAG confidence scoring LLM call
PDF reading (PdfPig)	Answer generation LLM call
OpenTelemetry diagnostics	Self-RAG critique LLM call
Metadata propagation on chunks	HyDE hypothetical answer LLM call

Rule of thumb: If it calls IChatClient, it's a candidate for LMP. If it's data plumbing, it stays in MEDI.

2. Architecture Deep Dive

Current advanced-rag Architecture (MEDI only)

User Query
    │
    ▼
┌──────────────────────────────────────────────┐
│  MEDI RetrievalPipeline                       │
│                                                │
│  QueryProcessors (pre-search):                │
│    MultiQueryExpander ──── IChatClient ──→ prompt string
│    TreeSearchRetriever ── metadata annotation │
│                                                │
│  Vector Search: Qdrant ──→ RRF merge           │
│                                                │
│  ResultProcessors (post-search):              │
│    LlmReranker ──── IChatClient ──→ prompt string
│    CragValidator ── IChatClient ──→ prompt string
└──────────────────────────────────────────────┘
    │
    ▼
SelfRagOrchestrator ──── IChatClient ──→ prompt string
    │
    ▼
Answer

Every IChatClient arrow is a hardcoded prompt string. No typed inputs, no validation, no learning, no optimization.

Proposed Architecture (MEDI + LMP)

User Query
    │
    ▼
┌──────────────────────────────────────────────────────────────┐
│  AdvancedRagModule : LmpModule<QuestionInput, GroundedAnswer> │
│                                                                │
│  ┌──────────────────────────────────────────────────────┐     │
│  │  MEDI RetrievalPipeline (infrastructure)              │     │
│  │                                                        │     │
│  │  QueryProcessors:                                      │     │
│  │    LmpQueryExpander ─── Predictor<QI, EQ> (learnable) │     │
│  │    TreeSearchRetriever ─ metadata (no LLM)            │     │
│  │                                                        │     │
│  │  Vector Search: Qdrant → RRF merge (no LLM)           │     │
│  │                                                        │     │
│  │  ResultProcessors:                                     │     │
│  │    LmpReranker ─── Predictor<RI, RJ> (learnable)      │     │
│  │    LmpCragValidator ─ Predictor<CI, CC> (learnable)    │     │
│  └──────────────────────────────────────────────────────┘     │
│                          │                                      │
│                          ▼ RetrievalResults                     │
│  ┌──────────────────────────────────────────────────────┐     │
│  │  ChainOfThought<AnswerInput, GroundedAnswer>          │     │
│  │  (learnable: instructions + demos + reasoning)        │     │
│  └──────────────────────────────────────────────────────┘     │
│                          │                                      │
│  All predictors traced ──┼── Trace records every LLM call      │
│  All predictors learnable┼── Optimizer injects demos            │
│  All predictors validated┼── LmpAssert guards output ranges    │
└──────────────────────────┼──────────────────────────────────────┘
                           ▼
                    GroundedAnswer { Answer, Citations }
                           │
                    BootstrapRandomSearch optimizes ALL predictors

The Key Innovation

The AdvancedRagModule owns both:

The MEDI RetrievalPipeline (for infrastructure)
The LMP Predictors inside the MEDI processors (for intelligence)

When you run optimizer.CompileAsync(module, trainSet, metric), the optimizer:

Runs the full pipeline (MEDI infrastructure + LMP predictors) on training data
Collects traces from every predictor call (expand, rerank, CRAG, answer)
Identifies which traces led to high-scoring answers
Injects those successful traces as few-shot demos into each predictor
Returns an optimized module where every LLM call has learned from data

The MEDI pipeline doesn't change. The LLM calls inside it get smarter.

3. The Three Integration Seams

There are exactly three places where LMP meets MEDI. Each is an adapter class that subclasses a MEDI processor base but uses an LMP Predictor internally.

Seam 1: RetrievalQueryProcessor → LMP Predictor

MEDI calls:     processor.ProcessAsync(RetrievalQuery query)
Adapter does:   predictor.PredictAsync(typed input from query)
                writes result back into query.Variants / query.Metadata

Used for: Query expansion, HyDE

Seam 2: RetrievalResultProcessor → LMP Predictor

MEDI calls:     processor.ProcessAsync(RetrievalResults results, RetrievalQuery query)
Adapter does:   predictor.PredictAsync(typed input per chunk)
                updates results.Chunks ordering / results.Metadata

Used for: LLM reranking, CRAG validation

Seam 3: LmpModule wraps RetrievalPipeline

Module calls:   _pipeline.RetrieveAsync(collection, query, topK, ...)
                → runs all MEDI processors (which internally use LMP predictors)
                → returns RetrievalResults
Then calls:     _answer.PredictAsync(context built from results)
                → LMP ChainOfThought for answer generation

Used for: Answer generation (the only LMP predictor not inside a MEDI processor)

Why Adapters, Not Replacement?

The adapter pattern preserves everything MEDI gives you for free:

RetrievalPipeline orchestrates processors in order
RRF deduplication runs between query processors and result processors
Tree traversal runs after vector search but before result processors
OpenTelemetry diagnostics (ActivitySource with structured tags)
Logging (ILoggerFactory integration)
The UseXxx() fluent builder DI pattern

You get all of this without reimplementing it in LMP.

4. Project Structure

LMP.Samples.AdvancedRag/
├── LMP.Samples.AdvancedRag.csproj
├── Program.cs                          ← Entry point: ingest → predict → evaluate → optimize
├── Types.cs                            ← All [LmpSignature] output types + input records
├── Adapters/
│   ├── LmpQueryExpander.cs             ← MEDI QueryProcessor backed by LMP Predictor
│   ├── LmpReranker.cs                  ← MEDI ResultProcessor backed by LMP Predictor
│   └── LmpCragValidator.cs             ← MEDI ResultProcessor backed by LMP Predictor
├── AdvancedRagModule.cs                ← LmpModule wrapping MEDI RetrievalPipeline
├── IngestedChunk.cs                    ← Vector store record schema (from advanced-rag)
└── data/
    ├── train.jsonl                     ← Training examples (survival/equipment Q&A)
    └── dev.jsonl                       ← Dev evaluation examples

The actual documents (Example_Emergency_Survival_Kit.pdf, Example_GPS_Watch.md) come from the wwwroot/Data/ directory — same as advanced-rag.

Total new files: 8 (.cs) + 2 (.jsonl) + 1 (.csproj)

5. Step 0: Project Setup

File: `LMP.Samples.AdvancedRag.csproj`

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net10.0</TargetFramework>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>
    <UserSecretsId>lmp-samples-advanced-rag</UserSecretsId>
  </PropertyGroup>

  <!-- LMP library references (adjust paths to your lmp-dotnet clone) -->
  <ItemGroup>
    <ProjectReference Include="..\..\src\LMP.Modules\LMP.Modules.csproj" />
    <ProjectReference Include="..\..\src\LMP.Optimizers\LMP.Optimizers.csproj" />
    <ProjectReference Include="..\..\src\LMP.SourceGen\LMP.SourceGen.csproj"
                      OutputItemType="Analyzer"
                      ReferenceOutputAssembly="false" />
  </ItemGroup>

  <!-- MEDI: Ingestion + Retrieval pipelines -->
  <ItemGroup>
    <PackageReference Include="Microsoft.Extensions.DataIngestion" Version="10.5.1-dev" />
    <PackageReference Include="Microsoft.Extensions.DataIngestion.Markdig" Version="10.5.0-preview" />
    <PackageReference Include="MEDIExtensions" Version="1.0.0-dev" />
  </ItemGroup>

  <!-- Vector store -->
  <ItemGroup>
    <PackageReference Include="Microsoft.SemanticKernel.Connectors.Qdrant" Version="1.74.0-preview" />
  </ItemGroup>

  <!-- Azure OpenAI -->
  <ItemGroup>
    <PackageReference Include="Azure.AI.OpenAI" />
    <PackageReference Include="Azure.Identity" />
    <PackageReference Include="Microsoft.Extensions.AI.OpenAI" />
    <PackageReference Include="Microsoft.Extensions.Configuration.UserSecrets" />
  </ItemGroup>

  <!-- Copy data files to output -->
  <ItemGroup>
    <None Update="data\**\*" CopyToOutputDirectory="PreserveNewest" />
  </ItemGroup>

</Project>

Note: The exact package versions depend on the pre-release feeds. Check nuget.config in the advanced-rag repo for the local package source.

6. Step 1: Signature Types

File: `Types.cs`

Every [LmpSignature] record is an LLM output type. The source generator reads the [Description] attributes and the signature string to build typed prompts. The optimizer can then improve on this baseline with learned demos and instructions.

using System.ComponentModel;
using LMP;

namespace LMP.Samples.AdvancedRag;

// ════════════════════════════════════════════════════════════
// INPUT TYPES (plain records — NOT [LmpSignature])
// ════════════════════════════════════════════════════════════

/// <summary>Module input: the user's question.</summary>
public record QuestionInput(
    [property: Description("The user's natural language question")]
    string Question);

/// <summary>Reranker input: a (question, passage) pair to judge.</summary>
public record PassageJudgmentInput(
    [property: Description("The original user question")]
    string Question,

    [property: Description("A retrieved text passage to evaluate for relevance")]
    string Passage);

/// <summary>CRAG input: a (question, top passages) pair for confidence assessment.</summary>
public record ConfidenceInput(
    [property: Description("The original user question")]
    string Question,

    [property: Description("The top retrieved passages, separated by newlines")]
    string TopPassages);

/// <summary>Answer generation input: question + ranked context.</summary>
public record AnswerInput(
    [property: Description("The original user question")]
    string Question,

    [property: Description("Retrieved and ranked context passages, separated by ---")]
    string Context)
{
    public override string ToString()
        => $"Question: {Question}\n\nContext:\n{Context}";
}

// ════════════════════════════════════════════════════════════
// OUTPUT / SIGNATURE TYPES ([LmpSignature] — LLM outputs)
// ════════════════════════════════════════════════════════════

/// <summary>
/// Query expansion output. The LLM generates alternative search queries
/// to improve retrieval recall across the survival/equipment domain.
/// </summary>
[LmpSignature("Given a user question about survival equipment or emergency preparedness, generate alternative search queries that would help find relevant information. Each alternative should approach the topic from a different angle — synonyms, related concepts, or more specific/general phrasings.")]
public partial record ExpandedQueries
{
    [Description("Three alternative phrasings of the question, each approaching it differently")]
    public required string[] Alternatives { get; init; }
}

/// <summary>
/// Reranker output. The LLM scores a single passage for relevance.
/// Used by LmpReranker adapter inside the MEDI ResultProcessor pipeline.
/// </summary>
[LmpSignature("Given a question and a text passage about survival equipment or emergency preparedness, judge how relevant the passage is to answering the question.")]
public partial record PassageJudgment
{
    [Description("Relevance score from 1 (not relevant at all) to 5 (directly answers the question)")]
    public required int Score { get; init; }
}

/// <summary>
/// CRAG confidence output. The LLM assesses whether retrieved passages
/// are sufficient to answer the question confidently.
/// </summary>
[LmpSignature("Given a question and top retrieved passages, assess whether the passages provide enough information to confidently answer the question. Consider factual coverage, specificity, and directness.")]
public partial record ConfidenceClassification
{
    [Description("Confidence level: 'correct' (passages directly answer the question), 'ambiguous' (partially relevant, may need refinement), or 'incorrect' (passages do not address the question)")]
    public required string Confidence { get; init; }

    [Description("Brief reasoning for the confidence assessment")]
    public required string Reasoning { get; init; }
}

/// <summary>
/// Final answer output. The LLM produces a grounded answer with citations.
/// This is the module's top-level output type.
/// </summary>
[LmpSignature("Given a question about survival equipment or emergency preparedness and supporting context passages, generate a comprehensive answer that is fully grounded in the provided context. Cite specific passages to support your claims.")]
public partial record GroundedAnswer
{
    [Description("A comprehensive answer derived from the context passages")]
    public required string Answer { get; init; }

    [Description("Direct quotes or close paraphrases from context that support the answer")]
    public required string[] Citations { get; init; }
}

Why Each Type Exists

Type	Used By	MEDI Seam
`QuestionInput`	Module input	—
`ExpandedQueries`	`LmpQueryExpander` adapter	Seam 1: QueryProcessor
`PassageJudgmentInput` + `PassageJudgment`	`LmpReranker` adapter	Seam 2: ResultProcessor
`ConfidenceInput` + `ConfidenceClassification`	`LmpCragValidator` adapter	Seam 2: ResultProcessor
`AnswerInput` + `GroundedAnswer`	`_answer` predictor in module	Seam 3: Module wraps pipeline

7. Step 2: LMP-Powered MEDI Adapters

These are the integration seam classes. Each one:

Extends a MEDI processor base class (RetrievalQueryProcessor / RetrievalResultProcessor)
Contains an LMP Predictor<TIn, TOut> with learnable state
Bridges the MEDI ProcessAsync() contract to the LMP PredictAsync() contract
Exposes its predictor via a public property so the AdvancedRagModule can register it with GetPredictors() for optimization

File: `Adapters/LmpQueryExpander.cs`

Replaces MultiQueryExpander from MEDIExtensions. Same MEDI contract, but the LLM call is now a typed, optimizable predictor.

using LMP;
using Microsoft.Extensions.AI;
using Microsoft.Extensions.DataIngestion; // RetrievalQuery, RetrievalQueryProcessor

namespace LMP.Samples.AdvancedRag.Adapters;

/// <summary>
/// MEDI RetrievalQueryProcessor that uses an LMP Predictor for query expansion.
///
/// What MEDI sees: a standard QueryProcessor that populates query.Variants.
/// What LMP sees:  a Predictor<QuestionInput, ExpandedQueries> with learnable state.
///
/// The MEDI RetrievalPipeline calls ProcessAsync() as part of its normal flow.
/// Internally, we delegate to the LMP predictor which has typed input/output,
/// validation, and learnable instructions + demos.
/// </summary>
public sealed class LmpQueryExpander : RetrievalQueryProcessor
{
    /// <summary>
    /// The LMP predictor powering this processor. Exposed so the AdvancedRagModule
    /// can include it in GetPredictors() for optimization.
    /// </summary>
    public Predictor<QuestionInput, ExpandedQueries> Predictor { get; }

    public LmpQueryExpander(IChatClient client)
    {
        Predictor = new Predictor<QuestionInput, ExpandedQueries>(client)
        {
            Name = "expand_query"
        };
    }

    /// <summary>
    /// MEDI contract: receive a RetrievalQuery, return it with Variants populated.
    /// We bridge to the LMP predictor for the actual LLM call.
    /// </summary>
    public override async Task<RetrievalQuery> ProcessAsync(
        RetrievalQuery query,
        CancellationToken cancellationToken = default)
    {
        var result = await Predictor.PredictAsync(
            new QuestionInput(query.Text),
            validate: r =>
                LmpAssert.That(r,
                    r => r.Alternatives is { Length: > 0 },
                    "Must generate at least one alternative query"),
            maxRetries: 2,
            cancellationToken: cancellationToken);

        // Write LMP output back into the MEDI data model
        query.Variants = [query.Text, .. (result.Alternatives ?? [])];

        return query;
    }
}

File: `Adapters/LmpReranker.cs`

Replaces LlmReranker from MEDIExtensions. Instead of a single batch prompt that asks the LLM to rank all passages at once (fragile, hard to optimize), this scores each passage individually with a typed predictor.

using LMP;
using Microsoft.Extensions.AI;
using Microsoft.Extensions.DataIngestion; // RetrievalResults, RetrievalResultProcessor, etc.

namespace LMP.Samples.AdvancedRag.Adapters;

/// <summary>
/// MEDI RetrievalResultProcessor that uses an LMP Predictor for reranking.
///
/// The original MEDIExtensions LlmReranker asks the LLM to rank all passages
/// in a single batch prompt. This adapter scores each passage individually,
/// giving the optimizer granular traces to learn from.
/// </summary>
public sealed class LmpReranker : RetrievalResultProcessor
{
    private readonly int _maxResults;

    /// <summary>
    /// The LMP predictor powering this processor. Exposed for optimization.
    /// </summary>
    public Predictor<PassageJudgmentInput, PassageJudgment> Predictor { get; }

    public LmpReranker(IChatClient client, int maxResults = 5)
    {
        _maxResults = maxResults;
        Predictor = new Predictor<PassageJudgmentInput, PassageJudgment>(client)
        {
            Name = "rerank"
        };
    }

    /// <summary>
    /// MEDI contract: receive RetrievalResults + original query, return reranked results.
    /// </summary>
    public override async Task<RetrievalResults> ProcessAsync(
        RetrievalResults results,
        RetrievalQuery query,
        CancellationToken cancellationToken = default)
    {
        if (results.Chunks.Count == 0)
            return results;

        // Score each chunk individually via the LMP predictor
        var scored = new List<(RetrievalChunk Chunk, int Score)>();

        foreach (var chunk in results.Chunks)
        {
            var judgment = await Predictor.PredictAsync(
                new PassageJudgmentInput(query.Text, chunk.Content),
                validate: r =>
                    LmpAssert.That(r,
                        r => r.Score >= 1 && r.Score <= 5,
                        "Score must be between 1 and 5"),
                maxRetries: 2,
                cancellationToken: cancellationToken);

            scored.Add((chunk, judgment.Score));
        }

        // Sort by score descending, take top results
        var reranked = scored
            .OrderByDescending(x => x.Score)
            .Take(_maxResults)
            .ToList();

        // Write back into MEDI data model
        results.Chunks = reranked
            .Select(x =>
            {
                x.Chunk.Score = x.Score; // Update the MEDI score field
                return x.Chunk;
            })
            .ToList();

        results.Metadata["reranked"] = true;
        results.Metadata["reranked_count"] = results.Chunks.Count;

        return results;
    }
}

File: `Adapters/LmpCragValidator.cs`

Replaces CragValidator from MEDIExtensions. The three-way routing logic (correct / ambiguous / incorrect) stays the same, but the LLM call that determines confidence is now an optimizable predictor.

using LMP;
using Microsoft.Extensions.AI;
using Microsoft.Extensions.DataIngestion;

namespace LMP.Samples.AdvancedRag.Adapters;

/// <summary>
/// MEDI RetrievalResultProcessor that uses an LMP Predictor for CRAG
/// (Corrective Retrieval-Augmented Generation) confidence assessment.
///
/// Routes on three paths:
///   "correct"   → use results as-is
///   "ambiguous"  → flag for refinement
///   "incorrect"  → clear chunks, set low_confidence
/// </summary>
public sealed class LmpCragValidator : RetrievalResultProcessor
{
    private readonly int _evaluateTopN;

    /// <summary>
    /// The LMP predictor powering this processor. Exposed for optimization.
    /// </summary>
    public Predictor<ConfidenceInput, ConfidenceClassification> Predictor { get; }

    public LmpCragValidator(IChatClient client, int evaluateTopN = 3)
    {
        _evaluateTopN = evaluateTopN;
        Predictor = new Predictor<ConfidenceInput, ConfidenceClassification>(client)
        {
            Name = "crag"
        };
    }

    public override async Task<RetrievalResults> ProcessAsync(
        RetrievalResults results,
        RetrievalQuery query,
        CancellationToken cancellationToken = default)
    {
        if (results.Chunks.Count == 0)
        {
            results.Metadata["crag_path"] = "incorrect";
            results.Metadata["low_confidence"] = true;
            return results;
        }

        // Build preview of top-N passages for the LLM to assess
        var topPassages = string.Join("\n\n",
            results.Chunks
                .Take(_evaluateTopN)
                .Select((c, i) => $"[{i + 1}] {c.Content[..Math.Min(c.Content.Length, 300)]}"));

        var classification = await Predictor.PredictAsync(
            new ConfidenceInput(query.Text, topPassages),
            validate: r =>
                LmpAssert.That(r,
                    r => r.Confidence is "correct" or "ambiguous" or "incorrect",
                    "Confidence must be 'correct', 'ambiguous', or 'incorrect'"),
            maxRetries: 2,
            cancellationToken: cancellationToken);

        // Apply CRAG routing
        results.Metadata["crag_path"] = classification.Confidence;
        results.Metadata["crag_reasoning"] = classification.Reasoning;

        switch (classification.Confidence)
        {
            case "incorrect":
                results.Chunks.Clear();
                results.Metadata["low_confidence"] = true;
                break;

            case "ambiguous":
                results.Metadata["needs_followup"] = true;
                break;

            case "correct":
                // Use results as-is
                break;
        }

        return results;
    }
}

What the Adapters Give You

Without Adapters (raw MEDI)	With Adapters (MEDI + LMP)
Prompt strings in processor code	Typed `[LmpSignature]` with `[Description]` fields
No validation on LLM output	`LmpAssert.That()` with retry on failure
No learning from data	`Predictor.Demos` populated by optimizer
No instruction tuning	`Predictor.Instructions` evolved by MIPROv2
No tracing of individual calls	Each `PredictAsync` recorded in `Trace`
Can't save/load tuned state	`SaveStateAsync` / `ApplyStateAsync` per predictor

8. Step 3: The AdvancedRagModule

File: `AdvancedRagModule.cs`

This is the orchestration layer. It owns the MEDI RetrievalPipeline (with LMP adapters inside), plus a standalone LMP predictor for answer generation.

The partial class is required — the source generator emits GetPredictors() which returns ALL four predictors (from the three adapters + the answer predictor), enabling the optimizer to tune them all.

using LMP;
using LMP.Samples.AdvancedRag.Adapters;
using Microsoft.Extensions.AI;
using Microsoft.Extensions.DataIngestion;
using Microsoft.Extensions.VectorData;

namespace LMP.Samples.AdvancedRag;

/// <summary>
/// Advanced RAG module that composes:
///   • MEDI RetrievalPipeline (infrastructure: vector search, RRF, tree traversal)
///   • LMP Predictors (intelligence: query expansion, reranking, CRAG, answer gen)
///
/// The module owns both layers. When optimized, all four predictors get
/// learned instructions and few-shot demos.
///
/// IMPORTANT: Must be partial for source generator to emit GetPredictors() + CloneCore().
/// </summary>
public partial class AdvancedRagModule : LmpModule<QuestionInput, GroundedAnswer>
{
    // ── Infrastructure (MEDI) ──────────────────────────────
    private readonly RetrievalPipeline _pipeline;
    private readonly VectorStoreCollection<Guid, IngestedChunk> _collection;
    private readonly int _topK;

    // ── Intelligence (LMP) — exposed via adapters ──────────
    // These are the LMP-powered MEDI processors. Their Predictor
    // properties are registered with GetPredictors() by the source gen.
    private readonly LmpQueryExpander _expander;
    private readonly LmpReranker _reranker;
    private readonly LmpCragValidator _cragValidator;

    // ── Standalone LMP predictor (not inside a MEDI processor) ──
    private readonly ChainOfThought<AnswerInput, GroundedAnswer> _answer;

    public AdvancedRagModule(
        IChatClient client,
        VectorStoreCollection<Guid, IngestedChunk> collection,
        int topK = 5)
    {
        ArgumentNullException.ThrowIfNull(client);
        ArgumentNullException.ThrowIfNull(collection);

        Client = client;
        _collection = collection;
        _topK = topK;

        // Create LMP-powered MEDI adapters
        _expander = new LmpQueryExpander(client);
        _reranker = new LmpReranker(client, maxResults: topK);
        _cragValidator = new LmpCragValidator(client, evaluateTopN: 3);

        // Wire adapters into the MEDI RetrievalPipeline
        _pipeline = new RetrievalPipeline();
        _pipeline.QueryProcessors.Add(_expander);
        _pipeline.ResultProcessors.Add(_reranker);
        _pipeline.ResultProcessors.Add(_cragValidator);

        // Standalone answer predictor with chain-of-thought
        _answer = new ChainOfThought<AnswerInput, GroundedAnswer>(client)
        {
            Name = "answer"
        };
    }

    // ── Source generator needs to know about ALL predictors ──
    // We override GetPredictors() manually since the predictors live
    // inside adapter objects, not directly as fields on this class.
    //
    // NOTE: If the source generator can discover predictors in adapter
    // fields automatically (via the [Predict] attribute or similar),
    // you could simplify this. For now, explicit override is safest.
    public override IReadOnlyList<(string Name, IPredictor Predictor)> GetPredictors()
    {
        return
        [
            (_expander.Predictor.Name, _expander.Predictor),
            (_reranker.Predictor.Name, _reranker.Predictor),
            (_cragValidator.Predictor.Name, _cragValidator.Predictor),
            (_answer.Name, _answer),
        ];
    }

    /// <summary>
    /// Full RAG pipeline:
    ///   1. MEDI RetrievalPipeline handles: expand → search → RRF → rerank → CRAG
    ///      (with LMP predictors powering the LLM calls inside)
    ///   2. LMP ChainOfThought handles: grounded answer generation
    /// </summary>
    public override async Task<GroundedAnswer> ForwardAsync(
        QuestionInput input,
        CancellationToken cancellationToken = default)
    {
        // ── Step 1: MEDI Retrieval Pipeline ──────────────────────────
        // This single call runs the ENTIRE retrieval pipeline:
        //   a. LmpQueryExpander.ProcessAsync() → expands query via LMP predictor
        //   b. Qdrant vector search per variant → RRF merge
        //   c. LmpReranker.ProcessAsync() → scores each chunk via LMP predictor
        //   d. LmpCragValidator.ProcessAsync() → confidence gate via LMP predictor
        //
        // All LMP predictor calls are traced if this.Trace is set.
        // The optimizer collects these traces for demo learning.
        PropagateTraceToAdapters();

        var results = await _pipeline.RetrieveAsync(
            _collection,
            input.Question,
            topK: _topK,
            contentSelector: chunk => chunk.Text,
            cancellationToken: cancellationToken);

        // Handle CRAG "incorrect" path — no confident results
        if (results.Chunks.Count == 0)
        {
            return new GroundedAnswer
            {
                Answer = "I could not find sufficiently relevant information to answer this question confidently.",
                Citations = []
            };
        }

        // ── Step 2: Answer Generation ────────────────────────────────
        // Build context from the MEDI pipeline's reranked, CRAG-validated results.
        // ChainOfThought makes the LLM reason step-by-step before answering.
        var context = string.Join("\n\n---\n\n",
            results.Chunks.Select(c => c.Content));

        var answer = await _answer.PredictAsync(
            new AnswerInput(input.Question, context),
            trace: Trace,
            validate: result =>
            {
                LmpAssert.That(result,
                    r => !string.IsNullOrWhiteSpace(r.Answer),
                    "Answer must not be empty");
                LmpAssert.That(result,
                    r => r.Citations is { Length: > 0 },
                    "Must include at least one citation from context");
            },
            maxRetries: 2,
            cancellationToken: cancellationToken);

        return answer;
    }

    /// <summary>
    /// Passes the module's Trace down to adapters so every LMP predictor
    /// call (expand, rerank, CRAG) is recorded for optimization.
    /// </summary>
    private void PropagateTraceToAdapters()
    {
        // The adapters' Predictor.PredictAsync() calls need a trace reference.
        // Since the MEDI pipeline calls adapter.ProcessAsync() (not PredictAsync
        // directly), we need to inject the trace before the pipeline runs.
        //
        // Implementation options:
        //   a. Adapters accept a Trace property and pass it to PredictAsync
        //   b. Adapters check a shared field on the module
        //   c. Use AsyncLocal<Trace> for ambient tracing
        //
        // For simplicity, adapters should expose a settable Trace property.
        // TODO: Add `public Trace? Trace { get; set; }` to each adapter
        //       and pass it in PredictAsync calls.
    }
}

The IngestedChunk Record

File: `IngestedChunk.cs`

This is the vector store schema — same as in advanced-rag. Shown here for completeness; copy from the advanced-rag repo.

using Microsoft.Extensions.VectorData;

namespace LMP.Samples.AdvancedRag;

/// <summary>
/// Vector store record representing an ingested and enriched document chunk.
/// Schema matches the advanced-rag sample's Qdrant collection.
/// </summary>
public sealed class IngestedChunk
{
    public const string CollectionName = "data-AdvancedRag-chunks";
    public const int VectorDimensions = 1536;
    public const string VectorDistanceFunction = "CosineSimilarity";

    [VectorStoreKey]
    public Guid Key { get; set; }

    [VectorStoreData]
    public string DocumentId { get; set; } = "";

    [VectorStoreData]
    public string Text { get; set; } = "";

    [VectorStoreData]
    public string Context { get; set; } = "";

    // Entity metadata (from EntityExtractionProcessor)
    [VectorStoreData]
    public string EntitiesPeople { get; set; } = "";

    [VectorStoreData]
    public string EntitiesOrganizations { get; set; } = "";

    [VectorStoreData]
    public string EntitiesTechnologies { get; set; } = "";

    [VectorStoreData]
    public string EntitiesVersions { get; set; } = "";

    // Topic metadata (from TopicClassificationProcessor)
    [VectorStoreData]
    public string TopicPrimary { get; set; } = "";

    [VectorStoreData]
    public string TopicSecondary { get; set; } = "";

    // Tree metadata (from TreeIndexProcessor)
    [VectorStoreData]
    public int Level { get; set; }

    [VectorStoreData]
    public string ParentId { get; set; } = "";

    // Hypothetical query metadata
    [VectorStoreData]
    public string ChunkType { get; set; } = "";

    [VectorStoreData]
    public string ParentChunkId { get; set; } = "";

    // Vector embedding
    [VectorStoreVector(VectorDimensions)]
    public ReadOnlyMemory<float> Vector { get; set; }
}

9. Step 4: Training & Dev Data

Domain: Survival & Equipment

The advanced-rag sample processes Example_Emergency_Survival_Kit.pdf and Example_GPS_Watch.md. Training data should be Q&A pairs about this content.

File: `data/train.jsonl`

12 training examples. The optimizer runs the full MEDI+LMP pipeline on these and collects successful traces.

{"input": {"Question": "What items should be included in a basic emergency survival kit?"}, "label": {"Answer": "A basic emergency survival kit should include water (one gallon per person per day for at least three days), non-perishable food (at least a three-day supply), a battery-powered or hand-crank radio, a flashlight with extra batteries, a first aid kit, a whistle to signal for help, dust masks, plastic sheeting and duct tape for sheltering in place, moist towelettes and garbage bags for personal sanitation, a wrench or pliers to turn off utilities, a manual can opener, and local maps.", "Citations": ["one gallon per person per day for at least three days", "battery-powered or hand-crank radio, a flashlight with extra batteries, a first aid kit"]}}
{"input": {"Question": "How much water should you store for emergency preparedness?"}, "label": {"Answer": "You should store at least one gallon of water per person per day for a minimum of three days, for both drinking and sanitation purposes. Consider storing more for hot climates, pregnant women, or sick individuals.", "Citations": ["one gallon per person per day for at least three days", "for both drinking and sanitation"]}}
{"input": {"Question": "How does the GPS watch track your location?"}, "label": {"Answer": "The GPS watch uses satellite-based Global Positioning System technology to track your location. It receives signals from multiple GPS satellites to calculate your precise coordinates, elevation, and movement data. The watch can display your current position, track your route, and provide navigation back to a saved waypoint.", "Citations": ["satellite-based Global Positioning System technology", "receives signals from multiple GPS satellites"]}}
{"input": {"Question": "What first aid supplies should be in a survival kit?"}, "label": {"Answer": "A survival kit's first aid supplies should include adhesive bandages in various sizes, sterile gauze pads, adhesive tape, elastic bandages, antiseptic wipes, antibiotic ointment, burn cream, scissors, tweezers, a thermometer, pain relievers, anti-diarrhea medication, and any personal prescription medications.", "Citations": ["adhesive bandages", "sterile gauze pads", "antiseptic wipes", "personal prescription medications"]}}
{"input": {"Question": "What is the battery life of the GPS watch?"}, "label": {"Answer": "The GPS watch battery life varies by mode. In standard watch mode with basic features, battery life extends significantly. In full GPS tracking mode with continuous satellite communication, battery life is reduced. Power-saving modes that reduce GPS polling frequency can extend battery life during extended outdoor activities.", "Citations": ["battery life varies by mode", "Power-saving modes that reduce GPS polling frequency"]}}
{"input": {"Question": "How do you purify water in an emergency situation?"}, "label": {"Answer": "In an emergency, water can be purified by boiling it for at least one minute (three minutes at elevations above 6,500 feet), using water purification tablets, using a portable water filter, or adding 8 drops of unscented household bleach per gallon and waiting 30 minutes. Always filter cloudy water through a clean cloth before treating it.", "Citations": ["boiling it for at least one minute", "water purification tablets", "8 drops of unscented household bleach per gallon"]}}
{"input": {"Question": "What features does the GPS watch compass offer?"}, "label": {"Answer": "The GPS watch compass provides both magnetic and GPS-based bearing information. It displays cardinal and intercardinal directions, supports bearing lock for following a specific heading, and can be calibrated for local magnetic declination. The compass integrates with route navigation to show the direction to your next waypoint.", "Citations": ["magnetic and GPS-based bearing information", "bearing lock for following a specific heading"]}}
{"input": {"Question": "What type of food should be stored for emergencies?"}, "label": {"Answer": "Store non-perishable food that requires no refrigeration, minimal preparation, and little or no water to cook. Good choices include canned meats and fish, canned fruits and vegetables, protein bars, granola bars, dried fruit, nuts, peanut butter, crackers, and ready-to-eat canned soups. Ensure you have a manual can opener. Replace food annually to maintain freshness.", "Citations": ["non-perishable food that requires no refrigeration", "canned meats and fish", "Replace food annually"]}}
{"input": {"Question": "How do you use the GPS watch for route navigation?"}, "label": {"Answer": "To use route navigation on the GPS watch, first save waypoints at key locations along your planned route. Then create a route by connecting waypoints in sequence. During navigation, the watch displays the direction and distance to the next waypoint, your current bearing, and estimated time of arrival. Breadcrumb tracking records your actual path for backtracking.", "Citations": ["save waypoints at key locations", "displays the direction and distance to the next waypoint", "Breadcrumb tracking"]}}
{"input": {"Question": "What should you do with your emergency kit every six months?"}, "label": {"Answer": "Every six months, review and update your emergency survival kit. Check expiration dates on food, water, medications, and batteries and replace any that are expired or near expiration. Test flashlights and radio equipment. Update personal documents and contact information. Adjust contents for seasonal needs and any changes in family size or medical requirements.", "Citations": ["Check expiration dates on food, water, medications, and batteries", "Update personal documents and contact information"]}}
{"input": {"Question": "Can the GPS watch measure elevation and barometric pressure?"}, "label": {"Answer": "Yes, the GPS watch includes an altimeter that measures elevation using both GPS satellite data and a built-in barometric pressure sensor. The barometric altimeter provides more accurate real-time elevation readings, while GPS altitude is used for calibration. The barometric sensor can also display weather trend information based on pressure changes.", "Citations": ["altimeter that measures elevation using both GPS satellite data and a built-in barometric pressure sensor", "weather trend information based on pressure changes"]}}
{"input": {"Question": "How do you shelter in place during a chemical emergency?"}, "label": {"Answer": "To shelter in place during a chemical emergency, go indoors immediately and close all windows and doors. Turn off heating, ventilation, and air conditioning systems. Use plastic sheeting and duct tape to seal windows, doors, and vents. Move to an interior room with few windows on an upper floor if possible. Listen to emergency broadcasts for instructions on when it is safe to leave.", "Citations": ["plastic sheeting and duct tape to seal windows, doors, and vents", "interior room with few windows", "Listen to emergency broadcasts"]}}

File: `data/dev.jsonl`

5 held-out examples for evaluation. These are NOT seen during optimization.

{"input": {"Question": "What communication devices should be in a survival kit?"}, "label": {"Answer": "A survival kit should include a battery-powered or hand-crank NOAA Weather Radio for emergency broadcasts, a whistle for signaling rescuers, and a fully charged cell phone with a portable battery charger. Consider including a two-way radio for areas without cell coverage. Keep extra batteries for all electronic devices.", "Citations": ["battery-powered or hand-crank radio", "whistle to signal for help", "extra batteries"]}}
{"input": {"Question": "Is the GPS watch waterproof?"}, "label": {"Answer": "The GPS watch is designed to be water-resistant and can withstand exposure to rain, splashing, and brief immersion. It is rated for use during water-based outdoor activities. However, it should not be used for deep-water diving as prolonged submersion beyond its rated depth may cause damage.", "Citations": ["water-resistant", "rated for use during water-based outdoor activities"]}}
{"input": {"Question": "How do you signal for help in a wilderness emergency?"}, "label": {"Answer": "Signal for help using a whistle (three blasts is the universal distress signal), a signal mirror to reflect sunlight toward aircraft or distant rescuers, bright-colored clothing or fabric spread on the ground, and a flashlight at night. Build a signal fire with green vegetation to create visible smoke during the day. The GPS watch can also share your coordinates for rescue teams.", "Citations": ["whistle", "three blasts is the universal distress signal", "signal mirror"]}}
{"input": {"Question": "How do you set waypoints on the GPS watch?"}, "label": {"Answer": "To set a waypoint on the GPS watch, navigate to the GPS/Navigation menu, select 'Mark Waypoint' or press the dedicated waypoint button. The watch captures your current GPS coordinates and allows you to name the waypoint for easy identification. You can also enter coordinates manually for a known location. Saved waypoints can be organized into groups and used for route creation.", "Citations": ["navigate to the GPS/Navigation menu", "captures your current GPS coordinates", "enter coordinates manually"]}}
{"input": {"Question": "What documents should be kept in an emergency kit?"}, "label": {"Answer": "Keep copies of important family documents in a waterproof container in your emergency kit, including identification (driver's licenses, passports), insurance policies, bank account records, medical records and prescriptions, proof of address, and emergency contact information. Include both physical copies and copies on a USB drive.", "Citations": ["copies of important family documents in a waterproof container", "identification", "insurance policies", "emergency contact information"]}}

10. Step 5: Program.cs

File: `Program.cs`

This demonstrates the full lifecycle: setup → ingest (MEDI) → predict (MEDI+LMP) → evaluate baseline → optimize (LMP) → evaluate optimized → save/load.

using Azure.AI.OpenAI;
using Azure.Identity;
using LMP;
using LMP.Optimizers;
using LMP.Samples.AdvancedRag;
using LMP.Samples.AdvancedRag.Adapters;
using Microsoft.Extensions.AI;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.VectorData;

// ══════════════════════════════════════════════════════════════
// 1. SETUP
// ══════════════════════════════════════════════════════════════

var config = new ConfigurationBuilder()
    .AddUserSecrets<Program>()
    .Build();

string endpoint = config["AzureOpenAI:Endpoint"]
    ?? throw new InvalidOperationException("Set AzureOpenAI:Endpoint in user secrets");
string deployment = config["AzureOpenAI:Deployment"]
    ?? throw new InvalidOperationException("Set AzureOpenAI:Deployment in user secrets");

IChatClient client = new AzureOpenAIClient(
        new Uri(endpoint), new DefaultAzureCredential())
    .GetChatClient(deployment)
    .AsIChatClient();

// Qdrant vector store collection (assumes documents already ingested
// by the advanced-rag sample's MEDI ingestion pipeline).
//
// For a standalone demo without Qdrant, swap this with an in-memory
// vector store — the AdvancedRagModule doesn't care.
var qdrantClient = new Qdrant.Client.QdrantClient("localhost", 6334);
var vectorStore = new Microsoft.SemanticKernel.Connectors.Qdrant.QdrantVectorStore(qdrantClient);
var collection = vectorStore.GetCollection<Guid, IngestedChunk>(IngestedChunk.CollectionName);
await collection.EnsureCollectionExistsAsync();

Console.WriteLine("╔══════════════════════════════════════════════════════════╗");
Console.WriteLine("║  LMP + MEDI Advanced RAG                                ║");
Console.WriteLine("║  MEDI pipeline infrastructure + LMP optimizable intel    ║");
Console.WriteLine("╚══════════════════════════════════════════════════════════╝");
Console.WriteLine();

// ══════════════════════════════════════════════════════════════
// 2. SINGLE PREDICTION: Full MEDI+LMP pipeline in action
// ══════════════════════════════════════════════════════════════

Console.WriteLine("─── Single Prediction ──────────────────────────────────");
Console.WriteLine();

var module = new AdvancedRagModule(client, collection);

var question = new QuestionInput(
    "What items should I pack for emergency water purification?");
Console.WriteLine($"Q: {question.Question}");
Console.WriteLine();

var result = await module.ForwardAsync(question);

Console.WriteLine($"A: {result.Answer}");
Console.WriteLine();
Console.WriteLine("Citations:");
foreach (var citation in result.Citations)
    Console.WriteLine($"  • {citation}");
Console.WriteLine();

// ══════════════════════════════════════════════════════════════
// 3. BASELINE EVALUATION
// ══════════════════════════════════════════════════════════════

Console.WriteLine("─── Baseline Evaluation ────────────────────────────────");
Console.WriteLine();

var dataDir = Path.Combine(AppContext.BaseDirectory, "data");
var trainSet = Example.LoadFromJsonl<QuestionInput, GroundedAnswer>(
    Path.Combine(dataDir, "train.jsonl"));
var devSet = Example.LoadFromJsonl<QuestionInput, GroundedAnswer>(
    Path.Combine(dataDir, "dev.jsonl"));

Console.WriteLine($"Train: {trainSet.Count} examples | Dev: {devSet.Count} examples");
Console.WriteLine();

// Metric: keyword overlap + citation bonus
Func<GroundedAnswer, GroundedAnswer, float> metric = (predicted, expected) =>
{
    if (string.IsNullOrWhiteSpace(predicted.Answer) ||
        string.IsNullOrWhiteSpace(expected.Answer))
        return 0f;

    var stopWords = new HashSet<string>(StringComparer.OrdinalIgnoreCase)
    {
        "the", "a", "an", "is", "are", "was", "were", "in", "on", "at",
        "to", "for", "of", "with", "and", "or", "but", "it", "its",
        "by", "from", "as", "be", "this", "that", "can", "will",
        "should", "your", "you", "have", "has", "not", "also"
    };

    var expectedWords = expected.Answer
        .Split([' ', ',', '.', '!', '?', ';', ':', '(', ')', '-', '"'],
               StringSplitOptions.RemoveEmptyEntries)
        .Where(w => w.Length > 2 && !stopWords.Contains(w))
        .Distinct(StringComparer.OrdinalIgnoreCase)
        .ToHashSet(StringComparer.OrdinalIgnoreCase);

    if (expectedWords.Count == 0) return 0f;

    var matchCount = expectedWords.Count(kw =>
        predicted.Answer.Contains(kw, StringComparison.OrdinalIgnoreCase));

    float keywordScore = (float)matchCount / expectedWords.Count;
    float citationBonus = predicted.Citations is { Length: > 0 } ? 0.1f : 0f;

    return Math.Min(keywordScore + citationBonus, 1.0f);
};

var baselineModule = new AdvancedRagModule(client, collection);
var baselineResult = await Evaluator.EvaluateAsync(baselineModule, devSet, metric);

Console.WriteLine($"Baseline: {baselineResult.AverageScore:P1}");
Console.WriteLine($"  Min: {baselineResult.MinScore:P1}  Max: {baselineResult.MaxScore:P1}");
Console.WriteLine();

// ══════════════════════════════════════════════════════════════
// 4. OPTIMIZATION
// ══════════════════════════════════════════════════════════════

Console.WriteLine("─── Optimizing (BootstrapRandomSearch, 8 trials) ────────");
Console.WriteLine();

var untypedMetric = Metric.Create(metric);

var optimizer = new BootstrapRandomSearch(
    numTrials: 8,
    maxDemos: 4,
    metricThreshold: 0.3f);

var optimizedModule = await optimizer.CompileAsync(
    new AdvancedRagModule(client, collection),
    trainSet,
    untypedMetric);

// ══════════════════════════════════════════════════════════════
// 5. OPTIMIZED EVALUATION
// ══════════════════════════════════════════════════════════════

Console.WriteLine("─── Optimized Evaluation ───────────────────────────────");
Console.WriteLine();

var optimizedResult = await Evaluator.EvaluateAsync(optimizedModule, devSet, metric);

Console.WriteLine($"Optimized: {optimizedResult.AverageScore:P1}");
Console.WriteLine($"  Min: {optimizedResult.MinScore:P1}  Max: {optimizedResult.MaxScore:P1}");
Console.WriteLine();
Console.WriteLine($"Improvement: {baselineResult.AverageScore:P1} → {optimizedResult.AverageScore:P1}");
Console.WriteLine();

// Show what the optimizer learned for each predictor
Console.WriteLine("─── Learned Parameters ─────────────────────────────────");
Console.WriteLine();
foreach (var (name, predictor) in optimizedModule.GetPredictors())
{
    Console.WriteLine($"  [{name}]");
    Console.WriteLine($"    Demos: {predictor.Demos.Count}");
    Console.WriteLine($"    Instructions: {Truncate(predictor.Instructions, 80)}");
}
Console.WriteLine();

// ══════════════════════════════════════════════════════════════
// 6. SAVE & RELOAD
// ══════════════════════════════════════════════════════════════

Console.WriteLine("─── Save & Reload ──────────────────────────────────────");
Console.WriteLine();

var statePath = Path.Combine(Path.GetTempPath(), "advanced-rag-optimized.json");
await optimizedModule.SaveStateAsync(statePath);
Console.WriteLine($"Saved to: {statePath}");

var reloaded = new AdvancedRagModule(client, collection);
await reloaded.ApplyStateAsync(statePath);

var reloadedResult = await Evaluator.EvaluateAsync(reloaded, devSet, metric);
Console.WriteLine($"Reloaded:  {reloadedResult.AverageScore:P1} (should match optimized)");
Console.WriteLine();
Console.WriteLine("Done! ✓");

// ── Helper ──────────────────────────────────────────────────
static string Truncate(string text, int max)
    => text.Length <= max ? text : text[..max] + "…";

11. Step 6: Build & Run

Prerequisites

Clone luisquintanilla/lmp-dotnet
Clone luisquintanilla/advanced-rag (for the Qdrant data + documents)
Run the advanced-rag ingestion first (so Qdrant has data to search)
Place this sample in lmp-dotnet/samples/LMP.Samples.AdvancedRag/
.NET 10 SDK + Docker (for Qdrant)

Commands

# 1. Start Qdrant (if not already running via Aspire)
docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant

# 2. Run advanced-rag ingestion first to populate Qdrant
#    (Follow its README for Azure OpenAI + Aspire setup)

# 3. Navigate to the LMP sample
cd lmp-dotnet/samples/LMP.Samples.AdvancedRag

# 4. Set user secrets
dotnet user-secrets init
dotnet user-secrets set "AzureOpenAI:Endpoint" "https://YOUR-RESOURCE.openai.azure.com/"
dotnet user-secrets set "AzureOpenAI:Deployment" "gpt-4o-mini"

# 5. Build & run
dotnet build && dotnet run

Standalone Mode (No Qdrant)

If you don't want to run Qdrant, create an InMemoryVectorStore with passages from the knowledge base. The AdvancedRagModule accepts any VectorStoreCollection<Guid, IngestedChunk> — swap the implementation.

12. Design Decisions & Rationale

Why Adapters Instead of Modifying MEDIExtensions Directly?

MEDIExtensions is a separate repo — you can't add LMP dependencies to it
Adapters preserve the MEDI contract — the pipeline doesn't know LMP exists
Clean separation — MEDI processors can still be used without LMP
Composition over inheritance — adapter wraps, doesn't replace

Why Manual `GetPredictors()` Override?

The LMP source generator discovers predictors stored as direct fields on the module class (e.g., private readonly Predictor<A, B> _foo). Our predictors are nested inside adapter objects. The source generator can't see through the adapter. So we override GetPredictors() manually to expose all four predictors.

Future improvement: The source generator could be taught to discover predictors in objects that implement a IHasPredictor interface.

Why Individual Scoring (Not Batch) for Reranking?

The original LlmReranker asks the LLM to rank ALL passages in one prompt (returns {"rankedIndices": [3, 1, 5, ...]}). This is efficient but:

Hard to optimize — the optimizer can't learn from individual scoring decisions
Fragile — if the LLM returns malformed indices, the whole batch fails
No granular traces — one trace entry for N passages vs. N trace entries

Individual scoring gives the optimizer N traces per query, each showing "for this question + this passage, the correct score was X." This is much richer signal for few-shot demo selection.

The cost tradeoff (N calls vs. 1) is acceptable because:

N is typically 5-10 (after RRF dedup, before final top-K)
Each call is very short (one passage, one score)
The optimizer can reduce calls by learning to score accurately in fewer retries

Why This Domain (Survival/Equipment)?

The advanced-rag sample uses Example_Emergency_Survival_Kit.pdf and Example_GPS_Watch.md. Matching the domain means:

You can test against the same Qdrant collection (shared ingestion)
Training data questions are realistic for the corpus
The sample tells a coherent end-to-end story

Why BootstrapRandomSearch (Not MIPROv2)?

Simpler to understand for a first sample
Faster (8 trials vs. MIPROv2's 3-phase process)
Good enough for 12 training examples
MIPROv2 can be shown as an "upgrade path" exercise

13. Glossary

Term	Definition
MEDI	Microsoft.Extensions.DataIngestion — pipeline framework for ingestion + retrieval
RetrievalPipeline	MEDI class that orchestrates query processors → vector search → result processors
RetrievalQueryProcessor	MEDI abstract class for pre-search processing (e.g., query expansion)
RetrievalResultProcessor	MEDI abstract class for post-search processing (e.g., reranking)
RetrievalQuery	MEDI data class with `Text`, `Variants`, and `Metadata`
RetrievalChunk	MEDI data class with `Content`, `Score`, and `Record` metadata
RetrievalResults	MEDI data class with `Chunks` list and `Metadata` dictionary
RRF	Reciprocal Rank Fusion — algorithm to merge rankings from multiple queries
CRAG	Corrective RAG — quality gate that classifies retrieval confidence
Adapter	Class that extends a MEDI processor but delegates LLM calls to an LMP Predictor
Predictor	LMP class: typed LLM call with learnable instructions + demos
LmpModule	LMP class: composes predictors and defines `ForwardAsync`
[LmpSignature]	Attribute marking a record as an LLM output type (source-gen trigger)
Trace	LMP class: log of all predictor calls during execution
BootstrapRandomSearch	LMP optimizer: N independent BootstrapFewShot trials, keep best
ChainOfThought	LMP predictor variant: LLM reasons step-by-step before answering
LmpAssert	LMP validation: throws on failure, triggers retry with error feedback

lqdev/spec.md