Design of the repoContextProvider MCP Server

title

source

author

published

created

description

Design of the repoContextProvider MCP Server

repoContextProvider is an MCP (Model Context Protocol) server that extracts context from code repositories and provides LLM-powered summarizations of that content. It is built as a production-ready, containerized Python package with multiple summarization backends and integration points for tools and agent frameworks. The design emphasizes modularity, testability, and extensibility, ensuring that new features (like additional tools or backends) can be added with minimal changes. The following sections detail the key features and architectural decisions of the system.

Figure: High-level MCP architecture. An MCP client (e.g. in an IDE or chatbot) connects to the MCP server (such as repoContextProvider) over a standardized interface (STDIO or HTTP via SSE). The server exposes Tools (actions the LLM can invoke), Resources (data the LLM can fetch), and Prompts (pre-defined prompt templates) philschmid.de philschmid.de. This allows an LLM (the “Host” application) to seamlessly interact with external data sources like repositories through function calls, with the MCP client handling the connection and data exchange.

Summarization Backends and Optional Dependencies

To accommodate different deployment scenarios and preferences, repoContextProvider supports multiple summarization backends. A common interface (e.g. an abstract base class or simple function protocol) is defined for summarization so that new backends can be added easily. The following backends are implemented:

OpenAI API Backend: Uses OpenAI’s GPT models (e.g. GPT-4 or GPT-3.5) via the OpenAI Python SDK. This backend streams repository text (or extracted context) to the OpenAI API and returns a summary. It can optionally leverage LangChain’s OpenAI wrappers for convenience (e.g. using ChatOpenAI to manage API calls) medium.com. This backend requires an API key and internet access. It’s ideal for high-quality summaries using OpenAI’s latest models.
LangChain Summarization Backend: Leverages LangChain’s built-in summarization chains and tools. LangChain provides chain implementations like “stuff”, “map_reduce”, and “refine” for summarizing documents python.langchain.com. For example, a Map-Reduce chain can summarize parts of a large repository (files or sections) and then combine those summaries python.langchain.com. This backend might still use an LLM (OpenAI or other) under the hood, but it adds intelligent chunking and combination strategies via LangChain. It can be configured to use OpenAI models or local models through LangChain’s integrations. Using LangChain allows more complex workflows (such as first retrieving relevant parts of the repo, then summarizing) to be built easily.
Hugging Face Transformers Backend: Runs local summarization models from Hugging Face Transformers. This backend does not require external API calls, making it suitable for offline or on-prem deployments. It can use pre-trained summarization models like distilbart-cnn-12-6 or T5 via the Transformers pipeline. For example, using the 🤗 pipeline API with the "summarization" task will by default load a model like distilbart-cnn-12-6 and generate a summary medium.com. This backend may require larger dependencies (PyTorch/Transformers), so it is kept optional. It’s configurable by model name or path, so users can choose a specific fine-tuned model for their domain.

Each backend is implemented in a separate module under repo_context/backends/ (e.g. openai_backend.py, langchain_backend.py, local_backend.py), and they all conform to a common interface (summarize(text: str, **kwargs) -> str). The server can choose the backend based on configuration – for instance, an environment variable SUMMARIZER_BACKEND (values: "openai", "langchain", "local") selects the implementation. This design makes it easy to route summarization requests to the appropriate service.

Optional Dependencies: The project defines extras in its setup configuration to avoid installing all backends by default. In setup.py, the extras_require parameter is used to create groups of optional dependencies sukhbinder.wordpress.com. For example:

repoContextProvider[openai] – Installs dependencies for the OpenAI backend (e.g. openai SDK and possibly LangChain core).
repoContextProvider[local] – Installs dependencies for local Transformers backend (e.g. transformers, torch).
repoContextProvider[agents] – Installs packages for agent/tool integrations (e.g. langchain itself, langgraph, and fastapi-mcp for MCP integration).
repoContextProvider[all] – A convenient meta-group that includes all of the above, pulling in all optional dependencies.

Users can install only what they need. For instance, a user who only plans to use local summarization can do pip install repoContextProvider[local] and avoid pulling OpenAI or LangChain packages. Conversely, [all] is available for a full installation. Defining these extras makes the module more flexible and user-friendly sukhbinder.wordpress.com, allowing minimal installations for smaller footprints. All core functionality (like the server and CLI) is designed to handle the absence of optional backends gracefully – if a backend is selected without its dependencies installed, the server returns an informative error or fallback response.

Containerization and Configuration

The project includes a Dockerfile and a docker-compose configuration to enable easy deployment in various environments. We use multi-stage builds in the Dockerfile to produce a slim final image. In the first stage, the image is based on a full Python environment (for example, python:3.11-slim plus build tools) where we install the application and compile any dependencies. In the final stage, we use a lightweight base (such as python:3.11-alpine or slim) and copy only the necessary files and installed packages from the builder. This approach ensures that heavy build-time artifacts (like .cache files, compilers, etc.) are not present in the runtime image docs.docker.com docs.docker.com. Multi-stage builds significantly reduce image size and attack surface by separating the build environment from the minimal runtime environment docs.docker.com. The resulting container contains just the Python interpreter, the repoContextProvider package, and needed libraries – making it fast to pull and secure to run.

Docker Compose: A sample docker-compose.yml is provided to orchestrate the server and its supporting services. It can, for example, define two services: one for the repoContextProvider server and another for Redis (if caching or rate-limiting is enabled, see below). The compose file is configured to use an .env file to load environment variables (Docker Compose automatically reads a .env file and substitutes variables). We supply a .env.example documenting all configurable values (like OPENAI_API_KEY, ALLOWED_ORIGINS, ENABLE_AUTH, etc.), so that users can easily set up their own .env. Key environment configurations include:

OpenAI API Key: e.g. OPENAI_API_KEY for the OpenAI backend. The container will read this and the OpenAI backend will use it for authentication.
Backend Selection: as mentioned, an env var to choose the default summarizer backend.
Auth Toggle: e.g. API_KEY_AUTH=1 to require an API key, along with API_KEY or a list of API_KEYS for valid keys.
CORS Allowed Origins: e.g. CORS_ORIGINS="*" or a comma-separated list of origins, which the server uses to configure CORS.

The FastAPI server code uses these environment variables (via os.getenv or Pydantic settings) so that no code changes are needed to reconfigure common options – everything can be adjusted through env vars or the .env file. This is especially important in container deployments to avoid baking secrets into images. For example, the server might use Python’s python-dotenv to load the file for local development convenience, but in Docker deployment, these would come from the environment directly.

API Key Authentication (Optional): The server can be launched in a mode that requires clients to supply an API key. This is implemented using FastAPI’s dependency injection. If enabled, a dependency function (e.g. verify_api_key) will be added to all route dependencies. This function checks for an X-API-Key header in the request and compares it against the expected key(s). If absent or incorrect, it raises an HTTP 401 Unauthorized error. We follow a pattern similar to known FastAPI API key auth implementations: for instance, using APIKeyHeader and a dependency that queries a database or list of keys medium.com medium.com. In our simpler case, it might just check against an environment variable. We then add app = FastAPI(dependencies=[Depends(verify_api_key)]) to enforce it globally medium.com. This way, all endpoints are protected unless the correct key is provided, which is important if the service is exposed publicly or to multiple users.

CORS Support: We enable Cross-Origin Resource Sharing via FastAPI’s CORSMiddleware. The allowed origin list is configurable (through CORS_ORIGINS env var). By default, we might allow all origins in development or localhost origins, and in production expect the user to configure specific domains. This ensures that if the server’s REST API is called from a web browser application, the requests won’t be blocked by the browser. The middleware is added early in the FastAPI app initialization to apply to all routes.

Finally, the Docker image entrypoint is set to launch the server (e.g. using Uvicorn). For example, the container might run: uvicorn repo_context.server.main:app --host 0.0.0.0 --port 8000. We also include a healthcheck in the Docker configuration (using Dockerfile HEALTHCHECK or via orchestrator) that pings the /health endpoint (see Production Features below) to restart the container if needed.

Modular Project Structure

The codebase is organized into a clear, modular structure to separate concerns and facilitate maintenance:

plaintextrepo_context/               - Main Python package
├── server/                 - FastAPI MCP server implementation
│   ├── __init__.py
│   ├── main.py             - Creates FastAPI app, includes routes and MCP setup
│   ├── mcp_integration.py  - (Optional) integration with FastAPI-MCP or FastMCP
│   └── routes/             - (If many endpoints, could organize into router modules)
├── cli/                    - CLI tool implementation
│   ├── __init__.py
│   └── main.py             - CLI entry point (e.g., using Typer or Click)
├── integrations/           - Integrations with external frameworks
│   ├── langchain_tool.py   - Definition of a LangChain Tool for this service
│   ├── langgraph_workflow.py - Example LangGraph workflow using the service
│   └── ...                 - (Future: other integration helpers)
├── backends/               - Summarization backend implementations
│   ├── __init__.py
│   ├── openai_backend.py   - Uses OpenAI API via openai or LangChain
│   ├── langchain_backend.py- Uses LangChain summarization chains
│   └── local_backend.py    - Uses HuggingFace transformers for local models
├── utils/                  - Utility modules
│   ├── __init__.py
│   ├── token_counter.py    - Functions to count tokens (for limits)
│   ├── cache.py            - Caching utility (in-memory & Redis)
│   ├── logging.py          - Logging setup (formatters, levels)
│   └── validation.py       - Extra validation helpers, if needed
├── __init__.py
└── MCPconfig/              - (Optional) If providing a sample MCP JSON config
examples/
├── langchain_agent.ipynb   - Jupyter or scripts demonstrating usage as agent tool
└── repo_summary_demo.py    - Example script using CLI or API to summarize a repo
tests/
├── test_server.py          - Tests for API endpoints (using Starlette TestClient)
├── test_backends.py        - Tests for each backend (with dummy data or mocking)
├── test_cli.py             - Tests for CLI argument parsing and output
└── ...                     - Additional tests
docs/
├── usage.md                - Documentation for using the server and CLI
└── development.md          - Notes for developers (coding conventions, etc.)
setup.py                    - Setup file with install_requires and extras_require
pyproject.toml or requirements.txt       - Project metadata and dependencies
README.md                   - Overview, installation instructions, quick start

Separation of Concerns: This layout groups similar functionality together. The server package contains everything related to running the FastAPI server and MCP interface. The cli package handles the command-line interface logic. backends contains the interchangeable summarization engines. By isolating these, we ensure that, for example, the server code can import all backends and decide which to use, but the backends themselves don’t depend on server internals (they could even be used independently). The integrations package is for any adapters or glue code that allows external frameworks to make use of our service – for instance, a LangChain Tool class or functions that format our API into LangChain’s expectations.

MCP Server Implementation: Within repo_context/server, we implement the MCP server using FastAPI along with the official MCP Python SDK. We take a hybrid approach using the fastapi-mcp integration library for convenience medium.com. At startup, we create the FastAPI app, define our routes (or tools), then use FastApiMCP to mount a special endpoint (typically /mcp) that exposes the API according to MCP specifications. The FastApiMCP(app, name="Repo Context Provider", ...) will automatically generate the MCP schema from our routes and tool descriptions and mount the necessary handler medium.com medium.com. This means an AI agent can query GET /mcp to discover the available tools (with names, descriptions, input schema, etc.) and then call the endpoints through the MCP client seamlessly. Each route in our FastAPI app that we want to expose as an MCP Tool is annotated with metadata like operation_id and summary which fastapi-mcp uses to document the tool medium.com. For example, we might have:

This function summarize_repo becomes an MCP Tool with the name "summarize_repo" and description from the docstring or summary. An MCP client (like Claude Desktop or VSCode with Copilot) will be able to discover it and call it with arguments. Under the hood, fastapi-mcp handles serving a schema at /mcp and routing invocations to the /summarize endpoint. We benefit from using FastAPI as the web framework: we can test these routes with normal HTTP requests, use the interactive docs (Swagger UI) for debugging, and integrate middlewares (for auth, CORS, etc.) easily, while still conforming to the MCP protocol for agent use.

CLI Implementation: In repo_context/cli/main.py, we implement a console entry-point named repo-context. Using a library like Typer (which provides an intuitive way to build CLIs with click -style decorators) or Python’s built-in argparse, we create commands such as repo-context summarize <path> or repo-context analyze <path>. The CLI essentially wraps calls to the same core logic that the server uses. For example, when the user runs repo-context summarize ./my_repo, the CLI code will internally call something like repo_context.server.main.summarize_repo(path="./my_repo", max_tokens=...) and print the resulting summary to stdout. This reuse of logic ensures consistency – whether you use the CLI or the HTTP API, you get the same results. It also means the core functions (like extract_repo_text or the backend summarize functions) can be easily unit-tested in isolation. The CLI tool might have subcommands for various analyses (e.g., summarize, stats for repository statistics, etc.), making it a handy developer tool on its own. We set up the console script entry point in setup.py so that after installation, typing repo-context will invoke our CLI sukhbinder.wordpress.com.

Packaging and Installation: The setup.py (or pyproject.toml) defines the package details and optional extras as discussed. We include console scripts entry points for both the CLI and the server:

"repo-context=repo_context.cli.main:app" (if using Typer, app is Typer’s CLI app object) or similar for CLI.
"repo-context-server=repo_context.server.main:run_server" to launch the server easily.

This means a user can do pip install repoContextProvider[all] and then use the commands directly. The README provides usage examples for both modes.

Production-Grade Features and Middleware

Several features are included to make the server robust in real-world deployments:

Logging: The server uses Python’s logging module to log important events and errors with an appropriate level. We configure a logger (in utils/logging.py) that can be adjusted via environment (e.g. LOG_LEVEL=DEBUG for verbose output). By default, it logs INFO level and above to stdout. Each request is logged with details (method, path, response time), and exceptions are logged with stack traces. For more structured logging, one could integrate libraries like Loguru or the standard logging.config with JSON format, but we keep it simple and reliable. The key is that multiple log levels are available – debug logs help in development (tracing each step of summarization), info logs track normal operations (startup, requests handled), and warning/error logs capture issues. These logs can be collected by Docker or cloud logging tools. We also make sure to not pollute logs from dependencies; for example, we can adjust the log level of third-party loggers (like uvicorn or openai libs) if necessary.
Caching Layer: To improve performance and avoid redundant work, we implement caching for expensive operations. Both in-memory caching and Redis-based caching are supported:
- In-Memory: We use a simple functools.lru_cache or a custom cache dictionary for quick caching within a single process. For instance, after summarizing a particular repository (or file) once, we can cache the result keyed by repo path and perhaps last modification time. Subsequent requests for the same repo summary can return instantly from cache. This cache is cleared on process restart, but provides a significant speedup for repeated queries in the meantime.
- Redis Cache: If a Redis URL is provided (e.g. REDIS_URL=redis://...), the server will connect to Redis and use it as a shared cache. We incorporate a small utility using aioredis or redis Python client to get/set cache entries with a TTL (time-to-live). This allows caching across multiple instances of the service (useful if the API is scaled horizontally) and persistence beyond process life. It also allows a larger cache size than in-memory might safely allow. We design the caching util to abstract these details – e.g. a function get_cache(key) and set_cache(key, value, ttl) that will use Redis if configured, otherwise fallback to an in-memory dict. The content cached could include the raw repository text extraction (so we don’t re-read files unnecessarily) and the summarization results. We set reasonable expiration (for example, 1 hour) so that if the repository updates, within an hour the summary will refresh. Users can also manually bust the cache by providing a query param (like ?refresh=true) on endpoints, which we then honor by bypassing cache. This caching strategy improves throughput and cost: for instance, if using OpenAI API, we don’t want to call it repeatedly for the same content. (One can also integrate a more advanced cache like an LRU with max size, but given the scope, a simple time-based cache suffices).
Note: We carefully consider what to cache – for sensitive data, caching in Redis should be done with security in mind (use AUTH on Redis or in-memory only). By default, caching is off unless configured, to avoid stale data issues in highly dynamic repos.
Health Check Endpoint: A very simple GET endpoint /health is provided that returns a 200 status and a JSON like {"status": "ok"}. This endpoint does minimal work (no dependencies on backends or external services) – it’s meant for load balancers or orchestration systems to check if the service is running. In Kubernetes, for example, one can set this as a liveness or readiness probe. The health check can be extended to perform internal checks (like verifying it can reach the OpenAI API or that Redis is responsive), but by default it just confirms the web server is up and able to respond.
Metrics Endpoint: We include an endpoint (by convention /metrics) that exposes Prometheus-compatible metrics for the server. Using the prometheus_client library, we set up counters and histograms to track things like request counts, request durations, and backend-specific metrics. FastAPI doesn’t provide this out of the box, but it’s straightforward to integrate. For example, we create a Counter for total requests and increment it in a middleware or in each endpoint. The Prometheus client can automatically collect Python GC and process metrics as well. The /metrics endpoint will output these metrics in the Prometheus text format. This allows ops teams to scrape metrics and monitor the performance of the service (QPS, error rates, latency, cache hits, etc.) carlosmv.hashnode.dev. We ensure that this endpoint is protected by the same auth if enabled (or we might leave it open if it's only accessible internally). By providing metrics, repoContextProvider can be seamlessly integrated into monitoring dashboards and alerting systems, which is crucial for production services.
Rate Limiting: To prevent abuse or accidental overload (especially if using an API like OpenAI with rate costs), we integrate a rate limiting mechanism. We use the SlowAPI library, which is a FastAPI/Starlette adaptation of Flask-Limiter slowapi.readthedocs.io. SlowAPI allows us to declare limits like “X requests per minute per IP”. We initialize a Limiter and attach it to the app. For example, we can set a global rate limit (default for all endpoints) such as 60 requests per minute per client IP, and specific stricter limits for heavy endpoints if needed. In code, it looks like: If the rate is exceeded, SlowAPI will automatically return HTTP 429 Too Many Requests slowapi.readthedocs.io slowapi.readthedocs.io. SlowAPI supports in-memory counters or Redis/Memcached backends for distributed rate limiting slowapi.readthedocs.io. In our setup, if Redis is configured, we initialize the limiter to use Redis for storing request counts (so that multiple instances share the rate limit pool; this is important in production with multiple replicas). Otherwise, it will use an in-memory cache by default slowapi.readthedocs.io. Rate limit configuration (like requests per minute) can be adjusted via environment variables to fine-tune for different deployment scenarios. This ensures that a misbehaving client or script can’t overwhelm the service or incur excessive API costs.
Request Validation: By virtue of using FastAPI + Pydantic, all request input (query parameters, JSON bodies, etc.) are validated automatically. Pydantic models or parameter types ensure that if a required field is missing or an incorrect type is provided, the server returns a clear 422/400 error with details, without even entering our route logic. For example, if our summarize_repo expects a string path, FastAPI will enforce that and generate an error if not supplied. We also add custom validation where appropriate, for instance verifying that the path exists or that a provided repository URL is reachable, and return meaningful errors if not. This defensive programming makes the API more robust and user-friendly – clients get immediate feedback if they misuse the API.
Error Handling: We implement global exception handlers for known error types. For example, if the OpenAI API call fails (due to network or quota issues), we catch that exception and return a 502 Bad Gateway or 503 error with a message. If our code raises a custom RepositoryNotFoundError, we catch it and return a 404 with a message. FastAPI allows adding exception handlers easily, and we use this to ensure the API never leaks internals via uncaught exceptions – instead, every error is mapped to a clean HTTP response. Additionally, these errors are logged (with stacktrace at debug level) so we can troubleshoot issues.
Security Considerations: Aside from optional API key auth and CORS described earlier, we take care to secure file system access (if the server is allowed to read local repos, we ensure it cannot read arbitrary paths outside allowed scopes – e.g., by sandboxing to certain directories or requiring explicit whitelisting). For any integrations with external APIs (like GitHub, OpenAI), sensitive keys are never logged. We also ensure that if the server spawns any subprocess (not in current design, but perhaps for git operations), we handle those securely.

Usability and Integration

The repoContextProvider is designed not just as a standalone service, but as a component that fits into larger AI systems and developer workflows:

Command-Line Usage: As mentioned, the repo-context CLI provides a quick way for developers to invoke the repository context analysis on their own. For example, running repo-context analyze --summary --stats /path/to/repo could output a summary of the repository along with statistics (number of files, primary languages, etc.). The CLI has help documentation (repo-context --help) that describes all commands and options. This makes it easy to test the core functionality without deploying the full server. It also assists in debugging (since one can run the summarize logic directly in a terminal to see any errors or performance issues).
FastAPI Server Launch: For convenience, a shortcut command repo-context-server is provided. This essentially does uvicorn repo_context.server.main:app --host 0.0.0.0 --port 8000 under the hood, possibly reading env vars for host/port if needed. This saves the user from writing a custom Uvicorn command or Python script to start the service. We also ensure this entry point is compatible with MCP usage: if a user wants to run this server as part of Claude Desktop’s config, they might use an MCP JSON config like: or, if running locally without Docker: This would launch our server and the MCP client would connect to http://localhost:8000/mcp as configured. We document such usage in the README.
LangChain Tool Integration: One of the key integration points is making repoContextProvider available as a tool in LangChain or similar agent frameworks. We create a LangChain Tool in integrations/langchain_tool.py. If using LangChain’s newer utilities, this could be as simple as: This repo_tool can then be added to a LangChain agent. For instance, a developer can initialize an OpenAI function-calling agent with this tool so that the LLM can decide to use it when a user asks something about a repository. In LangChain’s function calling paradigm, the tool’s name and description become part of the OpenAI functions specification, and the agent will format inputs as needed medium.com medium.com. We ensure our tool’s interface is simple (in this case, one string argument for repo identifier) to make it easy for the LLM to invoke. We also provide an example in the docs or examples folder showing how to integrate: e.g., using initialize_agent with our tool and an LLM, then querying it with a question about a repository. This demonstrates that repoContextProvider can plug into complex workflows – the LLM can ask it for context and then continue the conversation with that context (a common pattern for AI coding assistants).
LangGraph Workflows: For more advanced usage, we include a sample LangGraph workflow (LangGraph is an extension for stateful, graph-based LLM execution) in examples/langgraph_workflow.py. This could show how to incorporate our tool in a loop where the LLM first decides whether to call the tool, then calls it, then processes the result medium.com medium.com. LangGraph’s ToolExecutor can invoke our tool similarly to any other, given that we have provided the LangChain Tool interface or an OpenAI function spec. By verifying compatibility with LangGraph (and writing a brief guide), we ensure that future AI agents that use LangGraph’s approach can readily use this server. In essence, any agent that can consume an OpenAPI or MCP description can use our server as well, thanks to the standardized interface we expose via MCP medium.com.
Extensibility for New Tools: The design anticipates adding more capabilities beyond summarization. For example, a Git history analyzer tool could be added to extract insights from commit history, or a test coverage reporter tool to summarize test results in the repo. To support this, we keep the Tools definitions modular. Each new function can be added in the server/routes (and automatically picked up by FastApiMCP for MCP schema) or even as separate routers. The integrations and backends structure can be mirrored – e.g., a new backend might not be needed for those (as they might just use git commands), but if we needed an LLM to interpret git logs, we could reuse the existing backends. We strive to follow open/closed principle: new features can be added as new modules, without modifying the core of existing ones. The test suite can be expanded accordingly to cover new functionality. By keeping things decoupled (for instance, the CLI will automatically pick up new analysis commands if we add them, provided we register them), we make future enhancements straightforward.
Documentation and Examples: The docs/ directory and README include usage examples as mentioned. For a developer, seeing a concrete example of summarizing a repo using the CLI and via an HTTP call is very helpful. We also document how to configure the environment (for each backend, what variables or model files are needed). Additionally, we provide a note on performance considerations (like advising to use local backend for privacy, OpenAI for best quality, etc.). This comprehensive documentation ensures that users can quickly get started and integrate repoContextProvider into their workflows.

In summary, repoContextProvider is a full-featured MCP server for repository analysis, combining the power of LLM summarization with practical engineering for deployment. It supports multiple summarization engines (from OpenAI’s state-of-the-art models to local Transformer models) and exposes a standard interface that AI agents can use out-of-the-box. The containerization and configuration support make it easy to deploy securely, while caching, rate limiting, and metrics provide the needed stability and observability in production. The modular structure keeps the codebase maintainable, and new features or tools can be added readily, future-proofing the system for evolving needs in AI-assisted development.

Sources:

LangChain Summarization Chains python.langchain.com, demonstrating strategies like map-reduce for summarizing multiple documents.
Example of using HuggingFace Transformers pipeline for summarization medium.com.
Use of extras_require in setup.py to define optional dependency groups sukhbinder.wordpress.com.
FastAPI-MCP integration to expose FastAPI routes as MCP tools medium.com medium.com.
Docker multi-stage build best practice: separate build and runtime to slim down images docs.docker.com docs.docker.com.
FastAPI with API Key auth using dependency injection (x-api-key header) medium.com medium.com.
SlowAPI for rate limiting in FastAPI (supports in-memory or Redis backends) slowapi.readthedocs.io slowapi.readthedocs.io.
Prometheus metrics integration in FastAPI (using prometheus_client) carlosmv.hashnode.dev.
Model Context Protocol overview – enabling standardized tool access for LLMs philschmid.de philschmid.de.
LangGraph/ToolExecutor usage in agent workflows (example of tool invocation) medium.com medium.com.