Created
September 10, 2025 13:23
-
-
Save brandonbryant12/675421997801aacea26567686e5cde7d to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| To implement a circuit breaker that tracks errors per backend plugin in Backstage—triggering a cooldown (circuit open) when a plugin returns too many errors (e.g., 5xx status codes)—without requiring plugin developers to modify their code, customize the httpRouterService using a service factory override. This leverages Backstage’s backend architecture, where each plugin has its own isolated httpRouter instance (an Express Router), mounted at /api/. By overriding the core httpRouterService factory, you can inject a per-plugin circuit breaker middleware that monitors response statuses globally, applying to all plugins, including third-party ones. 20 0 11 | |
| This approach uses Opossum to manage the circuit state per plugin, manually recording successes/failures based on response status codes (e.g., counting 5xx as failures). When the error threshold is exceeded, the circuit opens, returning a 503 response for that plugin’s routes during cooldown. It aligns with the circuit breaker pattern for server-side resilience, preventing cascading failures from faulty plugins. 12 13 6 | |
| Step 1: Install Opossum | |
| In your packages/backend directory: | |
| yarn add opossum | |
| Step 2: Create the Custom HttpRouter Factory with Circuit Breaker Middleware | |
| Create a new file, e.g., packages/backend/src/services/customHttpRouterFactory.ts. This overrides the default httpRouterService factory to add a circuit breaker middleware to each plugin’s router. Pull global options from app-config.yaml (or use defaults). | |
| import { createServiceFactory } from '@backstage/backend-plugin-api'; | |
| import { coreServices, HttpRouterServiceAuthPolicy } from '@backstage/backend-plugin-api'; | |
| import { Config } from '@backstage/config'; | |
| import { RequestHandler, Router } from 'express'; | |
| import CircuitBreaker, { Options as CircuitBreakerOptions } from 'opossum'; | |
| import PromiseRouter from 'express-promise-router'; // For async handlers if needed | |
| // Helper to create the circuit breaker middleware | |
| function createCircuitBreakerMiddleware( | |
| config: Config, | |
| pluginId: string, | |
| ): { middleware: RequestHandler; breaker: CircuitBreaker } { | |
| // Global/per-plugin config (e.g., from app-config.yaml) | |
| const options: Partial = { | |
| timeout: config.getOptionalNumber(`circuitBreaker.plugins.${pluginId}.timeout`) ?? 5000, // ms | |
| errorThresholdPercentage: config.getOptionalNumber(`circuitBreaker.plugins.${pluginId}.errorThresholdPercentage`) ?? 50, | |
| resetTimeout: config.getOptionalNumber(`circuitBreaker.plugins.${pluginId}.resetTimeout`) ?? 30000, // Cooldown ms | |
| volumeThreshold: config.getOptionalNumber(`circuitBreaker.plugins.${pluginId}.volumeThreshold`) ?? 10, // Min requests | |
| // Add more: rollingCountTimeout, etc. | |
| }; | |
| // Dummy action since we manage success/failure manually | |
| const breaker = new CircuitBreaker(async () => {}, options); | |
| // Optional: Fallback (Opossum calls this if open, but we handle manually) | |
| breaker.fallback(() => 'Plugin unavailable during cooldown'); | |
| // Optional: Event logging | |
| breaker.on('open', () => console.log(`Circuit opened for plugin: ${pluginId}`)); | |
| breaker.on('close', () => console.log(`Circuit closed for plugin: ${pluginId}`)); | |
| breaker.on('halfOpen', () => console.log(`Circuit half-open for plugin: ${pluginId}`)); | |
| const middleware: RequestHandler = (req, res, next) => { | |
| if (breaker.opened) { | |
| return res.status(503).json({ error: `Plugin ${pluginId} unavailable during cooldown` }); | |
| } | |
| // Hook into response to record outcome | |
| res.on('finish', () => { | |
| if (res.statusCode >= 500) { // Track server errors; adjust to >=400 if client errors count as failures | |
| breaker.failure(new Error(`Failure in ${pluginId}: ${res.statusCode}`)); | |
| } else { | |
| breaker.success(); | |
| } | |
| }); | |
| next(); | |
| }; | |
| return { middleware, breaker }; | |
| } | |
| export const customHttpRouterFactory = createServiceFactory({ | |
| service: coreServices.httpRouter, | |
| deps: { | |
| plugin: coreServices.pluginMetadata, | |
| config: coreServices.rootConfig, | |
| lifecycle: coreServices.lifecycle, | |
| rootHttpRouter: coreServices.rootHttpRouter, | |
| auth: coreServices.auth, | |
| httpAuth: coreServices.httpAuth, | |
| // Add more deps if using other middleware (e.g., rateLimit) | |
| }, | |
| async factory({ plugin, config, lifecycle, rootHttpRouter, auth, httpAuth }) { | |
| const router = PromiseRouter(); | |
| // Add your circuit breaker middleware first (applies to all routes in this plugin) | |
| const { middleware: circuitMiddleware } = createCircuitBreakerMiddleware(config, plugin.getId()); | |
| router.use(circuitMiddleware); | |
| // Mount the router to the root (as in default factory) | |
| rootHttpRouter.use(`/api/${plugin.getId()}`, router); | |
| // Add other default/custom middleware (e.g., from @backstage/backend-defaults/httpRouter) | |
| // Example: router.use(createRateLimitMiddleware({ pluginId: plugin.getId(), config })); | |
| // router.use(createAuthIntegrationRouter({ auth })); | |
| // router.use(createLifecycleMiddleware({ lifecycle })); | |
| // Return the API for plugins to add their handlers | |
| return { | |
| use(handler: RequestHandler): void { | |
| router.use(handler); | |
| }, | |
| addAuthPolicy(policy: HttpRouterServiceAuthPolicy): void { | |
| // Implement if using auth barriers; otherwise omit or stub | |
| }, | |
| }; | |
| }, | |
| }); | |
| • Key Mechanics: | |
| ◦ Per-Plugin Isolation: The factory runs once per plugin, creating a unique breaker and middleware for each (based on plugin.getId()). | |
| ◦ Circuit Logic: Checks if the circuit is open before proceeding. On response finish, records success (non-5xx) or failure (5xx), updating Opossum’s stats to potentially open/close the circuit. | |
| ◦ Cooldown: When open, requests to that plugin’s routes get a 503. Opossum handles half-open/reset automatically after resetTimeout. | |
| ◦ Configurability: Use per-plugin overrides in app-config.yaml for tailored thresholds, or fall back to defaults. | |
| Add to app-config.yaml (example with global defaults; override per-plugin as needed): | |
| circuitBreaker: | |
| plugins: | |
| # Defaults apply if not specified per-plugin | |
| default: # Not directly used; copy to specific plugins | |
| timeout: 10000 | |
| errorThresholdPercentage: 60 | |
| resetTimeout: 60000 | |
| volumeThreshold: 5 | |
| catalog: # Example per-plugin override | |
| errorThresholdPercentage: 70 | |
| volumeThreshold: 20 | |
| Step 3: Register the Factory in the Backend | |
| In packages/backend/src/index.ts: | |
| import { createBackend } from '@backstage/backend-defaults'; | |
| import { customHttpRouterFactory } from './services/customHttpRouterFactory'; // Adjust path | |
| const backend = createBackend(); | |
| // Override the default httpRouter with your custom factory | |
| backend.add(customHttpRouterFactory); | |
| // Add plugins as usual (circuit breaker applies automatically) | |
| backend.add(import('@backstage/plugin-catalog-backend')); | |
| // ... other plugins | |
| backend.start(); | |
| Additional Notes | |
| • Testing: Simulate errors in a plugin route (e.g., throw errors to trigger 500s) and send requests until the threshold opens. Monitor logs for state changes. | |
| • Customization: Adjust failure criteria (e.g., include 4xx if desired). Add Prometheus integration for metrics via Opossum’s stats (e.g., expose via a monitoring plugin). 9 | |
| • Limitations: This tracks HTTP errors only; non-HTTP plugin logic (e.g., tasks) isn’t covered. For broader error tracking, integrate with Backstage’s logger or error handler middleware. | |
| • Alternatives if Needed: If Opossum’s manual mode doesn’t fit perfectly, consider Cockatiel (as in examples) for similar policy handling. 3 10 This setup ensures a global default without plugin modifications. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment