Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save lucapette/86cacae3dc6df6984306b5267b8a3d46 to your computer and use it in GitHub Desktop.

Select an option

Save lucapette/86cacae3dc6df6984306b5267b8a3d46 to your computer and use it in GitHub Desktop.
Analytics Service Improvement Report

Analytics Service Improvement Report

Executive Summary

The analytics-service is a Kotlin/Spring Boot gRPC service serving KPI, trends, and entity data from ClickHouse/PostgreSQL. The API is its primary consumer. While the architecture is well-documented (CLAUDE.md is excellent), there are significant code quality issues: a monolithic main class, pervasive cross-service duplication, and inconsistent patterns.


1. Critical: AnalyticsService.kt is a God Class

File: analytics-service/src/main/kotlin/com/heyacto/analytics/AnalyticsService.kt

At 1000+ lines, this file is a single @GrpcService class handling all 22 gRPC endpoints. Each handler method follows the same pattern: validate request → build timeframe → resolve product hierarchy → create repository → execute query → map to proto response. This creates massive intra-file duplication.

Action items:

  • Extract endpoint handlers into dedicated classes (e.g., OrderKpisHandler, LineItemKpisHandler, TimeseriesKpisHandler, TeamKpisHandler, EntityHandler). Each handler gets its own file with a single suspend fun handle(request): Response.
  • Extract the shared request-validation + timeframe + hierarchy-resolution preamble into a reusable builder/template, since getOrderKPIs, getLineItemKPIs, getLineItemTrends, getTimeseriesKPIs, and getOrderTrends all repeat the same 30-50 lines of preamble.
  • Extract proto-mapping extensions (toOrderKpisProto(), toTrendsProto(), etc.) into a separate mappers/ package — they're already extension functions but sit inside the service class.

2. High: Duplicate BaggageSpanProcessor (Exact Copy)

Both services contain an identical OpenTelemetry SpanProcessor:

  • analytics-service/.../observability/BaggageSpanProcessor.kt
  • api/.../config/BaggageSpanProcessor.kt

Action: Extract to a shared module (e.g., web-common or observability-common).


3. High: Triplicate TENANT_ID_HEADER Constant

The gRPC metadata key Metadata.Key.of("tenantid", Metadata.ASCII_STRING_MARSHALLER) is defined independently in 3 places:

  • analytics-service/.../TenantIdInterceptor.kt:39
  • api/.../TenantContextClientInterceptor.kt:19
  • api/.../AppEventsServerTenantInterceptor.kt:37

Action: Define once in a shared module (e.g., grpc-common or domain).


4. High: Duplicate AnalyticsConfigProvider (Same gRPC Call, Different Caching)

Both services call tenantManager.getAnalyticsConfig() and cache the result:

  • analytics-service: Guava CacheBuilder, 1-day TTL
  • api: ConcurrentHashMap, no TTL (stale data risk)

Action: Extract to a shared library with consistent TTL-based caching.


5. High: Duplicate ClickHouse Connection Mapping

Both services map TenantManagerProto.Tenant fields to ClickHouse coordinates:

  • analytics-service/.../DatabaseConnectionProvider.kt:120-127
  • api/.../TenantDatabaseCoordinatesProvider.kt:91-110

Near-identical except the api strips commas from hostnames.

Action: Extract to a shared utility in database/clickhouse.


6. Medium: Inconsistent Clock Timezones

  • analytics-service: Clock.systemUTC()
  • api: Clock.systemDefaultZone()

This is likely a latent bug. Many api files default to Clock.systemUTC() in constructor params, contradicting the bean.

Action: Align to Clock.systemUTC() everywhere. Audit which timezone is actually correct for the business logic.


7. Medium: Inconsistent TENANT_MANAGER_CLIENT_NAME

  • analytics-service: "tenantManager" (camelCase)
  • api: "tenant-manager" (kebab-case, raw strings in @GrpcClient)
  • tenant-migrator: "tenant-manager" (different value!)

Action: Standardize to one convention. Define in a shared constants file.


8. Medium: 10x Duplicated Error-Handling Pattern in API Client

api/.../AnalyticsServiceClient.kt has 10 methods that all follow:

suspend fun method(request: Req): Resp = withContext(MDCContext()) {
    try { stub.method(request) }
    catch (e: CancellationException) { throw e }
    catch (e: Exception) { throw AnalyticsServiceException("...", e) }
}

Action: Extract a generic suspend fun <T> gRPC(block: suspend () -> T): T wrapper.


9. Medium: 3 Different Date Validation Code Paths in analytics-service

Date format validation (YYYY-MM-DD) appears in:

  • ExceptionAndAlertHandler.kt:182-186 (via handleTimeframeException)
  • ParameterValidation.kt:412-425 (via LocalDate.parse())
  • AnalyticsService.kt:763-783 (inline in getTeamKpis)

Each has a subtly different error message.

Action: Consolidate into a single validateDateString(value: String): LocalDate utility.


10. Medium: Locale Validation Logic Not Shared

  • analytics-service/.../LocalePreferenceInterceptor.kt — gRPC header, regex validation, normalization
  • api/.../LocaleGraphQlInterceptor.kt — HTTP header, Locale.forLanguageTag(), fallback to German

The normalization logic (lowercase language, uppercase region) could be shared if the api needs it.

Action: Extract locale normalization to a shared utility if both services need consistent behavior.


11. Low: Logging Interceptor Duplication

Both services implement request timing + MDC logging:

  • analytics-service/.../LoggingInterceptor.kt (gRPC)
  • api/.../RequestLoggingFilter.kt (HTTP)

Structurally identical pattern but different protocols. The MDC key "tenant" (analytics) vs "user.tenant_id" (api) for the same concept is inconsistent.

Action: Standardize MDC key names across services.


12. Low: Exception Handling String Duplication

Both services use "An unexpected error occurred" as their catch-all error message:

  • analytics-service/.../ExceptionAndAlertHandler.kt:144
  • api/.../GlobalGraphQlExceptionHandler.kt:20

Action: Define in a shared constants file if these should remain in sync.


Summary: Prioritized Action Items

Priority Item Effort Impact
P0 Extract handlers from AnalyticsService.kt god class High Maintainability, testability
P1 Extract shared BaggageSpanProcessor to common module Low Eliminate exact copy
P1 Centralize TENANT_ID_HEADER constant Low Eliminate 3x duplication
P1 Fix api AnalyticsConfigProvider caching (add TTL) Low Bug fix (stale data)
P2 Extract shared AnalyticsConfigProvider Medium Eliminate near-copy
P2 Extract ClickHouse connection mapping to shared util Medium Eliminate near-copy
P2 Fix clock timezone inconsistency Low Bug fix
P2 Extract generic gRPC wrapper in API client Low Eliminate 10x boilerplate
P2 Consolidate date validation into single utility Low Eliminate 3x duplication
P3 Standardize MDC key names Low Observability consistency
P3 Standardize TENANT_MANAGER_CLIENT_NAME Low Eliminate inconsistency

What NOT to Change

The shared domain module (domain/src/.../kpis/model.kt) is well-structured with no duplication — the three-layer type mapping pattern (proto → domain → GraphQL) is clean. The product hierarchy system documented in CLAUDE.md is complex but correctly so — the ragged-level handling and rollup routing logic are well-reasoned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment