Let us define an observable microservice as one that exposes its internal state through events, metrics, and trace data. Ideally we can aggregate this data in some sort of centralized observability system for alerting and diagnostics. In my opinion there are tools that work much better than others and I will use those in this example but the most important part is that you have a starting point and your services can be observed and alerted on somehow. If your company does not have the means to setup all of this infrastructure you should consider buying one of the many SaaS options out today.
Disclaimer: I am not a native .net or F# developer, I think at this point in my career I've logged 2 weeks on an F# dev team so the implementations given below may not be idiomatic, however, my experience comes from being apart of both Devops and microservices dev teams in other teams and organizations.
That being said; let's start building an "observable" microservice.
We can use the following Giraffe project as our starting point. In a later article we can see how these pieces fit together with an additional service. For now let's keep this a 30,000ft view of the landscape.
open System
open Microsoft.AspNetCore.Builder
open Microsoft.AspNetCore.Hosting
open Microsoft.Extensions.DependencyInjection
open Giraffe
let webApp =
choose [
route "/ping" >=> text "pong"
route "/" >=> htmlFile "/pages/index.html" ]
let configureApp (app : IApplicationBuilder) =
// Add Giraffe to the ASP.NET Core pipeline
app.UseGiraffe webApp
let configureServices (services : IServiceCollection) =
// Add Giraffe dependencies
services.AddGiraffe() |> ignore
[<EntryPoint>]
let main _ =
WebHostBuilder()
.UseKestrel()
.Configure(Action<IApplicationBuilder> configureApp)
.ConfigureServices(configureServices)
.Build()
.Run()
0
In this example we will be using Prometheus. Prometheus works by exposing metrics on an http endpoint so that a scraping service can pull metrics on a time based interval. By convention this will be /metrics
.
To get started with Prometheus in F# we will use the following package: prometheus-net.AspNetCore.
We can update our initial web app to include setting up a metrics exporter and an http metrics exporter in the configureServices
function. What this will do is expose http specific metrics such as http request duration, http request counters, and runtime specific metrics such as memory usage, cpu time, open file descriptors, etc.
let configureServices (services: IServiceCollection) =
// Add our metrics server and default http metrics handlers
app.UseHttpMetrics() |> ignore
app.UseMetricServer() |> ignore
// Add Giraffe dependencies
services.AddGiraffe() |> ignore
Here is some example output of a service with some generated load.
Exporting http and process metrics is all well and good, but we can do better. We can use this as an opportunity to expose domain specific metrics in addition to those mentioned before. One way to do that is to define a let-bound value within a domain module. For example, if we have a scraping service we may want to export the number of attempted scrapes, a count of successful scrapes, errors, and latency.
Example:
let pingCounter = Metrics.CreateCounter("pinger_ping_total", "The total number of pings", CounterConfiguration())
let pingHandler =
fun (next: HttpFunc) (ctx: HttpContext) ->
task {
pingCounter.Inc() |> ignore
return! text "pong" next ctx
}
As the number of services grows in a system the need to trace a request end-to-end becomes critical to quickly understanding where in a process errors occur and latency is introduced. Two ways you may approach this is via Logging and OpenTracing.
Logging can be a great first step, especially when you service graph is small. I would suggest logging and propagating the original incoming request id and child request id's. This will allow you to piece together the resulting graph of upstream requests. The idea here is not to re-invent distributed tracing with a log aggregator but to provide actionable first steps.
It is hugely beneficial to come up with a standard logging format, it allows you to specify a minimal set of parsers needed to correctly display and index your logs. This is not always absolutely possible when dealing with a large number of teams or many different programming languages but where possible prefer to standardize on this.