Skip to content

Instantly share code, notes, and snippets.

@noahfalk
Last active March 20, 2023 13:07
Show Gist options
  • Save noahfalk/85668c0e820e13650f9e344aafeba135 to your computer and use it in GitHub Desktop.
Save noahfalk/85668c0e820e13650f9e344aafeba135 to your computer and use it in GitHub Desktop.

Disclaimer: This is very much a work in progress and I haven't worked through the details or fretted over naming. The goal is to get feedback on the basics and iterate. Perhaps we'll find it is fatally flawed, but we got to start somewhere :)

Use cases

I think there are likely two different schools of thought for how people will want to use these APIs:

  1. Ad-hoc usage - These are folks who want to add some metrics to their code with minimal effort.
  • Localized code changes only, ideally within a single class in a single file
  • Add no new types
  • Minimize LoC required, simple to follow best practice
  • Easy to iterate and change in the future
  • Easy to learn, not much abstraction
  • Easily unit testable
  • Not bothered by mixing telemetry schema/metadata definition and usage in their code
  1. De-coupled usage - These are folks who want a strong boundary between defining the instrumentation schema/metadata and its usage.
  • Strongly typed API that defines available instrumentation
  • Configuration of names/descriptions/units/versions/tags centralized and separated from usage
  • Instrumentation can be easily registered and shared via DI
  • Easily unit testable
  • Accept a modest increase in abstractions/LoC/files touched in order to achieve the decoupling

My hope is to have a single API surface that can reasonably satisfy both use-cases depending on the pattern chosen, and that it shouldn't be too hard to switch between the patterns or mix and match within the same project as needs change.

Ad-hoc pattern

public class FruitStoreController
{
    Counter<int> _fruitSold;
    ObservableGauge<int> _ordersOutstanding;
  
    public FruitStoreController(IMeterFactory meterFactory, IOrderService orders)
    {
        Meter m = meterFactory.CreateMeter<FruitStoreController>(); // alternately you could specify a string name
        _fruitSold = m.CreateCounter<int>("fruit-sold", "Amount of fruit sold at our store");
        _ordersOutstanding = m.CreateObservableCounter<int>("orders-outstanding", orders.GetOutstanding, "Orders created and not yet shipped");
    }
  
    public IActionResult PlaceOrder(int fruitCount)
    {
        _fruitSold.Add(fruitCount);
        ...
    }
}

Decoupled Pattern

This one consists of three parts: defining the instrumentation, registering the instrumentation, and using the instrumentation

Define instrumentation

class FruitStoreInstrumentation
{
    public FruitStoreInstrumentation(IMeterFactory factory, IOrderService orders)
    {
        Meter m = meterFactory.CreateMeter("FruitCo.FruitStore"); // alternately you could specify a type name
                                                                  // but if this instrumentation is shared across several
                                                                  // components then there may not be a single type name which makes sense
        FruitSold = m.CreateCounter<int>("fruit-sold", "Amount of fruit sold at our store");
        OutstandingOrders = m.CreateObservableCounter<int>(orders-outstanding", orders.GetOutstanding, "Orders created and not yet shipped");
    }

    public Counter<int> FruitSold { get; init }
    public ObservableCounter<int> OutstandingOrders { get; init; }
}

Register instrumentation

services.AddSingleton<FruitStoreInstrumentation>();

Use instrumentation

public class FruitStoreController
{
    FruitStoreInstrumentation _instrumentation;
  
    public FruitStoreController(FruitStoreInstrumentation instrumentation)
    {
        _instrumentation = instrumentation;
    }
  
    public IActionResult PlaceOrder(int fruitCount)
    {
        _instrumentation.FruitSold.Add(fruitCount);
        ...
    }
}

How to test

[Fact]
public void OrderIncreasesFruitSold()
{
    // arrange
    using meterFactory = new MeterFactory();
    IOrderService orderService = new FakeOrderService();
    FruitStoreController controller = new FruitStoreController(meterFactory, orderService);
    using InstrumentRecorder<int> recorder = new InstrumentRecorder(meterFactory, typeof(FruitStoreController), "fruits-sold");
    
    // act
    controller.PlaceOrder(4);
    controller.PlaceOrder(18);
    
    // assert
    Assert.Equal(recorder.Measurements.Count, 2);
    Assert.Equal(recorder.Measurements[0], 4);
    Assert.Equal(recorder.Measurements[1], 18);
}

[Fact]
public void OutstandingOrdersReportsOrderServiceValue()
{
    // arrange
    using meterFactory = new MeterFactory();
    IOrderService orderService = new FakeOrderService();
    FruitStoreController controller = new FruitStoreController(meterFactory, orderService);
    using InstrumentRecorder<int> recorder = new InstrumentRecorder(meterFactory, typeof(FruitStoreController), "fruits-sold");
    
    // act
    // this takes a snapshot of the current value
    // we could do things that change the value and take multiple snapshots if necessary
    recorder.SnapshotObservableInstrument();
    
    // assert
    Assert.Equal(recorder.Measurements.Count, 1);
    Assert.Equal(recorder.Measurements[0], 49);
}

class FakeOrderService : IOrderService
{
    public int GetOutstandingOrders() => 49;
}

Different overloads of the InstrumentRecorder constructor are possible depending on what information is easily available:

  1. factory + Meter name + instrument name
  2. Meter reference + instrument name
  3. Instrument reference

In the decoupled case where we directly expose the instrument as part of the API the instrument references are probably easier to use. With the APIs from .NET 6 it is possible to implement (2) and (3), but we'd need something new as part of this feature if we want to identify a Meter by factory+name.

New API/behavior needed

  1. Create MeterFactory and IMeterFactory. The factory would need to cache any Meters it creates and Dispose them when the factory is disposed. It also needs to return pre-existing Meters when given the same name+version.
  2. We probably want an extension method on ServiceCollection to add the factory and call it by default from hosts, similar to logging.
  3. All the Meter.CreateXXX APIs need to start returning cached instruments when given the same argument values
  4. We need some support for a MeterListener to receive publish notifications for Meters from a certain factory rather than from all Meters
  5. The InstrumentRecorder either needs to be implemented in some assembly or it could be a small code snippet we document and people paste it into their tests.

What about interfaces?

This design does not add any interfaces for Meters or Instruments on the premise that they aren't needed to accomplish what devs want to do.

  1. For creating test mocks - the unit test above provides an easy alternative that requires no mocks.
  2. For capturing and transmitting metric measurements during production - use MeterListener to receive the data.
  3. For changing any other aspect of Meter/Instrument API behavior - I've never heard anyone ask and its probably by-design that you can't, but if anyone thinks there is an important scenario here being missed lmk.

Composing with other future features

Strongly typed instruments (dotnet/runtime#77516)

Either of the patterns could use more strongly typed instruments in place of the current ones.

Target-typed or named DI capabilities

Meter could be injected directly rather than MeterFactory if a good name can be automatically inferred. This eliminates needing to use a MeterFactory abstraction as part of the Meter/Instrument definition process, saves 1 LoC, and may improve naming consistency. However I am not confident that the enclosing type name will always be a good name so I expect the documentation will still need to show users how to use the MeterFactory to get a different name when needed. Subjectively I think this represents a modest improvement to the overall scenario but others felt this difference represented a more substantial improvement.

I have no opposition to either of these DI features but nor do the patterns above feel in critical need of them.

Telemetry from multiple DI containers

Today if you want to view the telemetry from outside the process then a FruitStore meter from DI container A looks identical to one in container B. Two approaches that might useful:

  • Similar to the InstrumentRecorder that was bound to a particular factory during testing, any telemetry library could scope itself to the Meters that came from one factory rather than all Meters in the process. This lets folks set up distinct pipelines per-container and similar to what OTel does for logging.
  • Customizing the IMeterFactory could allow tagging Meters with some container specific data that allows them to disambiguated later. This would also rely on Meters themselves gaining some functionality to store tags. Today tags are only defined on individual measurements, rather than the broader instrument or meter scopes.

Other forms of instrumentation

I deliberately didn't name the FruitStoreInstrumentation using the word Metrics. I think it would be perfectly reasonable for developers to put other instrumentation objects such as ILogger, ActivitySource, DiagnosticSource, etc in the same containing type. In the ad-hoc case I expect other instrumentation objects would be stored as fields in the controller (or whatever class is producing the data).

@julealgon
Copy link

luckily it's the same team ... but there are other points of views.

If it's the same team it baffles me even more that people don't see it for what it is: an obvious hack... we are now discussing expanding upon hacks to add similarly hacky APIs and experienced people here like you and James are seemingly ok with that. I'd think you'd be one of the people who'd want to fix the issues at their root.

It seems absurd to me that you'd consider this a "point of view", as well. This is just an objective observation of a significant shortcoming in the DI system that has been consistently worked around in a multitude of nasty ways by consuming libraries such as .Logging, .Options, .Http and now telemetry. The more hacks keep being built around it, the harder it will be to ever fix the underlying issue. We'll get to a point where you'd just say "oh well... now we can't change it... too much stuff depends on how it is... too bad".

I can see you are annoyed by my insistence here, but the reason I am so adamant about it is because this is an absolutely huge opportunity to streamline a significant portion of the DI systems presenting itself.. this happens rarely with DI.

I do wish you guys took community feedback slightly more seriously however, instead of making remarks such as "its just 1 LoC", "its a minor syntax optimization" or just another "point of view" for something so incredibly obvious and objectively bad as this topic.

@davidfowl
Copy link

I can see you are annoyed by my insistence here, but the reason I am so adamant about it is because this is an absolutely huge opportunity to streamline a significant portion of the DI systems presenting itself.. this happens rarely with DI.

I'm saying, we heard you opinion and are considering it. I'm just trying to tell you that you don't have to keep repeating it. I think you're trying to hear us acknowledge that it's a hack, I won't do that.

I do wish you guys took community feedback slightly more seriously however, instead of making remarks such as "its just 1 LoC", "its a minor syntax optimization" or just another "point of view" for something so incredibly obvious and objectively bad as this topic.

Once again, thank you for your well thought out opinion. You don't need to dismiss other opinions by calling them absurd, that's not a great way to communicate.

@noahfalk
Copy link
Author

An alternative to Meter could be to go with HttpClientFactory's approach of strongly typed HTTP clients. However, that would require the strongly typed clients to be registered in DI:

I thought about that pattern a bit but so far it doesn't seem as good to me. The HttpClientFactory style approach appears to split the configuration into two places with the meter name being defined in the DI container initialization code and the instrument names being defined in the constructor of the type. In the ad-hoc usage scenario that breaks the assumption that developers would like to edit just one file, and in the decoupled usage scenario it breaks the assumption that developers want all the configuration centralized in one place in code.

The other trouble I am anticipating on that path is it makes the strong type Meter-centric. I think code like this:

services.AddMeter<FruitStoreInstrumentation>(name: "FruitCo.FruitStore");

makes sense when Meter is the only thing being injected into the FruitStoreInstrumentation. If we think of an alternative use-case where there is also a Logger, ActivitySource, or DiagnosticSource in there and all of those things could use names now it would feel odd to me that Meter gets one strategy to set up name and all the others would use a different strategy. I'm guessing we'd instead avoid patterns that intermingle different kinds of telemetry in the same container type and devs would end up with types FruitStoreLogging, FruitStoreMetrics, FruitStoreDistributedTracing, and so on. It would work, but feels unnecessarily verbose to me.

Do other folks have similar opinions or you think .NET devs will see it differently?

Perhaps telemetry from IMeterFactory isn't available anywhere globally until the host (or whatever created the ServiceCollection) registers them centrally with a name.

That is possible but it wouldn't be my first choice. I was thinking along the lines:

  • Folks who create a globally scoped MeterListener still see all Meters in the process, regardless whether they are created by a factory
  • Folks who create a MeterListener scoped to a particular factory only see the Meters from that factory.
    My sense was there was no need to interfere with global observations in order to support more scoped observations. For example I didn't want dotnet-counters to stop showing the data when a dev switches from static global Meters to DI with a factory.

services.AddMetrics(options => option.DefaultMetricTags.Add("app1"));

Yeah, I think this one is more effective if you have multiple DI containers in the same app and you want to distinguish them. It does require a new Meter feature to support default tags but I'm hopeful that won't be a big deal. We are in somewhat murky territory when it comes to adding new features to the Meter API surface when OTel doesn't prescribe the feature. For example its possible we add the feature using one name for the API then OTel spec adds it later but named differently.

Thanks @JamesNK!

@JamesNK
Copy link

JamesNK commented Feb 16, 2023

Everything you said makes sense.

Is there any agreement or standards around tag values? I'm thinking about a tool like dotnet-counters would understand and make sense of the values its given. Even though in .NET there are separate meters and counters with different tags, dotnet-counters is still displaying totals in one list. Could there be a tag format that indicates to tooling that this set of counters belongs to a separate factory.

For example, today:

[Microsoft.AspNetCore.Hosting]
  Current requests          10

Counters split by factory using a known tag, e.g. collection:app1 and collection:app2:

[Microsoft.AspNetCore.Hosting - app1]
  Current requests          9

[Microsoft.AspNetCore.Hosting - app2]
  Current requests          1

@noahfalk
Copy link
Author

Today in dotnet-counters if you specify different tags then each unique tag set will be shown individually. dotnet-counters doesn't know what they mean, but it does know they aren't the same. For example:

[HatCo.HatStore]
    hats-sold (Count / 1 sec)
        Color=Blue,Size=19                             9
        Color=Red,Size=12                             18

If all instruments in factory A had a container=app1 tag and all instruments in factory B had a container=app2 tag then we might end up with:

[HatCo.HatStore]
    hats-sold (Count / 1 sec)
        Color=Blue,container=app1,Size=19              9
        Color=Red,container=app1,Size=12              18
        Color=Blue,container=app2,Size=19              3
        Color=Red,container=app2,Size=12              29

To me that seems OK, not awesome. If we wanted to go further we could make the dotnet-counters treat the "Container" tag in a special way, or give general purpose command-line options where users could filter and pivot the data based on tags. Before we did anything too fancy I'd want to figure out how often do we expect people to have production scenarios with multiple containers running SxS. My understanding was that multiple containers were common while doing parallel testing, but rare in production. If that is correct then I'd be tempted to start with a cheap UI experience like the one above and wait for customer feedback to decide if more improvements were merited.

@JamesNK
Copy link

JamesNK commented Feb 16, 2023

I didn't know that about dotnet-counters. I agree it's fine for now.

I think a convention for separating values by container (basically, a known tag name) would be useful to foster a good ecosystem. It could be represented in the .NET API by a strongly typed property on the meter configuration:

services.AddMetrics(options => option.ContainerTag = "app1");
// the equivalent to option.DefaultMetricTags.Add("container", "app1")

@julealgon
Copy link

julealgon commented Feb 16, 2023

From all your back and forth, I see the discussion appears to be assuming that having some sort of "instrumentation" container/wrapper is going to be the way to use instrumentation... which seems odd to me: that's not what I'm doing at all, for example.

You see, instead of:

public class FruitStoreController
{
    FruitStoreInstrumentation _instrumentation;
  
    public FruitStoreController(FruitStoreInstrumentation instrumentation)
    {
        _instrumentation = instrumentation;
    }
  
    public IActionResult PlaceOrder(int fruitCount)
    {
        _instrumentation.FruitSold.Add(fruitCount);
        ...
    }
}

I'd rather have:

public class FruitStoreController
{
    Counter _fruitsSold;
  
    public FruitStoreController(Counter fruitsSold)
    {
        _fruitsSold = fruitsSold;
    }
  
    public IActionResult PlaceOrder(int fruitCount)
    {
        _fruitsSold.Add(fruitCount);
        ...
    }
}

The additional indirection of having an "instrumentation" container class that holds the instruments for me just adds nothing semantically. It seems that, once again, you are introducing boilerplate and additional abstractions and indirections to solve the container-based problem that "there is no mechanism to inject different raw Counter types into certain specific services". I have to once again go back to the way HttpClient works here. You don't see this type of thing:

public class FruitStoreController
{
    ExternalApiConnection _externalApiConnection;
  
    public FruitStoreController(ExternalApiConnection externalApiConnection)
    {
        _externalApiConnection = externalApiConnection;
    }
  
    public async Task<IActionResult> PlaceOrderAsync(int fruitCount)
    {
        var orderFromExternalApi = await _externalApiConnection.Client.GetAsJsonAsync<Order>("order");
        ...
    }
}

public class ExternalApiConnection
{
    public ExternalApiConnection(IHttpClientFactory httpClientFactory)
    {
        Client = httpClientFactory.CreateClient("SomeName") { BaseAddress = "http://my.com/api" };
    }

    public HttpClient Client { get; }
}

And the main reason for not seeing this is that the indirection is unwanted... the controller (or any other consumer) shouldn't have to be forced to known about a certain indirection just to get to what it needs: an already configured HttpClient. Which is why we do this instead:

public class FruitStoreController
{
    HttpClient _client;
  
    public FruitStoreController(HttpClient externalApiConnection)
    {
        _client = client;
    }
  
    public async Task<IActionResult> PlaceOrderAsync(int fruitCount)
    {
        var orderFromExternalApi = await _client.GetAsJsonAsync<Order>("order");
        ...
    }
}

services.AddHttpClient<FruitStoreController>(c => c.BaseAddress = "http://my.com/api");

IMHO, the exact same needs to happen with instrumentation. I want the class that's going to use the instruments to:

  1. Not have to go through some indirection container class because of DI-related limitations/concerns
  2. Not have to be concerned with how the instruments are configured, as that responsibility should sit outside of the consumer of the instrument

Your decoupled version solves 2, but violates 1. My proposal below solves both.

Now... what is the main difference between HttpClient and Counter/ActivitySource/Histogram/Observable... usage? Usually, a class only depends on a single HttpClient, while it is fairly normal for a service to need to handle multiple instruments at once. This, of course, makes it that much harder to create a similar API as the one for HttpClient, because it only accepts a single client per service.

And this is why I've been mentioning the container itself so much: if the container provided a away to associate arbitrary configurable types to underlying services, you could configure multiple instruments and associate them with any given type similar to what AddHttpClient does today.

I believe an API that promotes configuration at the aggregation root level (like AddHttpClient) is vastly superior to something where I suddenly have to inject the "configurator" into my class and manually configure the instruments in the constructor: we don't do that for HttpClient, and we shouldn't have to do it for any other type of dependency.

My ideal API would probably look something like:

services
    .AddMeter(c => 
    {
        configure the meter here. container level, only one. need to thing how the api would look with multiple ones of course
    })
    .AddTransient<IMyService, MyService>(c => 
    {
        c.WithCounter("first-counter", whateverOtherSettings);
        c.WithCounter("second-counter", whateverOtherSettings);
        c.WithActivitySource(...)
    });

Which, again, would be a generalized but extensible version of what AddHttpClient does: WithCounter and WithActivitySource above would be extensions over some raw interface in the container that allows arbitrary association, such as:

services
    .AddTransient<IMyService, MyService>(c => 
    {
        c.With<string>("connectionString", value);
        c.With<HttpClient>("client", provider => provider.GetRequiredService<IHttpClientFactory>().CreateClient("MyWeirdClient"))
        ...
    });

public class MyService : IMyService
{
    public MyService(string connectionString, HttpClient client) { ... }
}

Which would tie back to the APIs I mentioned from Autofac, Structuremap, Unity, etc.

@tekian
Copy link

tekian commented Feb 27, 2023

Thanks @noahfalk for the proposal, these are all steps in the right direction.

I agree with @JamesNK that whether people use IMeter<T> (as an equivalent to ILogger<T>) or IMeterFactory (as an equivalent of ILoggerFactory) would be mostly based on their taste. There is little to no confusion with that established pattern.

The HttpClientFactory style approach appears to split the configuration into two places with the meter name being defined in the DI container initialization code and the instrument names being defined in the constructor of the type.

Right on!
(I actually think that HttpClientFactory style approach isn't doing a favor to how we configure HttpClient but that's a different discussion altogether.)

Note: there is an additional approach or a variation to the second de-coupled model, in which metric code is de-coupled from the business class, but not into an instrument class on its own, but rather into a decorator implementation.

public interface IFoo {
    Task<Result> GetBar(int id);
}

public class Foo : Foo { ... }
public class MeterFoo : Foo {
    MeterFoo(IFoo inner) { ... }
}

Helps keep the original business class focused on what matters.

@julealgon
Copy link

I agree with @JamesNK that whether people use IMeter<T> (as an equivalent to ILogger<T>) or IMeterFactory (as an equivalent of ILoggerFactory) would be mostly based on their taste. There is little to no confusion with that established pattern.

What about using another established pattern that doesn't introduce any extra indirections or abstractions, the HttpClient approach?

Just because a pattern is already in use, that doesn't mean it is a good pattern to expand upon.

The HttpClientFactory style approach appears to split the configuration into two places with the meter name being defined in the DI container initialization code and the instrument names being defined in the constructor of the type.

Right on!

This does not have to be the case. You can have 100% of the configuration aspects handled in the DI setup and leave 0 for the consuming class, again, as is done with HttpClient. Please check my example API above. Consumers of the HttpClient do not have to perform extra configuration steps, nor should consumers of Counter<int> for example.

@noahfalk
Copy link
Author

Thanks for the nice write up @julealgon! Sorry I've been away from this for a couple weeks and trying to get back to it a bit. @JamesNK is the main person driving the work at the moment.

Not have to go through some indirection container class because of DI-related limitations/concerns

As I understand it, you see the container type as solely existing to mitigate a DI limitation and that it has no other value. Based on my interactions with various devs and seeing code on different projects my understanding is that while your view is a perfectly legitimate one and might be optimal for some situations, it isn't a widely held view among devs asking for these metrics improvements. Most people who have indicated a desire for isolation are looking for a relatively strong form of it: strong typing, separation from other concerns (including presumably other DI configuration code), and potentially separate source file(s). Beyond just differing stylistic preferences others might like this strong isolation so that they can audit the telemetry surface compactly represented in one place, use 'Find All References' and 'Go-to Definition' to quickly browse between definition and usage, or use methods/properties on the container to do things with strongly typed source generators like this or this.

The pattern you want and the patterns I understand many others want certainly aren't mutually exclusive. My main concern around your proposal isn't an objection to the user experience of the API, it is based on limited perceived demand from others and the amount of time and effort involved in the implementation relative to alternatives. Although I claim no great expertise in .NET's DI implementation, I assume adding a new DI feature isn't solely a matter of changing the default implementation in Microsoft.Extensions.DI. It would also involve coordination with the owners of other popular DI implementations that plug into the same interface, planning for what happens if any of the popular alternate implementations doesn't support the new feature, coordinating the roll-out between .NET and external DI implementers, and analyzing the impact of the new feature on perpendicular efforts like NativeAOT that needs to avoid various reflection and dynamic codegen operations that are likely to show up in the implementations of high performance DI containers.

Although I think the odds are pretty low that the runtime team will implement the DI approach you are advocating for within the scope of this metrics feature, here are some things you could do to make headway towards your vision if you wanted to:

  1. Open a GH issue describing what a good version of the DI feature could look like and who would benefit, then encourage other .NET developers to vote that issue up to show that there is plenty of support. The advantage of having an issue that focuses on DI explicitly is that you can aggregate a bunch of different use-cases together rather than trying to justify it entirely based on just the usage it would get in one specific feature like this one.
  2. Check out https://github.com/dotnet/runtime/blob/main/docs/area-owners.md#areas to determine an area specific owner/expert for DI (@dotnet/area-extensions-dependencyinjection) and hopefully get their opinion on it. Ultimately the area owners are primary people to convince about the importance of a feature because they are going to have a big role in figuring out priorities and doing work in that area.
  3. If you are interested in contributing to design, code, or testing of a feature that can help lower the amount of effort involved from others and makes the feature more attractive when prioritizing.

Sorry I know that probably wasn't what you were aiming for, but hopefully it still helps somewhat. Even though we aren't acting on it now the DI area owners are going to see all your feedback.

@julealgon
Copy link

Sorry I know that probably wasn't what you were aiming for...

It definitely wasn't as I wished folks from inside the .NET team would also be interested in solving long-standing workarounds and cleanup the overall DI architecture. The fact that these are still not seen as workarounds at all is baffling to me, and very unfortunate since it basically means we not only will not fix it, but will perpetuate the bad patterns with more and more usages (like this one) making it even harder to fix in the future.

I appreciate the guidance regardless @noahfalk . I'll consider starting an issue in the DI space, but from what I'm seeing in this discussion, I doubt it will go very far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment