Skip to content

Instantly share code, notes, and snippets.

@lmolkova
Last active January 13, 2022 20:29
Show Gist options
  • Select an option

  • Save lmolkova/bf1be28bcd0d08b99d6edd8ea4d740b6 to your computer and use it in GitHub Desktop.

Select an option

Save lmolkova/bf1be28bcd0d08b99d6edd8ea4d740b6 to your computer and use it in GitHub Desktop.
Configuration in Azure SDK for Java

Azure SDK configuration

Prior art

Goal

  • Define a way to apply non-code configuration to Azure SDK
    • support idiomatic configuration approaches for different frameworks/languages
  • Define guidelines for SDKs and languages to follow
    • property conventions: structure, naming principles
    • configuration API and extensibility points
    • ambient vs explicit configuration and priority
    • error handling and debugging

The main consumers for this feature are web-framework integrations with Azure SDKS such as Azure Spring Cloud (or any future integrations with frameworks and environments).

Non-goals

  • Make explicit configuration fully-consistent across languages
  • Have great experience for mixed code and non-code configuration: it must still work reasonably and deterministically, but we're not optimizing for this case
  • Define all configuration properties in every language. Defining properties can be done for options/clients gradually as a next steps

Examples

Overall structure

Configuration should support arbitrary format: json/yml/xml, flat. The only common denominator is getting property by name (+ metadata). Conventions on naming are different between languages/frameworks: separators, casing, etc.

  • .NET: already supports configuration with with MS.Extensions.Configuration, no changes suggested, provided to demonstrate variety of needs

    {
      "AzureDefaults": {
        "Retry": {
          "MaxRetries": 3,
          "Mode": "Exponential",
          "Delay": "00:00:01",
          "MaxDelay": "00:00:30"
        }
      },
      "KeyVault": {
        "VaultUri": "https://mykeyvault.vault.azure.net"
      },
      "Storage": {
        "ServiceUri": "https://mydemoaccount.storage.windows.net"
      }
    }
    services.AddAzureClients(builder =>
    {
      builder.AddBlobServiceClient(Configuration.GetSection("Storage"));
      builder.ConfigureDefaults(Configuration.GetSection("AzureDefaults"));
    }

Azure Core in different languages have different abstractions for retry options, so configuration properties might look differently:

  • Java: new, similar approach already exists in Azure Spring integrations, Quarkus and Micronaut have similar configuration stories

    http.retry.strategy = exponential
    http.retry.strategy.exponential.max-retries = 7
    http.retry.strategy.exponential.base-delay = PT1S
    http.retry.strategy.exponential.max-delay = PT2S
    
    appconfiguration.http.proxy.host = appconfigproxy.contoso.com
    appconfiguration.http.proxy.port = 80
    appconfiguration.connection-string = DefaultEndpointsProtocol=https;AccountName=myaccount;AccountKey=mykey
    
    storage.connection-string = DefaultEndpointsProtocol=https;AccountName=myaccount;AccountKey=mykey

    or (Azure Spring configuration)

    spring:
      cloud:
        azure:
          storage:
            fileshare:
              account-name: myaccount
              account-key: mykey
              endpoint:  https://endpoint.com
    // Configuration source is new; we don't need to ship FileSource implementation in Core, abstraction is a good start
    com.azure.core.util.Configuration configuration = new Configuration(new org.example.FileSource("application.properties"));
    
    // API exists today, but supports a few common configuration properties
    new BlobServiceClientBuilder()
        .configuration(configuration)
        .build();

Approach

  1. Define Configuration abstraction in azure core - it's an interface for clients to retrieve any kinds of properties

    • can retrieve property by name (and optional metadata)
    • support pluggable sources
    • support ambient (env vars) and explicit configuration
    • .NET is special here, it only needs to work with MS.Extensions.Configuration
  2. Require clients and core options to read themselves from Configuration.

    • Each client or core option defines configuration properties for itself and owns reading itself from Configuration
  3. Configuration properties names and hierarchy follow public APIs in given client/language

    • They evolve with public APIs, have breaking changes expectations similar to public APIs
    • They should be documented similarly to public APIs (or point to the same documentation)
  4. Client config properties are local (specific to client). Core/common properties can be local (per-client) or global (per all clients).

  5. Allow applications and framework integrations to implement custom sources and explicitly add them to Configuration. Web frameworks have opinions on and support for

    • source: file, env vars, etc
    • file format: yml, json, xml, flat
    • conventions on naming: separators, casing, binding to property names in code
  6. Supporting popular frameworks and matching existing public APIs for given client/language for custom sources is more important than consistency across languages

    • Properties reasonably follow public API (naming and hierarchy)
    • When possible, properties should be consistent across languages
    • Defined environment variables must be consistent across languages (frameworks may support it in some other way - beyond our control)

Property names and structure

Property uniquely identifies specific public property/setter/constructor-parameter that configures option or client.

  1. ConfigurationSource can only retrieve property value by given name

    • MUST: get property value by name
    • MUST: return string values only (to enable all possible sources including env vars), parsing/conversion happen between options and Configuration
    • MUST: consistently return the same value for the same property name (e.g. if duplicate keys are present))
    • MUST NOT: change values
    • SHOULD NOT: remap property name segments
    • MAY: add prefixes to property names, map separators and casing for names, translate flat property name to path in file
  2. Client should be able to request a property from Configuration by string name

    • MUST: get value for given name
    • MUST: allow checking if value for given name exists OR allow retrieving with default value
    • SHOULD: allow requesting value with given type
    • MAY: retrieve per-client property and fallback to global configuration when applicable
    • MAY: get value for given list of property name aliases (for backward compatibility with existing properties if any)
  3. Property name MUST map unambiguously to corresponding property/setter/constructor parameter/etc in public API

    • MUST: include implementation class identifier or other means to instantiate specific options class
    • MUST: have the same name for the same API property (potentially prefixed) across clients and other options they are used in.
    • SHOULD: use separators, casing and conventions idiomatic for language
    • SHOULD: use annotations/attributes to describe, document and validate property names + metadata along with the code.
  4. Environment variable names, when defined, MUST be consistent across languages:

    • Configuration MUST only support retrieving known environment variables
    • new environment variables MAY be registered for new properties
    • new env vars names MUST keep following current conventions
  5. Configuration instance is immutable. Client options can read configuration when it's passed or do it lazily - it leads to the same resolved values

    • Single ConfigurationSource (or Configuration) instance MUST consistently return the same value for the same property name
      • Log level needs special treatment - it MAY dynamically change (not in scope of this spec)
      • There might be a few other dynamic properties in future
    • Configuration MUST NOT allow adding new sources or properties after it's initialized.
      • Caveat: Configuration exists in some languages and need more work to satisfy this requirement
  6. All property names MUST be documented along with their metadata, expected format and public API they represent.

Property metadata

Property metadata which MAY be passed to Configuration in addition to property name:

  • client name (prefix)
  • can be global (Configuration should fallback to global properties when property not available for given client)
  • default values
  • property name aliases (for backward compatibility)

Error handling and debugging

Behavior is similar to what code configuration does already, with more edge cases related to missing/invalid/conflicting properties and additional logic in Configuration to retrieve/fallback and cache.

  • Every retrieved property value MUST be validated: parsing, type, value range.
    • it's a client/option responsibility, but core may provide common helpers
    • invalid property value, even if optional, MUST raise an error and fail corresponding client/option setup
  • Parsing errors, missing required properties, conflicts MUST be accompanied with informative error log and configuration MUST fail with exception.
  • Configuration SHOULD log where it retrieved property from (name and source) with debug severity
  • Recoverable conflicts MUST be accompanied with warning log, best effort should be done to detect such cases. Example: attempt to override configuration with higher source priority.

Configuration sources priority

Priority: code > explicit > implicit

  • Implicit configuration (env vars) MUST be applied first before other configurations. It MUST be the last source to retrieve properties from.

  • Configuration with custom sources MUST be passed explicitly during client setup and can override implicit configuration

    • Clients/core options MUST expose API to set configuration
      • Options SHOULD support creating themselves from configuration (rather than exposing a setter for configuration) - this makes mixed in-code and non-code configuration more predictable
      • If requested options are not defined in configuration, caller MUST detect it and MUST keep previous value of such option when it was set.
  • Configuration properties provided through explicit Configuration can be changed in code.

  • When option/property was already set up in code, it MUST NOT be overridden with non-code configuration.

BlobServiceClientBuilder()
    .httpClient(httpClient)
     
     // must not change httpClient even if fileConfig has relevant properties, 
     // should result in warning, but proceed
    .configuration(fileConfig) 
    .build();
BlobServiceClientBuilder()
    .configuration(fileConfig) 
    // has higher priority than options in fileConfig
    .httpClient(httpClient)
    .build();

API additions

APIView (Java): https://apiview.dev/Assemblies/Review/2a7d94c92389414da9448d6aa4c31b4f/4793132eab8749429479997d8501ac02?diffRevisionId=bb3ed257c464403bba88ce61e40bf068&doc=False&diffOnly=True It's a prototype with bare minimum APIs, polishing and additions are expected.

┌────────────────┐
│ ClientBuilder/ │
│ Options        ├───get(prop)/getSection(client)──┐
└────────────────┘                                 │                     ┌──────────────────────┐
                                                   │                  ┌──► Environment variables│ 
                                                   │                  │  └──────────────────────┘
                                         ┌─────────▼─────┐            │  ┌──────────────────────┐
                                         │ Configuration │──get(prop)─┼──►   System properties  │
                                         └───────────────┘            │  └──────────────────────┘
                                                                      │  ┌──────────────────────┐
                                                                      └──►    Custom source     │
                                                                         │  (e.g. Spring yaml)  │
                                                                         └──────────────────────┘

Appendix: code vs file config example

In this example property names are new - they only exist in prototype

  • Code based:

    ClientOptions clientOptions = new ClientOptions();
    clientOptions.setApplicationId("appconfig-application-id");
    
    List<Header> headers = new ArrayList<>();
    headers.add(new Header("header1", "v1"));
    headers.add(new Header("header2", "v2,v3"));
    clientOptions.setHeaders(headers);
    
    HttpLogOptions httpLogOptions = new HttpLogOptions();
    httpLogOptions.setLogLevel(HttpLogDetailLevel.BODY_AND_HEADERS);
    httpLogOptions.setAllowedHeaderNames(Set.of("header1", "header2"));
    httpLogOptions.setAllowedQueryParamNames(Set.of("p1", "p2"));
    httpLogOptions.setPrettyPrintBody(true);
    
    HttpClientOptions httpClientOptions = new HttpClientOptions();
    ProxyOptions proxyOptions = new ProxyOptions(ProxyOptions.Type.HTTP, InetSocketAddress.createUnresolved("localhost", 80));
    httpClientOptions.setProxyOptions(proxyOptions);
    
    HttpClient httpClient = HttpClient.createDefault(httpClientOptions);
    
    // most complicated part: retry count is custom, delays are default, but users have to specify them
    HttpPipelinePolicy retryPolicy = new RetryPolicy(new ExponentialBackoff(25,  Duration.ofMillis(800),  Duration.ofSeconds(8)), null, null);
    ConfigurationClientBuilder builder = new ConfigurationClientBuilder()
             .httpLogOptions(httpLogOptions)
             .clientOptions(clientOptions)
             .httpClient(httpClient)
             .retryPolicy(retryPolicy)
             .serviceVersion(ConfigurationServiceVersion.V1_0)
             .connectionString("...");
  • Proposed property-based config

    http-client.proxy.host=localhost
    http-client.proxy.port=80
    
    http-client.application-id=http-app-id
    http-client.headers=header1=v1;header2=v2,v3
    
    http.logging.level=BODY_AND_HEADERS
    http.logging.allowed-header-names=header3,header4
    http.logging.allowed-query-param-names=p1,p2
    http.logging.pretty-print-body=true
    
    http-client.retry.max-retries=5
    
    appconfiguration.service-version=V1_0
    appconfiguration.connection-string=...
    appconfiguration.http-client.application-id=appconfig-app-id
    Configuration configuration = new Configuration(new FileSource("application.properties"));
    return new com.azure.data.appconfiguration.ConfigurationClientBuilder().configuration(configuration);
@kasobol-msft
Copy link

Re: bulk code-configure. I'm a bit concerned about pushing internal usage towards reflective access. That might be a quick short-term solution to the problem, but it comes with all the problems of reflective access, i.e. we lose compiler support here. If we do that, we should plan for cleaning that debt.

@lmolkova
Copy link
Author

lmolkova commented Jan 10, 2022

@kasobol-msft

bulk code-configure. I'm a bit concerned about pushing internal usage towards reflective access.

I agree reflection is not great, but I'm not convinced there is a real use-case for bulk config and we should solve this problem at all.
E.g. you want to set proxy for all clients

  • you have global properties for it that would work without any code
  • If you do code-based, you need to update all builders with your ProxyOptions instance - how many builders do you have in your app? Spring team has maybe a dozen, I think a typical user app has less than this. Also, they still need to instantiate all these builders anyway. So why not call configuration() in the same place?

The only place users deal with an arbitrary unknown builder is Spring BeanPostProcessor. I'm not sure if we have a lot of users interested in this case. Spring team can still write 10-20 instanceof switch if they need it.

What do you think?

@kasobol-msft
Copy link

I see this as an opportunity to simplify Spring codebase. Potential improvement for external users is a by-product. End of the day Spring integration is a consumer of SDK apis, the workarounds proposed for that send strong signal that SDK public surface could use that improvement. Besides that sets a pattern for other "internal integrations" in the future which we might not see right away (although we may wait and see if they come).
If I were Spring integration dev, I'd rather contribute such change rather than maintain 20 instanceof and remember that I have to revisit that place every time I add new services - which in turn will bloat that piece every time new service is added. Not to mention that this approach requires extensive testing to prove all branches work as expected whereas the "right" alternative can ensure that via compiler enforced contract. I'm not sure which approach is cheaper in long term.

@lmolkova
Copy link
Author

lmolkova commented Jan 10, 2022

@kasobol-msft
configuration testing has to happen regardless of where configure() call happens.

I.e.

  1. not having base builder class is repeating this ~10x times

      @Bean
      ConfigurationClientBuilder configurationClientBuilder() {
          return new ConfigurationClientBuilder().configuration(configuration);
      }

    plus once somewhere

    ConfigurationSource.FileSource fileSource = new ConfigurationSource.FileSource("application.properties");
    Configuration configuration = new Configuration(fileSource);
  2. Having a base class is

    • this x10

        @Bean
        ConfigurationClientBuilder configurationClientBuilder() {
            return new ConfigurationClientBuilder();
        }

      and once this:

      @Bean
      public BeanPostProcessor postProc() {
            return new PostProcessor();
      }
      
      public class PostProcessor implements BeanPostProcessor {
          private final Configuration configuration;
      
          public PostProcessor() {
              ConfigurationSource.FileSource fileSource = new ConfigurationSource.FileSource("application.properties");
              configuration = new Configuration(fileSource);
          }
      
          @Override
          public Object postProcessAfterInitialization(Object bean, String beanName) throws BeansException {
              if (bean instanceof AzureClientBulider) {
                  ((AzureClientBulider)bean).configure(configuration);
              }
          }

I agree 2nd is better than 1st, but I don't think it's a big improvement

@lmolkova
Copy link
Author

Also, there were no new new integration added over past couple of years and when new client is added, you have to add code and tests for it regardless

@kasobol-msft
Copy link

Thanks for the sample. If the Spring usage indeed boils down to this then I agree it's not worth it.

@lmolkova
Copy link
Author

@kasobol-msft thanks! I hope we can make Spring usage as simple as that - it depends on the Spring team

@saragluna
Copy link

Several scenarios to consider:

  • User's application uses several azure sdks at the same time and wants to use the same credential instance for each of them.
  • User's application uses several HTTP-based azure sdks at the same time and wants to provide their own HttpClient / HttpPipeline or customize HttpPipelinePolicy for these http-based clients.
  • User's application uses several azure sdks at the same time and configures common properties at the global level, such as http.client.connect-timeout, but be able to override this configuration for a single sdk, say SecretClient.
  • To configure retry via properties, and the retry properties for each sdk client should follow the same pattern.

@lmolkova
Copy link
Author

@saragluna thanks for the scenarios. I updated the gist (and prototype) to more precisely describe functionality.

  • any option including credentials, http client options, retry options, can be per-sdk-client or global. Different clients can have different values. We will fallback to global properties when a property is not defined for a given client. I.e.

    • client.application-id is global application id for all clients
    • http.client.application-id is global application id for all http clients, if not specified client.application-id is be used
    • appconfiguration.http.client.application-id only applies to all appconfiguration clients, if not specified http.client.application-id is used
    • storage.http.client.application-id applies to all storage clients
    • appconfiguration.endpoint though would never fallback to global endpoint
  • regarding retries: ampq and http retry policies don't exactly match and we'd have different properties for them and for fixed and exponential strategies as well. E.g.

Property names are not final of course, but the principle is that they would follow public SDK API names and hierarchy, so ideally users don't even need to even look into the documentation to understand which API property maps to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment