Skip to content

Instantly share code, notes, and snippets.

@mikekistler
Last active April 12, 2022 01:33
Show Gist options
  • Save mikekistler/fa2961db03e8249bd103b10442500e85 to your computer and use it in GitHub Desktop.
Save mikekistler/fa2961db03e8249bd103b10442500e85 to your computer and use it in GitHub Desktop.
An alternate proposal for metadata handling

An Alternative Solution for Request/Response Metadata

This gist offers an alternative to Brian's proposal for handling metadata / solving Issue #182.

This proposal is to (mostly) "leave things as is", document them clearly, and then give users guidance for handling cases such as the scenarious identified in the on how to handle cases such as the use cases described in the Wiki page for this issue.

What do I mean by "leave things as is"?

The autorest & openapi3 emitters currently strip all @header, @query, @path, and @statusCode properties from a model when rendering it to a schema (which presumably is used for a request or response body).

These emitters also ignore any metadata properties that are not "at the top-level":

  • not an explicit parameter in the parameter list
  • Not a top level property of the return type

I think that this behavior is simple to understand and so long as it is well documented will not be a "surprise" to any users.

One thing we should consider changing is the current cadl-autorest and openapi3 emitter handling of @visibility("read"), which generates these properties as readOnly: true. This may seem harmless, but it is really the first step onto the slippery slope of overloading the visibility framework. So I propose that we introduce an @readOnly decorator into the openapi library and use this, rather than @visibility("read") to drive the openapi emitters to mark properties as readOnly: true.

But users will need guidance on good approaches for defining their APIs to avoid verbosity and redundancy and amenable to protocols other than HTTP.

Rules for multi-protocol API definition

  1. Don’t put @statusCode in a resource model. “Intersect it” with the return type.

  2. Don’t put @query or @path in any model that might be used as a return type. Create a separate model with “is” or “extends” from the resouce model for spreading into a parameter list.

  3. Don’t put request headers in any model that is not exclusively used as a spread source for parameters.

  4. Response headers generally should be intersected into the return type, but there are some exceptions:

    • Etag - since this could be considered a “read only” property of the resource, you could include it in the model for the return type. But if Etag is a required property, then it must be included (i.e. the header returned) on every operation that returns this resource. And if included it should be marked “readOnly” if the model is every used as a request body.

The rules in practice

How would these rules work in practice?

Use Case 1: Etag for a model

Put Etag in the model with @header. Etag will be dropped from the schema and thus not sent in requests. gRPC may need a way to flag is as not sent, or may just ignore it in a request message.

This covers the case of the Etag not returned in the list element, because metadata properties are ignored for inner models.

The "widgets" service illustrates this approach:

@route("/widgets")
namespace widgets {
  model Widget {
    name: string;
    weight: float32;

    @header
    @visibility("read")
    Etag: string;
  }

  model WidgetList {
    value: Widget[];
    nextLink?: string;
  }

  model WidgetPatch {
    @header contentType: "application/merge-patch+json";
    name?: string;
    weight?: float32;
  }

  @get
  op list(): WidgetList | Error;
  @put
  op create(@path name: string, @body body: Widget): Widget | Error;
  @get
  op get(@path name: string): Widget | Error;
  @patch
  op update(@path name: string, @body body: WidgetPatch): Widget | Error;
  @delete
  op delete(@path name: string): Widget | Error;
}

Scenario 1a:

Etag is a standard property in the resource -- returned on GET and LIST and PUT and PATCH.

This appears to be the case for some ARM resources: 'AzureEntityResource' and 'ResourceModelWithAllowedPropertySet'. (Hard to tell for sure without testing, since the Etag is not "required" so could be omitted in some cases.)

For this case, we'd define the ETag as a regular property in the model and the header, but with a different property name and specifying the true header name in the decorator.

etag: string;
@header("Etag") etagHdr string;

The "accounts" service illustrates this approach:

@route("/accounts")
namespace accounts {
  model Account {
    name: string;
    balance: float32;
    etag: string;

    @header("Etag")
    @visibility("read")
    etagHdr: string;
  }

  model AccountList {
    value: Account[];
    nextLink?: string;
  }

  model AccountPatch {
    @header contentType: "application/merge-patch+json";
    name?: string;
    balance?: float32;
  }

  @get
  op list(): AccountList | Error;
  @put
  op create(@path name: string, @body body: Account): Account | Error;
  @get
  op get(@path name: string): Account | Error;
  @patch
  op update(@path name: string, @body body: AccountPatch): Account | Error;
  @delete
  op delete(@path name: string): Account | Error;
}

Scenario 1b:

Etag is NOT a standard property in the resource -- it is only included in the list element.

I think we should discourage this pattern, but if needed I think it can be implemented with a separate model for the list element that “extends” or “is” the resource and just adds the Etag property.

The ETag header could be defined in the base resource model but could not have the same name as the ETag property, so it may need to specify the header name on the @header decorator. Or it could be defined in a separate model and intersected into the resource model where needed.

The "cookies" service illustrates this pattern:

@route("/cookies")
namespace cookies {
  model Cookie {
    name: string;
    calories: float32;

    @header("Etag")
    @visibility("read")
    etagHdr: string;
  }

  model CookieItem is Cookie {
    etag: string;
  }

  model CookieList {
    value: CookieItem[];
    nextLink?: string;
  }

  model CookiePatch {
    @header contentType: "application/merge-patch+json";
    name?: string;
    calories?: float32;
  }

  @get
  op list(): CookieList | Error;
  @put
  op create(@path name: string, @body body: Cookie): Cookie | Error;
  @get
  op get(@path name: string): Cookie | Error;
  @patch
  op update(@path name: string, @body body: CookiePatch): Cookie | Error;
  @delete
  op delete(@path name: string): Cookie | Error;
}

Use Case 2 "id" for a model

Rule #2 says don't put the path param in the model.

Specify the path param separate from the model (you have to do this anyway for "get"). If you want id in the response to "get", etc, then define it in the model with @readOnly.

Then you are free to define an id property in the model or not as needed.

The "widgets" service illustrates the pattern when id is passed as the final path segment:

@route("/widgets")
namespace widgets {
  model Widget {
    @visibility("read")
    id: string;
    weight: float32;
  }

  model WidgetList {
    value: Widget[];
    nextLink?: string;
  }

  model WidgetPatch {
    @header contentType: "application/merge-patch+json";
    weight?: float32;
  }

  @get
  op list(): WidgetList | Error;
  @put
  op create(@path id: string, @body body: Widget): Widget | Error;
  @get
  op get(@path id: string): Widget | Error;
  @patch
  op update(@path id: string, @body body: WidgetPatch): Widget | Error;
  @delete
  op delete(@path id: string): Widget | Error;
}

The "accounts" service illustrates the pattern where id is sent in the body (MC flavor):

@route("/accounts")
namespace accounts {
  model Account {
    id: string;
    weight: float32;
  }

  model AccountList {
    value: Account[];
    nextLink?: string;
  }

  model AccountPatch {
    @header contentType: "application/merge-patch+json";
    weight?: float32;
  }

  @get
  op list(): AccountList | Error;
  @post
  op create(@body body: Account): Account | Error;
  @get
  op get(@path id: string): Account | Error;
  @patch
  op update(@path id: string, @body body: AccountPatch): Account | Error;
  @delete
  op delete(@path id: string): Account | Error;
}

Scenario 3 Batch APIs

This feels like a special case that we should handle differently than the above.

For example, if we could decorate a model with @query, then we could define the model properties once, use "is" to get the query projection for the single document case and use the original model in the batch case.

It think this would be easier to understand and probably easier to implement.

The "analyze" service illustrates this pattern:

namespace analyze {
  model AnalyzeParameters {
    domain?: "phi";
    stringIndexType?: string;
  }

  // The following model would be the result of applying some new decorator or project to AnalyzeParameters
  // that would make each property a query parameter, e.g.
  // @asQuery model AnalyzeQueryParameters is AnalyzeParameters;

  model AnalyzeQueryParameters {
    @query
    domain?: "phi";

    @query
    stringIndexType?: string;
  }

  model AnalyzeResponse {
    // Arbitrarily complex model
    foo: string;
    bar: int32;
    baz: float64;
  }

  @route("/analyze")
  @post
  op analyze(...AnalyzeQueryParameters): {
    @statusCode code: "200";
    @header xMsRequestId: string;
  } & AnalyzeResponse;

  model BatchJobs {
    jobs: AnalyzeParameters[];
  }

  model BatchJobResponse {
    code: string;
    xMsRequestId: string;
    result: AnalyzeResponse;
  }
  model BatchResponse {
    jobs: BatchJobResponse[];
  }

  @route("/batch")
  @post
  op batchAnalyze(@body body: BatchJobs): BatchResponse;
}

Scenario 4: Logical model in headers

This scenario should "just work". Define all the properties in a model and decorate each one with @header.

gRPC can simply ignore the @header decorators and flow the data in the request & response message.

I think there should be a separate model for request headers and response headers, to avoid creating a convention or distinct decorators for which is which.

The "blob" service (my simple variant) illustrates this pattern:

@route("/blob")
namespace blob {

  // In practice there might be several parameter models -- one with common params,
  // one with params for create vs get, etc.
  model BlobParameters {
    @header
    snapshot: string;
    @header
    versionId: string;
    @header
    leaseIdOptional: boolean;
    @header
    encryptionKey: string;
    @header
    ifMatch: string;
    @header
    ifNoneMatch: string;
  }

  // In practice there might be several properites models -- one with common properties,
  // one with properties returned for create vs get, etc.
  model BlobProperties {
    @header
    lastModified: zonedDateTime;
    @header("x-ms-creation-time")
    creationTime: zonedDateTime;
    @header("x-ms-blob-type")
    blobType: "BlockBlob" | "PageBlob" | "AppendBlob";
    @header("x-ms-lease-state")
    leaseState: "available" | "leased" | "expired" | "breaking" | "broken";
    @header("Etag")
    etag: string;
    @header("x-ms-request-id")
    requestId: string;
    @header("x-ms-server-encrypted")
    isServerEncrypted: boolean;
    @header("x-ms-encryption-scope")
    encryptionScope?: string;
  }

  model OctetStream {
    @header
    contentType: "applicaiton/octet-stream";
    @body
    body: bytes;
  }

  @head
  op getProperties(...BlobParameters): {@statusCode code: 200} & BlobProperties | Error;

  @route("/pageBlob")
  namespace pageBlob {
    @put 
    op create(...BlobParameters, ...OctetStream): {@statusCode code: 201} & BlobProperties | Error;
  }
}

What are the benefits?

I think there are a few benefits of this proposal over Brian's:

  1. With the exception of use case 3, this is exactly how things work now.
  2. I think this is simpler to understand for users.
  3. It does not require "inventing names" for various forms of the models (you might say that we force the user to invent the names, but I still think that's preferable).
  4. We don't need conventions for which headers are request vs response, or special decorators.
  5. There's no "surprise" new header added when some model 5 levels deep in the return type adds a header.
  6. We don't have to decide how to handle conflicting headers from different sub-models of a return type.
  7. We don't have to "overload" the visibility framework, which is currently just a general feature that user's can apply to a variety of situations, with new special semantics for metadata.

Create-Only properties

Brian's proposal supports "create-only" properties -- passed as input to a "create" (e.g. PUT) operation but not an "update" (e.g. PATCH) operation.

While there are cases of this in Azure, they aren't that common. One case is the "location" property in an ARM resource. And whether rare or not, Cadl already has support for these cases with the @visibility and @withVisibility decorators:

  model FooResource {
    id: string;
    @visibility("create")  // anything but "update"
    location: string;
    kind?: string;
    managedBy?: string;
    properties: FooProperties;
  }

  @withVisibility("update")
  model FooResourceUpdate is FooResource {};

  @put
  op create(@body body: FooResource; ...ArmRequestMetadata) : ArmResponseMetadata<201> & {@body body: FooResource} | Error;
  @patch
  op update(@body body: FooResourceUpdate; ...ArmRequestMetadata) : ArmResponseMetadata<200> & {@body body: FooResource} | Error;

Note that the "@body" in the return types is there to get the emitter to generate a $ref rather than an inline model.

Here's a link to the full example which is a minor swizzle on the ARM envelope example from Mark C.

Cadl codegen

Brian's proposal includes an @noclient decorator that "declares a field that is sent from the service but isn't part of the "logical T"".

The intent here is apparently to guide the generation of clients directly from Cadl, indicating what fields should be included in the return type class vs made available as "metadata".

I think we could add the @noclient decorator to this proposal if needed, but I'm not convinced it is.

  • @statusCode should never be part of the "logicalT", so it is ruled out.
  • If the "logicalT" is returned by any operation (likely), they rule #2 above says it shouldn't contain @query or @path.
  • That leaves @header, and rule #4 says don't include them in the model unless they are part of the "logical T".

So in this proposal, there is a model that represents the "logical T", meaning that it contains no properties that are not part of the "logical T", so we don't need "@noclient" in the model of the "logical T".

All other components of the return type will be metadata, which, when specified outside a model we can consider as not part of the "logical T", so these don't need @noclient either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment