Skip to content

Instantly share code, notes, and snippets.

@johanste
Last active July 24, 2019 16:35
Show Gist options
  • Save johanste/319960a2c60f1f3bc7e0596af2c61324 to your computer and use it in GitHub Desktop.
Save johanste/319960a2c60f1f3bc7e0596af2c61324 to your computer and use it in GitHub Desktop.
Assumptions on capabilities and requirements for distributed tracing for Azure Libraries, preview-2

A) For preview-2, the azure libraries don’t support anything beyond HTTP based protocols (i.e. EventHubs/ServiceBus are not supported). In practical terms, it also means that we only have to worry about outgoing requests.

B) None of our libraries have a hard dependency on any 3rd part tracing library (including the opentelemetry/opencensus libraries) for preview-2. This includes both code and packaging.

C) Whenever ambient contexts are supported in the language/runtime, our azure libraries support them. Not supporting ambient contexts mean that integrations that rely on said ambient context (e.g. opencensus for python’s requests integration) may not function correctly.

D) We never initiate a trace/create a span context from within our library for outgoing requests. We rely on it having been created based on context passed implicitly or explicitly to us.

Creating a trace means making decisions such as if the trace is sampled in or not.

E) The libraries do not register any exporters.

F) The library only ever create spans for methods that make service requests.

G) If a trace is active/a valid span context exists (regardless of if it is sampled in or not), a new span is automatically created for the execution of each library method (that makes one or more service requests), and child spans are created for each outgoing request made from within the context of said library method.

  • The name of spans wrapping the library method invocation is of the form ..
  • The name of spans wrapping the individual outgoing requests is named by the URI path value of the request.
  • One span is created per retry attempt if retries occur.
  • If a call is made to AAD to refresh a token, it is a separate span.

H) We don’t create links or a parent/child relationship between spans for pages in a paged operation.

I) We don’t create links or a parent/child relationship between spans for polling requests in a long running operation.

The following span attributes are to be set for outgoing service requests (per https://github.com/open-telemetry/opentelemetry-specification/blob/master/semantic-conventions.md):

Attribute Value
Component "http"
http.method
http.url
http.status_code

In addition, following attributes are to be set for individual requests:

Attribute Value Comment
requestId Client side generated request Id
serviceRequestId Server generated request/correlation id.

The status for each finished span is to be taken from the list of canonical status codes: https://github.com/open-telemetry/opentelemetry-specification/blob/e9340d74f1ba0b651b3581d6bd5df6a92b772e18/specification/tracing-api.md#status.

Two categories:

  1. User has done nothing to enable distributed tracing (no tracing library or plugin referenced, no tracer configured/enabled
  2. User is referencing a distributed tracing plugin and has configured/enabled tracing

For category 1, there are no observable effects. No tracing information is emitted in outgoing requests (e.g. no tracstate/tracecontext headers). Spans may be created in memory, but since no exporter is created/referenced, no span information is created. ISSUE: we could, in theory, still use the span information in logs.

For category 2, see below for what spans to create.

Example scenarios with spans created

Span parent/child relationship is indicated by the "dotted path" in the name. Span 1.1 is a child of Span 1. Please note that there is no expectation that Span 1.2 starts after 1.1 has completed - they may overlap in time.

Storage container create using SAS (no calls to AAD). All requests successful.

Span 1
name azure.storage.blobs.StorageAccountClient.create_container
status OK
type Client
Span 1.1
name https://storage.account.com/accountname/blobname
http.url https://storage.account.com/accountname/blobname
http.status_code 201
http.method PUT
status OK
type Client

Storage container create using AAD. All requests successful.

Attributes on span omitted when not relevant to describe the overall set of spans

Span 1
name azure.storage.blobs.StorageAccountClient.create_container
status OK
type Client
Span 1.1
name Refresh AAD token
http.url https://login.microsoftonline.com/{tenant}/oauth2/authorize
Span 1.2
name https://storage.account.com/accountname/blobname
http.url https://storage.account.com/accountname/blobname

Storage blob upload.

Span 1
name azure.storage.blobs.StorageAccountClient.create_container
status OK
type Client
Span 1.1
name Refresh AAD token
http.url https://login.microsoftonline.com/{tenant}/oauth2/authorize
Span 1.2
name https://myaccount.blob.core.windows.net/mycontainer/myblob?comp=block&blockid=id
http.url https://myaccount.blob.core.windows.net/mycontainer/myblob?comp=block&blockid=id

...

Span 1.n
name https://myaccount.blob.core.windows.net/mycontainer/myblob?comp=block&blockid=id
http.url https://myaccount.blob.core.windows.net/mycontainer/myblob?comp=block&blockid=id
Span 1.n+1
name https://myaccount.blob.core.windows.net/mycontainer/myblob?comp=blocklist
http.url https://myaccount.blob.core.windows.net/mycontainer/myblob?comp=blocklist

Storage container create using SAS (no calls to AAD). The first PUT fails (throttled), second request succeeds.

Span 1
name azure.storage.blobs.StorageAccountClient.create_container
status OK
type Client
Span 1.1
name https://storage.account.com/accountname/blobname
http.url https://storage.account.com/accountname/blobname
http.status_code 429
http.method PUT
status RESOURCE_EXHAUSTED
type Client
Span 1.2
name https://storage.account.com/accountname/blobname
http.url https://storage.account.com/accountname/blobname
http.status_code 201
http.method PUT
status OK
type Client

List storage blobs.

Span 1
name azure.storage.blobs.StorageAccountClient.list_containers
status OK
type Client
Span 1.1
name https://myaccount.blob.core.windows.net/?comp=list
http.url https://myaccount.blob.core.windows.net/?comp=list
Span 2
name azure.storage.blobs.StorageAccountClient.list_containers
status OK
type Client
Span 2.1
name https://myaccount.blob.core.windows.net/?comp=list&marker=7
http.url https://myaccount.blob.core.windows.net/?comp=list&marker=7

FAQ

Q: If a service client method calls other service client methods (i.e. is a "convenience" or "high-level" method), should the spans be nested?

**A: No. But it is not a ship-stopper for preview 2 if they are.

Q: What it the relationship between spans for a paged operation?

**A: For preview 2, no relationship is expected.

Q: What it the relationship between spans for a long running/polling operation?

**A: For preview 2, no relationship is expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment