Skip to content

Instantly share code, notes, and snippets.

@jsquire
Created October 2, 2020 20:25
Show Gist options
  • Save jsquire/a5532295c528d6d771d0ba2c3f431bb5 to your computer and use it in GitHub Desktop.
Save jsquire/a5532295c528d6d771d0ba2c3f431bb5 to your computer and use it in GitHub Desktop.
Event Hubs: Extending EventData with BinaryData

Event Hubs: BinaryData Member for EventData

In order to take advantage of integration with the ObjectSerializer for Schema Registry support as well as use a common idiom for data access and translation, the EventData type in the Event Hubs client library would like to take advantage of the BinaryData type introduced for the Azure.Core ecosystem.

Because Event Hubs is generally available and backwards compatibility needs to be preserved for existing applications using EventData, it is not possible to directly change the type on the existing Body member. This document outlines some of the considerations and options for doing so.

Things to know before reading

  • The names used in this document are presented for discussion as potential final choices. Please provide feedback during discussion.

  • Usage examples and details are not provided; because the changes are at the member-level for an existing type, the Event Hubs samples provide a full surface area for usage illustration.

Goal

Choose a name for the BinaryData member and a final structure for the EventData type moving forward without breaking backwards compatibility for existing applications and attempting to preserve consistency with the naming and structure of the type across languages.

Challenges

  • The EventData type is generally available and used by existing applications. The preferred name of the payload, Body, is used today across languages and within the track zero and track one libraries.

  • The Body member has a ReadOnlyMemory<byte> type; this allows an implicit conversion from BinaryData but accessing ReadOnlyMemory<byte> members such as ToArray requires an explicit cast.

  • A pattern such as var payload = Encoding.UTF8.GetString(eventData.Body.ToArray(); is a very common use case; it would be inadvisable to consider a breaking change here.

Option: Add a new Member

This is the approach that was taken for the current preview. In order to minimize impact to existing customers and keep alignment with the current structure, a new property was added which conforms to the convention used by the existing stream accessor for the body.

Benefits:

  • Provides minimal churn for existing customers
  • Conforms to expectations from earlier generations of the .NET library
  • Keeps alignment across languages for the current generation

Concerns:

  • The name reads awkwardly
  • The name does not emphasize the BinaryData member, potentially impacting discoverability
// Azure.Messaging.EventHubs v5.3.0-beta.3 (Schema Registry Preview)
public class EventData 
{
    // New members
    public EventData(BinaryData eventBody);
    public BinaryData BodyAsBinaryData { get; }
    
    // Existing members
    public EventData(ReadOnlyMemory<byte> eventBody);
    public ReadOnlyMemory<byte> Body { get; }
    public Stream BodyAsStream { get; }
    public DateTimeOffset EnqueuedTime { get; }
    public long Offset { get; }
    public string PartitionKey { get; }
    public IDictionary<string, object> Properties { get; }
    public int? PublishedSequenceNumber { get; }
    public long SequenceNumber { get; }
    public IReadOnlyDictionary<string, object> SystemProperties { get; }
}

Option: Deprecate and Replace Body

This approach deprecates the existing Body and BodyAsStream members by marking them as non-browsable, preserving backwards compatibility but limiting discovery.

Benefits:

  • Accentuates the BinaryData member, helping discoverability
  • Body access is uniform, with a single member providing all operations
  • Aligns with the Service Bus approach

Concerns:

  • Cannot use the preferred name of Body
  • Changes a high-visibility member of a core type in a GA'd library
  • Impacts usage expectations from previous generations of the library
  • Breaks alignment with other languages
  • The proposed name DataBody potentially overloads a term/concept from AMQP
// Azure.Messaging.EventHubs v5.3.0-beta.3 (Schema Registry Preview)
public class EventData 
{
    // New members
    public EventData(BinaryData eventBody);
    public BinaryData DataBody { get; }
    
    // Changes to existing members
    [EditorBrowsable(EditorBrowsableState.Never)]
    public ReadOnlyMemory<byte> Body { get; }
    
    [EditorBrowsable(EditorBrowsableState.Never)]
    public Stream BodyAsStream { get; }
    
    // Existing members
    public EventData(ReadOnlyMemory<byte> eventBody);
    public DateTimeOffset EnqueuedTime { get; }
    public long Offset { get; }
    public string PartitionKey { get; }
    public IDictionary<string, object> Properties { get; }
    public int? PublishedSequenceNumber { get; }
    public long SequenceNumber { get; }
    public IReadOnlyDictionary<string, object> SystemProperties { get; }
}

Option: Mirror ReadOnlyMemory<T> members on BinaryData

This approach attempts to align the API surface of BinaryData with ReadOnlyMemory<T> in order to take advantage of the implicit conversion.

Note: This option is most likely not feasible, as it is susceptible to compatibility issues should members be added to ReadOnlyMemory<T> that are not reflected to BinaryData. It is included here for completeness and discussion.

Benefits:

  • Possibly allows for changing the type of Body to BinaryData without a breaking changs
  • Avoids the downsides of the other approaches

Concerns:

  • The surface area of ReadOnlyMemory<T> is not huge, but it is not tiny.
  • Some members of ReadOnlyMemory<T> do not make sense when applied to BinaryData itself
  • High degree of breaking changes that go undetected
  • Ongoing efforts would be needed to keep the surface area alignedt from AMQP

Reference Examples

The following serve as a reference for the EventData type as it exists across the languages in GA form.

.NET

// Azure.Messaging.EventHubs v5.2.0 (GA)
public class EventData 
{
    public EventData(ReadOnlyMemory<byte> eventBody);
    public ReadOnlyMemory<byte> Body { get; }
    public Stream BodyAsStream { get; }
    public DateTimeOffset EnqueuedTime { get; }
    public long Offset { get; }
    public string PartitionKey { get; }
    public IDictionary<string, object> Properties { get; }
    public long SequenceNumber { get; }
    public IReadOnlyDictionary<string, object> SystemProperties { get; }
}

// Microsoft.Azure.EventHubs v4.3.1 (GA - Track One)
public class EventData : IDisposable
{
    public EventData(byte[] array);
    public EventData(ArraySegment<byte> arraySegment);
    public ArraySegment<byte> Body { get; }
    public IDictionary<string, object> Properties { get; internal set; }
    public SystemPropertiesCollection SystemProperties { get; set; }
    public string ContentType { get; set; }
    public void Dispose();
}

Java

public class EventData {
    public EventData(byte[] body);
    public EventData(ByteBuffer body);
    public EventData(String body);
    public Map<String, Object> getProperties();
    public Map<String, Object> getSystemProperties();
    public byte[] getBody();
    public String getBodyAsString();
    public Long getOffset();
    public String getPartitionKey();
    public Instant getEnqueuedTime();
    public Long getSequenceNumber();
}

TypeScript

export interface EventData {
  body: any;
  properties?: { [key: string]: any; };
}
  
export interface ReceivedEventData {
  body: any;
  properties?: { [key: string]: any; };
  enqueuedTimeUtc: Date;
  partitionKey: string | null;
  offset: number;
  sequenceNumber: number;
  systemProperties?: { [key: string]: any; };
}

Python

class EventData(object):
    def __init__(self, body=None):
    def _from_message(cls, message):
    def _encode_message(self):
    def sequence_number(self):
    def offset(self):
    def enqueued_time(self):
    def partition_key(self):
    def properties(self):
    def properties(self, value):
    def system_properties(self):
    def body(self):
    def body_as_str(self, encoding="UTF-8"):
    def body_as_json(self, encoding="UTF-8"):
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment