Microsoft Tackles Internet-of-Things With New Data Stream Processing Service
Last week at the Microsoft Worldwide Partner Conference, Microsoft took the wraps off of Azure Event Hubs. This service – in preview release until General Availability next month – is for high throughput ingress of data streams generated by devices and services. Event Hubs resembles Amazon Kinesis and uses an identical pricing scheme based on data processing units and transaction volume.
Microsoft added Event Hubs to its portfolio of messaging services called the Azure Service Bus. Event Hubs is similar to Service Bus Queues and Topics in that it supports first-in-first-out messaging, competing consumer scenarios, and data retention policies. Its client-side cursor, partitioned consumer support, and significant time-based retention options are unique within the Microsoft portfolio. An Event Hub can accept up to 1 GB per second of incoming data and store over 2.5 PB over the course of a month. Microsoft allocates capacity via Throughput Units; a given Throughput Unit delivers up to 1 MB per second ingress, 2 MB per second egress, and up to 84 GB of event storage per day. Microsoft states that you can provision up to 1024 Throughput Units (via special request) and up to 30 days of event storage for Event Hubs.
An Event Hub is made up of multiple components, as described by Microsoft. Messages are sent to an Event Hub publisher endpoint via HTTP or AMQP. Microsoft defines partitions as the “scale mechanism of Event Hubs” and the default number of partitions is 16. Partitions can consume one Throughput Unit and inbound messages can targeted at a specific partition through a user-defined partition key value. Each Event Hub consumers has an endpoint that applications connect to in order to process a partition. Developers use a client-side pointer, or offset, to retrieve messages from a specific point in the event stream. The longer the Event Hub’s retention period, the farther back any consumer can replay the stream using an offset.
Amazon Kinesis is the data stream processing service from cloud competitor AWS. It too supports partitions, or shards, that can receive up to 1 MB per second, and deliver up to 2 MB per second of data to receivers. How does it differ from Azure Event Hubs? Kinesis has a 24 hour data retention period, integration with many other AWS services (such as the RedShift data warehouse, S3 object storage service, and Identity and Access Management), HTTP-only access, and a developer client library that takes care of complex stream reading activities. The data retention period is longer in Azure Event Hubs, and it also has the unique concept of publisher policies that can throttle and authenticate on a per-device basis.
Microsoft is providing a 50% discount throughout the preview period and is offering Azure Event Hubs in its North America, Europe, and Asia Pacific regions. To get a look at the code required to interface with this service, developers can review Microsoft-provided samples for Getting Started, building secure publishers, and creating event receivers.
John Krewson, Steve Ropa and Matt Badgley Nov 24, 2014