Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Amazon Releases Kinesis Service Update

Amazon Releases Kinesis Service Update

Amazon has recently announced an update to their Amazon Kinesis Service.  In this update, three new features have been added to Amazon Kinesis Streams and Amazon Kinesis Firehose including support for Elasticsearch Service Integration, Shard-Level Metrics and Time-Based Iterators.

Amazon Elasticsearch Service Integration

Kinesis Firehose integration with Amazon Elasticsearch allows developers to move data from Amazon Kinesis Firehose delivery streams into an Amazon Elasticsearch Service Cluster.

As data enters the Kinesis Firehose delivery stream, it will be buffered based upon configuration settings.  Subsequently, bulk inserts will be made to the Elasticsearch service.  Compression and encryption can also be applied to these delivery streams via the AWS Management Console.

Some use cases of this integration include indexing and analyzing server logs, clickstreams and social media traffic. Matt Wood, general manager, product strategy at AWS, suggests this new capability is a “great fit for log analytics and application monitoring.”

Once the data has been published to Elasticsearch, it can then be analyzed and visualized in tools like Kibana.

Shard-Level Metrics

A Kinesis Stream is made up of one or more shards.  Each shard included in the Kinesis Stream, represents read and write capacity. One shard provides 1MB/sec data input and 2MB/sec data output.  Shards are charged by the hour and can process up to 1000 records per second. 

New Shard-Level Metrics are available that provide insight into the performance of each shard in your Kinesis Stream. In total, there are 6 metrics now available and their state is reported every minute.  These metrics will be charged using CloudWatch’s per-metric pricing model.

Amazon has published the details of each metric available for Kinesis Streams:

  • IncomingBytes – The number of bytes that have been successfully PUT to the shard.
  • IncomingRecords – The number of records that have been successfully PUT to the shard.
  • IteratorAgeMilliseconds – The age (in milliseconds) of the last record returned by a GetRecords call against a shard. A value of 0 means that the records being read are completely caught up with the stream.
  • OutgoingBytes – The number of bytes that have been retrieved from the shard.
  • OutgoingRecords – The number of records that have been retrieved from the shard.
  • ReadProvisionedThroughputExceeded -The number of GetRecords calls that have been throttled for exceeding the 5 reads per second or 2 MB per second shard limits.
  • WriteProvisionedThroughputExceeded – The number of records that have been rejected due to throttling for exceeding the 1000 records per second or 1 MB per second shard limits.

Organizations can use these insights to gauge the performance of their streams.  One use case includes detecting when an upstream application is publishing data at a higher rate than a downstream consuming application can handle and creating a throughput bottleneck.

Time-Based Iterators

As an application reads data from a stream, it needs to keep track of where it is in the stream in order to pull the right data, in the right order.  In Kinesis Streams an iterator is used to perform this function.  Prior to this release, a sequence number, oldest record or newest record was used as an iterator. 

In this AWS update, customers can now specify a timestamp of where they would like to start processing their stream.  An example of when this functionality can be used is when a downstream application takes some planned downtime, but the publisher(s) continue to publish to the stream.  By default, a Kinesis Stream can store 24 hours of data which allows a consuming application to start consuming from the stream again where it previously stopped, by providing a timestamp.

Rate this Article