BT

Hortonworks Addresses the IoAT with DataFlow Based on NiFi

| by Abel Avram Follow 9 Followers on Sep 25, 2015. Estimated reading time: 2 minutes |

Hortonworks has quietly made available the DataFlow platform which is based on Apache NiFi and attempts to solve the processing needs of the IoAT.

Hortonworks has recently introduced the DataFlow (HDF) platform to an audience of oil and gas producers during a webinar. HDF is based on Apache NiFi which is a real-time data streaming and processing system open sourced by NSA last year. The initial name of the project was Niagarafiles. When NiFi was open sourced, several former NSA developers founded Onyara, a company continuing the development of the project and providing support. Hortonworks has recently bought Onyara and integrated those developers into their team.

Because NiFi can be used to stream data coming from a wide variety of sources, Hortonworks considers HDF fit for the Internet of AnyThing (IoAT). Data flowing in HDF is multidirectional and point-to-point, enabling users to interact with the stream of collected data and even reach out to the source of it, down to sensors and devices. HDF is complementary to HDP, the former dealing with data-in-motion while the later, based on Hadoop, getting insights from data-at-rest.

NiFi was built with a number of concepts in mind: the ability to granularly manage the flow of information, tracking everything that happens with data – where it comes from and what happened with it along the way, and securing the control and data planes. NiFi’s main features are:

  • Guaranteed data delivery
  • Data buffering with a back-pressure mechanism
  • Prioritized queuing
  • QoS
  • Data provenance – NiFi keeps a log with every change a data has gone through enabling traceability, data recovery and replay, auditing, evaluation
  • Logging detailed history of data
  • Interactive command and control console providing visual feedback for system changes
  • Flow templates
  • Pluggable/multi-role security
  • Extendability
  • Clustering

NiFi is not just for IoT, being useful for all sorts of real-time data processing needs: predictive analytics, fraud detection, big data ingest, resource evaluation, and others. NiFi comes out of the box with 90 data processors including encoders, encrypters, compressors, converters, creating Hadoop sequence files from data flows, interacting with AWS, sending messages to Kafka, getting messages from Twitter, and others. One can configure the data processors through a drag&drop visual UI, chaining them and using back-pressure between them to control the data flow. The tool has built-in scalability, request replication, load balancing and failover.

On the roadmap one can find: better configuration management of flows, an extension and template registry, first class Avro support, interactive queue management, multi-tenant data flow, and others. 

HDF can be tested in a sandboxed environment with Apache Ambari.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT