Splunk's Hunk 6.1 Brings New Capabilities for Big Data Analytics
Performing ad-hoc analytics on top of big data and deriving useful insights from it can be challenging. Hunk addresses this challenge by providing a platform for rapidly exploring, analyzing, and visualizing data in Hadoop and other NoSQL data stores.
Hunk can be thought of as somewhat analogous to Hive – an open-source SQL engine for querying data that is stored in Hadoop. Both Hive and Hunk take a user’s query, compile it into a series of MapReduce jobs, and execute the jobs on the cluster. Hunk, however, differs from Hive in several key aspects:
- Hunk uses its own language called Splunk’s Search Processing Language (SPL) instead of SQL to provide querying capabilities.
- Hunk does not require a schema to be defined in advance. Instead, it creates a schema on-the-fly when the query is executed.
- Hunk does not wait for the MapReduce jobs to finish before displaying results. In order to provide a more interactive experience, it streams interim results immediately while the MapReduce jobs continue to run in the background.
- In addition to the querying engine, Hunk also includes a built-in visualization layer that lets users create interactive charts from their search results and save them.
Splunk adds several new features to Hunk in the latest version:
- Report acceleration caches search results in Hadoop, improving reporting response times and performance. It can be enabled on a per-report basis.
- Dashboards and charts are now interactive and support charting overlays, pan-and-zoom controls, and drill downs.
- Charts and reports can now be embedded into third-party business applications.
- Hunk is no longer limited to just Hadoop. Streaming resource libraries allow developers to connect Hunk to any NoSQL data store, such as Apache Cassandra, MongoDB, and Neo4j.
- Improved security through pass-through authentication gives administrators the ability to control which Hunk users submit MapReduce jobs and access HDFS files.
- Hunk adds support for new file formats, including sequence files, RCFile, ORC files, and Parquet.
The community reaction to the release has been positive. Here is a sample of quotes from Twitter:
Splunk, hunk, and hadoop all in one system.. way too much fun for a geek girl. @mskerryschaffer - Kerry Schaffer, information technology director at Marketing Associates.
#SplunkLive new product releases of Splunk Enterprise and Hunk continues to deliver for the AppDev world and outpace the competition. @aconcolino - Anthony Concolino, independent management consultant.
I tip my hat to @splunk for coming up with this clever product name: "Splunk Hunk for Hadoop" Rolls right off the tongue. @tobingilman - Tobin Gilman, big data practice lead at Bootstrap Marketing and Business Development.
Ben Linders May 28, 2015