Splunk .conf 2015 Keynote

Splunk opened their big data conference with an emphasis on “making machine data accessible, usable, and valuable to everyone”. This is a shift from their original focus: indexing arbitrary big data sources. Reasonably happy with their ability to process data, they want to ensure that developers, IT staff, and normal people have a way to actually use all of the data their company is collecting.

Splunk itself is a ten-year-old company with 10,000 customers, up from 1,000 customers in 2010.

Nate McKervey on Splunk Enterprise 6.3

Performance has improved significantly over the previous version, 6.2. Ad hock searches are now roughly twice as fast. The scheduling agent as has also been improved. Rather than saying when a search should be run, the user instead says when they want the search to be completed. The intelligent scheduler considers factors such as data size and server load to estimate when the search should be run.

Indexing has also been improved. According to Cisco, indexing on Splunk Enterprise 6.3 is more 4 times faster when running on the Cisco UCS platform. Cisco also reported a 6X improvement on searches over 6.2 on the same hardware.

In terms of hardware utilization, version 6.2 required 20 indexers to handle 2TB of data per day. With the new version, the baseline recommendation for the same amount of data is down to eight indexers.

HTTP Event Collector

In the past, Splunk was designed to read from arbitrary data sources. But of course that requires something to actually collect and log the events that make up the data. With this release, Splunk offers a HTTP Event Collector. This allows events to be pushed directly into Splunk without the need for intermediaries. Nate claims that the HTTP Event Collector can scale to millions of events per second.

Data Archiving with Hunk 6.3

When working with big data, archiving has to be part of the picture. But cold archives are traditionally hard to work with, so data is often stored on the hot servers, bogging everything thing. With Hunk 6.3 you can now archive your older data into Hadoop but still query it using the same tools you used with data indexed in Splunk Enterprise. Nate reports a 75% decrease in storage size when running in this mode.

Mark Olesen on Splunk in the Cloud

Another big push this year is hosted offering known as Splunk Cloud. They currently have hundreds of customers on this option, some of whom push several terabytes of data per day into their servers.

Splunk Cloud runs on top of Amazon AWS, so it has the same theoretical uptime guarantees. However, Splunk Cloud is going beyond that to promise a 100% uptime SLA. Beyond running on multiple AWS availability zones, Splunk promises to actively monitor each users instance to proactively address issues.

Snehal Antani on Business Analytics

On the business analytics side, the goal is to move companies away from “running their business on month old data”. Splunk’s Business Analytics division sees their role as replacing traditional ETL jobs and reports that run on a weekly or monthly basis with real time dashboards and reports.

One of the reasons that ETL jobs exist is that moving all of the data into one place is really expensive in terms of time and network utilization. So rather than doing that, Splunk is developing distributed search engines that queries the data where it lives. Essentially a map-reduce job that not only spans multiple servers, but also multiple data centers and data formats.

Monzy Merza on Splunk Enterprise Security 4.0

The first feature Monzy discussed it the Investigator Timeline. Normally security analysts have to manually copy event information into notebooks or spreadsheets. With Investigator Timeline, all of the events that you do find interesting across any dashboard can be tagged with an investigation name. Once the relevant data is collected, each event appears on a timeline with links back to the dashboards where the event can be further analyzed.

A related feature is the Investigate Journal. This records every search that the security analyst makes so that others can see how he or she conducted the investigation. This is useful in a variety of scenarios including legal reporting requirements, training, and simply helping allowing analyst remember what he was doing.

Splunk User Behavior Analytics (UBA)

Like products offered by their partners, Splunk UBA relies on machine data and machine learning to detect changes in user behavior. Specific details were not covered in the keynote.

Jonathan Cervelli on Splunk IT Service Intelligence and Glass Tables

Splunk’s IT Service Intelligence product is designed to make service monitoring easier. The emphasis is on data visualization.

To start with Glass Tables, you upload a diagram that represents what you want to monitor. This is just a static image, which could be anything from a Visio network diagram to a photograph of a whiteboard. You can then drag and drop relevant searches onto the diagram.

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

InfoQ Article Contest

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

The InfoQ Newsletter