x Take the InfoQ Survey !

SpringXD being Re-architected and Re-branded to Spring Cloud Data Flow

by Charles Humble on  Sep 25, 2015

Pivotal announced a complete re-design of Spring XD, its big data offering, during last week’s SpringOne2GX conference, with a corresponding re-brand from Spring XD to Spring Cloud Data Flow. The new product is focussed on orchestration.

Splunk for DBAs

by Jonathan Allen on  Sep 24, 2015

The DBA’s primary job is to ensure that the business’s information is always available, with performance coming in at close second. We’ve already talked about optimizing distributed queries in Splunk and map-reduce queries in Hunk. In this report we expand upon that with more information that a DBA needs to know about Splunk databases.

Optimizing Distributed Queries in Splunk

by Jonathan Allen on  Sep 23, 2015

Optimizing queries in Splunk’s Search Processing Language is similar to optimizing queries in SQL. The two core tenants are the same: Change the physics and reduce the amount of work done. Added to that are two precepts that apply to any distributed query.

Big Data Architecture: Push, Pull, or Search in Place?

by Jonathan Allen on  Sep 23, 2015

A surprisingly common theme at the Splunk Conference is the architectural question, “Should I push, pull, or search in place?”

Architecture, Tuning, and Troubleshooting a Splunk Indexer Cluster

by Jonathan Allen on  Sep 23, 2015

If you could handle all of the data you need to work with on one machine, then there is no reason to use big data techniques. So clustering is pretty much assumed for any installation larger than a basic proof of concept. In Splunk Enterprise, the most common type of cluster you’ll be dealing with is the Indexer Cluster.

Hunk/Hadoop: Performance Best Practices

by Jonathan Allen on  Sep 23, 2015

When working with Hadoop, with or without Hunk, there are a number of ways you can accidentally kill performance. While some of the fixes require more hardware, sometimes the problems can be solved simply by changing the way you name your files.

Introducing Splunk IT Service Intelligence

by Jonathan Allen on  Sep 22, 2015

Splunk is jumping into the service-monitoring sector with a new visualization called IT Service Intelligence.

Using Hunk+Hadoop as a Backend for Splunk

by Jonathan Allen on  Sep 22, 2015

Splunk can now store archived indexes on Hadoop. At the cost of performance, this offers a 75% reduction in storage costs without losing the ability to search the data. And with the new adapters, Hadoop tools such as Hive and Pig can process the Splunk-formatted data.

Splunk .conf 2015 Keynote

by Jonathan Allen on  Sep 22, 2015

Splunk opened their big data conference with an emphasis on “making machine data accessible, usable, and valuable to everyone”. This is a shift from their original focus: indexing arbitrary big data sources. Reasonably happy with their ability to process data, they want to ensure that developers, IT staff, and normal people have a way to actually use all of the data their company is collecting.

Google's Cloud Dataflow Enters General Availability

by Kent Weare on  Sep 17, 2015

On August 12, Google announced that its big data processing service has reached general availability. This managed service allows customers to build pipelines that manipulate data prior to being processed by big data solutions. Cloud Dataflow supports both streaming and batch programming in a unified model.

Data Workflow Management Using Airbnb's Airflow

by Alex Giamas on  Sep 08, 2015

Airbnb recently opensourced Airflow, its own data workflow management framework. Airflow is being used internally at Airbnb to build, monitor and adjust data pipelines. Airflow’s creator, Maxime Beauchemin and Agari’s Data Architect and one of the framework’s early adopters Siddharth Anand discuss about Airflow, where it can be of use and future plans.

Microsoft Releases Azure Data Factory

by Richard Seroter on  Aug 25, 2015 1

Any cloud provider that believes in data gravity is trying to make it easier to collect and store data in its facilities. To make data movement between cloud and on-premises endpoints easier, Microsoft recently announced the general availability of Azure Data Factory (ADF).

QCon SF 2015 Update: Workshops at a glance (Nov 19-20)

by Wesley Reisz on  Aug 13, 2015

At QCon San Francisco, we offer two days of workshops (Nov 19-20). Workshops focus on developing the technical skills that leverage technologies you heard about from our expert practitioners during the conference sessions. Here is a glimpse at some of the experts you can learn from QCon SF ‘15 workshops.

Data Quality at Prezi

by João Miranda on  Jul 18, 2015

For an organization to be data-driven, it's not enough to just dump mountains of data. That data needs to be accurate and meaningful. Julianna Göbölös-Szabó, data engineer at Prezi shared how they improved the quality of its log data. Their solution involved moving from unstructured to structured data with a lightweight, contract-based approach to nudge all teams in the right direction.

Basho Data Platform Supports In-Memory Analytics, Caching, Search and Integration with NoSQL

by Srini Penchikala on  Jul 05, 2015

Basho Data Platform supports integration with NoSQL databases like Redis, in-memory analytics, caching, and search. Basho Technologies, the company behind Riak NoSQL database, announced in May, the availability of the data platform that can be used to deploy and manage Big Data, IoT and hybrid cloud applications.

General Feedback
Marketing and all content copyright © 2006-2015 C4Media Inc. hosted at Contegix, the best ISP we've ever worked with.
Privacy policy