InfoQ Homepage Big Data Content on InfoQ
-
Apache Drill - Interactive Query and Analysis at Scale
Michael Hausenblas introduces Apache Drill, a distributed system for interactive analysis of large-scale datasets, including its architecture and typical use cases.
-
A Guide to Python Frameworks for Hadoop
Uri Laserson reviews the different available Python frameworks for Hadoop, including a comparison of performance, ease of use/installation, differences in implementation, and other features.
-
Evolving Panorama of Data
Rebecca Parsons reviews some of the changes in how data is used and analyzed, looking at how data is used to track violence, and attempts to predict famine and other crises before they happen.
-
Leveraging Scriptable Infrastructures, Towards a Paradigm Shift in Software for Data Science
Karim Chine introduces Elastic-R, demonstrating some of its applications in bioinformatics and finance.
-
Data Science of Love
Vaclav Petricek digs some of the romantic interactions nuggets hidden in eHarmony's large collection of human relationships.
-
Leveraging Your Hadoop Cluster Better - Running Performant Code at Scale
Michael Kopp explains how to run performance code at scale with Hadoop and how to analyze and optimize Hadoop jobs.
-
Lessons Learned Building Storm
Nathan Marz shares lessons learned building Storm, an open-source, distributed, real-time computation system.
-
Building Applications using Apache Hadoop
Eli Collins overviews how to build new applications with Hadoop and how to integrate Hadoop with existing applications, providing an update on the state of Hadoop ecosystem, frameworks and APIs.
-
Copious Data, the "Killer App" for Functional Programming
Dean Wampler supports using Functional Programming and its core operations to process large amounts of data, explaining why Java’s dominance in Hadoop is harming Big Data’s progress.
-
Cloud and Big Data: Unicorns All the Way Down
Francine Bennett keynotes on using big data in the cloud.
-
The Big Data Revolution
Claudia Perlich keynotes on M6D’s approach to Big Data, using data granularity to build predictive models used for user targeting, bid optimization and fraud detection.
-
The Why, What and How of Open Data
Jeni Tennison explains how to evaluate an organization's data assets as potential sources of open data, and how to deal with the thorny issues of derived and personal data.