InfoQ Homepage Hive Content on InfoQ
Presentations
RSS Feed-
Data Science in the Cloud @StitchFix
Stefan Krawczyk discusses how StitchFix used the cloud to enable over 80 data scientists to be productive and have easy access, covering prototyping, algorithms used, keeping schema in sync, etc.
-
Streaming Live Data and the Hadoop Ecosystem
Oleg Zhurakousky discusses the Hadoop ecosystem – Hadoop, HDFS, Yarn-, and how projects such as Hive, Atlas, NiFi interact and integrate to support the variety of data used for analytics.
-
Achieving Mega-Scale Business Intelligence through Speed of Thought Analytics on Hadoop
Ian Fyfe discusses the different options for implementing speed-of-thought business analytics and machine learning tools directly on top of Hadoop.
-
The Game of Big Data: Scalable, Reliable Analytics Infrastructure at KIXEYE
Randy Shoup describes KIXEYE's analytics infrastructure from Kafka queues through Hadoop 2 to Hive and Redshift, built for flexibility, experimentation, iteration, testability, and reliability.
-
REEF: Retainable Evaluator Execution Framework
Rusty Sears introduces REEF along with examples of computational frameworks, including interactive sessions, iterative graph processing, bulk synchronous computations, Hive queries, and MapReduce.
-
Apache Tez: Accelerating Hadoop Query Processing
Bikas Saha and Arun Murthy detail the design of Tez, highlighting some of its features and sharing some of the initial results obtained by Hive on Tez.
-
Big Data Platform as a Service at Netflix
Jeff Magnusson details some of Netflix' key services: Franklin, Sting and Lipstick.
-
Petabyte Scale Data at Facebook
Dhruba Borthakur discusses the different types of data used by Facebook and how they are stored, including graph data, semi-OLTP data, immutable data for pictures, and Hadoop/Hive for analytics.
-
Hadoop and Cassandra, Sitting in a Tree ...
Jake Luciani introduces Brisk, a Hadoop and Hive distribution using Cassandra for core services and storage, presenting the benefits of running Hadoop in a peer-to-peer masterless architecture.