Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Presentations Facebook’s Petabyte Scale Data Warehouse using Hive and Hadoop

Facebook’s Petabyte Scale Data Warehouse using Hive and Hadoop



Ashish Thusoo and Namit Jain explain how Facebook manages to deal with 12 TB of compressed new data everyday with Hive’s help. Hive is an open source data warehousing framework built on Hadoop, allowing developers to perform analysis against large datasets using SQL.


Ashish Thusoo is currently managing the Facebook data infrastructure team. He is the project leader of Hive at Apache and a member of Hadoop PMC. Namit Jain is a member of Facebook’s data-infrastructure group and he is a committer for Hive. He also worked for over 10 years at Oracle on streaming technologies, XML, replication and queuing.

About the conference

QCon is a conference that is organized by the community, for the community.The result is a high quality conference experience where a tremendous amount of attention and investment has gone into having the best content on the most important topics presented by the leaders in our community. QCon is designed with the technical depth and enterprise focus of interest to technical team leads, architects, and project managers.

Recorded at:

Feb 21, 2010