InfoQ

InfoQ

Presentation

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

Recorded at:
Recorded at

Facebook’s Petabyte Scale Data Warehouse using Hive and Hadoop

Presented by Ashish Thusoo and Namit Jain on Feb 21, 2010 Length 00:58:26     Download: MP3
Sections
Architecture & Design,
Enterprise Architecture
Topics
Data Warehousing ,
Performance & Scalability ,
Architecture
Tags
Data Analysis ,
QCon ,
QCon San Francisco 2009 ,
Hadoop
The next QCon is in London March 5-9, Join us!
 

How would you like to view the presentation?

In case you are having issues watching this video, please follow these simple steps to help us investigate the issue:
1. Right click on the video player and select Copy log
2. Paste the copied information in an email to video-issue@infoq.com (clicking this link will fill in the default details in most email clients).
Note: in case your email client hasn't automatically picked up the email subject, please include in your email the URL of the video too.
3. Done.
We will investigate the issue and get back to you as soon as possible. Thanks for helping us improve our site!
Summary
Ashish Thusoo and Namit Jain explain how Facebook manages to deal with 12 TB of compressed new data everyday with Hive’s help. Hive is an open source data warehousing framework built on Hadoop, allowing developers to perform analysis against large datasets using SQL.

Bio
Ashish Thusoo is currently managing the Facebook data infrastructure team. He is the project leader of Hive at Apache and a member of Hadoop PMC. Namit Jain is a member of Facebook’s data-infrastructure group and he is a committer for Hive. He also worked for over 10 years at Oracle on streaming technologies, XML, replication and queuing.

About the conference
QCon is a conference that is organized by the community, for the community.The result is a high quality conference experience where a tremendous amount of attention and investment has gone into having the best content on the most important topics presented by the leaders in our community. QCon is designed with the technical depth and enterprise focus of interest to technical team leads, architects, and project managers.
Cell phones by Brian Edwards Posted
Re: Cell phones by Gresham Paul Posted
Re: Cell phones by Hemlata Kalsha Posted
Re: Cell phones by Archy Ty Posted
use a cell phone jammer by Patti Hych Posted
  1. Back to top

    Cell phones

    by Brian Edwards

    I wish people could turn cell phones off for 45 minutes.

  2. Back to top

    Re: Cell phones

    by Gresham Paul

    A great strategy for that is a charity box and a polite notice telling people you'll ask them to donate $10 if their phone rings. Once you've disturbed 100 other delegates I've never known anyone to not pay. It tends to work well for us.

  3. Back to top

    Re: Cell phones

    by Hemlata Kalsha

    I completely agree the mobile sound is very disturbing

  4. Back to top

    Re: Cell phones

    by Archy Ty

    The cell phone interference makes the presentation looks non-informative. It destruct... Maybe every conference, they need to put a cellphone jammer... if that's fine...

  5. Back to top

    use a cell phone jammer

    by Patti Hych

    really boring with the noises of phones, just go and buy a cell phone jammer, it is cool, used it for a long time!!
    www.jammerall.com/

Educational Content

Yesod Web Framework

Michael Snoyman presents Yesod, a web framework written in Haskell and containing a web server, templating, ORM, libraries (templating, gravatar, etc.).

Transactions without Transactions

Richard Kreuter and Kyle Banker on how to avoid classical RDBMS transactional systems by using compensation mechanisms, transactional messaging or transactional procedures.

Attila Szegedi on JVM and GC Performance Tuning at Twitter

Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.

10 tips on how to prevent business value risk

One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.

Interview: Software Systems Architecture: Working With Stakeholders Using Viewpoints and Perspectives

InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.

Beauty Is in the Eye of the Beholder

Alex Papadimoulis discusses ugly code, where it comes from, how to avoid it, and how to get rid of it.

Architecting Visa for Massive Scale and Continuous Innovation

John Davies examines Visa’s architecture and shows how enterprises have architected complex integrations incorporating Hadoop, memcached, Ruby on Rails, and others to deliver innovative solutions.

Max Protect: Scalability and Caching at ESPN.com

Sean Comerford unveils ESPN.com’s architecture, what components are used and why, and the current changes the website goes through.