BT
x Your opinion matters! Please fill in the InfoQ Survey about your reading habits!

Optimizing for Big Data at Facebook

Posted by Ashish Thusoo on  Apr 17, 2012

Hive co-creator Ashish Thusoo describes the Big Data challenges Facebook faced and presents solutions in 2 areas: Reduction in the data footprint and CPU utilization. Generating 300 to 400 terabytes per day, they store RC files as blocks, but store as columns within a block to get better compression. He also talks about the current Big Data ecosystem and trends for companies going forward.

All things Hadoop

Posted by Ted Dunning on  Feb 02, 2012 2

In this interview Ted Dunning talk about Hadoop, its current usage and its future. He explains the reasons for Hadoop's success and make recommendations on how to start using it.

Costin Leau on Spring Data, Spring Hadoop and Data Grid Patterns

Posted by Costin Leau on  Nov 23, 2011 4

In this interview recorded at JavaOne 2011 Conference, Spring Hadoop project lead Costin Leau talks about the current state and upcoming features of Spring Data and Spring Hadoop projects. He also talks about the Caching and Data Grid architecture patterns.

Jonas Bonér and Kresten Krab Thorup on Bringing Erlang's Fault Tolerance and Distribution to Java with Akka and Erjang

Posted by Jonas Bonér and Kresten Krab Thorup on  Oct 20, 2011

Jonas Bonér and Kresten Krab Thorup discuss some key aspects of Erlang like fault tolerance and reliability and how the Akka and Erjang projects try to bring them to the JVM.

Ville Tuulos on Big Data and Map/Reduce in Erlang and Python with Disco

Posted by Ville Tuulos on  Jun 24, 2011

Ville Tuulos talks about Disco, the Map/Reduce framework for Python and Erlang, real-world data mining with Python, the advantages of Erlang for distributed and fault tolerant software, and more.

Francesco Cesarini and Simon Thompson on Erlang

Posted by Francesco Cesarini and Simon Thompson on  May 26, 2011

Francesco Cesarini and Simon Thompson discuss how Erlang's design allows fault tolerance and resilience, modular error handling, details of the actor model implementation and distributed programming.

ECMAScript 5, Caja and Retrofitting Security, with Mark S. Miller

Posted by Mark S. Miller on  Feb 25, 2011

Mark S. Miller talks about the security considerations of JavaScript and how they are dealt with in ECMAScript 5 and the Caja project. He also mentions issues that have to do with HTML5 and compares the security characteristics of other languages like Java and Scheme.

Ron Bodkin on Big Data and Analytics

Posted by Ron Bodkin on  Jan 27, 2011

Ron Bodkin discusses big data architecture, real-time analytics, batch processing, map-reduce, and data science.

What’s Next for jclouds?

Posted by Adrian Cole on  Dec 23, 2010

Adrian Cole discusses his jclouds project, which is an open source library that helps Java developers get started in the cloud and reuse their Java development skills. Cole also talks about some of the challenges of creating a cloud agnostic library, such as the use of different hypervisors and that various cloud implementations are written in different languages, such as VB, Python, Ruby, etc.

Ralph Johnson, Joe Armstrong on the Future of Parallel Programming

Posted by Ralph Johnson, Joe Armstrong on  Jul 21, 2010

Ralph Johnson and Joe Armstrong discuss their ideas about parallel programming - whether shared memory is harmful, the place of message passing, fault tolerance, the importance of protocols and more.

Stefan Tilkov Talks REST, Web Services and More

Posted by Stefan Tilkov on  May 28, 2010

Stefan Tilkov discusses REST (Representational State Transfer) and RESTful web services based upon work he has done for clients of his consultancy. Stefan talks about the shortcomings of the WS-* specs and says he sees little need for WS-* web services any more. Stefan also talks about how web development frameworks are beginning to map to the RESTful model, and the concept of REST and security.

Billy Newport Discusses Parallel Programming in Java

Posted by Billy Newport on  Apr 16, 2010

Billy Newport talks to InfoQ about the need for higher level abstraction to do parallel programming with multi-core systems effectively. The interview explores some approaches taken with MapReduce products such as Cascading and Pig for a Hadoop cluster, explores the limitations of the actor model and message passing, and touches on IBM's WebSphere eXtreme Scale (ObjectGrid) product.

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT