BT

Optimizing for Big Data at Facebook

Posted by Ashish Thusoo on  Apr 17, 2012

Hive co-creator Ashish Thusoo describes the Big Data challenges Facebook faced and presents solutions in 2 areas: Reduction in the data footprint and CPU utilization. Generating 300 to 400 terabytes per day, they store RC files as blocks, but store as columns within a block to get better compression. He also talks about the current Big Data ecosystem and trends for companies going forward.

Attila Szegedi on JVM and GC Performance Tuning at Twitter

Posted by Attila Szegedi on  Feb 09, 2012 3

Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.

Hadoop and NoSQL in a Big Data Environment

Posted by Ron Bodkin on  Feb 03, 2012

Ron Bodkin of Big Data Analytics discusses early adoption of Hadoop, NoSQL and big data technologies. He discusses common patterns and explains how developers can write low-level primitives to optimize MapReduce function. Other topics include Hive, Pig, multi tenancy, and security.

Gil Tene Discusses Garbage Collection, the OpenJDK and the JCP

Posted by Gil Tene on  Jan 18, 2012

Gil Tene talks to Charles Humble about different garbage collection techniques, and specific collectors including Azul's C4, IBM's Balanced GC, and Oracle's Garbage First, before moving on to discuss both the JCP and OpenJDK.

Hardware friendly, high performance Java-Applications

Posted by Martin Thompson & David Farley on  Jan 05, 2012 4

Martin Thompson and David Farley discuss how to use the scientific method to create high performance systems by measuring performance and adapting the implementation to approach the limits of current hardware. The disruptor architecture is an open sourced result of their work at low-latency, high throughput systems for the retail trading platform of LMAX Ltd.

Costin Leau on Spring Data, Spring Hadoop and Data Grid Patterns

Posted by Costin Leau on  Nov 23, 2011 4

In this interview recorded at JavaOne 2011 Conference, Spring Hadoop project lead Costin Leau talks about the current state and upcoming features of Spring Data and Spring Hadoop projects. He also talks about the Caching and Data Grid architecture patterns.

Steve Vinoski and Bob Ippolito on Async I/O in Python and Node.js, Web Development in Erlang

Posted by Steve Vinoski, Bob Ippolito on  Oct 10, 2011

Steve Vinoski and Bob Ippolito discuss web development with MochiWeb and Yaws and extending Erlang with native code. Also: async I/O in Python and Node.js vs Erlang.

Jonas Bonér on Akka, Actors and Shared State, STM, Typesafe

Posted by Jonas Bonér on  Sep 02, 2011

Jonas Bonér explains the Akka project and the types of actors it offers as well as its transactional features. Also: a preview of how Akka 2.0 changes the management of (remote) actors.

Orion Henry on Heroku, Doozer and Paxos, Ruby

Posted by Orion Henry on  Aug 17, 2011 1

Orion Henry explains what make Heroku's PaaS tick, in particular the new extensible Cedar stack as well as Doozer, the implementation of the Paxos algorithm created at Heroku.

Ari Zilka on RAM is the New Disk & BigMemory

Posted by Ari Zilka on  Aug 16, 2011 3

Terracotta creator Ari Zilka talks about about the RAM is the new disk and argues for scaling up before scaling out, comparing the architectural approaches of lots of VMs with small heaps vs. a few JVMs with very large heaps. Ari introduces BigMemory, a Java add-on to Enterprise Ehcache, which allows app designs with huge amounts of memory accessible in-process, with minimal garbage collection.

Aaron Patterson on Rails 3.1 and Ruby Performance

Posted by Aaron Patterson on  Aug 09, 2011 1

Aaron Patterson talks about performance in Ruby and Rails, some of the challenges Rails and Rack pose for the Ruby GC, and much more.

Justin Sheehy and Damien Katz on Riak and CouchDB

Posted by Justin Sheehy and Damien Katz on  Aug 05, 2011

Justin Sheehy and Damien Katz discuss Riak and CouchDB, the strengths and trade-offs of different approaches to NoSQL, and why both databases are written in Erlang.

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT