Hive co-creator Ashish Thusoo describes the Big Data challenges Facebook faced and presents solutions in 2 areas: Reduction in the data footprint and CPU utilization. Generating 300 to 400 terabytes per day, they store RC files as blocks, but store as columns within a block to get better compression. He also talks about the current Big Data ecosystem and trends for companies going forward.
Bryan talks about the challenges of operating Node.js in real production environments and the experiences he had working with it at Joyent. He also talks about DTrace, SmartOS, V8 and compares with other platforms.
Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.
John Nolan shows the state of hardware acceleration with GPUs and FPGAs, why it's hard to write efficient code for them, and why to favor polymorphism over if statements for performance.
Martin Thompson and David Farley discuss how to use the scientific method to create high performance systems by measuring performance and adapting the implementation to approach the limits of current hardware. The disruptor architecture is an open sourced result of their work at low-latency, high throughput systems for the retail trading platform of LMAX Ltd.
InfoQ catches up with Manik Surtani to discuss JSR 347, data grids and Inifinispan. Manik dicusses overlap with NoSQL and support for Memcached and HotRod wire protocol as well.
In this interview recorded at JavaOne 2011 Conference, Spring Hadoop project lead Costin Leau talks about the current state and upcoming features of Spring Data and Spring Hadoop projects. He also talks about the Caching and Data Grid architecture patterns.
Jonas Bonér and Kresten Krab Thorup on Bringing Erlang's Fault Tolerance and Distribution to Java with Akka and Erjang
Jonas Bonér and Kresten Krab Thorup discuss some key aspects of Erlang like fault tolerance and reliability and how the Akka and Erjang projects try to bring them to the JVM.
Larva is a runtime monitoring system that uses AspectJ to weave monitoring into Java code and can check the correctness of the program using an FSM; Elarva is an Erlang version of the tool.
Terracotta creator Ari Zilka talks about about the RAM is the new disk and argues for scaling up before scaling out, comparing the architectural approaches of lots of VMs with small heaps vs. a few JVMs with very large heaps. Ari introduces BigMemory, a Java add-on to Enterprise Ehcache, which allows app designs with huge amounts of memory accessible in-process, with minimal garbage collection.
Cliff Click discusses the Pauseless GC algorithm and how Azul's Zing implements it on plain x86 CPUs. Also: what keeps dynamic languages slow on the JVM, invokedynamic, concurrency and much more.
Jon Brisbin discusses his experience with Virtualization and reasons why companies would use Private Clouds, eg. regulation compliance. Also: the future role of operations, monitoring, and more.