BT
Older Newer rss
52:24

Big Data Platform as a Service at Netflix

Posted by Jeff Magnusson  on  Nov 18, 2013

Jeff Magnusson takes a deep dive into key services of Netflix’s “data platform as a service” architecture, including RESTful services that: provide comprehensive metadata management across data sources (Franklin); enable visualization and caching of results of Hadoop jobs (Sting); and visualize the execution plans produced by languages such as Pig and Hive (Lipstick).

39:26

Exercises in Style

Posted by Crista Lopes  on  Nov 13, 2013

Crista Lopes writes a program in multiple styles -monolithic/OOP/continuations/relational/Pub-Sub/Monads/AOP/Map-reduce- showing the value of using more than a style in large scale systems.

42:04

Stream Processing: Philosophy, Concepts, and Technologies

Posted by Dan Frank  on  Nov 10, 2013 1

Dan Frank discusses stream data processing and introduces NSQ – Bitly’s open source queuing system – and other new technologies used for communication between streaming programs.

41:19

"Big Data" Agile Analytics

Posted by Ken Collier  on  Oct 27, 2013

Ken Collier discusses Agile Analytics, a combination of sophisticated analytics techniques, lean learning principles, agile delivery methods, and "big data" technologies.

53:38

High Speed Smart Data Ingest into Hadoop

Posted by Oleg Zhurakousky  on  Oct 24, 2013

Oleg Zhurakousky discusses architectural tradeoffs and alternative implementations of real-time high speed data ingest into Hadoop.

45:11

Making the Internet a Better Place: Scaling AppNexus

Posted by Mike Nolet  on  Oct 18, 2013

Mike Nolet shares lessons learned scaling AppNexus and architectural details of their system processing 30TB/day: Hadoop, load balancer-free DNS architecture built in GSLB and Keepalived, and real-time data streaming built in C.

36:45

Apache Drill - Interactive Query and Analysis at Scale

Posted by Michael Hausenblas  on  Oct 13, 2013

Michael Hausenblas introduces Apache Drill, a distributed system for interactive analysis of large-scale datasets, including its architecture and typical use cases.

28:12

A Guide to Python Frameworks for Hadoop

Posted by Uri Laserson  on  Oct 03, 2013

Uri Laserson reviews the different available Python frameworks for Hadoop, including a comparison of performance, ease of use/installation, differences in implementation, and other features.

37:10

Evolving Panorama of Data

Posted by Rebecca Parsons  on  Oct 02, 2013

Rebecca Parsons reviews some of the changes in how data is used and analyzed, including new technology approaches, looking at how data is used to track election violence, movement of people after a natural disaster, and attempts to predict famine and other humanitarian crises before they happen.

43:33

Leveraging Scriptable Infrastructures, Towards a Paradigm Shift in Software for Data Science

Posted by Karim Chine  on  Oct 02, 2013

Karim Chine introduces Elastic-R, demonstrating some of its applications in bioinformatics and finance.

51:42

Data Science of Love

Posted by Vaclav Petricek  on  Aug 17, 2013

Vaclav Petricek digs some of the romantic interactions nuggets hidden in eHarmony's large collection of human relationships.

35:50

Leveraging Your Hadoop Cluster Better - Running Performant Code at Scale

Posted by Michael Kopp  on  Aug 16, 2013

Michael Kopp explains how to run performance code at scale with Hadoop and how to analyze and optimize Hadoop jobs.

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT