BT
Older Newer rss
38:03

Efficient Data Storage for Analytics with Parquet 2.0

Posted by Julien Le Dem  on  Mar 22, 2015

Julien Le Dem discusses the advantages of a columnar data layout, specifically the features and design choices Apache Parquet uses to achieve goals of interoperability, space and query efficiency.

01:28:53

GORM Inside and Out

Posted by Jeff Scott Brown  on  Mar 21, 2015

Jeff Scott Brown introduces GORM, a super powerful ORM tool that makes ORM simple by leveraging the flexibility and expressiveness of a dynamic language like Groovy.

33:44

Programming and Testing a Distributed Database

Posted by Reid Draper  on  Mar 20, 2015

Reid Draper shows how real world distributed database work, communicate and are tested, trading RPC for messaging, unit-tests for QuickCheck, and micro-benchmarks for multi-week stress tests.

01:23:00

Using a Graph Database for JVM Heap Analysis

Posted by James Richardson, Nat Pryce  on  Mar 19, 2015 2

James Richardson, Nat Pryce discuss some of the challenges faced using Neo4J for interactive analysis of large data imports (80K nodes, 150k relationships) and how they overcame them.

01:06:43

Big Data in Memory

Posted by John Davies  on  Mar 14, 2015

John Davies shows a Spring work-flow consuming 7.4kB XML messages, binding them to 25kB Java but storing them in just 450 bytes each, 10 million derivative contracts in-memory on a laptop.

44:13

Gobblin: A Framework for Solving Big Data Ingestion Problem

Posted by Lin Qiao  on  Mar 12, 2015

Lin Qiao discusses the architecture of Gobblin, LinkedIn’s framework for addressing the need of high quality and high velocity data ingestion.

51:16

Remote Access Made Easy and Fast with Haskell

Posted by Simon Marlow  on  Mar 12, 2015

Simon Marlow explains how to use Haxl to automatically batch and overlap requests for data from multiple data sources.

35:16

Better Together - Using Spark and Redshift to Combine Your Data with Public Datasets

Posted by Eugene Mandel  on  Mar 12, 2015

Eugene Mandel discusses challenges of conforming data sources and compares processing stacks: Hadoop+Redshift vs Spark, showing how the technology drives the way the problem is modeled.

43:36

The New Features in MariaDB 10.0 and in the Upcoming MariaDB 10.1

Posted by Michael Widenius  on  Mar 11, 2015

Michael Widenius walks through the features of MariaDB 10.0 and 10.1, outlining the performance benefits resulting from switching to MariaDB.

01:29:34

Become a Data-driven Organization with Machine Learning

Posted by Peter Harrington  on  Mar 08, 2015

Peter Harrington explains what you do with machine learning, and what are the building blocks for an application that uses machine learning from collected data to creating predictions for customers.

01:19:37

Apps + Data + Cloud: What Does It All Mean?

Posted by Matt Stine  on  Mar 08, 2015

Matt Stine presents how combine Spring Boot, Spring Data, Spring Reactor, Spring XD, Hadoop and run them in the cloud.

47:08

SQL Strikes Back! Recent Trends in Data Persistence and Analysis

Posted by Dean Wampler  on  Feb 24, 2015

Dean Wampler takes a look at SQL’s resurgence and specific example technologies, including: NewSQL, Hybrid SQL, SQL abstractions on top of file-based data, SQL as a functional programming language.

General Feedback
Bugs
Advertising
Editorial
Marketing
InfoQ.com and all content copyright © 2006-2015 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT