BT
Older rss
35:44

Big-Data Analytics Misconceptions

Posted by Irad Ben-Gal  on  May 03, 2016

Irad Ben-Gal discusses Big Data analytics misconceptions, presenting a technology predicting consumer behavior patterns that can be translated into wins, revenue gains, and localized assortments.

39:15

How Comcast Uses Data Science and ML to Improve the Customer Experience

Posted by Jan Neumann  on  May 01, 2016 1

Jan Neumann presents how Comcast uses machine learning and big data processing to facilitate search for users, for capacity planning, and predictive caching.

36:19

The Mechanics of Testing Large Data Pipelines

Posted by Mathieu Bastian  on  Apr 24, 2016

Mathieu Bastian explores the mechanics of unit, integration, data and performance testing for large, complex data workflows, along with the tools for Hadoop, Pig and Spark.

38:27

Stream Processing with Apache Flink

Posted by Robert Metzger  on  Apr 07, 2016

Robert Metzger provides an overview of the Apache Flink internals and its streaming-first philosophy, as well as the programming APIs.

43:44

Rethinking Streaming Analytics for Scale

Posted by Helena Edelson  on  Apr 03, 2016

Helena Edelson addresses new architectures emerging for large scale streaming analytics based on Spark, Mesos, Akka, Cassandra and Kafka (SMACK) or Apache Flink or GearPump.

01:30:26

Developing Real-time Data Pipelines with Apache Kafka

Posted by Joe Stein  on  Mar 04, 2016

Joe Stein makes an introduction for developers about why and how to use Apache Kafka. Apache Kafka is a publish-subscribe messaging system rethought of as a distributed commit log.

01:24:27

Apache Spark for Big Data Processing

Posted by Ilayaperumal Gopinathan, Ludwine Probst  on  Feb 14, 2016

Ilayaperumal Gopinathan and Ludwine Probst discuss Spark and its ecosystem, in particular Spark Streaming and MLlib, providing a concrete example, and showing how to use Spark with Spring XD.

49:07

The Lego Model for Machine Learning Pipelines

Posted by Leah McGuire  on  Jan 16, 2016

Leah McGuire describes the machine learning platform Salesforce wrote on top of Spark to modularize data cleaning and feature engineering.

54:52

Tuning Java for Big Data

Posted by Scott Seighman  on  Oct 28, 2015

Scott Seighman discusses causes of common performance issues in Big Data environments, heap size, garbage collection, JVM reuse tuning guidelines and Big Data performance analysis tools.

44:53

Ground-up Introduction to In-memory Data

Posted by Viktor Gamov  on  Oct 10, 2015

Viktor Gamov covers In-Memory technology, distributed data topologies, making in-memory reliable, scalable and durable, when to use NoSQL, and techniques for Big In-Memory Data.

44:41

Pulsar: Real-time Analytics at Scale

Posted by Sharad Murthy, Tony Ng  on  Sep 13, 2015

Sharad Murthy & Tony Ng present Pulsar, a real-time streaming system which can scale to millions of events per second with high availability and 4GL language support.

48:35

Exploratory Data Analysis with R

Posted by Matthew Renze  on  Sep 13, 2015 3

Matthew Renze introduces the R programming language and demonstrates how R can be used for exploratory data analysis.

General Feedback
Bugs
Advertising
Editorial
Marketing
InfoQ.com and all content copyright © 2006-2016 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT