InfoQ Homepage Big Data Content on InfoQ

Presentations

RSS Feed

Newer Older

AI, ML & Data Engineering

Big-Data Analytics Misconceptions

Irad Ben-Gal discusses Big Data analytics misconceptions, presenting a technology predicting consumer behavior patterns that can be translated into wins, revenue gains, and localized assortments.

Irad Ben-Gal
on May 03, 2016

Icon

35:44
AI, ML & Data Engineering

How Comcast Uses Data Science and ML to Improve the Customer Experience

Jan Neumann presents how Comcast uses machine learning and big data processing to facilitate search for users, for capacity planning, and predictive caching.

Jan Neumann
on May 01, 2016

Icon

39:15
AI, ML & Data Engineering

The Mechanics of Testing Large Data Pipelines

Mathieu Bastian explores the mechanics of unit, integration, data and performance testing for large, complex data workflows, along with the tools for Hadoop, Pig and Spark.

Mathieu Bastian
on Apr 24, 2016

Icon

36:19
AI, ML & Data Engineering

Stream Processing with Apache Flink

Robert Metzger provides an overview of the Apache Flink internals and its streaming-first philosophy, as well as the programming APIs.

Robert Metzger
on Apr 07, 2016

Icon

38:27
AI, ML & Data Engineering

Rethinking Streaming Analytics for Scale

Helena Edelson addresses new architectures emerging for large scale streaming analytics based on Spark, Mesos, Akka, Cassandra and Kafka (SMACK) or Apache Flink or GearPump.

Helena Edelson
on Apr 03, 2016

Icon

43:44
AI, ML & Data Engineering

Developing Real-time Data Pipelines with Apache Kafka

Joe Stein makes an introduction for developers about why and how to use Apache Kafka. Apache Kafka is a publish-subscribe messaging system rethought of as a distributed commit log.

Joe Stein
on Mar 04, 2016

Icon

01:30:26
AI, ML & Data Engineering

Apache Spark for Big Data Processing

Ilayaperumal Gopinathan and Ludwine Probst discuss Spark and its ecosystem, in particular Spark Streaming and MLlib, providing a concrete example, and showing how to use Spark with Spring XD.

Ilayaperumal Gopinathan Ludwine Probst
on Feb 14, 2016

Icon

01:24:27
AI, ML & Data Engineering

The Lego Model for Machine Learning Pipelines

Leah McGuire describes the machine learning platform Salesforce wrote on top of Spark to modularize data cleaning and feature engineering.

Leah McGuire
on Jan 16, 2016

Icon

49:07
Java

Tuning Java for Big Data

Scott Seighman discusses causes of common performance issues in Big Data environments, heap size, garbage collection, JVM reuse tuning guidelines and Big Data performance analysis tools.

Scott Seighman
on Oct 28, 2015

Icon

54:52
AI, ML & Data Engineering

Ground-up Introduction to In-memory Data

Viktor Gamov covers In-Memory technology, distributed data topologies, making in-memory reliable, scalable and durable, when to use NoSQL, and techniques for Big In-Memory Data.

Viktor Gamov
on Oct 10, 2015

Icon

44:53
AI, ML & Data Engineering

Pulsar: Real-time Analytics at Scale

Sharad Murthy & Tony Ng present Pulsar, a real-time streaming system which can scale to millions of events per second with high availability and 4GL language support.

Tony Ng Sharad Murthy
on Sep 13, 2015

Icon

44:41
Development

Exploratory Data Analysis with R

Matthew Renze introduces the R programming language and demonstrates how R can be used for exploratory data analysis.

Matthew Renze
on Sep 13, 2015

Icon

48:35

Newer Presentations

Older Presentations

InfoQ Software Architects' Newsletter

Presentations