BT
Older rss
40:48

Data Science in the Cloud @StitchFix

Posted by Stefan Krawczyk  on  Feb 17, 2017 Posted by Stefan Krawczyk  on  Feb 17, 2017

Stefan Krawczyk discusses how StitchFix used the cloud to enable over 80 data scientists to be productive and have easy access, covering prototyping, algorithms used, keeping schema in sync, etc.

43:06

Elastic Data Analytics Platform @Datadog

Posted by Doug Daniels  on  Feb 17, 2017 1 Posted by Doug Daniels  on  Feb 17, 2017 1

Doug Daniels discusses the cloud-based platform they have built at DataDog and how it differs from a traditional datacenter-based analytics stack, pros and cons and the tooling built.

45:26

Petabytes Scale Analytics Infrastructure @Netflix

Posted by Tom Gianos  on  Feb 15, 2017 Posted by Tom Gianos Dan Weeks  on  Feb 15, 2017

Tom Gianos and Dan Weeks discuss Netflix' overall big data platform architecture, focusing on Storage and Orchestration, and how they use Parquet on AWS S3 as their data warehouse storage layer.

01:02:53

Big Data in the Real World: Technology and Use Cases

Posted by Mike Olson  on  Feb 09, 2017 Posted by Mike Olson  on  Feb 09, 2017

Mike Olson presents several use cases where big data is collected and analyzed to gather insights from the automotive, insurance, financial, and other sectors.

38:49

Using Bayesian Optimization to Tune Machine Learning Models

Posted by Scott Clark  on  Feb 07, 2017 Posted by Scott Clark  on  Feb 07, 2017

Scott Clark introduces Bayesian Global Optimization as an efficient way to optimize ML model parameters, explaining the underlying techniques and comparing it to other standard methods.

32:49

Machine Learning and End-to-End Data Analysis Processes in Spark Using Python and R

Posted by Debraj GuhaThakurta  on  Feb 05, 2017 Posted by Debraj GuhaThakurta  on  Feb 05, 2017

Debraj GuhaThakurta discusses ML and data analysis processes in Spark using examples written in Python and R.

33:53

Streaming Live Data and the Hadoop Ecosystem

Posted by Oleg Zhurakousky  on  Jan 29, 2017 Posted by Oleg Zhurakousky  on  Jan 29, 2017

Oleg Zhurakousky discusses the Hadoop ecosystem – Hadoop, HDFS, Yarn-, and how projects such as Hive, Atlas, NiFi interact and integrate to support the variety of data used for analytics.

27:42

Spring Data Hazelcast: Fluently Accessing Distributed Repositories

Posted by Victor Gamov  on  Jan 26, 2017 Posted by Victor Gamov Neil Stevenson  on  Jan 26, 2017

Victor Gamov and Neil Stevenson present using Spring Data for a Hazelcast project, built on the KeyValue module and providing infrastructure components for creating repository abstractions.

50:21

Scaling Counting Infrastructure @Quora

Posted by Chun-Ho Hung  on  Jan 22, 2017 Posted by Chun-Ho Hung Nikhil Garg  on  Jan 22, 2017

Chun-Ho Hung and Nikhil Garg discuss Quanta, Quora's counting system powering their high-volume near-real-time analytics, describing the architecture, design goals, constraints, and choices made.

50:44

Java (SE) State of the Union

Posted by Gil Tene  on  Jan 17, 2017 Posted by Gil Tene  on  Jan 17, 2017

Gil Tene presents the current state of Java SE and OpenJDK, the role of Java in the Big Data and Infrastructure components, JCP, the ecosystem, trends, etc.

01:10:46

Cloud Native Streaming and Event-driven Microservices

Posted by Marius Bogoevici  on  Jan 14, 2017 Posted by Marius Bogoevici  on  Jan 14, 2017

Marius Bogoevici demonstrates how to create complex data processing pipelines that bridge the big data and enterprise integration together and how to orchestrate them with Spring Cloud Data Flow.

55:24

Spring and Big Data

Posted by Thomas Risberg  on  Jan 08, 2017 Posted by Thomas Risberg  on  Jan 08, 2017

Thomas Risberg discusses developing big data pipelines with Spring, focusing around the code needed and he also covers how to set up a test environment both locally and in the cloud.

BT