BT
Older rss
47:47

Stream Processing & Analytics with Flink @Uber

Posted by Danny Yuan  on  Mar 25, 2017 Posted by Danny Yuan  on  Mar 25, 2017

Danny Yuan discusses how Uber builds its next generation of stream processing system to support real-time analytics as well as complex event processing.

39:21

Demistifying DynamoDB Streams

Posted by Akshat Vig  on  Mar 25, 2017 Posted by Akshat Vig Khawaja Shams  on  Mar 25, 2017

Akshat Vig and Khawaja Shams discuss DynamoDB Streams and what it takes to build an ordered, highly available, durable, performant, and scalable replicated log stream.

49:06

Building a Data Science Capability from Scratch

Posted by Victor Hu  on  Mar 23, 2017 Posted by Victor Hu  on  Mar 23, 2017

Victor Hu covers the challenges, both technical and cultural, of building a data science team and capability in a large, global company.

49:31

Data Cleansing and Understanding Best Practices

Posted by Casey Stella  on  Mar 23, 2017 Posted by Casey Stella  on  Mar 23, 2017

Casey Stella talks about discovering missing values, values with skewed distributions and likely errors within data, as well as a novel approach at finding data interconnectedness.

44:45

SQL Server on Linux: Will it Perform or Not?

Posted by Slava Oks  on  Mar 22, 2017 Posted by Slava Oks  on  Mar 22, 2017

Slava Oks talks about SQL Server’s history, high-level architecture and dives into core of I/O Manager, Memory Manager, and Scheduler. Topics include lessons learned and experiences behind the scenes.

51:31

Practical Data Synchronization Using CRDTs

Posted by Dmitry Ivanov  on  Mar 10, 2017 Posted by Dmitry Ivanov  on  Mar 10, 2017

Dmitry Ivanov discusses the basic CRDTs implementations in Scala, explaining the advantages of these data structures to solve many synchronization problems as well as their limitations.

54:36

ScyllaDB: Achieving No-Compromise Performance

Posted by Avi Kivity  on  Mar 07, 2017 Posted by Avi Kivity  on  Mar 07, 2017

Avi Kivity discusses ScyllaDB, the many necessary design decisions, from the programming language and programming model through low-level details and up to the advanced cache design, and more.

40:48

Data Science in the Cloud @StitchFix

Posted by Stefan Krawczyk  on  Feb 17, 2017 Posted by Stefan Krawczyk  on  Feb 17, 2017

Stefan Krawczyk discusses how StitchFix used the cloud to enable over 80 data scientists to be productive and have easy access, covering prototyping, algorithms used, keeping schema in sync, etc.

43:06

Elastic Data Analytics Platform @Datadog

Posted by Doug Daniels  on  Feb 17, 2017 1 Posted by Doug Daniels  on  Feb 17, 2017 1

Doug Daniels discusses the cloud-based platform they have built at DataDog and how it differs from a traditional datacenter-based analytics stack, pros and cons and the tooling built.

45:26

Petabytes Scale Analytics Infrastructure @Netflix

Posted by Tom Gianos  on  Feb 15, 2017 Posted by Tom Gianos Dan Weeks  on  Feb 15, 2017

Tom Gianos and Dan Weeks discuss Netflix' overall big data platform architecture, focusing on Storage and Orchestration, and how they use Parquet on AWS S3 as their data warehouse storage layer.

01:02:53

Big Data in the Real World: Technology and Use Cases

Posted by Mike Olson  on  Feb 09, 2017 Posted by Mike Olson  on  Feb 09, 2017

Mike Olson presents several use cases where big data is collected and analyzed to gather insights from the automotive, insurance, financial, and other sectors.

38:49

Using Bayesian Optimization to Tune Machine Learning Models

Posted by Scott Clark  on  Feb 07, 2017 Posted by Scott Clark  on  Feb 07, 2017

Scott Clark introduces Bayesian Global Optimization as an efficient way to optimize ML model parameters, explaining the underlying techniques and comparing it to other standard methods.

BT