Lawrence Chernin describes best practices and validation methods used to deal with large unstructured data, including a suite of unit tests covering the implementations of algorithmic equations.
Beach Clark talks about the technological and cultural challenges of turning data science into a vital part of the business model at Georgia Aquarium.
Alok Aggarwal overviews Artificial Intelligence and discusses a use case, “Voice of Cancer Patients” that uses ML and NLP algorithms to analyze unstructured text written by cancer patients.
The panelists discuss how Data Science can help solve various problems for business.
Jonathan Gray introduces Hydrator, an open source framework and user interface for creating data lakes for building and managing data pipelines on Spark, MapReduce, Spark Streaming and Tigon.
Ali Jalali presents how to develop a machine learning predictive analytics engine for big data analytics.
Sameer Farooqui demos connecting to the live stream of Wikipedia edits, building a dashboard showing what’s happening with Wikipedia datasets and how people are using them in real time.
Graeme Seaton discusses the drivers behind Big Data initiatives and how to approach them using the vast amounts of data available.
Andrew Psaltis talks about Apache Beam, which aims to provide a unified stream processing model for defining and executing complex data processing, data ingestion and integration workflows.
Kriti Sharma talks about how Barclays is solving some of the toughest big data challenges in financial services using scalable, open source technology.
Tim Wagner defines server-less computing, examines the key trends and innovative ideas behind the technology, and looks at design patterns for big data, event processing, and mobile using AWS Lambda.
Pushpraj Shukla discusses how Microsoft Bing predicts the future based on aggregate human behavior using one of the largest scale data sets, and recent progress in large scale deep learnt models.