Older rss

Hydrator: Open Source, Code-Free Data Pipelines

Posted by Jonathan Gray  on  Oct 23, 2016

Jonathan Gray introduces Hydrator, an open source framework and user interface for creating data lakes for building and managing data pipelines on Spark, MapReduce, Spark Streaming and Tigon.


Developing a Machine Learning Based Predictive Analytics Engine for Big Data Analytics

Posted by Ali Jalali  on  Oct 16, 2016

Ali Jalali presents how to develop a machine learning predictive analytics engine for big data analytics.


Exploring Wikipedia with Apache Spark: A Live Coding Demo

Posted by Sameer Farooqui  on  Aug 23, 2016

Sameer Farooqui demos connecting to the live stream of Wikipedia edits, building a dashboard showing what’s happening with Wikipedia datasets and how people are using them in real time.


Applying Big Data

Posted by Graeme Seaton  on  Aug 07, 2016

Graeme Seaton discusses the drivers behind Big Data initiatives and how to approach them using the vast amounts of data available.


Apache Beam: The Case for Unifying Streaming APIs

Posted by Andrew Psaltis  on  Jul 30, 2016

Andrew Psaltis talks about Apache Beam, which aims to provide a unified stream processing model for defining and executing complex data processing, data ingestion and integration workflows.


Creating Customer-Centric Products Using Big Data

Posted by Kriti Sharma  on  Jul 15, 2016

Kriti Sharma talks about how Barclays is solving some of the toughest big data challenges in financial services using scalable, open source technology.


Server-Less Design Patterns for the Enterprise with AWS Lambda

Posted by Tim Wagner  on  Jul 08, 2016

Tim Wagner defines server-less computing, examines the key trends and innovative ideas behind the technology, and looks at design patterns for big data, event processing, and mobile using AWS Lambda.


Predicting the Future: Surprising Revelations trom Truly Big Data

Posted by Pushpraj Shukla  on  May 24, 2016

Pushpraj Shukla discusses how Microsoft Bing predicts the future based on aggregate human behavior using one of the largest scale data sets, and recent progress in large scale deep learnt models.


Netflix Keystone - How We Built a 700B/day Stream Processing Cloud Platform in a Year

Posted by Peter Bakas  on  May 19, 2016

Peter Bakas presents in detail how Netflix has used Kafka, Samza, Docker, and Linux to implement a multi-tenant pipeline processing 700B events/day in the Amazon AWS cloud.


Hunting Criminals with Hybrid Analytics

Posted by David Talby  on  May 10, 2016

David Talby demos using Python libraries to build a ML model for fraud detection, scaling it up to billions of events using Spark, and what it took to make the system perform and ready for production.


Resilient Predictive Data Pipelines

Posted by Sid Anand  on  May 06, 2016

Sid Anand discusses how Agari is applying big data best practices to the problem of securing its customers from email-born threats, presenting a system that leverages big data in the cloud.


Big-Data Analytics Misconceptions

Posted by Irad Ben-Gal  on  May 03, 2016

Irad Ben-Gal discusses Big Data analytics misconceptions, presenting a technology predicting consumer behavior patterns that can be translated into wins, revenue gains, and localized assortments.

General Feedback
Marketing and all content copyright © 2006-2016 C4Media Inc. hosted at Contegix, the best ISP we've ever worked with.
Privacy policy

We notice you're using an ad blocker

We understand why you use ad blockers. However to keep InfoQ free we need your support. InfoQ will not provide your data to third parties without individual opt-in consent. We only work with advertisers relevant to our readers. Please consider whitelisting us.