InfoQ Homepage Big Data Content on InfoQ
-
Exploring Wikipedia with Apache Spark: A Live Coding Demo
Sameer Farooqui demos connecting to the live stream of Wikipedia edits, building a dashboard showing what’s happening with Wikipedia datasets and how people are using them in real time.
-
Applying Big Data
Graeme Seaton discusses the drivers behind Big Data initiatives and how to approach them using the vast amounts of data available.
-
Apache Beam: The Case for Unifying Streaming APIs
Andrew Psaltis talks about Apache Beam, which aims to provide a unified stream processing model for defining and executing complex data processing, data ingestion and integration workflows.
-
Creating Customer-Centric Products Using Big Data
Kriti Sharma talks about how Barclays is solving some of the toughest big data challenges in financial services using scalable, open source technology.
-
Server-Less Design Patterns for the Enterprise with AWS Lambda
Tim Wagner defines server-less computing, examines the key trends and innovative ideas behind the technology, and looks at design patterns for big data, event processing, and mobile using AWS Lambda.
-
Predicting the Future: Surprising Revelations trom Truly Big Data
Pushpraj Shukla discusses how Microsoft Bing predicts the future based on aggregate human behavior using one of the largest scale data sets, and recent progress in large scale deep learnt models.
-
Netflix Keystone - How We Built a 700B/day Stream Processing Cloud Platform in a Year
Peter Bakas presents in detail how Netflix has used Kafka, Samza, Docker, and Linux to implement a multi-tenant pipeline processing 700B events/day in the Amazon AWS cloud.
-
Hunting Criminals with Hybrid Analytics
David Talby demos using Python libraries to build a ML model for fraud detection, scaling it up to billions of events using Spark, and what it took to make the system perform and ready for production.
-
Resilient Predictive Data Pipelines
Sid Anand discusses how Agari is applying big data best practices to the problem of securing its customers from email-born threats, presenting a system that leverages big data in the cloud.
-
Big-Data Analytics Misconceptions
Irad Ben-Gal discusses Big Data analytics misconceptions, presenting a technology predicting consumer behavior patterns that can be translated into wins, revenue gains, and localized assortments.
-
How Comcast Uses Data Science and ML to Improve the Customer Experience
Jan Neumann presents how Comcast uses machine learning and big data processing to facilitate search for users, for capacity planning, and predictive caching.
-
The Mechanics of Testing Large Data Pipelines
Mathieu Bastian explores the mechanics of unit, integration, data and performance testing for large, complex data workflows, along with the tools for Hadoop, Pig and Spark.