InfoQ Homepage Big Data Content on InfoQ
-
Spring and Big Data
Thomas Risberg discusses developing big data pipelines with Spring, focusing around the code needed and he also covers how to set up a test environment both locally and in the cloud.
-
Uses of Big Data by a Non-Profit Engaged in Conducting Events Funded in Part by Third Party Sponsors
Thomas Grilk discusses how a non-profit can efficiently use data from customers/athletes in its marketing and sponsorship activities while respecting the privacy and confidentiality of its customers.
-
TensorFlow: A Flexible, Scalable & Portable System
Rajat Monga talks about why Google built TensorFlow, an open source software library for numerical computation using data flow graphs, and what were some of the technical challenges in building it.
-
Visual Rules of the Road for Big Data Practitioners
David Fisher discusses via example how to build a data navigation language into visualizations, providing an intuitive user experience via the mechanism of subtle visual cuing.
-
Validation Methodology of Large Unstructured Unsupervised Learning Systems
Lawrence Chernin describes best practices and validation methods used to deal with large unstructured data, including a suite of unit tests covering the implementations of algorithmic equations.
-
How Predictive Analytics Boosts the Customer Experience at the Georgia Aquarium
Beach Clark talks about the technological and cultural challenges of turning data science into a vital part of the business model at Georgia Aquarium.
-
Overview of Artificial Intelligence and Its Use in Analyzing “Voice of Cancer Patients”
Alok Aggarwal overviews Artificial Intelligence and discusses a use case, “Voice of Cancer Patients” that uses ML and NLP algorithms to analyze unstructured text written by cancer patients.
-
Solving Business Problems with Data Science
The panelists discuss how Data Science can help solve various problems for business.
-
Hydrator: Open Source, Code-Free Data Pipelines
Jonathan Gray introduces Hydrator, an open source framework and user interface for creating data lakes for building and managing data pipelines on Spark, MapReduce, Spark Streaming and Tigon.
-
Developing a Machine Learning Based Predictive Analytics Engine for Big Data Analytics
Ali Jalali presents how to develop a machine learning predictive analytics engine for big data analytics.
-
Exploring Wikipedia with Apache Spark: A Live Coding Demo
Sameer Farooqui demos connecting to the live stream of Wikipedia edits, building a dashboard showing what’s happening with Wikipedia datasets and how people are using them in real time.
-
Applying Big Data
Graeme Seaton discusses the drivers behind Big Data initiatives and how to approach them using the vast amounts of data available.