InfoQ Homepage Data Analytics Content on InfoQ
-
Pulsar: Real-time Analytics at Scale
Sharad Murthy & Tony Ng present Pulsar, a real-time streaming system which can scale to millions of events per second with high availability and 4GL language support.
-
Interactive Analytics at Scale with Druid
Julien Lavigne du Cadet discusses how Criteo uses Druid: an open-source, real-time data store designed to power interactive applications at scale, covering Druid's architecture and internals.
-
Analyzing Social Networks with F#
Evelina Gabasova explains how to run a social network analysis on Twitter and how to use data science tools to find out more about followers.
-
Customer Insight, from Data to Information
Thore Thomassen shares from experience how to combine structured data in a DWH with unstructured data in NoSQL, and using parallel data warehouse appliances to boost the analytical capabilities.
-
Customer Analytics on Hadoop
Bob Kelly presents case studies on how Platfora uses Hadoop to do analytics for several of their customers.
-
Dashboarding: The Developers’ Role in Data Analysis
Seth Juarez shares insight on how to create applications that use dashboards to drive value, convert raw data into answers, and simplify business processes.
-
Scaling Chartbeat from 8 Million Open Browsers to Realtime Analytics and Optimization
Wesley Chow presents Chartbeat's real-time analytics platform and how able to handle the requests in a cost efficient manner using a custom written analytics engine in C and Lua.
-
Introduction to Data Science
Bryan Nehl makes an introduction to the data science: data formats, ETL tools, NoSQL databases, languages, libraries, techniques and approaches for exploring data and extracting value from it.
-
Persistence: A View from Stratosphere
Stefan Edlich discusses big data systems -Spanner, Presto- and the future of data persistence, data analytics, data formats and of NoSQL/NewSQL in general.
-
Haskell in the Newsroom
Erik Hinton discusses the successes and failures of making a cultural shift in the newsroom at NYT to accept Haskell and some of the projects Haskell has been used for.
-
R for Big Data
Indrajit Roy presents HP Labs’ attempts at scaling R to efficiently perform distributed machine learning and graph processing on industrial-scale data sets.
-
Functional Programming for Optimization Problems with City of Palo Alto Open Data
Paco Nathan reviews an example data analysis application written in Cascalog used for a recommender system based on City of Palo Alto Open Data.