InfoQ Homepage Data Content on InfoQ
-
Data Science in the Cloud @StitchFix
Stefan Krawczyk discusses how StitchFix used the cloud to enable over 80 data scientists to be productive and have easy access, covering prototyping, algorithms used, keeping schema in sync, etc.
-
Scaling the Data Infrastructure @Spotify
Mārtiņš Kalvāns and Matti Pehrs overview the Data Infrastructure at Spotify, diving into some of the data infrastructure components, such us Event Delivery, Datamon and Styx.
-
Data Microservices in the Cloud
Mark Pollack introduces Spring Cloud Data Flow enabling one to create pipelines for data ingestion, real-time analytics and data import/export, demoing apps that are deployed onto multiple runtimes.
-
Targeting Your Audience: Data Visualization to Communicate Data Insights
Randy Krum explains how to use the power of data visualization to convey actionable insights to an audience, making data clear and memorable by showing the audience what the data means.
-
Ingest & Stream Processing - What Will You Choose?
Pat Patterson and Ted Malaska talk about current and emerging data processing technologies, and the various ways of achieving "at least once" and "exactly once" timely data processing.
-
Journey from Data Integration to Data Science
Michael Wise discusses the journey from having data integrated across an organization, to employing data science to make good use of it.
-
Data Driven Action: A Primer on Data Science
S Aerni, S Ramanujam and J Vawdrey present approaches and open source tools for wrangling and modeling massive datasets, scaling Java applications for NLP on MPP through PL/Java and much more.
-
Data Structure Adventures
Joseph Blomstedt presents ongoing work to build a new set of high performance data structures for Erlang, including both single process data structures as well as various concurrent data structures.
-
Making Distributed Data Persistent Services Elastic (Without Losing All Your Data)
Joe Stein introduces Mesos and managing data services on it, presenting use cases for replacing classic solutions (like cold storage) with new functionality based on these technology.
-
Design vs. Data: Enemies or Friends?
Big Design Upfront was considered so evil in the early days of Agile that it acquired its own acronym. It’s time we relearned that great products start with asking the right questions.
-
Responding Rapidly When You Have 100GB+ Data Sets in Java
Peter Lawrey discusses data-driven reactive systems, profiling latency distribution in such an environment, finding rare bugs, implementing resilience and monitoring.
-
Evolving a Data System
Simon Metson approaches the problem of evolving a data system; some patterns and anti-patterns both technical (polyglot systems, lambda architectures) and organisational (data silos, lava layers).