InfoQ Homepage Data Content on InfoQ
-
Improve Feature Freshness in Large Scale ML Data Processing
Zhongliang Liang covers the impact of feature freshness on model performance, discussing various strategies and techniques that can be used to improve feature freshness.
-
The Rise of the Serverless Data Architectures
Gwen Shapira explores the implications of serverless workloads on the design of data stores, and the evolution of data architectures toward more flexible scalability.
-
Building High-Fidelity Data Streams
Sid Anand discusses how they built a lossless streaming data system that guarantees sub-second (p95) event delivery at scale with better than three nines availability.
-
What is Derived Data? (and Do You Already Have Any?)
Felix GV explains what derived data is, and dives into four major use cases which fit in the derived data bucket, including: graphs, search, OLAP and ML feature storage.
-
Real-Time Machine Learning: Architecture and Challenges
Chip Huyen discusses the value of fresh data as well as different types of architecture and challenges of online prediction.
-
Taming the Data Mess, How Not to Be Overwhelmed by the Data Landscape
Ismaël Mejía reviews the current data landscape and discusses both technical and organizational ideas to avoid being overwhelmed by the current lack of consolidation of the data engineering world.
-
Data Versioning at Scale: Chaos and Chaos Management
Einat Orr discusses several technologies that version large data sets, the use cases they support and the technology developed to best support those use cases.
-
Modern Data Pipelines in AdTech—Life in the Trenches
Roksolana Diachuk discusses how to use modern data pipelines for reporting and analytics as well as the case of historical data reprocessing in AdTech.
-
Protecting User Data via Extensions on Metadata Management Tooling
Alyssa Ransbury overviews the current state of metadata management tooling, and details how Square implemented security on its data.
-
Building & Operating High-Fidelity Data Streams
Sid Anand discusses building high-fidelity nearline data streams as a service within a lean team.
-
Data-driven Development in the Automotive Field
Toshika Srivastava offers insight into how they in the automotive field are developing products with data and what their challenges are.
-
Data Mesh Paradigm Shift in Data Platform Architecture
Zhamak Dehghani introduces Data Mesh, the next generation data platform, that shifts to a paradigm drawing from modern distributed architecture.