InfoQ Homepage Change Data Capture Content on InfoQ
News
RSS Feed-
Uber's CacheFront: Powering 40M Reads per Second with Significantly Reduced Latency
Uber developed an innovative caching solution, CacheFront, for its in-house distributed database, Docstore. CacheFront enables over 40M reads per second from online storage and achieves substantial performance improvements, including a 75% reduction in P75 latency and over 67% reduction in P99.9 latency, demonstrating its effectiveness in enhancing system efficiency and scalability.
-
Netflix Creates Incremental Processing Solution Using Maestro and Apache Iceberg
Netflix created a new solution for incremental processing in its data platform. The incremental approach reduces the cost of computing resources and execution time significantly as it avoids processing complete datasets. The company used its Maestro workflow engine and Apache Iceberg to improve data freshness and accuracy and plans to provide managed backfill capabilities.
-
Distributed Materialized Views: How Airbnb’s Riverbed Processes 2.4 Billion Daily Events
Airbnb created Riverbed, a Lambda-like data framework for producing and managing distributed materialized views. The framework supports over 50 read-heavy use cases where data is sourced from multiple data sources within the company’s service-oriented architecture (SOA) platform. It uses Apache Kafka and Apache Spark for online and offline components, respectively.
-
Yelp Rebuilds Corrupted Cassandra Cluster Using Its Data Streaming Architecture
Yelp created a solution to sanitize data from the corrupted Apache Cassandra cluster utilizing its data streaming architecture. The team explored many potential options to address the data corruption issue, however, ultimately had to move the data into a new cluster to remove corrupted records in the process.
-
Debezium Releases Version 2.0 of Its Change Data Capture Tool
Debezium, an open-source distributed platform for change data capture (CDC), converts records from existing databases into event streams, enabling applications to detect and respond to database row-level changes. This release of version 2.0 introduces many changes: Java 11 is now required; incremental snapshots are improved [...]
-
Netflix Studio Search: Using Elasticsearch and Apache Flink to Index Federated GraphQL Data
Netflix engineers recently published how they built Studio Search, using Apache Kafka streams, an Apache Flink-based Data Mesh process, and Elasticsearch to manage the index. They designed the platform to take a portion of Netflix's federated GraphQL graph and make it searchable. Today, Studio Search powers a significant portion of the user experience for many applications within the organisation.
-
Uber Re-Architected Its Foundational Fulfilment Service
Uber recently shared how it re-architected its fulfilment service, one of Uber's foundational platform services. Following a two-year-long effort involving 30+ teams and hundreds of developers, Uber engineers "built a strong foundation for modelling various types of physical fulfilment categories in the new platform and migrated all existing transportation use cases."