InfoQ Homepage Apache Kafka Content on InfoQ
-
Uber Implements Disaster Recovery for Multi-Region Kafka
In a recent blog post, Uber engineers highlight how they use a replication platform to implement disaster recovery at scale with a multi-region Kafka deployment. Uber has a large deployment of Apache Kafka, processing trillions of messages and multiple petabytes of data per day. Uber's engineers provided business resilience and continuity in the face of natural and human-made disasters.
-
LinkedIn Migrates away from Lambda Architecture to Reduce Complexity
Software engineers from LinkedIn recently published how they migrated away from a Lambda architecture. The Lambda architecture implementation caused their solution to have high operational overhead and added complexity, leading to slow product iteration times. As a result, the engineers chose to migrate to a Lambda-less architecture, resulting in significant development velocity improvements.
-
AWS Releases Amazon Timestream into General Availability
AWS recently announced the general availability of Amazon Timestream, a serverless purpose-built database that exposes time-series data through SQL. With Amazon Timestream, customers can save time and costs in managing the lifecycle of time series data by keeping recent data in memory and moving historical data to a cost-optimized storage tier based on user-defined policies.
-
Infinite Storage & Retention for Apache Kafka in Confluent Cloud
Confluent, Inc. recently announced the Infinite Storage option for its standard and dedicated clusters. This offering is a part of the Project Metamorphosis initiative, which is focused on enabling Kafka with modern cloud properties. Organizations can have a centralized platform for all event data for real-time actioning and historical analysis with limitless storage and retention.
-
KSQL Now Available on Confluent Cloud
KSQL is the streaming SQL engine for Apache Kafka, and it is currently available as a fully-managed service on the Confluent Cloud Platform for all its customers on usage-based billing plans. In a recent blog post, Confluent announced the availability of Confluent Cloud KSQL.
-
Confluent Offers Apache Kafka as a Service on the Azure Marketplace
In a recent blog post, Confluent announced the general availability of Confluent Cloud on Microsoft Azure. Confluent Cloud is a fully managed Apache Kafka service that removes the burden of operationally managing Kafka for engineers.
-
Experience Using Event Streams, Kafka and the Confluent Platform at Deutsche Bahn
To provide trip information to their rail passengers, Deutsche Bahn (DB) has created the RI-Plattform (Passenger Information Application) based on Apache Kafka and Kafka Streams with a plan to feed all information channels through the system. In a blog post, Axel Löhn and Uwe Eisele describe the microservices based design, how they build and run the system, and their experience from production.
-
Confluent Offers Apache Kafka as a Service on the GCP Marketplace
In a recent blog post, Confluent announced the general availability of Confluent Cloud on the Google Cloud Platform (GCP) Marketplace. Confluent Cloud is a fully managed Apache Kafka service, which removes the burden of its users to manage Kafka themselves.
-
Oracle Expands Cloud Native Services, Adds Kafka Streaming, API Gateway and Logging Support
In a recent blog post, Oracle announced the limited availability of three news service offerings in its Oracle Cloud Native Services platform. The three new services include Kafka Compatibility for Oracle Streaming, an API Gateway for managing connectivity to serverless components and containers and a Logging service that supports log management and analytics across resources and applications.
-
The Future of Data Engineering: Chris Riccomini at QCon San Francisco
At QCon San Francisco 2019, Chris Riccomini presented “The Future of Data Engineering”. The key takeaway of his talk is about reaching an end goal with data engineering, which is having a fully automated decentralized data warehouse.
-
Delta – a Data Synchronization and Enrichment Platform by Netflix
Large systems often utilize numerous data stores. There is sometimes a need to keep some of these data stores in sync, and to enrich data in a store by calling external services. To address these needs, Netflix has created Delta, an eventual consistent, event-driven data synchronization and enrichment platform. In a blog post, the team behind Delta gives an overview of their design.
-
Jagadish Venkatraman on LinkedIn's Journey to Samza 1.0
At the recent ApacheCon North America, Jagadish Venkatraman spoke about how LinkedIn developed Apache Samza 1.0 to handle stream processing at scale. He described LinkedIn's use cases involving trillions of events and petabytes of data, then highlighted the features added for the 1.0 release, including: stateful processing, high-level APIs, and a flexible deployment model.
-
Celia Kung on LinkedIn's Brooklin Data Streaming Service
Celia Kung from LinkedIn's team spoke at the QCon New York 2019 Conference last week about Brooklin data streaming service that supports pluggable sources and destinations. These can be data stores or messaging systems making the solution flexible and extensible. Brooklin is part of the streams infrastructure platform developed at LinkedIn.
-
Amazon Managed Kafka Aims to Simplify Kafka Streaming Setup and Use
Introduced as a public preview at AWS re:invent 2018, Amazon Managed Streaming for Kafka (MSK) is now generally available. Amazon MSK aims to make it easy to build and run streaming applications based on Kafka.
-
Expo: Real Time A/B Testing and Monitoring with Spark Streaming and Kafka at Walmart Labs
The WalmartLabs engineering team developed a real time A/B testing tool called Expo that collects and analyzes user engagement metrics. It uses Spark Structured Streaming to process the incoming data and stores the metrics in KairosDB.