InfoQ Homepage Apache Kafka Content on InfoQ
-
Uber Builds Scalable Chat Using Microservices with GraphQL Subscriptions and Kafka
Uber replaced a legacy architecture built using the WAMP protocol with a new solution that takes advantage of GraphQL subscriptions. The main drivers for creating a new architecture were challenges around reliability, scalability, observability/debugibility, as well as technical debt impeding the team’s ability to maintain the existing solution.
-
Java News Roundup: New OpenJDK JEPs, Spring Functions Catalog, Apache Kafka, Quarkus, JReleaser
This week's Java roundup for February 26th, 2024, features news highlighting: JEP 468, Derived Record Creation (Preview); JEP 467, Markdown Documentation Comments; a new Spring Functions Catalog; end-of-life planned for the Spring Framework 6.0 and 5.3 release trains; and point releases for Apache Kafka, Quarkus and JReleaser.
-
Grab Improves Kafka on Kubernetes Fault Tolerance with Strimzi, AWS AddOns and EBS
Grab updated its Kafka on Kubernetes setup to improve fault tolerance and completely eliminate human intervention in case of unexpected Kafka broker terminations. To address the shortcomings of the initial design, the team integrated with AWS Node Termination Handler (NTH), used the Load Balancer Controller for target group mapping, and switched to ELB volumes for storage.
-
.NET Aspire Preview 3: Expanded Component Support with Azure OpenAI, MySQL, CosmosDB, Kafka and More
Last week, Microsoft revealed the availability of the .NET Aspire - third preview. Preview 3 brings changes including UI improvements to the dashboard, and new component support for Azure OpenAI, Kafka, Oracle, MySQL, CosmosDB & Orleans, and many more.
-
Pinterest Open-Sources a Production-Ready PubSub Java Client for Kafka, Flink, and MemQ
Pinterest open-sourced its generic PubSub client library, PSC, which has been heavily used in production for a year and a half. The library helped the engineering teams by increasing developer velocity, and the scalability and stability of services using it. Over 90% of Java applications have migrated to PSC with minimal changes.
-
Instacart Creates Real-Time Item Availability Architecture with ML and Event Processing
Instacart combined machine learning with event-based processing to create an architecture that provides customers with an indication of item availability in near real-time. The new solution helped to improve user satisfaction and retention by reducing order cancellations due to out-of-stock items. The team also created a multi-model experimentation framework to help enhance model quality.
-
Zendesk Moves from DynamoDB to MySQL and S3 to Save over 80% in Costs
Zendesk reduced its data storage costs by over 80% by migrating from DynamoDB to a tiered storage solution using MySQL and S3. The company considered different storage technologies and decided to combine the relational database and the object store to strike a balance between querybility and scalability while keeping the costs down.
-
Expedia Uses WebSockets and Kafka to Query Near Real-Time Streaming Data
Expedia created a solution to support querying the clickstream data from their platform in near-real time to enable their product and engineering teams to explore live data while working on new and enhancing existing data-driven functional use cases. The team used a combination of WebSockets, Apache Kafka, and PostgreSQL to allow streaming query results continuously to users’ browsers.
-
Privacy Engineering at Scale: DoorDash’s Journey in Geomasking and Data Protection
DoorDash recently published how it proactively embeds privacy into its products. It explains the importance of Privacy Engineering, an often overlooked software architecture practice, and provides an example of geomasking users' address data to protect their privacy better.
-
How HubSpot Uses Apache Kafka Swimlanes for Timely Processing of Workflow Actions
HubSpot adopted routing messages over multiple Kafka topics (called swimlanes) for the same producer to avoid the build-up in the consumer group lag and prioritize the processing of real-time traffic. Using a combination of automatic and manual detection of traffic spikes, the company ensures the majority of customers’ workflows execute without delays.
-
Goldsky’s Streaming-First Architecture for Blockchain Data with Flink, Redpanda and Kubernetes
Goldsky created a platform for the real-time processing of blockchain data. The platform allows clients to extract data from blockchains into their own databases to support product features, but without running the data pipeline infrastructure. The event-driven architecture (EDA) of Goldsky leverages Apache Flink, Redpanda, Kubernetes, and cloud provider services.
-
Contentsquare Uses Microservices and Apache Kafka for Notification Delivery
Contentsquare needed notification functionality for many use cases within its platform. The company created a generic solution spanning multiple services as part of its microservice architecture. During the implementation, the developers had to improve observability and overcome some scalability challenges.
-
Reddit Unveils REV2: Modernised Rule-Execution with Kubernetes, Kafka, and Flink Stateful Functions
Reddit's Safety Engineering team recently published how it modernised its Rule-Execution system, which detects and acts on policy-violating content in real time. The new architecture includes improvements like transitioning from legacy EC2-based systems to Kubernetes, better rule version control with Github and S3 storage, and the capability to scale more efficiently with Flink Stateful Functions.
-
Java News Roundup: Foreign Function & Memory API, OpenJDK JEPs, Apache Tomcat CVEs
This week's Java roundup for October 9th, 2023, features news from OpenJDK, JDK 22, Apache Tomcat CVEs, Devoxx Morocco, and milestone, point and release candidates of: Spring Framework; Spring Data; Micronaut; Quarkus; Micrometer Metrics; Micrometer Tracing; Apache Kafka; Apache Camel; Eclipse Vert.x; Project Reactor; JHipster Lite; Piranha; and RefactorFirst.
-
Distributed Materialized Views: How Airbnb’s Riverbed Processes 2.4 Billion Daily Events
Airbnb created Riverbed, a Lambda-like data framework for producing and managing distributed materialized views. The framework supports over 50 read-heavy use cases where data is sourced from multiple data sources within the company’s service-oriented architecture (SOA) platform. It uses Apache Kafka and Apache Spark for online and offline components, respectively.