InfoQ Homepage Apache Kafka Content on InfoQ
-
Expedia Uses WebSockets and Kafka to Query Near Real-Time Streaming Data
Expedia created a solution to support querying the clickstream data from their platform in near-real time to enable their product and engineering teams to explore live data while working on new and enhancing existing data-driven functional use cases. The team used a combination of WebSockets, Apache Kafka, and PostgreSQL to allow streaming query results continuously to users’ browsers.
-
Privacy Engineering at Scale: DoorDash’s Journey in Geomasking and Data Protection
DoorDash recently published how it proactively embeds privacy into its products. It explains the importance of Privacy Engineering, an often overlooked software architecture practice, and provides an example of geomasking users' address data to protect their privacy better.
-
How HubSpot Uses Apache Kafka Swimlanes for Timely Processing of Workflow Actions
HubSpot adopted routing messages over multiple Kafka topics (called swimlanes) for the same producer to avoid the build-up in the consumer group lag and prioritize the processing of real-time traffic. Using a combination of automatic and manual detection of traffic spikes, the company ensures the majority of customers’ workflows execute without delays.
-
Goldsky’s Streaming-First Architecture for Blockchain Data with Flink, Redpanda and Kubernetes
Goldsky created a platform for the real-time processing of blockchain data. The platform allows clients to extract data from blockchains into their own databases to support product features, but without running the data pipeline infrastructure. The event-driven architecture (EDA) of Goldsky leverages Apache Flink, Redpanda, Kubernetes, and cloud provider services.
-
Contentsquare Uses Microservices and Apache Kafka for Notification Delivery
Contentsquare needed notification functionality for many use cases within its platform. The company created a generic solution spanning multiple services as part of its microservice architecture. During the implementation, the developers had to improve observability and overcome some scalability challenges.
-
Reddit Unveils REV2: Modernised Rule-Execution with Kubernetes, Kafka, and Flink Stateful Functions
Reddit's Safety Engineering team recently published how it modernised its Rule-Execution system, which detects and acts on policy-violating content in real time. The new architecture includes improvements like transitioning from legacy EC2-based systems to Kubernetes, better rule version control with Github and S3 storage, and the capability to scale more efficiently with Flink Stateful Functions.
-
Java News Roundup: Foreign Function & Memory API, OpenJDK JEPs, Apache Tomcat CVEs
This week's Java roundup for October 9th, 2023, features news from OpenJDK, JDK 22, Apache Tomcat CVEs, Devoxx Morocco, and milestone, point and release candidates of: Spring Framework; Spring Data; Micronaut; Quarkus; Micrometer Metrics; Micrometer Tracing; Apache Kafka; Apache Camel; Eclipse Vert.x; Project Reactor; JHipster Lite; Piranha; and RefactorFirst.
-
Distributed Materialized Views: How Airbnb’s Riverbed Processes 2.4 Billion Daily Events
Airbnb created Riverbed, a Lambda-like data framework for producing and managing distributed materialized views. The framework supports over 50 read-heavy use cases where data is sourced from multiple data sources within the company’s service-oriented architecture (SOA) platform. It uses Apache Kafka and Apache Spark for online and offline components, respectively.
-
Managing 238 Million Memberships of Netflix: Surabhi Diwan at QCon San Francisco
During the first day of QCon San-Francisco 2023, Surabhi Diwan, a senior software engineer at Netflix, presented on managing 238 million Memberships of Netflix. The talk is a part of the “Architectures You’ve Always Wondered About" track. Diwan's work at Netflix involves the backend work regarding membership engineering, which is critical for both signups and streaming at Netflix.
-
Confluent Announces Apache Flink on Confluent Cloud in Open Preview
Confluent recently announced the open preview of Apache Flink on Confluent Cloud as a fully-managed service for stream processing. The company claims that the managed service will make it easier for companies to filter, join, and enrich data streams with Flink.
-
Digital Ocean Launches its Managed Kafka Service
Digital Ocean enters the arena of fully-managed Kafka services with its new offering aimed at simplifying management and maintenance of the popular event streaming platform. Digital Ocean Kafka targets startups and SMBs by offering them an all-inclusive, flat-rate pricing model.
-
Allegro Uses Control Theory for Workload Balancing in its Apache Kafka PubSub Platform
Allegro, the largest eCommerce platform in Poland, implemented dynamic workload balancing in Hermes, its open-source publish-subscribe message broker, built on top of Apache Kafka. The new workload balancing algorithm achieves more uniform resource utilization and lower infrastructure costs.
-
Running Apache Flink Applications on AWS KDA: Lessons Learnt at Deliveroo
Deliveroo introduced Apache Flink into its technology stack for enriching and merging events consumed from Apache Kafka or Kinesis Streams. The company opted to use AWS Kinesis Data Analytics (KDA) service to manage Apache Flink clusters on AWS and shared its experiences from running Flink applications on KDA.
-
Grab Reduces Traffic Cost for Kafka Consumers on AWS to Zero
Grab took advantage of the ability of Apache Kafka consumers to connect to the broker node in the same availability zone (AZ) introduced in Kafka 2.3 and reduced the traffic cost on AWS to zero for reconfigured consumers. The change has substantially reduced overall infrastructure costs for running Apache Kafka on AWS.
-
Public Preview of JSON Schema Support in Azure Event Hubs Schema Registry for Kafka Applications
Microsoft recently announced that the Azure Event Hubs schema registry now includes JSON schema support, providing Kafka applications with a centralized repository for schema documents used in messaging-centric and event-driven applications. The JSON schema support is currently in public preview.