The Apache Software Foundation has released version 3.0 of Apache Pulsar, the distributed messaging and streaming platform. Pulsar 3.0 introduces the Long-Term Support release, under this new release cadence, the Pulsar community will provide maintenance fixes for 24 months and security vulnerability patches for additional 12 months. The plan is to release a new LTS version every 18 months. This scheme is aimed at customers who are requiring stability and longer maintenance cycles. Feature releases are planned to be implemented between two LTS releases.
Apache Pulsar is a high-performance, multi-tenant messaging and streaming platform with support for multiple clusters, low latency, seamless scalability, guaranteed message delivery with persistent message storage provided by Apache BookKeeper, and serverless connector frameworks for data processing and connectivity.
The image below shows the architecture of a Pulsar cluster:
One of the significant improvements in Pulsar 3.0 is the introduction of a new load manager implementation. The previous load manager had scalability issues when Pulsar clusters grew to thousands of brokers and millions of topics. The new load manager aims to balance cluster utilization more evenly while reducing latency and dependence on Apache ZooKeeper. It achieves this by storing load data for brokers and bundles in non-persistent topics, eliminating the need for N-replication.
Another enhancement is the delayed message support in Pulsar. The previous implementation had limitations related to memory constraints and index rebuilding. The new mechanism supports delayed message index snapshots, minimizing the costs of rebuilding the index and reducing memory usage for maintaining the delayed message index. This improvement enables efficient handling of large numbers of delayed messages and improves overall performance.
Pulsar 3.0 also brings support for multi-arch Docker images. Docker images are now published for both Intel x86-64 and Arm64 architectures.
In terms of underlying optimizations, Pulsar 3.0 introduces enhancements to the BookKeeper direct IO logic. The new implementation bypasses the OS PageCache, reducing memory consumption and improving cache utilization.
Another optimization introduced in Pulsar 3.0 is the segmented snapshot optimization for the Transaction Buffer. The new segmented snapshot approach splits the snapshot into multiple parts, each with a fixed number of aborted transactions and a maxReadPosition
identity. This enhancement improves transaction buffer recovery speed, reduces resource costs associated with large snapshots, and addresses write amplification issues.
Finally, Pulsar introduces blue-green cluster deployment support. Blue-green deployment is a widely-used approach for migrating live traffic from one cluster to another.
Pulsar 3.0 brings significant improvements in load balancing, delayed message support, Docker image availability, BookKeeper IO logic optimization, transaction buffer segmentation, and blue-green cluster deployment. These enhancements enhance performance, scalability, and usability of the Pulsar messaging system, making it more efficient and flexible.