InfoQ Homepage Distributed Systems Content on InfoQ

News

RSS Feed

Newer Older

Architecture & Design

ClickHouse Keeper: Efficient Apache ZooKeeper Alternative Created with C++ and Raft

ClickHouse project team created an in-house replacement for Apache Zookeeper as it needed a more efficient implementation that would also address some of Zookeeper's shortcomings. Now, ClickHouse Keeper is an essential part of the ClickHouse project and a cornerstone of this open-source analytical database, but can also be used independently for many distributed coordination use cases.

Rafal Gancarz
on Dec 01, 2023
Architecture & Design

How DoorDash Rearchitected its Cache to Improve Scalability and Performance

DoorDash rearchitected the heterogeneous caching system they were using across all of their microservices and created a common, multi-layered cache providing a generic mechanism and solving a number of issues coming from the adoption of a fragmented cache.

Sergio De Simone
on Oct 28, 2023
DevOps

Disaster Recovery Across a Million Pieces: Michelle Brush at QCon San Francisco

During the second day of QCon San Francisco 2023, Michelle Brush, an engineering director, SRE at Google, discussed challenges, patterns, and practices for disaster recovery actions in massively distributed systems in her session. The session is part of the "Designing for Resilience" track.

Steef-Jan Wiggers
on Oct 04, 2023
Architecture & Design

LinkedIn's Open-Source "iris-message-processor" Achieves 86.6x Faster Escalation Management Speeds

LinkedIn developed a new open-source service called "iris-message-processor" to enhance the performance and reliability of its existing Iris escalation management system. "iris-message-processor" significantly improves processing speeds, being ~4.6x faster under average loads and ~86.6x faster under high loads than its predecessor.

Eran Stiller
on Sep 11, 2023
Architecture & Design

Pinterest Revamps Its Asynchronous Computing Platform with Kubernetes and Apache Helix

Pinterest created the next-generation asynchronous computing platform, Pacer, to replace the older solution, Pinlater, which the company outgrew, resulting in scalability and reliability challenges. The new architecture leverages Kubernetes for scheduling job-execution workers and Apache Helix for cluster management.

Rafal Gancarz
on Aug 21, 2023
Architecture & Design

Cadence 1.0: Uber Releases Its Scalable Workflow Orchestration Platform

Uber released a major version of its workflow orchestration platform named Cadence after six years in development. Uber and other companies use Cadence to build stateful services at scale using native programming languages.

Rafal Gancarz
on Aug 07, 2023
Java

Apache Pulsar 3.0 Delivers a New LTS Version and Efficiency Improvements

The Apache Software Foundation has released version 3.0 of Apache Pulsar, the distributed messaging and streaming platform. Pulsar 3.0 introduces the Long-Term Support release and many performance and scalability improvements.

Andrea Messetti
on May 23, 2023
Architecture & Design

Preventing Serverless Vendor Lock-in with Design Patterns

Gregor Hohpe recently published an article proposing a paradigm shift to address vendor lock-in concerns on serverless cloud applications. Designing a solution using well-known patterns decouples its functional characteristics from the underlying cloud implementation, making it easier to avoid lock-in or to go multi-cloud.

Vasco Veloso
on Sep 24, 2022
Culture & Methods

A Distributed System is Knowable: an Impossible Thing for Developers

Failure in distributed systems is normal. Distributed systems can provide only two of the three guarantees in consistency, availability, and partition tolerance. According to Kevlin Henney, this limits how much you can know about how a distributed system will behave. He gave a keynote about Six Impossible Things at QCon London 2022 and at QCon Plus May 10-20, 2022.

Ben Linders
on Sep 01, 2022
DevOps

Cloudflare D1 Provides Distributed SQLite for Cloudflare Workers

Soon to enter beta, D1 is Cloudflare's first step into the Cloud-based SQL storage arena. D1 is built on top of SQLite with the addition of a distributed replication mechanism, batch operation support, embedded compute, automatic backups and redundancy, and more.

Sergio De Simone
on May 25, 2022
Architecture & Design

Dealing with Thundering Herd at Braintree

Braintree engineer Anthony Ross explained in a recent article how introducing some random jitter into retry intervals for failed tasks solved a thundering herd issue which was impacting the efficiency of their payment dispute management API.

Sergio De Simone
on May 19, 2022
DevOps

Managing Complex Dependencies with Distributed Architecture at eBay

The eBay engineering team recently outlined how they came up with a scalable release system. The release solution leverages distributed architecture to release more than 3,000 dependent libraries in about two hours. The team is using Jenkins to perform the release in combination with Groovy scripts.

Aditya Kulkarni
on Apr 08, 2022
Architecture & Design

Microservice Calls’ Critical Path Analysis with Jaeger and Uber’s CRISP

Discovering which services need to be optimised to reduce end-to-end latency in a microservices-based system can be challenging because call graphs may be too complicated to read. Uber described an open-source tool called CRISP built to solve this problem by finding the critical paths in these graphs. These paths identify those operations whose optimisation benefits the overall system.

Vasco Veloso
on Dec 20, 2021
Architecture & Design

Dapr Joins CNCF Incubator: Q&A with Yaron Schneider

The Cloud Native Computing Foundation (CNCF) recently announced that it accepted the Distributed Application Runtime (Dapr) as a CNCF incubating project. This statement follows an earlier announcement by Dapr, announcing the formation of the Dapr project's Steering and Technical Committee (STC).

Eran Stiller
on Nov 17, 2021
Architecture & Design

Reviewing the Eight Fallacies of Distributed Computing

In a recent article on Ably Blog, Alex Diaconu reviewed the eight fallacies of distributed computing and provided a number of hints at how to handle them. InfoQ has taken the chance to talk with Diaconu to learn more about how Ably engineers deal with the fallacies.

Sergio De Simone
on Sep 03, 2021

Newer News

Older News

InfoQ Software Architects' Newsletter

News