InfoQ Homepage Performance & Scalability Content on InfoQ
-
Cadence 1.0: Uber Releases Its Scalable Workflow Orchestration Platform
Uber released a major version of its workflow orchestration platform named Cadence after six years in development. Uber and other companies use Cadence to build stateful services at scale using native programming languages.
-
AWS Launches General Availability of Amazon EC2 P5 Instances for AI/ML and HPC Workloads
AWS recently announced the general availability (GA) of Amazon EC2 P5 instances powered by the latest NVIDIA H100 Tensor Core GPUs suitable for users that require high performance and scalability in AI/ML and HPC workloads. The GA is a follow-up to the earlier announcement of the development of the infrastructure.
-
Microsoft Azure Managed Lustre for HPC and AI Workloads Now Generally Available
Microsoft recently announced the general availability (GA) of Azure Managed Lustre, a managed file system for high-performance computing (HPC) and AI workloads.
-
How LinkedIn Serves over 4.8 Million Member Profiles per Second
LinkedIn introduced Couchbase as a centralized caching tier for scaling member profile reads to handle increasing traffic that has outgrown their existing database cluster. The new solution achieved over 99% hit rate, helped reduce tail latencies by more than 60% and costs by 10% annually.
-
New Azure Cosmos DB Features to Boost Performance and Optimize Cost
Microsoft has recently unveiled several new features for Azure Cosmos DB to enhance cost efficiency, boost performance, and increase elasticity. These features are burst capacity, hierarchical partition keys, serverless container storage of 1 TB, and priority-based execution.
-
Datadog Creates Scalable Data Ingestion Architecture
Datadog created a dedicated data ingestion architecture offering exactly-once semantics for their third-generation event store, Husky. The event-driven architecture (EDA) can accommodate bursts in traffic in the multi-tenant platform with reasonable ingestion latency and acceptable operational costs.
-
Real-Time Messaging Architecture at Slack
Slack recently described how it sends millions of messages daily in real-time across the globe. The company provides a comprehensive insight into its architecture, designed to manage real-time messages at scale. It highlights the unique challenges posed by delivering real-time messages across different time zones and regions and how Slack's engineers designed the infrastructure to handle them.
-
Content Discovery at Scale with Hexagons and Elasticsearch at DoorDash
DoorDash recently published an article on how it is solving scaling challenges with content discovery using Elasticsearch and H3, a geospatial indexing system that partitions the world into hexagonal cells.
-
BBC New Serverless Platform Improves Scalability and Performance
One year into the transition to their new WebCore serverless platform, the BBC has started to reap the benefits of an architecture that removes the burden on engineers to solve performance and operational challenges and allows them to focus on the value they deliver to customers.
-
Netflix’s RENO Keeps Experience Consistent across Devices
Netflix has developed the Rapid Event Notification System (RENO) to create a consistent user experience across various platforms and devices. RENO reacts more quickly and consistently than the traditional request/response model to user-generated actions ranging from watching a title to changing profile information.
-
.NET 6: Threading Improvements
While numerous libraries exist to abstract away the complexities of asynchronous and concurrent programming, developers still need to drop down to lower thread-handling logic from time to time. Continuing our API changes for .NET 6 series, we look at some new tricks for multi-threading.
-
Introducing System.Threading.RateLimiting for .NET
While rate limiting is a well-known problem for web servers, there are many other situations where similar capabilities are needed. With the introduction of System.Threading.RateLimiting, developers will be able to add this capability without writing it themselves.
-
AWS Launches EC2 Auto Scaling Warm Pools
AWS recently released Warm Pools for EC2 Auto Scaling, which reduces the time and cost to scale out (aka horizontal scaling) applications by maintaining a pool of pre-initialized instances.
-
Google Announces the General Availability of A2 Virtual Machines
Recently, Google announced A2 Virtual Machines (VMs)' general availability based on the NVIDIA Ampere A100 Tensor Core GPUs in Compute Engine. According to the company, the A2 VMs will allow customers to run their NVIDIA CUDA-enabled machine learning (ML) and high-performance computing (HPC) scale-out and scale-up workloads efficiently at a lower cost.
-
How Project Cyclop Enabled GitHub to Reduce Push Failures to Nearly Zero
GitHub spawned Project Cyclop several months ago to identify what caused occasional push failures and to find a fix. It turns out there was no single culprit, and a careful analysis led to identifying a number of changes that improved push traffic by at least an order of magnitude, according to GitHub.