InfoQ Homepage Architecture & Design Content on InfoQ
-
PyPI Supply Chain Attack Compromises LiteLLM, Enabling the Exfiltration of Sensitive Information
Discovered by FutureSearch researcher Callum McMahon, a supply chain attack against LiteLLM on PyPI resulted in over 40 thousand downloads of a compromised version that installed a malicious payload capable of harvesting and exfiltrating sensitive information. LiteLLM is downloaded roughly 3 million times per day.
-
QCon London 2026: Team Topologies as the ‘Infrastructure for Agency’ with AI
At QCon London 2026, Matthew Skelton argued that AI success depends on organisational maturity. He highlighted bounded agency, security, and stewardship as key to managing AI agents. By using Innovation and Practices Enabling Teams, companies can drive knowledge diffusion and optimise internal processes to see real-world returns on their AI investments.
-
Discord Open Sources Osprey Safety Rules Engine Processing 2.3 Million Rules per Second
Discord open-sourced Osprey, a safety rules engine processing 400 million daily actions and 2.3 million rules per second. Osprey uses a polyglot architecture: a Rust coordinator manages traffic, while stateless Python workers execute logic using a Python-based domain-specific language called SML. This design allows trust and safety teams to deploy real-time threat mitigations at high scale.
-
Java News Roundup: GraalVM Build Tools, EclipseLink, Spring Milestones, Open Liberty, Quarkus
This week's Java roundup for March 23rd, 2026, features news highlighting: GA releases of GraalVM Native Build Tools 1.0 and EclipseLink 5.0; the March 2026 edition of Open Liberty; fourth milestone releases of Spring Boot, Spring Modulith and Spring AI; a point release of Quarkus; the first development release of Infinispan; and a maintenance release of GlassFish.
-
Microsoft Launches Azure Copilot Migration Agent to Accelerate Cloud Migration Planning
Microsoft has launched the Azure Copilot Migration Agent, an AI assistant built into the Azure portal that automates migration planning, agentless VMware discovery, and landing zone creation. Despite being billed as generally available, the agent is in public preview and cannot execute migrations. Replication and cutover remain manual tasks in Azure Migrate.
-
Discord Engineers Add Distributed Tracing to Elixir's Actor Model without Performance Penalty
Discord engineering detailed how they added distributed tracing to Elixir's actor model. Their custom Transport library wraps messages with trace context and uses dynamic sampling to handle million-user fanouts. CPU optimizations included skipping unsampled traces and filtering context before deserialization, recovering 10+ percentage points of overhead.
-
"Pick and Mix" Custom Regions: Cloudflare Introduces Fine-Grained Data Residency Control
Cloudflare recently introduced Custom Regions, an expansion of its Regional Services that lets customers precisely define where their data is processed. By selecting specific groups of data centers by country or region, customers can ensure that TLS termination and application-layer processing remain within chosen geographic boundaries for compliance and control.
-
Inside Agoda’s Storefront: a Latency-Aware Reverse Proxy for Improving DNS Based Load Distribution
Agoda engineers developed Storefront, a Rust-based S3-compatible reverse proxy that improves load balancing, request routing, and observability across large-scale object storage systems. The proxy addresses DNS-based distribution limitations, implements latency-aware routing, cross-data-center optimizations, IO safeguards, credential-less authentication, and exposes telemetry via OpenTelemetry.
-
AWS S3 Introduces Account-Regional Namespaces, Ending 18 Years of Global Bucket Name Collisions
AWS introduced account-regional namespaces for S3, fixing global bucket name collisions that broke IaC automation for 18 years. The new format is {prefix}-{account-id}-{region}-an. CloudFormation gets the BucketNamePrefix property, and IAM gets the s3:x-amz-bucket-namespace condition key. This prevents confused-deputy attacks by making names unpredictable when there is no account ID.
-
Uber Launches IngestionNext: Streaming-First Data Lake Cuts Latency and Compute by 25%
Uber launches IngestionNext, a streaming-first data lake ingestion platform that reduces data latency from hours to minutes and cuts compute usage by 25%. Built on Kafka, Flink, and Apache Hudi, it supports thousands of datasets, enabling faster analytics, experimentation, and machine learning workloads globally.
-
AWS Load Balancer Controller Reaches GA with Kubernetes Gateway API Support
AWS shipped GA support for Kubernetes Gateway API in its Load Balancer Controller, dumping annotation-based configuration for type-safe CRDs with proper validation. The release handles both L4 (TCP/UDP via NLB) and L7 (HTTP/gRPC via ALB) routing through the Gateway API spec. Teams get cross-namespace routing, automatic certificate discovery, and role separation without cluster-admin permissions.
-
QCon London 2026: Shielding the Core: Architecting Resilience with Multi-Layer Defenses
Anderson Parra, staff software engineer at SeatGeek, presented “Shielding the Core: Architecting Resilience with Multi-Layer Defenses” at QCon London 2026. Parra discussed strategies on how to handle significant traffic spikes in systems that can overwhelm an even well-designed infrastructure.
-
Uber Automates Design Documentation with Agentic Systems
Uber’s uSpec uses AI agents and the Figma Console MCP to automate design specs, cutting documentation time from weeks to minutes. Integrated with the Michelangelo platform, it uses a GenAI Gateway for PII redaction, ensuring data stays local. This reflects a 2026 industry shift between Uber’s "Visual-First" Figma workflow and a "Guide-First" approach favored by developers using agentic IDEs.
-
AI Coding Assistants Haven’t Sped up Delivery Because Coding Was Never the Bottleneck
Agoda recently published an observation arguing that while AI coding tools have measurably raised individual developer output, the resulting velocity gains at the project level have been surprisingly modest, because coding was never the real bottleneck. The post claims that the bottleneck has shifted upstream to specification and verification because these areas require human judgment.
-
Inside Netflix’s Graph Abstraction: Handling 650TB of Graph Data in Milliseconds Globally
Netflix engineers built Graph Abstraction, a high-throughput platform managing 650 TB of graph data with millisecond latency. Supporting services from Netflix Gaming’s social graphs to operational topology graphs, it maintains global availability via asynchronous replication. This article covers its architecture, caching, and traversal design for high-scale performance.