InfoQ Homepage Architecture Content on InfoQ
-
Beyond the Benchmark: a Metrics-Driven Approach to Sustained iOS Performance on Real Devices
iOS performance engineering often defaults to a mental model where performance is a property of a component. Performance is instead an emergent behavior of the interaction between application code, device hardware, OS resource management, network conditions, and user behavior patterns over time. This article gives a direct, first-party path to capturing performance issues using Xcode Instruments.
-
Three Pillars of Platform Engineering: a Virtuous Cycle
Platform engineering succeeds when reliability and ergonomics reinforce each other rather than compete. This article explores three foundational pillars: automated reliability, developer ergonomics, and operator ergonomics. Together, they establish a virtuous cycle that strengthens system stability, reduces operational burden, and empowers teams to scale infrastructure with confidence.
-
The DPoP Storage Paradox: Why Browser-Based Proof-of-Possession Remains an Unsolved Problem
DPoP closes a real gap in OAuth 2.0. Sender-constrained tokens are a meaningful upgrade over bearer tokens for any client that can implement them. But RFC 9449's silence on browser key storage creates the need for an architectural decision that each team must confront deliberately — there is no safe default that works everywhere.
-
When a Cloud Region Fails: Rethinking High Availability in a Geopolitically Unstable World
Sovereign fault domains are failure boundaries defined by legal, political, or physical jurisdiction rather than hardware topology. The article maps geopolitical events to known distributed-systems failure modes, argues multi-region should replace multi-AZ as the HA baseline for systems crossing jurisdictions, and outlines design patterns, chaos experiments, and an ALE model to justify the spend.
-
Building Production-Ready tRPC APIs: the TypeScript Alternative to Apollo Federation
This article details our migration from Apollo Federation to a TypeScript-based tRPC stack, which resulted in an 89% reduction in bugs and 67% faster response times. It also covers the mistakes we made, the unexpected performance gains, and an overview of the production architecture we use today to handle 2.4 million daily requests with 99.97% uptime.
-
Using AWS Lambda Extensions to Run Post-Response Telemetry Flush
At Lead Bank, synchronous telemetry flushing caused intermittent exporter stalls to become user-facing 504 gateway timeouts. By leveraging AWS Lambda's Extensions API and goroutine chaining in Go, flush work is moved off the response path, returning responses immediately while preserving full observability without telemetry loss.
-
The Spring Team on Spring Framework 7 and Spring Boot 4
InfoQ recently spoke with key members of the Spring team about the significant architectural and functional advancements in Spring Framework 7 and Spring Boot 4. This conversation explores the strategic shift toward core resilience by integrating features such as retry and concurrency throttling directly into the framework, alongside the performance benefits of modularizing auto-configurations.
-
Stateful Continuation for AI Agents: Why Transport Layers Now Matter
Agent workflows make transport a first-order concern. Multi-turn, tool-heavy loops amplify overhead that is negligible in single-turn LLM use. Stateful continuation cuts overhead dramatically. Caching context server-side can reduce client-sent data by 80%+ and improve execution time by 15–29% .
-
Replacing Database Sequences at Scale without Breaking 100+ Services
The article discusses the challenges faced during a migration from a relational database to NoSQL, focusing on the importance of database sequences for unique identifiers. It outlines the development of a new sequence service using DynamoDB and a two-tier caching architecture.
-
Beyond RAG: Architecting Context-Aware AI Systems with Spring Boot
This article introduces Context-Augmented Generation (CAG) as an architectural refinement of RAG for enterprise systems. It shows how a Spring Boot-based context manager can incorporate user identity, session state, and policy constraints into AI workflows, improving traceability, consistency, and governance without altering existing retrievers or LLM infrastructure.
-
Event-Driven Patterns for Cloud-Native Banking: Lessons from What Works and What Hurts
Event-driven architecture helps banks decouple systems, scale services, and create clear activity trails. But it also introduces complexity, new failure modes, and operational challenges. Chris Tacey-Green explains where it adds value in banking systems and the practical patterns, such as inbox/outbox and stable event contracts, needed to make it reliable.
-
Architecting Autonomy at Scale: Raising Teams without Creating Dependencies
Modern engineering needs a shift from "gates" to "guardrails." Scale via decentralized architecture that treats teams like adults—building judgment through Socratic coaching, shared platforms, and automated drift detection. Move beyond bottlenecks to an interdependent model where AI governance and ADRs preserve context without killing velocity. Empower autonomy while maintaining alignment.