InfoQ Homepage Telemetry Content on InfoQ
News
RSS Feed-
Inside Agoda’s Storefront: a Latency-Aware Reverse Proxy for Improving DNS Based Load Distribution
Agoda engineers developed Storefront, a Rust-based S3-compatible reverse proxy that improves load balancing, request routing, and observability across large-scale object storage systems. The proxy addresses DNS-based distribution limitations, implements latency-aware routing, cross-data-center optimizations, IO safeguards, credential-less authentication, and exposes telemetry via OpenTelemetry.
-
QCon London 2026: Wrangling Telemetry at Scale, a Guide to Self-Hosted Observability
At QCon London 2026, Colin Douch discussed building and operating self-hosted monitoring stacks, surveyed the current tooling landscape, and explained how to build a coherent observability setup rather than treating logs, metrics, and traces as separate pillars.
-
Quesma Releases OTelBench to Evaluate OpenTelemetry Infrastructure and AI Performance
Quesma has launched OTelBench, an open-source suite to benchmark OpenTelemetry pipelines and AI-driven instrumentation. It evaluates collector performance under stress while testing how accurately LLMs handle complex SRE tasks like context propagation. Initial data shows AI agents often achieve success rates below 30%, highlighting the gap between code generation and production observability.
-
How to Enable Testing a Distributed System on a Single Environment Using Proxy Routing
Without a dedicated QA environment, teams faced tech and coordination issues when testing a distributed system. A slow, unmaintainable CLI led an organization to shift left with automated testing. They built a tool for versioned deployments using CI and proxy routing, enabling developers to run isolated tests on multiple versions to catch bugs earlier.
-
Learnings from Internal Tool Migrations to Support Software Engineering Efficiency
In her presentation at QCon San Francisco, Ying Dai shared two critical software engineering migration stories - one focused on production monitoring and the other on production deployments with automated validations. Both migrations were driven by the goal of enhancing engineering efficiency, but each came with its own challenges and lessons.
-
Effective and Efficient Observability with OpenTelemetry
Daniel Gomez Blanco, principal engineer at Skyscanner, shared his experiences at QCon London on a large-scale observability initiative at his company, based on adopting OpenTelemetry across hundreds of services and the motivation and value gained from adopting open standards across the entire organization.
-
AWS Lambda Telemetry API Provides Enhanced Observability Data
AWS has released the AWS Lambda Telemetry API, a new way for extensions to receive enhanced function telemetry from the Lambda service. The new API simplifies collecting traces, logs, and custom and enhanced metrics from Lambda functions. Along with several example extensions, there are several extensions available from third parties including Datadog, Dynatrace, Serverless, and Sumo Logic.
-
Comprehensive Kubernetes Telemetry with AWS Observability Accelerator
AWS recently created a new template within the AWS Observability Accelerator project that provides an integrated telemetry solution for Elastic Kubernetes Service (EKS) workloads.
-
Twitter Open Sources Its Telemetry Tool Rezolus for Detection of Short-Lived Anomalies
Twitter Engineering open sourced their telemetry tool called Rezolus, which can detect anomalies in system performance metrics by sampling them at a higher rate.
-
Surviving Success
Teams rarely consider success as a mode of failure, but not preparing for exceeding their goals can be just as dangerous as ignoring basic software and infrastructure needs. Mark Simms and Mark Souza discuss anti-patterns they've seen and some of the best ways to architect to win in spite of your own success.