QCon London 2026: Wrangling Telemetry at Scale, a Guide to Self-Hosted Observability

At QCon London 2026, Colin Douch discussed building and operating self-hosted monitoring stacks, surveyed the current tooling landscape, and explained how to build a coherent observability setup rather than treating logs, metrics, and traces as separate pillars.

Colin Douch at QCon London 2026

Douch, site reliability engineer at DuckDuckGo and formerly observability tech lead at Cloudflare, opened the session by challenging the audience:

Do you ever feel the complexity demon creeping up to you? We've all developed hacks to try and escape it, but it's always there. Sometimes you have to confront it, and when you do, you need observability.

Highlighting a common paradox in modern observability, Douch explained that while observability tooling is meant to simplify debugging complex systems, the observability stack itself often becomes equally complex. While many organizations outsource the problem to SaaS vendors, the session focused on the realities of running observability infrastructure internally and what teams should understand before committing to it. Douch warned:

Should I run my own Observability stack? No, at least not until you have exhausted each and every other option.

After stressing the challenges of self-hosted observability ("you need at least an extra 2-3 full-time engineers and significant money"), Douch outlined the typical components of a self-hosted observability stack.

Douch suggested using Prometheus, his preferred choice despite its horizontal scaling challenges, or VictoriaMetrics for metrics, and highlighted that exemplars are an underused feature of modern metrics systems.

He stressed the importance of structuring logs and storing them in a columnar database, suggesting VictoriaLogs or Loki, given their different approaches and design philosophies. While developers could cut out the middleware and ingest logs directly into the database, Douch advised against it unless one is already running. He also warned:

Sprinkling in logs leads to a soup of unusable data that makes it nigh on impossible to solve problems.

In practice, self-hosted observability often consists of loosely coupled systems built around projects such as metrics collectors, distributed tracing frameworks, and log aggregation tools. While this modular ecosystem provides flexibility, it also introduces operational overhead.

Douch also reviewed the current ecosystem of open-source tooling commonly used to assemble such stacks. Noting that "traces are just a fancy name for logs with some pre-agreed structure," he endorsed OpenTelemetry for traces, arguing the complexity is worth it, but advised against using it for metrics or logs, recommending Prometheus Text Exposition and JSON instead.

Douch then discussed sampling, the advantages and disadvantages of head and tail sampling, and collectors. Rather than treating logs, metrics, and traces as independent data silos, he argued that the value of observability comes from connecting these signals. While the three pillars overlap significantly, logs are a subset of traces, and metrics are aggregations over the same underlying data.

Colin Douch at QCon London 2026

Throughout the talk, Douch emphasised that building an observability platform is less about selecting a single tool and more about designing a coherent telemetry pipeline.

About the Author

Renato Losio

Show moreShow less

InfoQ Software Architects' Newsletter

Write for InfoQ

About the Author

Renato Losio

Rate this Article

This content is in the DevOps topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter