BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Grafana Rearchitects Loki with Kafka and Ships a CLI to Bring Observability Into Coding Agent

Grafana Rearchitects Loki with Kafka and Ships a CLI to Bring Observability Into Coding Agent

Listen to this article -  0:00

At GrafanaCON 2026 in Barcelona, Grafana Labs announced Grafana 13 with the new Loki Kafka-backed architecture at the ingestion layer and the AI Observability in Grafana Cloud to monitor and evaluate AI systems in real time. In particular, the new CLI called GCX was announced, designed to surface Grafana Cloud data inside agentic development environments.

The traditional Loki architecture achieves high availability through replication: every incoming log line is sent to three ingesters, giving you a replication factor of three. Straightforward enough on paper. The catch is that deduplication relies on file naming, if ingesters cover the same time range, they should produce identical file names and those duplicates collapse.

 

Previous and current Loki architecture

Trevor Whitney, Staff Software Engineer at Grafana Labs, explained the mechanics during a briefing at GrafanaCON:

In a distributed system, the ingesters drift a bit, and any amount of drift in the time syncing of the ingesters results in those files not getting deduped by file name. Our internal metrics show that in reality, we end up storing on average 2.3x, for every log line that we ingest, we store it 2.3 times.

That 2.3x multiplier isn't an abstraction. It shows up on every line item: CPU at ingestion, memory pressure, network costs, object storage bills, and then again at query time when duplicates have to be reconciled on the fly.

The new architecture replaces the replication-at-ingestion strategy with Kafka as the durability layer. Logs land in Kafka once, ingesters consume from the queue, and the effective replication factor drops to one. Combined with a redesigned query engine that distributes work across partitions and executes in parallel, Grafana claims up to 20x less data scanned and 10x faster performance on aggregated queries.

There's a trade-off worth naming. Loki's original design principle was minimal dependencies: object storage and nothing else. The new architecture breaks that. Whitney acknowledged it directly:

Up until now, our only dependency has been object storage, and that's kind of been a goal of the project from the beginning. So yes, this does introduce a second dependency. You will now have object storage and Kafka for any distributed installation of Loki.

Single-binary deployments won't be affected; a local setup or home lab has no replication to orchestrate, so it runs fine with just file system or object storage. But anyone running Loki at scale needs to factor Kafka into their operational surface.

During the GrafanaCON, a new agent-aware CLI for integrating observability into AI-driven workflows, GCX, was launched in public preview. The premise is simple: many engineers now spend most of their day inside Claude Code, Cursor, or GitHub Copilot, and when something breaks in production, the workflow forces a context switch: out to Grafana, through dashboards, back to the editor, and then back to Grafana to verify the fix worked. GCX is designed to collapse that loop.

Ward Bekker, who led the GCX work, described the reasons of a CLI tool during a live demo:

CLIs were never out of fashion, but they're definitely more in fashion now, especially because of the agentic coding tools. A lot of folks notice that if you're using CLIs on the command line in combination with Cursor or Claude Code, it's extremely effective.

Bekker walked through a representative scenario: a synthetic monitoring check detects failures on an e-commerce order flow; Grafana Assistant runs automated root cause analysis; GCX pulls that analysis into Claude Code alongside the relevant source files; Claude Code proposes and applies a fix; GCX then queries the synthetic monitoring metrics directly to confirm recovery. No browser tab required.

Grafana Labs is not betting on one integration model. The team is shipping GCX as a CLI while also developing a remote MCP server in parallel, on the view that both have different audiences and use cases worth supporting.

These announcements sit alongside Grafana 13, which ships dynamic dashboards as generally available, adds Git-based workflow support, and expands the data source ecosystem to over 170 integrations. Grafana Labs also launched an AI Observability product in public preview for teams monitoring LLM-powered applications in production.

Grafana 13 and the Loki updates are available now. GCX is in public preview. The AI Observability solution is also in public preview in Grafana Cloud.

About the Author

Rate this Article

Adoption
Style

BT