Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Pull Queries and Connector Management Added to ksqlDB (KSQL) Event Streaming Database for Kafka

Pull Queries and Connector Management Added to ksqlDB (KSQL) Event Streaming Database for Kafka

The new release of KSQL, an event streaming database for Apache Kafka, includes pull queries to allow for data to be read at a specific point in time using a SQL syntax, and Connector management that enables direct control and execution of connectors built to work with Kafka Connect. Since the Confluent team behind KSQL believes it's an extra significant release, they have decided to rename the tool to ksqlDB. The goal for the tool is to provide one mental model for doing everything needed to build a complete streaming app using ksqlDB, which in turn only depends on Kafka.

ksqlDB is used for continuously transforming streams of data. It reads messages from Kafka topics and can filter, process, and react to these messages and create new derived topics. Materialized views can then be created over the created topics, views that are perpetually kept up-to-date as new events arrive. Until now ksqlDB has only been able to query continuous streams, which are called push queries because they push out a continuous stream of changes. With the new release, ksqlDB can also read current state of a materialized view using pull queries. These new queries can run with predictable low latency since the materialized views are updated incrementally as soon as new messages arrive.

With the new connector management and its built-in support for a range of connectors, it’s now possible to directly control and execute these connectors with ksqlDB, instead of creating separate solutions using Kafka, Connect, and KSQL to connect to external data sources. The motive for this feature is that the development team believes building applications around event streams was too complex. Instead, they want to achieve the same simplicity as when building applications using relational databases.

In a blog post, Jay Kreps, CEO of Confluent and one of the co-creators of Apache Kafka, describe the new features in more detail and why they are important in order to make event streaming a mainstream approach to application development. He thinks that providing a database built for streaming is a part in achieving that. He sees events as a first-class citizen in a modern development stack and believes that it must be as simple to build an event streaming application as building a REST service or CRUD application.

Internally, the ksqlDB architecture is based on a distributed commit log used to synchronize the data across nodes. To manage state, RocksDB is used to provide a local queryable storage on disk. A commit log is used to update the state in sequences and to provide failover across instances for high availability.

Kreps refers to a blog post by Martin Kleppmann: Turning the database inside-out with Apache Samza, and notes that Kafka provides the foundational ingredients for this kind of event-oriented model. He also points out that with the new functionality, ksqlDB can make this architectural pattern easier to implement.

In a short video, Tim Berglund from Confluent runs a demo describing old and new features of ksqlDB.

ksqlDB is owned and maintained by Confluent Inc. as part of its Confluent Platform and is licensed under the Confluent Community License. The code is available on GitHub.

Rate this Article