InfoQ

News

Catching up with Esper: Event Stream Processing Framework

Posted by Floyd Marinescu,Thomas Bernhardt on Oct 12, 2007 08:40 AM

Community
Java
Topics
Messaging
Tags
Event Stream Processing ,
Esper
Esper (which version 1.0 was announced more than one year ago on InfoQ) is an event stream processing (ESP) and event correlation engine (CEP) that triggers actions when event conditions occurs among event streams - which can be thought of as a database turned upside down where statements are registered and data streams flow through. Event processing is a growing trend in the software industry, and several vendors have entered the market following a number of startups. Common use cases range from algorithmic trading, BAM, RFID, advanced monitoring systems and fraud detection up to a direct relationship with an SOA.  InfoQ caught up with Thomas Bernhardt & Alexandre Vasseur on recent developments with the project.

According to the Esper team, Esper is currently the only pure Java open source ESP/CEP engine that is also commercially supported by a company named EsperTech - which also maintains a .Net implementation.

Esper was licensed to BEA and modified for use in their WebLogic Event Server, which launched in June.  In light of some mixed reactions, Thomas commented to InfoQ:
I think the fact that Esper plays a role in BEA's product helps the Esper project in a couple of ways. First, feedback gained is incorporated back
into an improved Esper. Second, the BEA product raises overall awareness of CEP/ESP technology hugely and thus enlargens the mindshare and market.
Third, its a great testimony to how open, extensible, and enterprise grade ready Esper technology is. The Esper community and users base is really proud of that relationship.
With the growth of this market space and the presence of multiple competing implementations, standardization could play some benefit.  Thomas commented on the potential and background CEP language standardization:
The CEP community clearly sees CEP and ESP as complimentary, and recognizes that other approaches (i.e. baynesian or neural networks) also apply to CEP problems. In light of various approaches, and vendors not agreeing, the most relevant standard appears to emerge from work of the ANSI SQL standardization commitee extending SQL to provide "pattern matching in sequence of rows".

There will for sure be further work on that early topic and standardization will likely go beyond ESP/CEP language standardization.
Perhaps the most notable news related to Esper recently is the publishing of a performance benchmark kit and results, in mid-august:
Esper exceeds over 500 000 event/s on a dual CPU 2GHz Intel based hardware, with engine latency below 3 microseconds average (below 10us with more than 99% predictability) on a VWAP benchmark with 1000 statements registered in the system - this tops at 70 Mbit/s at 85% CPU usage.
Despite based on a rather simple use case, the publication of this benchmark work is aimed at shaking the industry, as it comes with a complete kit to replay the benchmark. An Esper event server is listening to remote clients sending market stock events over the network. The Esper engine is configured to compute volume weighted moving average of the feeds in real time over a sliding window of time or events.

Asked about the need for such a benchmark, Esper responded:
The CEP market has been common place of vague information regarding performance and latency with every vendor throwing its figures in the press without any details at all. No comparative benchmarks exists in this area yet. 

Vague performance information in this industry had already been criticized by Progress Apama and others . Here is a compilation from the Apama blog:
* Skyler manages rates as high as 200,000 messages/second * Key feature: Coral8 handles thousands to millions of events per second * StreamBase extends performance leadership by processing over one million events per second with near zero latency * Aleri Labs breaks sub-millisecond latency barrier
Apama itself claims to be "a high-performance, scalable processing engine that can process thousands of events per second". Such claims could also be found in the BEA wording regarding their WebLogic Event Server announcement with inferior yet more precise figures: "As we come out of the gate, we're going to provide 50,000 complex events per second".

Those results seems to confirm that "hundreds of thousands" events per second is common and no exceptional in this area, and also show exactly how Esper performs on the given scenarios. It also gives valuable material to the user community to better assess performance instead of listening to random vendor FUD commonly throwned at disruptive yet affordable open source software.
The Esper team has also published the details of all its runs on its wiki and updated its product website with a performance section and performance best practices section.   Another source of benchmarks may be coming from the newly formed STAC benchmark council, which aims to put out customer-driven benchmark standards for trading technology. 

See also InfoQ's previous coverage for good background matertial on Esper and CEP at: http://infoq.com/esper.
kdb+ by James Richardson Posted Oct 15, 2007 6:40 AM
Re: kdb+ by Alex Vasseur Posted Oct 16, 2007 11:24 AM
simple implementation by James Richardson Posted Nov 18, 2007 3:29 PM
Re: simple implementation by James Richardson Posted Nov 18, 2007 3:35 PM
no Esper at Gartner CEP event ... by Jim Falgout Posted Sep 17, 2008 3:19 PM
  1. Back to top

    kdb+

    Oct 15, 2007 6:40 AM by James Richardson

    No comparison to kdb+ here, which is pretty quick - and has been applying vector processing to stock prices (among other things) for ages. e.g. select amount wavg price by date from trade where stock=`BT.L check out www.kdb.com/q for more info. i have no connection with kdb or 1st derivatives.

  2. Back to top

    Re: kdb+

    Oct 16, 2007 11:24 AM by Alex Vasseur

    I believe Kdb+ (at kx.com by the way) is primarily a time series database. As such it does not have features for causality ("happened before" relation between events), and I expect it to be pretty hard to deal with "absence of events". By contrast Esper features both event stream (dealing with sliding windows of real time data stream) and complex events (happened before and other time guard constructs). The vwap benchmark that illustrates Esper performance is certainly a simple enough use case to compare Esper with a time series database, but it is far from giving a complete overview of what can be achieved with Esper event processing and continuous queries capabilities. This said kdb+ is an interesting piece of software and I believe this should be fairly possible to integrate Esper with KX kdb+ for f.e. event replay capabilities. Alex

  3. Back to top

    simple implementation

    Nov 18, 2007 3:29 PM by James Richardson

    unrelated to the kdb+ question above - i had a cause to use esper recently to measure application performance. On a current SOA project i was able to determine application latency and throughput by matching the correlation-ids being output by the various systems. adding a performance monitor with esper took about three hours from concept to finished implementation. good job guys.

  4. Back to top

    Re: simple implementation

    Nov 18, 2007 3:35 PM by James Richardson

    sorry i forgot to mention - it required not a single application change.

  5. Back to top

    no Esper at Gartner CEP event ...

    Sep 17, 2008 3:19 PM by Jim Falgout

    Just got back from the Gartner CEP event and did not hear about Esper there (not a big surprise, not much talk about open source in general there). This article is dated, so I'm wondering what is going on with Esper? Has it gained any momentum? As far as benchmarks, it seems that with event processing, low latency and messages per second are stressed to the extreme. Performance in the micro-second range may be the cost of entry to play in the algorithmic trading space, but there are many, many more use cases for event processing where the events are measured in hundreds or thousands, not millions per second. At the show, even many trading companies mentioned that performance was not their main criteria for picking a vendor ...

Educational Content

Bindings, Platforms, and Innovation

This presentation focuses on the Internet and separating myth from fact, history from the future, and the mundane from the imaginative. Bob Frankston presents a vision of what could and should be.

Orchestrating Long Running Activities with JBoss / JBPM

This article explores the use of JBoss and jBPM to implement design solutions that effectively address the issue of orchestrating long running activities.

Neo4j - The Benefits of Graph Databases

This presentation covers the use of graph databases as an optimal solution for data that is difficult to fit in static tables, rapidly evolving data or data that has a lot of optional attributes.

Realistic about Risk: Software development with Real Options

This session introduces Real Options and shows how it can help in running your project. Real Options is a decision-making process that can be used to manage risk.

Communication Flexibility Using Bindings

This article discusses the use of bindings on services and references (including the instance of non-configured bindings) as the means to implement SCA communications in a Web and SOA environment.

Writing DSLs in Groovy

After a short introduction to DSLs, Scott Davis plays with the keyboard showing how to approach the creation of a DSL by typing working snippets of Groovy code that get executed.

Scaling Agile with C/ALM (Collaborative Application Lifecycle Management)

IBM Rational and InfoQ present, Scaling Agile with C/ALM, an eBook showing organizations how to become “finely tuned software delivery machines” by enabling team integration and scaling.

Concurrent Programming with Microsoft F#

Amanda Laucher presents a real life enterprise application written in F#. She shows actual code snippets, explaining design decisions and suggesting how to use some of the F# constructs.