InfoQ Homepage Presentations High Throughput Stream Processing with ACID Guarantees
High Throughput Stream Processing with ACID Guarantees
Summary
Terence Yim from Continuuity showcases a transactional stream processing system that supports full ACID properties without compromising scalability and high throughput.
Bio
Terence Yim has been a Apache Twill committer since its incubation. He is a Software Engineer at Continuuity, responsible for designing and building realtime processing systems on Hadoop/HBase.
About the conference
Software is Changing the World. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.
Community comments
I was disappointed by this talk...
by Edward Sargisson,
I was disappointed by this talk...
by Edward Sargisson,
Your message is awaiting moderation. Thank you for participating in the discussion.
The speaker describes a system where they built a Multi-Version Concurrency Control layer on top of Hive and then implemented transactional queueing on top of that.
They then claimed they could do a whole 100,000 events/s on 8 nodes.
What is remarkable is not comparing that with just doing it on Postgres. While I don't have similar stats for Postgres I would expect that throughput on 1 or 2 nodes. Secondly, the speaker didn't seem to realize that queues have the feature that they're generally full or empty, rarely anywhere in between. This means that naive implementations suffer for performance because they contend on a lock that's protecting the start of the queue or the end of the queue.
There was also not comparison with Apache Flume - which also does transactional, durable transport (but not replicated).