InfoQ Homepage Presentations Ingest & Stream Processing - What Will You Choose?
Ingest & Stream Processing - What Will You Choose?
Summary
Pat Patterson and Ted Malaska talk about current and emerging technologies. They evaluate each and understand how they are useful in solving problems related to large scale data processing, joining and combining streams. They also talk about the various ways of achieving "at least once" and "exactly once" processing and how we can make sure that data is processed in a timely fashion.
Bio
Pat Patterson has been working with Internet technologies since 1997, building software and working with communities at Sun Microsystems, Huawei, and Salesforce, and now he is a community champion at StreamSets. Ted Malaska is a solutions architect at Cloudera. He has 18 years of professional experience working for start-ups, the US government, banks, commercial firms, bio firms and retail ones.
About the conference
Software is changing the world. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.
Community comments
Kafka streams limitations and future.
by Praveena Manvi,
Kafka streams limitations and future.
by Praveena Manvi,
Your message is awaiting moderation. Thank you for participating in the discussion.
Excellent introduction to streaming solutions landscape.
It would have been nice if comments on Kafka Streams could have elaborated.
- Kafka Streams is samza 2.0 (refactored copy paste of old samza code and carries its pros/cons)
- Kafka streaming has multi-tenancy support limitations
- Kafka streaming scaling (as this as library runs in client JVM machines) limitations compared to others (although it can run on any container)
It will be great if anyone from confluent can comment on this. Kafka Stream being being defacto on most of applications, kafka streaming could be the most tempting solution (with <9K code) rather than introducing spark/flink/storm based one just because the deployment complexity they bring in.