InfoQ Homepage Streaming Content on InfoQ

Articles

RSS Feed

Newer Older

AI, ML & Data Engineering

Big Data Processing with Apache Spark - Part 3: Spark Streaming

In this article, third installment of Apache Spark series, author Srini Penchikala discusses Apache Spark Streaming framework for processing real-time streaming data using a log analytics sample application.

Srini Penchikala
on Jan 07, 2016
Storm Applied Review and Q&A with the Authors

Storm is a distributed, fault-tolerant, real-time computation system that was originally developed at BackType and later open sourced by Twitter. Storm Applied is a new book from Manning that aims to provide a practical guide on using Storm, both in a development and in a production setting. InfoQ has spoken with two of the book’s authors, Sean T. Allen and Matthew Jankowski.

Sergio De Simone
on Jul 27, 2015
AI, ML & Data Engineering

Big Data Processing with Apache Spark - Part 2: Spark SQL

Spark SQL, part of Apache Spark big data framework, is used for structured data processing and allows running SQL like queries on Spark data. In this article, Srini Penchikala discusses Spark SQL module and how it simplifies running data analytics using SQL interface. He also talks about the new features in Spark SQL, like DataFrames and JDBC data sources.

Srini Penchikala
on Apr 16, 2015
AI, ML & Data Engineering

Big Data Processing with Apache Spark – Part 1: Introduction

Apache Spark is an open source big data processing framework built around speed, ease of use, and sophisticated analytics. In this article, Srini Penchikala talks about how Apache Spark framework helps with big data processing and analytics with its standard API. He also discusses how Spark compares with traditional MapReduce implementation like Apache Hadoop.

Srini Penchikala
on Jan 30, 2015
AI, ML & Data Engineering

Real-Time Stream Processing as Game Changer in a Big Data World with Hadoop and Data Warehouse

This article discusses what stream processing is, how it fits into a big data architecture with Hadoop and a data warehouse (DWH), when stream processing makes sense, and what technologies and products you can choose from.

Kai Wähner
on Sep 10, 2014
Architecture & Design

Using SEDA to Ensure Service Availability

A new strategy for incorporating event driven architecture for scalability and availability of services in the context of SOA. These strategies are based on queuing research pioneered for the use of highly abailable and scalable services, initially in the Web context, but moving into the SOA and Web services context. Actual implementation is described in the context of Mule.

Rune Peter Bjornstad Rune Schumann
on Oct 11, 2006

Newer Articles

Older Articles

Unlock the full InfoQ experience

Don't have an InfoQ account?

Topics

Are We Ready for the Next Cyber Security Crisis Like Log4shell?

Panel: Taking Architecture Out of the Echo Chamber

Directing a Swarm of Agents for Fun and Profit

The Principal Engineer’s Path: Skills, Strategies, and Lessons Learned

Securing the AI Stack: From Model to Production

Helpful links

Choose your language

Articles

Big Data Processing with Apache Spark - Part 3: Spark Streaming

Storm Applied Review and Q&A with the Authors

Big Data Processing with Apache Spark - Part 2: Spark SQL

Big Data Processing with Apache Spark – Part 1: Introduction

Real-Time Stream Processing as Game Changer in a Big Data World with Hadoop and Data Warehouse

Using SEDA to Ensure Service Availability

GitHub Will Use Copilot Interaction Data from Free, Pro, and Pro+ Users to Train AI Models

QCon London 2026: Team Topologies as the ‘Infrastructure for Agency’ with AI

Are We Ready for the Next Cyber Security Crisis Like Log4shell?

Anthropic’s Designs Three-Agent Harness Supports Long-Running Full-Stack AI Development

Panel: Taking Architecture Out of the Echo Chamber

Replacing Database Sequences at Scale Without Breaking 100+ Services

How to Handle Trusts and Psychological Safety When Scaling Organizations

The Principal Engineer’s Path: Skills, Strategies, and Lessons Learned

Agentic AI Patterns Reinforce Engineering Discipline

TigerFS Mounts PostgreSQL Databases as a Filesystem for Developers and AI Agents

Directing a Swarm of Agents for Fun and Profit

Optimization in Automated Driving: from Complexity to Real-Time Engineering

Open Source Security Tool Trivy Hit by Supply Chain Attack, Prompting Urgent Industry Response

PyPI Supply Chain Attack Compromises LiteLLM, Enabling the Exfiltration of Sensitive Information

Kubernetes Autoscaling Demands New Observability Focus beyond Vendor Tooling

InfoQ Architect Certification

QCon AI Boston

QCon San Francisco

InfoQ Software Architects' Newsletter

Articles