Real-Time Data Management with Apache Spark

While Apache Spark has supplanted MapReduce for large-scale processing of streaming data, no solution for real-time data management has emerged. Spark doesn’t have features for managing data or processing state. As a result, developers using Spark often write extensive code to ingest, prepare and enrich, store and manage data and state.

GridGain and Ignite provide the ideal underlying in-memory data management technology for Apache Spark because of its in-memory support for both stored “data at rest” and streaming “data in motion.” Learn how this makes many Spark tasks simple, including stream ingestion, data preparation and storage, stream processing, state management, streaming analytics, and machine and deep learning.

Login to InfoQ to interact with what matters most to you.

Recover your password...


Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.


More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.


Stay up-to-date

Set up your notifications and don't miss out on content that matters to you