Denis Magda on Continuous Deep Learning with Apache Ignite

At the recent ApacheCon North America, Denis Magda spoke on continuous machine learning with Apache Ignite, an in-memory data grid. Ignite simplifies the machine-learning pipeline by performing training and hosting models in the same cluster that stores the data, and can perform "online" training to incrementally improve models when new data is available.

Magda, vice-president of product management at GridGain, began by describing some of the pain points of machine learning on large datasets, in particular the latency involved in moving data across the network from its storage location to the processors that perform training. Models also have to be deployed into a production system after they are trained, and retrained periodically after new data is collected. Because Ignite runs code on the same computers that host data, it can train, deploy, and update a machine-learning model without a time-consuming extract-transform-load (ETL) step.

Ignite is a "memory-centric distributed database, caching, and processing platform" originally open-sourced by GridGain. The key features that are used in the machine-learning scenario are its data storage and compute grid, which allows machine-learning training to occur on the machines that host the data. With many other systems, such as Apache Spark, before machine-learning can begin, a dataset must be extracted and loaded into the system from its home-of-record, which can be a time-consuming process. Instead of moving the data to the training computers, Ignite moves the training computation to the data. Further, because Ignite partitions the data across many servers, the training can be run in parallel to complete faster.

Ignite provides many common ML algorithms, including linear regression, k-means clustering, decision trees, and support vector machines (SVM). Ignite also includes a "vanilla" multi-layer perceptron implementation, but for most deep-learning tasks, developers will likely opt to use Ignite's TensorFlow integration. Ignite supports distributed training with TensorFlow "based upon the standalone client mode of distributed multi-worker training." In addition, Ignite's resilient architecture monitors for and restarts unhealthy cluster nodes, so that training is not interrupted by a machine failure.

Once the model is trained, Ignite can store the model, perform inference using the model, and retrain the model as new data is collected. Ignite ML models support an "update" interface that "provides relearning of an already trained model on a new portion of data using the state of the model trained earlier." This is known as "online learning," as the model is updated while it is being used (i.e., while it is "online"). Not all of Ignite's ML algorithms support this feature; for example, decision trees do not. Also, some algorithms require a batch of new data before updating; for example, k-means requires a batch of at least k samples.

Magda concluded his talk with a list of upcoming features, including the ability to import models from Spark and XGBoost, as well as a full Python API for the ML features. Currently Ignite has only a "thin" Python client that uses a binary client protocol through a raw TCP socket. He also noted that Ignite is a "Top 5" project at Apache, with the second-most-active dev mailing list and third-most-active user mailing list.

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

InfoQ Article Contest

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

The InfoQ Newsletter