Splice Machine Data Platform 3.0 Supports Kubernetes Managed Service and New ML Manager

The latest version of distributed SQL data platform Splice Machine supports a new Kubernetes managed service, a new version of Machine Learning Manager (v2.0), and automatic in-database model deployment.

Splice Machine released version 3.0 of the data platform which also includes features like disaster recovery, replication, time travel, and legacy database compatibility. The data platform supports native machine learning and AI capabilities, and unified deployment on-premise and on the cloud.

Splice Machine's ML Manager provides end-to-end lifecycle management for ML models. The new ML Manager 2.0 has Jupyter notebook integration with MLflow and in-database deployment. Native Jupyter support comes with JupyterHub as well as BeakerX Jupyter extensions. ML Manager helps with model Workflow management by offering bulk logging of model parameters and metrics to full visibility into pipeline stages and feature transformations.

Other new features in Splice Machine 3.0 include improvements in the following areas:

Workload Management: Splice Machine data platform supports the use of multiple OLAP queues that allow users to reserve cluster capacity for specific queries, track resources consumed by each server/role, and manage resource capacity for specific kinds of queries. This helps with the isolation of workloads when multiple resource intensive queries are running simultaneously.
SQL Coverage: This includes support for DB2 specific SQL syntax and outer joins in queries. Outer joins eliminate the need to rewrite queries written against a legacy database. SQL enhancements also include the "Time Travel - Point in Time" queries which allow the users to query the database as it existed at some time in the past. This feature is useful when working with slowly changing dimensions, data auditing scenarios, and for DB changes that may need to be unwound.
Replication and HA: This feature supports the ability to stand up multiple database clusters that are automatically kept in sync via active-passive replication to achieve the business continuity parameters like recovery point objectives (RPOs) and recovery time objectives (RTOs).
Data Security: This includes schema access restrictions to control access to objects belonging to a specified schema so that other users cannot view or access without appropriate privileges. Another security feature is the customized pattern matching for log redaction which allows users to use regex based patterns to redact sensitive information from system logs.
Kubernetes Support: Splice Machine on Kubernetes provides an abstraction from the underlying infrastructure which enables the hybrid cloud and multi-cloud deployments. The new Kubernetes support includes a native Spark data source (NSDS) that can be used to stream Dataframes across the container/network boundary to Splice offering a throughput solution implemented in Apache Kafka.

This release also includes some platform upgrades with support for Cloudera 6.3 and HWX 3.2.3 as well as HDFS 3.0, HBase 2.0 and Apache Spark 2.4.1. For more information on Splice Machine 3.0, check out the blog post and the webinar.

InfoQ Software Architects' Newsletter

Follow us on

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter