Splice Machine Version 1.0 Supports Integration with Hadoop and Analytic Window Functions

Splice Machine version 1.0 supports analytic window functions and integration with Hadoop ecosystem. Splice Machine team recently released their Hadoop based RDBMS data management solution that can be used for transactional workloads on Hadoop.

Its architecture is based on Apache Hadoop Big Data analytics engine, HBase and Apache Derby database to leverage the scalability of Hadoop technology. With the support for ACID transactions, the database can be used for real-time applications and operational analytics for processing big data and scalability.

Splice Machine released the public beta offering back in May and they worked with the beta customers on testing the functionality and performance of the product before releasing the version 1.0.

Main product features in this version are:

Analytic Window Functions: These functions provide SQL analytic capabilities based on the SQL-2003 standard. Analytics include running totals, moving averages, and Top-N queries. Supported Window Functions include RANK, DENSE_RANK, and ROWNUMBER.
Integration with Hadoop Ecosystem: This integration includes the Apache HCatalog support to work with MapReduce, Hive, Pig, and Spark frameworks. HCatalog provides a relational view of data stored in the Hadoop Distributed File System (HDFS). Users can run queries against the data stored in Splice Machine, Spark, and Hive tables in HCatalog without knowing where and how each data set is stored.
Authentication and Authorization: Authentication support includes integration with LDAP v3 standard and FIPS-compliant password hashing algorithms like SHA-512 (default). And the authorization model allows DBAs to create new users to access the data stored in Splice Machine database. It also includes the privileges to control read and write operations at a table or column level.
Native Backup & Recovery: This includes a transaction-aware data backup and restore feature to ensure business continuity. This is a hot backup capability that allows applications and workloads continued availability while the backup of the database is in progress.
Bulk, Parallel Export: This is used to export query results into text files in Comma Separated Values (CSV) format. All cluster nodes are used in generating the query results.
Splice Machine Management Console: The console provides insight into query performance like viewing explain traces on queries. Explain trace shows the timing of the query operation and the distribution of data.

A standalone version of the Splice Machine Hadoop RDBMS, version 1.0, is now available for download on their website. Splice Machine also has a free version of the product which is offered to companies that are less than five years old and generate $10 million or less in revenues.

Splice Machine offers a data migration support program called "Safe Journey" to assist the enterprise customers who are deploying Splice Machine v1.0 in their organizations, with the migration of the customers' database workloads.

For more information on Splice Machine's technical architecture, check out Q&A interview InfoQ conducted earlier this year with Rich Reimer, VP of Marketing and Product Management.

Topics

Pitfalls of Unified Memory Models in GPUs

Evolving Trainline Architecture for Scale, Reliability and Productivity

Generally AI - Season 2 - Episode 3: Surviving the AI Winter

Mastering Observability: Unlocking Customer Insights with Gojko Adzic

Proactive Approaches to Securing Linux Systems and Engineering Applications

Helpful links

Choose your language

Write for InfoQ

Rate this Article

This content is in the Splice Machine topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

Microsoft Introduces Drasi: Open-Source System for Real-Time Event Processing and Automation

How Cell-Based Architecture Enhances Modern Distributed Systems

Article Series: Cell-Based Architectures: How to Build Scalable and Resilient Systems

Orchestrating a Path to Success - a Conversation with Bernd Ruecker

OpenAI Releases Swarm, an Experimental Open-Source Framework for Multi-Agent Orchestration

Generally AI - Season 2 - Episode 3: Surviving the AI Winter

Challenges and Lessons Porting Code from C to Rust

Copilot Now Available in OneDrive: AI-Powered Features for Streamlined Document Management

Ephemeral IDs: Cloudflare's Latest Tool for Fraud Detection

Evolving Trainline Architecture for Scale, Reliability and Productivity

Taking Advantage of Cell-Based Architectures to Build Resilient and Fault-Tolerant Systems

No EC2 or Kubernetes Allowed: Insights from Building Serverless-Only Architecture at PostNL

Mastering Observability: Unlocking Customer Insights with Gojko Adzic

How a Sustainable Mindset in Software Engineering Can Increase Team Performance and Prevent Burnout

The Ongoing Challenges of DevSecOps Transformation and Improving Developer Experience

University Researchers Publish Analysis of Chain-of-Thought Reasoning in LLMs

Microsoft and Tsinghua University Present DIFF Transformer for LLMs

OpenAI Releases Swarm, an Experimental Open-Source Framework for Multi-Agent Orchestration

Google Cloud Adds Scalable Vector Search to Memorystore for Valkey & Redis Cluster

Podman Desktop 1.13 Launches with Hyper-V Support and Additional Enhancements

Uber Completes Major MySQL Fleet Upgrade, Boosting Performance and Security

QCon San Francisco

QCon London

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?