Olap4j 1.0: a Java API for OLAP Servers
After nearly five years of work, Business Intelligence vendor Pentaho has announced the release of olap4j 1.0, a new, common Java API for any online analytical processing (OLAP) server. The current release includes support for the following OLAP servers:
There have been attempts to provide a standard Java OLAP API before. In 2000, the Java Community Process officially requested a standard OLAP interface to alleviate the need to alter implementations when changing vendors. JSR 69: Java OLAP Interface (JOLAP) was intended to be "… a pure Java API for the J2EE environment that supports the creation and maintenance of OLAP data and metadata, in a vendor-independent manner." However, after receiving approval in 2004, the specification was never finalized, leaving the Java development community without a standard Java OLAP API.
InfoQ spoke to Julian Hyde, olap4j specification lead, Pentaho Lead Architect for Analysis and Mondrian Founder, to find out more about olap4j.
InfoQ: What are the main features of the olap4j 1.0 release?
Olap4j is an API that allows you to connect to an OLAP database, query its metadata, and execute MDX queries, from the Java environment. It is the first Java API that has allowed Java programmers to connect to OLAP servers from different vendors.
There are two drivers. The olap4j driver for [SOAP based standard] XMLA talks to any XML for Analysis (XMLA) server; including Microsoft SQL Server Analysis Services versions 2005 and 2008, Jedox Palo, and SAP/BW. The olap4j driver for Mondrian speaks to a Mondrian engine in the same JVM; the Mondrian driver is the reference implementation for olap4j, and has in fact been Mondrian's primary API for several years.
The olap4j API and the two drivers are all stable and proven in production for several years.
InfoQ: Why is it important for the development community to have an open standard for interfacing with OLAP servers?
An open standard gives developers, integrators and end-users a choice. Without an open standard, if you want to use vendor X's OLAP server, you are restricted and must use vendor X's OLAP client. If you are building an application, that application is locked in to that vendor. With olap4j, you can switch to a different server if it is better, cheaper, faster, or simply because that is the server for which a particular customer already has a site license. And you have a diverse set of OLAP clients to choose from.
InfoQ: How does olap4j differ from JSR 69?
In my opinion, JSR 69 had a number of failings. It was a fairly high-level interface that required users to understand a very rich metamodel (based on the Common Warehouse Metamodel) and a rich query model. Queries were built by calling an API, not specified textually in a language such as SQL or MDX. It was sponsored by a group of vendors who had existing implementations, and I think they deadlocked trying to find a compromise of the various query-building APIs. Lastly, Microsoft, which produces the market-leading OLAP server, was not a contributor to the proposed specification.
Around the same time, Microsoft was authoring its own standards for OLAP: OLE DB for OLAP (for C and C++ environments), ADO MD (for .NET languages such as C# and VB), and XMLA (a SOAP dialect, for SaaS). These standards were simpler and because they were based upon a query language (MDX), were more like the APIs that developers were used to for talking to relational databases (ODBC, JDBC). Those standards have become pervasive in their respective environments, but of course Microsoft had and has no interest in creating a Java standard.
olap4j aims to be immediately familiar to Java developers by adopting the paradigms they are familiar with from JDBC. For example, the code to create a connection, prepare and execute a statement looks very similar in olap4j and JDBC. The difference is that the query language is MDX, not SQL, and the result is a multidimensional CellSet, not a ResultSet consisting of rows and columns.
The fact that olap4j is literally an extension to JDBC means that you can use infrastructure like connection pools and directory services for olap4j connections just as you would for JDBC connections.
InfoQ: How are those MDX queries generated?
Whereas SQL is typically hard-coded into a JDBC application, the purpose of OLAP is interactive data exploration, so queries need to be formed dynamically. Olap4j provides a query model to build queries step-by-step, and then convert those queries into MDX. The key difference between olap4j and query-model-centric APIs such as JSR-69 and Oracle's OLAP API is that in olap4j use of the query model is optional. If you are an application developer and your application always needs to show the same charts or data tables, you can write MDX queries by hand; if you are an OLAP tool developer, you can use olap4j's query model or you could build your own query model. Even as we reach version 1.0, olap4j's query model is a work in progress. The good news is that it is not a core part of the API, and it can evolve without breaking the rest of the API.
The result is that olap4j is smaller and simpler than JSR-69 would have been, or Oracle's OLAP API is. That is good news for application developers learning the API, as well as server and driver developers trying to implement it.
InfoQ: To date JOLAP has served mainly as a foundation for other efforts. For instance, Hyperion's implementation of XMLA uses a Java API based on the JOLAP specification. But the specification itself has never really become a standard as such. Are you hoping to revive the JOLAP specification itself?
No; see above. However, olap4j deliberately draws heavily on the most important current OLAP specification, XMLA. The metadata from both APIs is very similar, and the query language, MDX, is identical. The chief difference is that in a Java environment, olap4j is much easier to use than XMLA. And in many cases it is much more efficient than XMLA, because drivers are able to use caching or more efficient protocols than XML over HTTP.
InfoQ: Do you intend to submit the spec to the JCP, or do you intend to keep it as a stand-alone specification and process?
We would like olap4j to be considered a standard part of the Java runtime library like JDBC. However, we don't plan to submit to JCP at this time. One thing the failure of JSR-69 taught us is that the market can be more important than what comes out of the JCP. We hope that as olap4j drivers are added for more and more servers, olap4j will become a force in the market. As server vendors see the benefits of olap4j, then hopefully they will support us in proposing olap4j within the JCP.
InfoQ: What are the major differences between olap4j and other Java OLAP interfaces, such as the Oracle OLAP Java API?
First, the other APIs, including Oracle's API, are not based on the MDX query language. Instead queries are built programmatically, and the API to build queries tends to be different for each vendor.
Second, other APIs lock you into one vendor.
Most, if not all, OLAP servers implement XMLA these days. This has two important consequences. First, it should be straightforward to create an olap4j driver for those servers, using the open source olap4j driver for XMLA as starting point. Second, the fact that these servers implement XMLA means that they implement the MDX language. So, it should be possible to implement a native olap4j driver for these servers, which would be more efficient than the XMLA-based driver.
InfoQ: Are there plans for additional driver support for other OLAP servers in future releases of olap4j?
We have not committed to support any particular servers, but as an open source project and we hope to receive valuable contributions from the community. In the past year, others have implemented the SAP/BW and Palo drivers; I am hopeful that we will have drivers for Oracle/Essbase in the next few months.
InfoQ: Is there any additional information the development community should know about olap4j?
The core API, specified in olap4j 1.0, is stable. We are committed to maintaining backwards compatibility as the API evolves in future. However, we are not stopping with olap4j 1.0. We will enhance the core API, in compatible ways, and there are a number of things we would like to add to the API in future versions.
We would like to extend and improve the query model. Recall that the query model is a client-side library that generates MDX, so it is easy to change the query model without changing the core parts of the API for querying metadata and executing queries.
There is an experimental facility for cell-writeback. This will allow 'what-if' scenarios and planning applications. It will always be an optional feature in olap4j, but we hope that at least Palo and Mondrian will implement it.
There is also an experimental facility to receive 'push' notifications the instant that data changes on the server. Imagine a web client where the cell values change in front of your eyes, and change color briefly to alert you to the change.
Since olap4j is authored using an open process, with contributors from a variety of projects and companies, we can expect more bright ideas to appear in future versions of the specification.
Olap4j was developed by an association of companies and open source projects. It is an open specification and is licensed under (EPL) the Eclipse Public License.
Martin Thompson Jul 27, 2014