# Pivotal Open Sources Their Big Data Suite

Pivotal has decided to open source core components of their Big Data Suite and has announced the Open Data Platform, an initiative promoting open source and standardization for Big Data.

Pivotal came later to the Big Data market, after some of the earlier players such as HortonWorks, Cloudera and MapR. But now, to address “fragmentation and vendor lock-in” in the big data space, Pivotal has decided to open source a number of products from its Big Data Suite, namely Greenplum Database - parallel processing data warehouse –, HAWQ – an ANSI-compliant SQL on Hadoop query engine, and GemFire – a distributed in-memory NoSQL database.

Michael Cucchi, Sr. Director of Product Marketing at Pivotal, provided more details on this process and the reasons behind it. According to Cucchi, while customers liked “the Pivotal Big Data Suite’s flexibility”, their buying preferences changed because “customers these days are seeking to incorporate open source technology as much as possible, especially in the area of IT infrastructure.” As a result, Pivotal decided to open source the core of their Big Data Suite, and Cucchi mentioned their intent to open source all components of the suite.

According to Cucchi, the open sourcing process has been “in the works for some time” due to the complexity related to licensing, intellectual property and product governance, providing some details for what’s coming next:

Our detailed plans are still being finalized, but we plan to begin release and incubation of Pivotal GemFire,Pivotal HAWQ, and Pivotal Greenplum Database in a quarterly cadence.  We’re closing in now on the structure of ownership of GemFire, Greenplum Database, and HAWQ code to the most appropriate entity for working with the big data community. As we get closer to actual release milestones, we will continue to update on the progress of each product.

Pivotal has also announced the Open Data Platform (ODP), an initiative of 15 companies including Hortonworks, IBM, Infosys, GE, SAS, that are promoting open source and standardization in the Big Data space. The first components that will be taken care of by ODP are Ambari, HDFS, MapReduce, and YARN, and it is quite likely that Pivotal will entrust the source code of their suite to ODP.

In the same time, the Big Data Suite has been enhanced with a number of services: the ability to deploy the suite on Cloud Foundry taking advantage of the Operations Manager, and integration with Spring XD, Redis, and RabbitMQ. The next version of Pivotal HD will integrate with Spark and “all of the available Apache projects”, according to Cucchi.

by Eric Pederson



by Abel Avram



by Cameron Purdy



by Abel Avram



by Cameron Purdy



by Mike Youngstrom



by pivotalservices EMC



by John Davies

What license are they using? Are they all going to be Apache projects?



by Abel Avram

I wanted initially to add a line or two about licenses, but then I left it out. The license is very complex. It's a combined license of all the licenses of the libraries included. The Pivotal HD license is dozens of pages long and includes all sorts of license types: Apache, EPL, GPL, LGPL, MIT, Oracle (Java), etc. Pivotal HD includes 100-200 libraries, each with its own license.

This is the page for all their open source licenses: pivotal.io/open-source
and this is the page for Pivotal HD's license: download.pivotal.io/pivotal-hd/1.1.1/PivotalHD1...



I don't think that answers the question.

The question that was asked is this: What license will Pivotal choose when it open sources its products?

The question that you answered is this: What licenses are represented by the open source code that Pivotal already uses as part of its products?

Also, the title of this thread says "Pivotal Open Sources Their Big Data Suite", but that is not (yet) true. Pivotal has announced that Pivotal has decided that Pivotal will eventually open source at least some parts of these Pivotal products over the next year or so.

According to EMC/VMWare public filings and comments, Pivotal has been losing money at an incredible rate. Based on that burn rate and the amount that EMC/VMWare and GE initially invested, Pivotal is likely to run out of runway within a year. The Pivotal business model isn't working, and nobody seems to have the courage to publicly point out just how bad the business model is (and just how ludicrously bad some of the VMWare acquisitions were.)

Instead of parroting press releases, let's be honest for once: EMC set up Pivotal to get money-losing products off of its balance sheet, and that hasn't worked. Most likely, EMC wanted to hastily get Pivotal to an IPO before all of the initial Pivotal investment got burned up, because an IPO would have recapitalized the money-losing Pivotal (allowing EMC to walk away from its money-losing ownership) -- and an IPO would have given EMC a huge (and arguably undeserved!) pay-out. Instead, EMC is now going to have to put a huge additional pile of cash into Pivotal, and/or Pivotal will have to drastically restructure in order to try to get to break-even.

Given this, I think that the Groovy divestment is just the tip of the iceberg: www.infoq.com/news/2015/01/Pivotal-Pulls-Groovy...

What I can't figure out is this: How will open sourcing all of this "big data suite" help Pivotal dramatically raise its revenue stream in the next few months? That has to be the goal, since Pivotal isn't a charity or a foundation -- they're a business, and they're losing money, and they're running out of runway.

Peace,

Cameron.
For the sake of full disclosure, I work at Oracle. The opinions and views expressed in this post are my own, and do not necessarily reflect the opinions or views of my employer.



by Abel Avram

Hi Cameron,
nice to see you commenting on InfoQ.
Regarding the licensing issues, you are partially correct. The data we have now can only tell about the plans Pivotal has and the licensing of the contained packages. But I wonder if they can release the whole thing under a unique license because each package has its own license and many of them cannot be changed.

The title is correct. For brevity reasons, it does not include the fact that at this point it is just an announcement, but we take them for their word and we have no reason to believe they won't follow through. Also, I said: "Cucchi mentioned their intent to open source all components of the suite." I said that based on his blog post:
blog.pivotal.io/big-data-pivotal/news-2/pivotal...

Regarding the business side of the issue, this is not the place to discuss it.



Your message is awaiting moderation. Thank you for participating in the discussion.

I wonder if they can release the whole thing under a unique license because each package has its own license and many of them cannot be changed.

It's an interesting question. Theoretically, the copyright-holder (Pivotal in this case) can license in any way that they choose to, but by containing (even linking to) code covered by other licenses, that may restrict how they can license their own offerings. I know that this is a challenge with respect to "copyleft" licenses such as GPL that place requirements and restrictions on the licensing of code that embeds or links to any code that is GPL licensed.

Peace,

Cameron.



Interesting analysis Cameron. Do you have any links so that I can read more about Pivotals financials?

I'm not very good with wall street data but a simple google search seems to show that although Pivotal may be losing money they are also growing.

"24% growth in net revenues to $58 million during 3Q14" and "Pivotal was EMC’s fastest-growing division". marketrealist.com/2014/12/emcs-pivotal-fared-3q14/ They may still be losing money but they also appear to be growing. I guess we'll see if they can keep up the growth despite their "bad business model". Mike • ##### Re: License? Your message is awaiting moderation. Thank you for participating in the discussion. I would also like to point out that "In less than a year, Pivotal Cloud Foundry has booked ~$40 million in software sales and secured a customer portfolio of top global brands".

see www.prnewswire.com/news-releases/pivotal-cloud-...



by John Davies

> "24% growth in net revenues to $58 million during 3Q14" and "Pivotal was EMC’s fastest-growing division". At this level it's complex,$58m of revenue is just the total of what goes through the books. They could have a higher burn-rate (staff costs) and much of that could be services for example through Accenture where you bill the client $10m, pay Accenture (for example)$9m and then pay your own staff another $1m to sell and manage it.$10m revenue but nothing gained on the bottom line.

I'm not insinuating anything about Pivotal's numbers only that you should not take headline figures at face value. You can still be loss-making and be the fastest growing division, the headline only means exactly what it says.

-John-
PS: Great to see Cam back on line, this could go back to the good ol' days of the ServerSide when Floyd was there.

