SAP's Cloud Strategy Evolves With In-Memory Computing Appliance
On Wednesday at the SAP SAPPHIRE 2011 conference, SAP announced availability of the High-Performance Analytic Appliance (HANA) software over the cloud in a partnership with Dell and Intel. This partnership allows SAP customers to deploy their SAP applications on Dell’s Virtual Integrated System(VIS) datacenter platform. SAP CTO and executive board member, Vishal Sakka also gave a preview of the upcoming HANA AppCloud which will provide customers access to SAP's Business Intelligence OnDemand, Carbon Impact and Sales & Operations Planning applications among others. Apart from migrating their own applications SAP will also certify third party applications on HANA.
Over the past year, since HANA was first announced at SAP Sapphire 2010, it has taken a front and center position in SAP's strategy roadmap. It is speculated that SAP's venture into in-memory computing is partly in response to Oracle's Exadata product since a majority of the SAP userbase use the Oracle database and will be looking at Oracle's Exadata as a move to the next generation data processing platform. Unlike Exadata that is supported only on the Oracle stack, a wide range of vendors have signed on to sell HANA appliances, including IBM, Fujitsu, Dell, Cisco and Hewlett-Packard.
The reason for SAP to invest in HANA over the cloud is even more compelling as per ZDNet's Dennis Howlett:
It is not simply a matter of revenue. SAP has already produced stats that show storage is one of the largest costs associated with running its systems, far outstripping the cost of running application servers for SAP. Test systems are more costly to operate than production systems. Running in the cloud could dramatically reduce TCO.
From a technical standpoint, IBM SAP consultant and prominent SAP blogger, Vijay VijayaShankar has a few doubts:
The big issue with the HANA on cloud vision is “how do you do ETL, and will it be real time?”. Details remain to be seen – but the one use case I particularly worry about is the Sales and Ops Planning on Cloud. This is an ETL intensive activity generally, and speed is a known issue even for in-memory solutions. Add the bandwidth issue to access the cloud and concerns of security and privacy of base data, and I generally feel S&OP is better done on-premises than on-demand for most customers. There might be a few that have low data volumes etc which can use it – but it is hard to imagine this catching on with big SAP shops. Another concern I have about putting HANA on cloud in near future, is SAP’s ability to size the infrastructure. HANA has not spent enough time in production to get useful information on sizing, and there is a good chance of SAP under calling or over calling the size required to host HANA. And it will be bad in either case obviously.
HANA employs an in-memory, data source agnostic computing engine, where data to be processed is held in RAM instead of reading from secondary storage devices thus providing a performance boost. The platform also comes with a modeling studio which easy enough for business users to use. As per the first official benchmark tests in partnership with IBM and audited by WinterCorp, HANA easily handles 10,000 queries per hour against 1.3 terabytes of data and returns results within seconds. The test was conducted on the IBM x3850 X5 server which contains 32 cores, 0.5 terabytes of memory, and a RAID 5 disk system which with SAP HANA software could handle up to 1.3 terabytes of data, as SAP HANA compresses data and stores it in columns. HANA scales linearly, meaning that if you need more cores or memory, you simply add more nodes, according to SAP.
Vijay VijayaShankar shared a couple of remarks around HANA's touted real-time performance
I do have a serious issue with calling HANA Real Time. A better word would be “Right Time”, as Ray Wang pointed out on twitter – at least for now, till HANA becomes the backbone of ECC and other products. The question is “will users have a real time experience?”. Most users will not sit next to HANA box in a server room – they will be on a WAN, VPN connection etc. The very fact that SAP hauled HANA systems all the way from their labs to the show floor, and not just connected to them remotely tells me that HANA will not give a real time feeling to users.
and compression capabilities:
I had read Hasso’s book cover to cover on its PDF version, and had tweeted earlier that I had disagreements with some of it. One issue was compression in HANA. Even in keynote, he gave the impression that customers can see a 10X compression of data. I find it hard to believe, since DB2, ORACLE etc does an excellent job of compressing data already. So if say DB2 compresses data 5 times, will HANA compress it 10 times over and above this? Hasso clarified to me that he meant raw data will be compressed 10 times on average, and not already compressed data. But remember, customers “sees” already compressed data and will be comparing HANA’s result to that.Also, database cannot be sized based on this compression ratio as was mentioned in the keynote – it needs extra space for various technical reasons.
Vishal Sikka shared some details of the HANA architecture in an interview with PCWorld:
HANA is built on a superset of technologies with a long history at SAP, including the MaxDB database and TREX in-memory engine. Data held in memory by HANA is backed by a persistence layer that logs transactions and incorporates savepoints that create images of the database, according to an SAP document. This allows it to be restored in the event of a power outage or other disruption.HANA is compatible with any BI (business intelligence) application that supports common query languages like SQL and MDX.