BT
x Your opinion matters! Please fill in the InfoQ Survey about your reading habits!

GigaSpaces XAP 6.0:スペースベースアーキテクチャ向けの簡易化SpringベースAPI

by Ryan Slobojan on Sep 25, 2007 |

GigaSpaces(サイト・英語)は先日、分散環境においてアプリケーションからのスケーリングを実現するインフラソフトウェアプラットフォームであるeXtreme Application Platform(サイト・英語)(XAP)のバージョン6.0をリリースした。InfoQは、GigaSpacesのGeva Perry氏(source)とNati Shalom氏(source)に、今回のリリースおよびこのバージョンでの変更点について詳しく話を伺った。

まず、Perry氏とShalom氏に、6.0での主な変更点について語ってもらった。

  • OpenSpaces-6.0向けの主要な開発プラットフォームで、Spring Framework(サイト・英語)を使用して、POJOドリブンの開発モデルのみならず、高スケーラブルかつイベントドリブンのサービス指向アーキテクチャ向けコンポーネントを提供する。また、インメモリデータグリッド、リモーティング、宣言型のイベントコンテナ/トランザクション、およびOSGiのような展開モデルなどのコンポーネントを提供する。
  • Persistence as a Service (PaaS) -ミラーサービスの別名でも知られており、バックエンドデータサービスを用いてインメモリデータグリッドの内容の非同期性が確実に持続するようにする。その際、アプリケーションコードやコンフィギュレーションを変更する必要はなく、ミラーサービスはすべてをトランスペアレントに処理する。
  • JMS 1.1相互運用性-JMS APIを介して直接フィードを送信することを可能にし、これらのフィードはすぐにスペース内のエントリに変換される。これによって待ち時間が低減し、コンポーネント数が減るためにイベントドリブンのアプリケーションの開発、展開が簡素化される。
  •  SLAドリブンのコンテナ-サービスレベルアグリーメント(SLA)を通してクラスタのインスタンスを動的に管理するサービスグリッドは現在、Springを使用することによって大幅に簡素化されており、製品のすべてのエディションに統合されている(無償のCommunityエディションも含む)。
  •  強化された.Netサポート-パフォーマンスが向上し、.Netが元からサポートされ、新しいAPIセットが.NetとJavaの両方を組み込みモードで実行することを可能にすることによって.NetとJavaのシームレスな相互運用性を実現する。また、GigaSpacesはMicrosoftと提携し、Excel、SharePoint、Visual Studio、およびWindows Compute Cluster ServiceのようなMicrosoft技術に対してよりパッケージ化されたソリューションを提供する。
  • Amazon Elastic Compute Cloud(サイト・英語)(EC2)のサポート-現在、6.0はサーバー当たり1時間0.10ドルのコストでEC2サービスに使用できるため(サイト・英語)、ユーザーが試したり、複数のサーバーにどのようにスケーリングされるかを確認したりできる。
  • 統合されたスペースベースアーキテクチャ(サイト・英語)(SBA)のサポート-以前のリリースでは、SBAの実装には開発とコンフィギュレーションの努力が必要だったが、6.0では、これがAPIの主たる部分をなすプロセシングユニットのようなSBAコンポーネントを用いて簡素化されている。

Perry氏とShalom氏にプロセシングユニットとは何かを詳しく説明してもらった。

スペースベースアーキテクチャにおいて、プロセシングユニットとは、アプリケーションのスケールとフェイルオーバーのユニットを意味します。通常、これには待ち時間/ランタイムの依存が大きいアプリケーションサービスおよびミドルウェアコンポーネントのすべてが含まれます。SBAは、単独のコンテナ(プロセシングユニット)の下でこれらのサービスをカプセル化し、これらのコンポーネントすべてに対して一貫したスケーリングとフェイルオーバー動作を包括的に維持します。例えば、エラーイベントは、ミドルウェアコンポーネント(メッセージング、データグリッド)とそれに関連するビジネスロジックの両方のリカバリプロセスを自動的にトリガします。こうして、エラーイベントが発生し、メッセージングシステムがイベントの引き渡しを開始するが、アプリケーションサービス側でその処理を行う準備が整っていないという事実から発生する部分エラーや一貫性のないビヘイビアが避けられます。待ち時間の観点からすると、これらのコンポーネントをすべて同じランタイムコンテナ内でカプセル化すると、純粋にメモリ内で相互に影響するため、ネットワークのオーバーヘッドが低減します。スケーラビリティはプロセシングユニットの追加と同じくらい単純になります。言い換えれば、データ、ビジネスロジック、および/またはメッセージング層を個別にスケーリングする必要がないのです。

Perry氏とShalom氏は、今回のリリースがさらに大きな展望に適合すると見られる部分についても語ってくれた。

.6.0は、当社が目下促進しているビジョン実現する上でなお一層の重要ステップです。このビジョンの核心は、n層アーキテクチャとJ2EEスタックを使用してアプリケーションを構築する時代が終わりつつあるという認識なのです。これらに依存するアーキテクチャやミドルウェア技術は、今日のビジネスアプリケーションに必要なスケーリング、信頼性、およびパフォーマンスをサポートする能力の点から、壁に突き当たっています。


とりわけ、新たに出現したアーキテクチャは、垂直型スケーラビリティと比べて、低コストのハードウェアに基づく水平的なスケーラビリティをサポートします。これらは、インメモリデータグリッドを、RDBMSではなく、リアルタイムかつオンラインでのレコードのトランジャクションシステムとして活用します。これらは、データやサービスの動的コロケーションが、単独の「always available(いつでもどこでも利用可能)」かつフォールトトレランスなクラスタリングを備えた自給型の「プロセシングユニット」を作成するのを可能にします。

 Perry氏とShalom氏は、eBay、Google、MySpace、およびAmazonなどの他の大規模サイトが同様のアイデアを思いついたことを言及し、MapReduce(サイト・英語)、Hadoop(サイト・英語)、およびmemcached(サイト・英語)を例に挙げた。ただし彼らは、JDBC、JMS、およびSpringのサポートを通じて、J2EE世界および既存の開発者スキルとの相互運用性を維持することに注意を払っていることを明らかにした。彼らはまた、XAP Community Edition(サイト・英語)やOpenSpacesなどの製品が、主に大企業が使用するエンタープライズ製品というよりも、主流の開発者をターゲットとしていることを指摘した。

GridGain、Coherence、およびTerracottaとGigaspacesの比較についても尋ねたが、彼らはまず、これらのベンダがすべてアプリケーションの構築と展開のより良い方法についてコミュニティを教育しようとしていることを述べた。一方、Gigaspacesは、スケーリング、パフォーマンス、および高可用性のすべての面に応える総括的なアプリケーションプラットフォームとなるように設計されている。彼らいわく、他のベンダは分散キャッシングのような分散コンピューティングという具体的な面により重点を置いている。また、GigaSpacesがバッチアプリケーションに焦点を当てたエンタープライズグリッドコンピューティングソリューションではないと述べており、これを提供するべく、GigaSpacesはGridGain、DataSynapse、およびPlatform Computingなどの企業と提携している。また、GigaSpacesは、インターアプリケーショングリッドにおいて共有リソース上で複数のアプリケーションを管理するよりも、分散環境において単独のアプリケーションを実行するためのイントラアプリケーショングリッドにもより重点を置いている。

Perry氏とShalom氏にはまた、グリッドコンピューティングの将来に関するCameron Purdy氏の最近の考え(source)についての意見を述べてもらった。

Cameron氏の主張では、Tangosolは常に最高のデータキャッシング技術であり、その理由は、まさにデータキャッシングそのものに焦点を絞っているためです。当社の考えでは、アプリケーション全体のスケーラビリティ、パフォーマンス、および可用性を、そのアプリケーションのデータボトルネックだけでなく、終端間で解決することは、単に分散キャッシング機能(データグリッドと呼ばれるにしても)よりはるかに多くのことを必要とするというのが自明の理です。Oracleはこれに気づいたため、Tangosolを買収し、レガシーアプリケーションサーバーやメッセージング技術も含む予定である自社のFusion Middlewareスタックにキャッシングを追加しようと試みています。


 しかし、異なる技術の統合がどんなに優れていても(また、優れているかまだ分からない場合でも)、これには常に、参加コンポーネントすべてに対するスケーラビリティ、パフォーマンス、および持続的な可用性のための一般的な動的クラスタリングモデルが本質的に欠如しているという根本的なハンディキャップを負います。これは、全く間違ったアプローチです。これと、GigaSpaces 6.0 XAPで利用できる総体的なSBAアプローチと比べてみてください。

最後に、彼らにXAPに開かれた将来について尋ねた。

当社は、[EC2] およびCommunity Editionに加えて、特別なスタートアップ製品も作成する予定で、これによって、スタートアップ、オープンソースプロジェクト、および非営利団体が製造中の製品を無料で使用できるようになります。間もなく登場するこの製品について公に言及するのはこれが初めてです。数週間後にリリースされますので、引き続き注目してください。

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

GigaSpaces XAP 6.0: Simplified, Spring-based API for Space-Based ... by Julian Browne

I was lucky enough to get some early access to XAP 6.0, and have to say I've been very impressed. Congratulations to the Gigaspaces team on their achievement. The list above barely scratches the surface of what you can do with this technology. Think of processing units in the SBA as domain-specific entities and there's some rather elegant synergy with the DSL/LOP topics recently posted on InfoQ, with NFRs included.

One area where I think all in this space (pardon the pun) would agree with Cameron, is that success is when grid-think becomes an implicit part of the architecture toolbox. It's time to shed the 'specialist' and 'high-end' tags the approach has historically had, and to see it as just a good way to provide a bit of capability back to the business.

Oracle Coherence Data Grid by Cameron Purdy

Obviously I don't want to take anything away from Gigaspaces' announcement, and congratulations to them on their recent product release. Despite our differences, there's no doubt that competition in this space has pushed all of us far beyond our own initial expectations. So please permit me to disagree a little ;-)

Cameron's claim has always been that Tangosol is the best Data Caching technology because it has focused on doing just that: Data Caching. We believe it is now self-evident that solving the scalability, performance and availability of an entire application, end-to-end, not just the data bottleneck of that application, will require a lot more than just distributed caching functionality (even if it is called a Data Grid).


I have a slightly different point of view, which is hardly surprising ;-)

The truth is that Tangosol, a little company with no financial backing, was able to build a very successful product that solved a lot of important problems for a lot of customers, from small companies to many of the largest companies in the world. Back in 2001, we created the first coherent clustered cache product, Coherence, and it was extremely popular. In fact, it's so popular that even the Gigaspaces web site runs on it .. ;-)

By 2003, we were working with some of the largest and most successful web sites in the world to use Java to achieve continuous availability by clustering the living state and transactional data of applications. Our customers pioneered the notion that information in memory could be of higher reliability and availability than any of the traditional data management choices that were available to them. Predictable scalability with low latency was a hallmark of these applications, from stock markets and banking systems to telco applications and ecommerce websites.

It's clever to attempt to label it as "only" distributed caching, and we are definitely very successful in that space. We're also just as successful in the Data Grid space, which adds transactional data management to the grid environment, and when it comes to Event Driven Architectures (EDA) and eXtreme Transaction Processing (XTP), we still see no viable competition to Coherence.

Oracle have realized it, that's why they bought Tangosol and are attempting to add caching to their Fusion Middleware stack that will also include their legacy application server and messaging technologies.


If Oracle's plans for Coherence were to "attempt to add caching to their Fusion Middleware stack", we wouldn't have even considered it.

It's true that Oracle is already extremely successful with Coherence in the marketplace -- it doesn't hurt to have an account manager dedicated to every major organization in the world! However, that isn't why Oracle selected Coherence as the technology to own in this space. The qualities of service (QoS) that Coherence provides are going to be the required building blocks for every new piece of infrastructure, for every service and for every major application from this point forward. The levels of availability and reliability that Coherence provides are simply unparalleled.

.. it will always be fundamentally handicapped by the inherent lack of a common dynamic clustering model for scalability, performance and continuous-availability for all of the participating components.


Just let me know when you catch up to the clustering model that we introduced in 2001 ;-)

Peace,

Cameron Purdy
Oracle Coherence: The Java Data Grid

Two very different approaches by Nikita Ivanov

Having worked with these two products rather closely (our project integrates natively with both of them) I have a strange feeling. First of all, I truly respect them both. It’s not a b/s statement; both products are established, very complex and proven in the market place. Working in the similar space and business trade I can only appreciate what it took to get there…

Now, on the surface they both solve somewhat similar problem: you can hear a lot about distributed heap, distributed caching, data grid, spaces, etc. However, what I found startling is that technological approaches are so different in these two products that even though they solve similar (if not the same problems) they do it VERY differently – thus driving very different reactions from different customers. Applicability and usage of these two products also varies dramatically.

You can, by the way, safely add Terracotta to this mix. It is yet another product that solves the same type of problem (sort of) in a VERY different way again.

We all have out biases based on our past experiences and preferences. I personally view these two products as very different (orthogonally different) approaches to the similar problem domain. No less – no more…

My 2 cents,
Nikita Ivanov.
GridGain - Grid Computing Made Simple

Xtreme Transaction Processing by Nati Shalom


"Obviously I don't want to take anything away from Gigaspaces' announcement, and congratulations to them on their recent product release."


Thanks, Cameron.


"Despite our differences, there's no doubt that competition in this space has pushed all of us far beyond our own initial expectations. So please permit me to disagree a little ;-)"


We're in agreement on that point ☺


"I have a slightly different point of view, which is hardly surprising ;-)"


Rather than speaking on our own behalf, I'd rather have others who used the product comment about it. For a start, look at the comment made by Julian Browne above:


"The list above barely scratches the surface of what you can do with this technology. Think of processing units in the SBA as domain-specific entities and there's some rather elegant synergy with the DSL/LOP topics recently posted on InfoQ, with NFRs included."


While I don't expect that we will share the same view on each other's products, I think that some clarity on what is XTP (eXtreme Transaction Processing) is in order.

As an illustration, let's look at a typical transaction processing application, such as an order management system.

It is typically built using the following components in a J2EE environment:

1. Data Feeds – typically from a JMS provider.
2. Message-Driven Bean - Where the business logic resides
3. Database - used for maintaining state to ensure recoverability as well as durability.

Building highly-available transaction processing with this model requires:
1. JMS + JMS Cluster to maintain high availability. In some cases, JMS high-availability is achieved by writing the state of the messaging system to disk.
2. Application Server clustering to ensure high-availability
3. Database clustering
4. XA transactions to ensure ACID properties are kept among these different components.

From what I hear from our customers and your own words, when you refer to XTP you're talking about enhancing (or partially replacing) the database with caching, or an extended version of caching called Data Grid. The equivalent of that in our product is the Enterprise Data Grid edition.

When we at GigaSpaces talk about XTP we’re referring to the complete application stack, including the messaging feed, the container and the data. By doing that we’re targeting not just the data bottleneck, but the end-to-end application scalability, latency and complexity challenges.

We do that by providing a JMS façade on top of the same cluster used for processing the business logic and the data. This way, we remove many of the moving parts and the multiple clustering models required with the previous approach, which is essentially tier-based.

Needless to say, there is a huge difference between solving the data I/O bottleneck and the scalability of the entire application.

It is, therefore, not surprising that users who recently evaluated the two technologies to address their XTP requirements realized quickly that on the XTP front, our products are really not comparable. The difference is so obvious, that we are getting requests from customers who are already using various caching alternatives, including Coherence, to support these caches so that they will be able to benefit from SLA driven container, Space Based JMS and the full GigaSpaces platform. This is something we’re seriously considering.

For more information on what XTP really means refer to the following blog post on that topic.

Nati S.
GigaSpaces
Write Once Scale Anywhere

Re: Two very different approaches by Nati Shalom

Hi Nikita


We all have out biases based on our past experiences and preferences. I personally view these two products as very different (orthogonally different) approaches to the similar problem domain. No less – no more…


Interestingly enough we share the same view..
I think that you had done great job integrating the different products with GridGain despite those differences, well done!

Nati S.

Re: GigaSpaces XAP 6.0: Simplified, Spring-based API for Space-Based ... by Nati Shalom

Julian - thanks for the kind words


Think of processing units in the SBA as domain-specific entities and there's some rather elegant synergy with the DSL/LOP topics recently posted on InfoQ, with NFRs included.


I gave a presentation on the latest SpringONE event in Brussle which covers some of the topics the Julian is referring too. You can find the online presentation here.

That presentation triggered an interesting discussion on the Spring forum, End of App Servers?,  which highlights some of the points I was trying to make in this presentation:
"Given the availability of all the various commodity services which an
app server typically provides (connection pools, emailing, security,
transactions, etc), there doesn't seem like much value-add provided by
a war/ear/rar centric app servers any more.

Being able to just start up another OSGi container on another box, and
dynamically register OSGi components to it seems much more attractive,
in terms of manageability, scale-out/fault-tolerance."


If your interested in playing with it - i'll recommend starting by looking at the following webcasts


Interview with GigaSpaces by Jesse Chan

Since this press release, I have managed to get an exclusive interview with Geva Perry, Chief Marketing Officer at GigaSpaces. He talks more about the technology and what makes it different than other technologies such as caching, messaging, MapReduce, and so forth. GigaSpaces is not only powerful, but complementary to a lot of other technologies. You can read the interview here.

Re: Xtreme Transaction Processing by Cameron Purdy

Nati,

As you know, scaling the stateless parts of the application has never been the problem, but I'm glad to hear you can solve it nonetheless .. ;-)

In almost every real world case, the latency and throughput limiters on the stateful side -- including the transactional side -- is the obstacle to scale. These are the problems that Coherence solves, and despite your protestations to the contrary, these are the same problems that you are working to solve.

Once again, a sincere congratulations on the release, and I hope you don't take our disagreements personally; they are not intended to be so.

Peace,

Cameron Purdy
Oracle Coherence: The Java Data Grid

Compute grid vs. Data grid by John Davies

It's always healthy to see competition, I know both Nati and Cameron extremely well and I know they both have mutual respect for each other, the respect doesn't quite run as deep for each other's companies though.

I have worked with both products, worked with their clients and even done talks on both products. Although there is a notable overlap, these are two different and distinct products fundamentally aimed, originally at least, at two different problems.

One, Tangosol, who recently merged with Oracle :-), is a data grid and the other, GigaSpaces, is a compute grid. There are a number of situations where I can envisage both products working together but it still comes down to the fact that you could do most of what the other does by extending the scope of what it was originally designed for. Still, at the end of the day I can still see clear advantages in both products.

I'm a huge fan of JavaSpaces, it has a beautifully simple API but for some bizarre reason it never really made it into the main stream despite Sun trying to help it succeed by introducing EJBs. Tangosol adopted the Hashmap as an API which made it easy to use but to be honest JavaSpaces' four methods with identical parameters isn't exactly difficult to master.
The use of Spring in OpenSpaces brings JavaSpaces into the "standards" world where the programmer only has to learn one API.
Despite Spring being an order or two more complex than JavaSpaces, it is pretty neat the way they've integrated it into 6.0 and it's easy to knock up demos.

The is a huge market and increasing by the day, its great to see GigaSpaces where they are now with over 100 customers compared to a few years ago where I used to know them all by name and what version they were on.

Congratulations GigaSpaces, a great achievement and just keep up the fight with Tangosol, they're worthy advocates.

-John-

Re: Xtreme Transaction Processing by Nati Shalom

Hi Cameron


As you know, scaling the stateless parts of the application has never been the problem, but I'm glad to hear you can solve it nonetheless .. ;-)


This is probably the area where we probably differ most. My view is that there is no such thing as *stateless* tier when it comes to transaction processing or almost any distributed application that need to be high availability and reliability. For example if you send message through JMS, someone need to consume it at some other point in time, during that period of time the message becomes the *state* of the transaction. In addition to that there could be failure scenario's which requires that you will store that intermediate state to ensure full recovery of the message from the exact point it failed. The same applies to the transaction coordinator and your session information.

Tests that was conducted by one of our partners proved that point beyond any doubt. In those tests they compared the Tier based approach where we used JMS+XA+Caching vs alternative that used our JMS facade,+ Spring as the abstraction layer and Caching all running on our virtualized XAP middleware. Those tests showed that with the Tier based approach the JMS tier with its own clustering overhead as well as the transaction coordinator had huge impact on the end-to-end latency, scalability and complexity. One of reason is pretty obvious, to ensure high availability both had to maintain their state in the file system to ensure full recoverability. With this architecture we ended up with the following message flow with the tier based alternative:

1. Send message to the JMS
1.2the JMS write to disk (for its own high availability)
3. MDB takes the message under transaction (with the additional transaction overhead)
4. That message is written to the Cache (under transaction ) the cache replicate that data to backup instance (again for its own recoverability)

Any access to disk during the critical path of the business-transaction limit the throughput and latency as you know - so no matter how fast you are with the data-tier your end-to-end scaling, latency and complexity is going to be determined by your weakest link.

In our case you simply send the message to the JMS facade and that 's translate immediately as an entry in the our DataGrid.
See details here

Another area of complexity is affinity. With the tier based approach you need to make sure that messages are routed to the appropriate queue which contains the appropriate caching partition. You need to do that explicitly since both messaging and data clusters uses different clusters load balancing model etc.

In our case there is a single clustering/fail-over semantics for the entire application as well as development and deployment model. This makes that challenge pretty much irrelevant.

You can imagine what will happen if you add scaling to the picture with the tier based approach.


Bottom line:
My point is that if you need true linear scaling (as appose to performance optimization) you can't assume anything about the other tiers. You need to be able to handle the end-to-end transaction flow and not just part of it. IMO this is the only way to achieve true linear scalability.

The real good news is that with our latest release we added specific abstractions at the messaging (JMS, Event Containers) and Data, (DAO, Declarative Transactions) that makes the transition from the non scalable Tier-Based-Model to linearly Scale-Out model relatively seamless. If you happen to use Spring this will fit-in as a very native extension to your existing development and runtime environment. In this way you can scale your entire application seamlessly just by changing the runtime platform.
If your already using another caching alternative you can benefit from our messaging, and SLA driven container to scale out your entire application.

Nati S.
GigaSpaces
Write Once Scale Anywhere

Re: Compute grid vs. Data grid by Nati Shalom

Hi John

Thanks for the feedback.


One, Tangosol, who recently merged with Oracle :-), is a data grid and the other, GigaSpaces, is a compute grid.


To be more accurate we position ourself as a platform for scaling-out stateful application. Were positioning the parallel processing part that comes natively with JavaSpaces thorough the Master/Worker pattern for parallel transaction processing as appose to parallelization of batch processing. There is fundamental difference between the two, the later tend to be low-latency and stateful in nature and often requires very simple parallel processing pattern where batch processing are more stateless in nature and requires more sophisticated parallel processing and resource schedulring based on different policies to improve the utilization and reduce compute time on Data Center.


There are a number of situations where I can envisage both products working together but it still comes down to the fact that you could do most of what the other does by extending the scope of what it was originally designed for. Still, at the end of the day I can still see clear advantages in both products


I can share with you that one of the leading investment banks chose to use GigaSpaces and Coherence partially because of the reasons i mentioned above.

These are interesting days:)


Congratulations GigaSpaces, a great achievement..


Thanks John,
Stay tuned there are more to come.

Thanks! by Binil Thomas

I am reading this thread about 18 months after it was initially active. I find the polite arguments by Cameron, Nati and John very informative; you guys are role models for ideal internet debate. :-)

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

12 Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT