Designing and Developing Cross-Cutting Features
Every developer has had to integrate with another system, API or component. Tis article provides strategies to handle the change and for he separating system boundaries.
The content has been bookmarked!
There was an error bookmarking this content! Please retry.

Posted by David Pallmann on May 01, 2009
In this 3-parts series of articles we're going to look at grid computing using the Azure cloud computing platform. In Part 1, we'll look at this from a design pattern and benefits perspective. In Part 2 and 3 we will see a code example for a grid computing framework developed for Azure.

Not everyone is clear on the distinctions between grid computing and cloud computing, so let's begin with a brief explanation of each. While grid computing and cloud computing are not the same thing, there are many synergies between them and using them together makes a lot of sense.
Grid computing is about tackling a computing problem with an army of computers working in parallel rather than a single computer. This approach has many benefits:
Not all types of work lend themselves to grid computing. The work to be done is divided into smaller tasks, and a loosely-coupled network of computers work on the tasks in parallel. Smart infrastructure is needed to distribute the tasks, gather the results, and manage the system. Not surprisingly, the early adopters of grid computing have been those who needed to solve mammoth computing problems. Thus you see grid computing used today in genetics, actuarial calculations, astronomical analysis, and film animation rendering. But that's changing: grid computing is getting more and more scrutiny for general business problems, and the onset of cloud computing is going to accelerate that. Computing tasks do not have to be gargantuan to benefit from a grid computing approach, nor are compute-intensive tasks the only kind of work eligible for grid computing. Any work that has a repetitive nature to it is a good candidate for grid computing. Whether you're a Fortune 500 corporation that needs to process 4 million invoices a month or a medium-sized business with 1,000 credit applications to approve, grid computing may well make sense for you. Grid computing is a decade older than cloud computing, so much of today's grid computing naturally doesn't use a cloud approach. The most common approaches are:
Cloud computing allows for an alternative approach to grid computing that has many attractive characteristics, offers a flexible scale-up/scale-down as you wish business model, and already provides much of the supporting infrastructure that traditionally has had to be custom-developed.
In order for Grid computing to move into the mainstream there need to be compelling business applications for it. Let's take a look at 3 kinds of business applications that lend themselves to a grid computing approach and also provide significant business value. The first example is data mining. Data mining and other forms of data analysis can identify interesting relationships and patterns from business data.
The second example is decisioning, where a battery of forward-chaining business rules need to execute in order to make a business decision. In some scenarios decisions need to be reached very quickly yet the computations involved can be complex. The parallelism of the grid can be leveraged for a faster response time that doesn't degrade when the workload increases.
The third example is batch processing where you occasionally have to handle bursts of large workloads but don't have the in-house capacity to handle the workload.
These examples illustrate that grid computing is moving out of niche applications and is becoming a generally useful form of computing for businesses of all kinds to consider.
Cloud computing is about leveraging massive data centers with smart infrastructure for your computing needs. Cloud computing spans application hosting and storage, as well as services for communication, workflow, security, and synchronization. Benefits of cloud computing include the following:
Microsoft's cloud computing platform is called Azure, and currently it consists of 4 primary service areas:
Azure is new; at the time of this writing, it is in a preview period with a commercial release expected by end of year 2009.
Azure is designed to support many different kinds of applications and has no specific features for grid computing. However, Azure provides much of the functionality needed in a grid computing system. To make Azure a great grid computing platform only requires using the right design pattern and a framework that provides grid-specific functionality. We'll look at the design pattern now and in Part 2 we will explore a framework that supports this pattern.

The first thing you'll notice about this pattern is that there is some software/data in the Azure cloud and some on-premise in the enterprise. What goes where, and why?
The software actors in this pattern are:
The data actors in this pattern are:
Let's put all of this together and walk through how you would develop and run a grid computing application from start to finish using this pattern and a suitable framework:
1. A need for a grid computing application is established. The tasks that will be needed, input data, and results destinations are identified.
2. Using a framework, developers add the custom pieces unique to their project:
3. Azure projects for application hosting and storage are configured using the Azure portal. The Grid Worker package is deployed to cloud hosting, tested, and promoted to Production.
4. Using the Grid Manager console, the grid job run is defined and started. This starts the Loader running.
5. The Loader reads local enterprise data and generates tasks, writing each to the Task Queue.
6. The Grid Worker project in the Azure portal is started, which spawns multiple instances of Grid Workers.
7. Each Grid Worker continually receives a new task from the Task Queue, determines the task type, executes the appropriate code, and sends the task results to the Results Queue. The way Azure queues work is very useful here: if a worker has a failure and crashes in the middle of performing a task, the task will reappear in the queue after a timeout period and will get picked up by another Grid Worker.
8. The Aggregator reads results from the Results Queue and writes them to local enterprise storage.
9. While the grid is executing, administrators can use the Grid Manager console to watch status in near real-time as Grid Workers execute tasks.
10. When the Aggregator realizes all scheduled tasks have been completed, it provides notification of this condition via the console. At this point, the grid has completed its work and its results are safely stored in the enterprise.
11. The Grid Workers are suspended via the Azure portal to avoid incurring any additional compute-time charges. Cloud storage is already empty as all queues have been fully read and no additional storage charges will accrue.
The Azure platform does good things for grid computing, both technically and financially:
Next time, we'll see how this pattern is implemented in code using a grid computing framework developed for Azure.
David Pallmann is a consulting director for Neudesic, a Microsoft Gold Partner and National Systems Integrator. Prior to joining Neudesic David worked on the WCF product team at Microsoft. He has published 3 technical books and maintains an active Azure blog. He is also a founding member of the Azure User Group.
Every developer has had to integrate with another system, API or component. Tis article provides strategies to handle the change and for he separating system boundaries.
Alex Russell talks about the shortcomings of the web platform and how it is evolving in order to adress them. He also explains about how browsers are improving and shares his vision on things to come.
Jeff Lindsay discusses creating distributed and concurrent systems using ZeroMQ – a lightweight message queue-, and gevent – a coroutine-based networking library.
Brian Ketelsen introduces Skynet, a platform for polyglot, distributed and composable services that communicate with each other over RPC/JSON.
Carin Meier tells the story of Alice discovering Monads, meeting three types of monads – Identity, Maybe, State-, and learning how to implement them in Clojure.
The need for agile, queryable, reliable, scalable storage without the pain of SQL schema migration is real. This article uses MongoDB to introduce NoSQL concepts to Java, PHP, and Python developers.
Jérôme Giraud introduces Wink Toolkit, an open source mobile JavaScript framework for HTML5 web or hybrid apps, showing widgets and interactions.
Greg Wilson and Christophe Coenraets demo Adobe Edge, a motion and interaction tool, CSS Regions and Shaders, and PhoneGap.
No comments
Watch Thread Reply