Tapestry for Nonbelievers
A new article by I. Drobiazko and R. Zubairov introduces v. 5 of the Apache Tapestry component-oriented web framework. The tutorial shows how to create a component and covers IoC in Tapestry and Ajax.
- Java,
Tracking change and innovation in the enterprise software development community
Posted by Sebastien Auvray on Jan 29, 2008 12:00 PM
The MapReduce design pattern to distribute data processing was introduced by Google in 2004, and came with a C++ implementation. A new Ruby implementation is now available under the name of Skynet released by Adam Pisoni.There are notably 2 key differences between Google's design paper and Skynet:Skynet is an adaptive, self-upgrading, fault-tolerant, and fully distributed system with no single point of failure.
If a worker dies or fails for any reason, another worker will notice and pick up that task. Skynet also has no special ‘master’ servers, only workers which can act as a master for any task at any time. Even these master tasks can fail and will be picked up by other workers.Skynet is very easy to use and set up which is the real strength of the MapReduce concept. Skynet also extends ActiveRecord with MapReduce features such as
distributed_find.> Model.distributed_find(:all, :conditions => "id > 20").each(:somemethod)As long as your running Skynet, it will execute :somemethod on each model, but in a distributed manner (on as many workers as you have). It does this without instantiating the models before distributing it, or even fetching all the ids ahead of time. So it can work on infinitely large data sets.
A Technical Introduction to Terracotta
Hibernate without Database Bottlenecks
Scale Your Application without Punishing Your Database
Why Should I Care About Terracotta?
Terracotta 2.5.2 - Download now for scalability without tradeoffs
Any comment about performance? this is usually a problem in Ruby applications
If my understanding is not completely wrong, this sounds like an Actors implementation.
Also so following comment is a bit confusing to me:
Also there is some question as to how well starfish actually distributes tasks since Ruby actually can't marshal and send code blocks over the wire, only references to it.
I don't know how Google implementation looks like, but I am having hard times understanding how C++ would distribute code blocks and execute these on various machines.
./alex
--
.w( the_mindstorm )p.
The complain about performance is leveled against almost all dynamic languages, including Java. In almost all cases, they have a point that these languages are slower than C, but miss the point that the performance tradeoff is made consciously for two reasons. The first being the belief that engineering resources are more valuable than computer resources. Obviously this argument has a limit, which brings us to the second argument. Ruby is being used in plenty of large scale production environments where performance is important. Ruby is not likely slower than any other interpreted language be it Java or Perl. A map/reduce framework written in Ruby will be slower than one written in C, but not necessarily slower than one written in Java.
Good point and to be honest I'm not sure how they do this internally at Google. I guess there's just an assumption made that you should be able to pass code blocks in a dynamic language... though this is a poor assumption. I actually haven given up implementing this in Skynet, though I still question how much utility it has. In a dynamic language like Ruby, you tend to write very little code relying on a great deal of code. How much code would you really want to send with each data slice? How much code would that code rely on? How would you know whether that code is on the worker machines? Given all of this ambiguity, it seems the idea of passing code has limited real world value. That said, having a system that is self upgrading is very useful. Skynet is self-upgrading in a rudimentary way, but we have big ideas for how it might upgrade more intelligently in the future.
A new article by I. Drobiazko and R. Zubairov introduces v. 5 of the Apache Tapestry component-oriented web framework. The tutorial shows how to create a component and covers IoC in Tapestry and Ajax.
In this interview, Burton Group consultant Pete Lacey talks to Stefan Tilkov about his disillusionment with SOAP, his opinion on REST, and addresses some of the perceived shortcomings REST vs. WS-*.
Jay Fields presents his concept of Business Natural Languages - a type of Domain Specific Languages geared towards being readable by domain experts.
Adoption and interest for Distributed Version Control Systems is constantly rising. We will introduce the concept of DVCS and have a look at 3 actors in the area: git, Mercurial and Bazaar.
Deborah Hartmann interviewed Segundo Velasquez about his experience as customer with an Agile team during the initial phase of software design of a product.
David Cooksey shows how to fine grained versioning to a ClickOnce deployment using an HttpHandler written with ASP.NET, making partial rollouts to a test audience much easier.
Windows workflow (WF) is an excellent framework for implementing business processes, but lacks support for human activities. This article describes a completely generic approach for changing this.
In this interview taken during OOPSLA 2007, Markus Voelter talks about the importance of documenting the software architecture, and gives some good and also bad examples on how it could be done.
4 comments
Reply