Bindings, Platforms, and Innovation
This presentation focuses on the Internet and separating myth from fact, history from the future, and the mundane from the imaginative. Bob Frankston presents a vision of what could and should be.
Tracking change and innovation in the enterprise software development community

Posted by Ezra Zygmuntowicz on Jun 27, 2006 01:59 PM
Update 3rd March 2008: NOTE: this article is outdated - please refer to the documentation on the BackgrounDrb website.
The Role of Open Source in Data Integration
Effective Management of Static Analysis Vulnerabilities and Defects
Agile Development: A Manager's Roadmap for Success
Ruby on Rails is a great framework for developing many diverse types of web applications. As the problem domain of these web applications expands, you may need to run computationally intensive or long running background tasks. This poses a problem in that you are constrained to work within the request/response cycle of HTTP. So how can you run these long background tasks without your web server timing out? And how do you display the progress to your users?
Enter BackgrounDRb. This is a Rails plugin I wrote recently as one way to solve this problem. Ruby includes DRb (Distributed Ruby) as part of the standard library. DRb provides a simple API for publishing and consuming Ruby objects over TCP/IP networks or unix domain sockets. BackgrounDRb is a small framework that facilitates running background tasks in a separate process from Rails, thereby decoupling them from the request/response cycle. With DRb you can manage your tasks from Rails using hooks for progress bars or status updates to your users.
The BackgrounDRb server works by publishing a MiddleMan object. This object is the manager for your worker classes. It holds a @jobs hash composed of { job_key => running_worker_object } pairs and a @timestamps hash composed of { job_key => timestamp } pairs. The MiddleMan object straddles the interface between the DRb server and your Rails application. Here is a simple diagram to show the architecture.

This is a generic worker class as created by the worker generator provided by the plugin.
$ script/generate worker Foo
class FooWorker < BackgrounDRb::Rails
def do_work(args)
# This method is called in its own new thread when you
# call new worker. args is set to :args
end
end
When your FooWorker object is instantiated from rails via MiddleMan, the do_work method is automatically run in its own thread. We use a thread here so rails does not wait for the do_work method to finish before it continues on.
With BackgrounDRb, you usually create a new worker object with an AJAX request. Your view can then use periodically_call_remote to fetch the progress of your job and display it however you like. Let's flesh out the FooWorker class and show how you would create a new FooWorker object and retrieve its progress from within a rails controller.
class FooWorker < BackgrounDRb::Rails
attr_reader :progress
def do_work(args)
@progress = 0
calculate_the_meaning_of_life(args)
end
def calculate_the_meaning_of_life(args)
while @progress < 100
# calculations here
@progress += 1
end
end
end
Now in the controller:
class MyController < ApplicationController
def start_background_task
session[:job_key] =
MiddleMan.new_worker(:class => :foo_worker,
:args => "Arguments used to instantiate a new FooWorker object")
end
def get_progress
if request.xhr?
progress_percent = MiddleMan.get_worker(session[:job_key]).progress
render :update do |page|
page.call('progressPercent', 'progressbar', progress_percent)
page.redirect_to( :action => 'done') if progress_percent >= 100
end
else
redirect_to :action => 'index'
end
end
def done
render :text => "Your FooWorker task has completed
"
MiddleMan.delete_worker(session[:job_key])
end
end
And in your start_background_task.rhtml view file you could use something like this:
<%= periodically_call_remote(:url => {:action =>
'get_progress'}, :frequency => 1) %>
MiddleMan.new_worker returns a randomly generated job_key that you can store in the session for later retrieval. If you want to specify a named key instead of using the generated key you can do so like this:
# This will throw a BackgrounDRbDuplicateKeyError if the :job_key already exists.
MiddleMan.new_worker(:class => :foo_worker,
:job_key => :my_worker,
:args => "Arguments used to instantiate a new FooWorker object")
MiddleMan.get_worker :my_worker
Upon instalation, the plugin writes a config file into RAILS_ROOT/config/backgroundrb.yml. In this file there is a load_rails config option. If this is set to true then you will be able to use your ActiveRecord objects in your worker classes. When you start the server it will use your already existing database.yml file for database connection details.
This plugin can also be used for caching large or compute-intensive objects including ActiveRecord objects. You can store rendered views or large queries in the cache. In fact you can store any text or object that can be marshalled. Here is how you would use the cache:
# Fill the cache
@posts = Post.find(:all, :include => :comments)
MiddleMan.cache_as(:post_cache, @posts)
# OR
@posts = MiddleMan.cache_as :post_cache do
Post.find(:all, :include => :comments)
end
# Retrieve the cache
@posts = MiddleMan.cache_get(:post_cache)
# OR
@posts = MiddleMan.cache_get(:post_cache) { Post.find(:all, :include => :comments) }
MiddleMan.cache_get takes an optional block argument. If the cache located at the :post_cache key is empty, the results of evaluating the block are placed in the cache and assigned to @posts. If you don't supply a block and the cache is empty it will return nil.
In the current implementation, you are responsible for expiring your own caches and deleting your own workers from the main pool. This works two ways. You can either explicitly call MiddleMan.delete_worker(:job_key) or MiddleMan.delete_cache(:cache_key). There is also a MiddleMan.gc! method that takes a Time object and deletes all jobs with a time-stamp older than the one specified. Here is a script that can be run from cron to expire jobs older than 30 minutes:
#!/usr/bin/env ruby
require "drb"
DRb.start_service
MiddleMan = DRbObject.new(nil, "druby://localhost:22222")
MiddleMan.gc!(Time.now - 60*30)
In the near future there will be a timing mechanism built into BackgrounDRb. This will allow for jobs and garbage collection to be run at scheduled times and for specifying a time-to-live parameter when you create new jobs or caches.
There are Rake tasks as well as plain Ruby command line scripts to start and stop the daemon. On OS X, linux or BSD you can use the Rake tasks to start and stop the server:
$ rake backgroundrb:start
$ rake backgroundrb:stop
On Windows you currently have to keep a console window open while you run the backgroundrb server (Hopefully this will change in the near future). So on Windows, to start the daemon you would open a console and run the command like this:
> ruby script\backgroundrb\start
# ctrl-break to stop
So what are a few real world use cases, you ask? Here is a small list of things I am currently using BackgrounDRb for:
Plans for the future include the ability to fork new processes to handle larger jobs that require their own Ruby interpreter instance. Also work needs to be done to let BackgrounDRb run as a Windows service. Anyone who is familiar with Windows services that can offer some help here would be greatly appreciated. Suggestions and patches are also welcome.
Update 3rd March 2008: NOTE: this article is outdated - please refer to the documentation on the BackgrounDrb website.
In order to make your example work one needs to include the rails javascript libraries in the "start_background_task.rhtml" view: <%= javascript_include_tag :defaults %>
Brian Sletten just did a drb presentation at the Northern VA Ruby User's Group; slides and whatnot are on novarug.org.
Development has been swift... and some of the code in the article is now out-of-date.
progress_percent = MiddleMan.get_worker(session[:job_key]).progress
Should nowadays just be:
progress_percent = MiddleMan.worker(session[:job_key]).progress
when installing and creating new workers, remember to restart your rails server. I was trying to run the example but it didn't work, once I restarted WEBrick everything worked
Folks, Above Article on BackgrounDRb is completely obsolete. Please do not use instructions provided here as a tutorial for using BackgrounDRb. It has caused enough pain and agony for many users. New documentation for bdrb is available at, http://backgroundrb.rubyforge.org and one should look there before referring anything else.
This presentation focuses on the Internet and separating myth from fact, history from the future, and the mundane from the imaginative. Bob Frankston presents a vision of what could and should be.
This article explores the use of JBoss and jBPM to implement design solutions that effectively address the issue of orchestrating long running activities.
This presentation covers the use of graph databases as an optimal solution for data that is difficult to fit in static tables, rapidly evolving data or data that has a lot of optional attributes.
This session introduces Real Options and shows how it can help in running your project. Real Options is a decision-making process that can be used to manage risk.
This article discusses the use of bindings on services and references (including the instance of non-configured bindings) as the means to implement SCA communications in a Web and SOA environment.
After a short introduction to DSLs, Scott Davis plays with the keyboard showing how to approach the creation of a DSL by typing working snippets of Groovy code that get executed.
IBM Rational and InfoQ present, Scaling Agile with C/ALM, an eBook showing organizations how to become “finely tuned software delivery machines” by enabling team integration and scaling.
Amanda Laucher presents a real life enterprise application written in F#. She shows actual code snippets, explaining design decisions and suggesting how to use some of the F# constructs.
5 comments
Watch Thread Reply