BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Introduction to BackgrounDRb

Introduction to BackgrounDRb

Bookmarks

Update 3rd March 2008: NOTE:  this article is outdated - please refer to the documentation on the BackgrounDrb website.


Ruby on Rails is a great framework for developing many diverse types of web applications. As the problem domain of these web applications expands, you may need to run computationally intensive or long running background tasks. This poses a problem in that you are constrained to work within the request/response cycle of HTTP. So how can you run these long background tasks without your web server timing out? And how do you display the progress to your users?

Enter BackgrounDRb. This is a Rails plugin I wrote recently as one way to solve this problem. Ruby includes DRb (Distributed Ruby) as part of the standard library. DRb provides a simple API for publishing and consuming Ruby objects over TCP/IP networks or unix domain sockets. BackgrounDRb is a small framework that facilitates running background tasks in a separate process from Rails, thereby decoupling them from the request/response cycle. With DRb you can manage your tasks from Rails using hooks for progress bars or status updates to your users.

The BackgrounDRb server works by publishing a MiddleMan object. This object is the manager for your worker classes. It holds a @jobs hash composed of { job_key => running_worker_object } pairs and a @timestamps hash composed of { job_key => timestamp } pairs. The MiddleMan object straddles the interface between the DRb server and your Rails application. Here is a simple diagram to show the architecture.

This is a generic worker class as created by the worker generator provided by the plugin.

$ script/generate worker Foo
class FooWorker < BackgrounDRb::Rails
def do_work(args)
# This method is called in its own new thread when you
# call new worker. args is set to :args
end

end

When your FooWorker object is instantiated from rails via MiddleMan, the do_work method is automatically run in its own thread. We use a thread here so rails does not wait for the do_work method to finish before it continues on.

With BackgrounDRb, you usually create a new worker object with an AJAX request. Your view can then use periodically_call_remote to fetch the progress of your job and display it however you like. Let's flesh out the FooWorker class and show how you would create a new FooWorker object and retrieve its progress from within a rails controller.

class FooWorker < BackgrounDRb::Rails
attr_reader :progress
def do_work(args)
@progress = 0
calculate_the_meaning_of_life(args)
end
def calculate_the_meaning_of_life(args)
while @progress < 100
# calculations here
@progress += 1
end
end
end

Now in the controller:

class MyController < ApplicationController
def start_background_task
session[:job_key] =
MiddleMan.new_worker(:class => :foo_worker,
:args => "Arguments used to instantiate a new FooWorker object")
end
def get_progress
if request.xhr?
progress_percent = MiddleMan.get_worker(session[:job_key]).progress
render :update do |page|
page.call('progressPercent', 'progressbar', progress_percent)
page.redirect_to( :action => 'done') if progress_percent >= 100
end
else
redirect_to :action => 'index'
end
end
def done
render :text => "

Your FooWorker task has completed

"
MiddleMan.delete_worker(session[:job_key])
end
end

And in your start_background_task.rhtml view file you could use something like this:









 

<%= periodically_call_remote(:url => {:action =>
'get_progress'}, :frequency => 1) %>

MiddleMan.new_worker returns a randomly generated job_key that you can store in the session for later retrieval. If you want to specify a named key instead of using the generated key you can do so like this:

 # This will throw a BackgrounDRbDuplicateKeyError if the :job_key already exists.
MiddleMan.new_worker(:class => :foo_worker,
:job_key => :my_worker,
:args => "Arguments used to instantiate a new FooWorker object")

MiddleMan.get_worker :my_worker

Upon instalation, the plugin writes a config file into RAILS_ROOT/config/backgroundrb.yml. In this file there is a load_rails config option. If this is set to true then you will be able to use your ActiveRecord objects in your worker classes. When you start the server it will use your already existing database.yml file for database connection details.

This plugin can also be used for caching large or compute-intensive objects including ActiveRecord objects. You can store rendered views or large queries in the cache. In fact you can store any text or object that can be marshalled. Here is how you would use the cache:

# Fill the cache
@posts = Post.find(:all, :include => :comments)
MiddleMan.cache_as(:post_cache, @posts)
# OR
@posts = MiddleMan.cache_as :post_cache do
Post.find(:all, :include => :comments)
end

# Retrieve the cache
@posts = MiddleMan.cache_get(:post_cache)
# OR
@posts = MiddleMan.cache_get(:post_cache) { Post.find(:all, :include => :comments) }

MiddleMan.cache_get takes an optional block argument. If the cache located at the :post_cache key is empty, the results of evaluating the block are placed in the cache and assigned to @posts. If you don't supply a block and the cache is empty it will return nil.

In the current implementation, you are responsible for expiring your own caches and deleting your own workers from the main pool. This works two ways. You can either explicitly call MiddleMan.delete_worker(:job_key) or MiddleMan.delete_cache(:cache_key). There is also a MiddleMan.gc! method that takes a Time object and deletes all jobs with a time-stamp older than the one specified. Here is a script that can be run from cron to expire jobs older than 30 minutes:

#!/usr/bin/env ruby
require "drb"
DRb.start_service
MiddleMan = DRbObject.new(nil, "druby://localhost:22222")
MiddleMan.gc!(Time.now - 60*30)

In the near future there will be a timing mechanism built into BackgrounDRb. This will allow for jobs and garbage collection to be run at scheduled times and for specifying a time-to-live parameter when you create new jobs or caches.

There are Rake tasks as well as plain Ruby command line scripts to start and stop the daemon. On OS X, linux or BSD you can use the Rake tasks to start and stop the server:

$ rake backgroundrb:start
$ rake backgroundrb:stop

On Windows you currently have to keep a console window open while you run the backgroundrb server (Hopefully this will change in the near future). So on Windows, to start the daemon you would open a console and run the command like this:

> ruby script\backgroundrb\start
# ctrl-break to stop

So what are a few real world use cases, you ask? Here is a small list of things I am currently using BackgrounDRb for:

  • Downloading and caching RSS feeds for a feed aggregator.
  • Screen scraping automation using watir to drive a web browser that navigates to other websites in the background to collect information.
  • Automating Xen VPS creation and sysadmin tasks.
  • Creating indexes in the background for Hyper Estraier and ferret search technologies.
  • Bridging Rails and IRC bots.

Plans for the future include the ability to fork new processes to handle larger jobs that require their own Ruby interpreter instance. Also work needs to be done to let BackgrounDRb run as a Windows service. Anyone who is familiar with Windows services that can offer some help here would be greatly appreciated. Suggestions and patches are also welcome.

  • rubyforge project
  • Blog
  • install as plugin: script/plugin install svn://rubyforge.org//var/svn/backgroundrb

Update 3rd March 2008: NOTE:  this article is outdated - please refer to the documentation on the BackgrounDrb website.

Rate this Article

Adoption
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Missing javascript includes in view

    by Oliver Kiessler,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    In order to make your example work one needs to include the rails javascript libraries in the "start_background_task.rhtml" view:

    <%= javascript_include_tag :defaults %>

  • drb and novarug

    by Tom Copeland,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Brian Sletten just did a drb presentation at the Northern VA Ruby User's Group; slides and whatnot are on novarug.org.

  • Corrections, for new versions

    by Olle Jonsson,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Development has been swift... and some of the code in the article is now out-of-date.


    progress_percent = MiddleMan.get_worker(session[:job_key]).progress


    Should nowadays just be:


    progress_percent = MiddleMan.worker(session[:job_key]).progress

  • installing problem

    by andrew d,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    when installing and creating new workers, remember to restart your rails server. I was trying to run the example but it didn't work, once I restarted WEBrick everything worked

  • This Article is Obsolete

    by hemant kumar,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Folks,

    Above Article on BackgrounDRb is completely obsolete. Please do not use instructions provided here as a tutorial for using BackgrounDRb. It has caused enough pain and agony for many users. New documentation for bdrb is available at, backgroundrb.rubyforge.org and one should look there before referring anything else.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT