Meet the Goliath of Ruby Application Servers
PostRank Labs announced they released an open source version of their Ruby web server framework that powers PostRank. The project, titled Goliath, is an asynchronous server designed for speed, which leverages key features of Ruby 1.9+ to get the job done.
Unlike other Ruby web servers such as Mongrel, Unicorn or Thin, Goliath uses EventMachine to allow for an event-driven design. Combining this with MRI Ruby 1.9+ and the use of Ruby's Fibers results in applications that are fast and manageable.
From the Goliath project web site:
Goliath is an open source version of the non-blocking (asynchronous) Ruby web server framework powering PostRank. It is a lightweight framework designed to meet the following goals: bare metal performance, Rack API and middleware support, simple configuration, fully asynchronous processing, and readable and maintainable code (read: no callbacks).
The framework is powered by an EventMachine reactor, a high-performance HTTP parser and Ruby 1.9 runtime. One major major advantage Goliath has over other asynchronous frameworks is the fact that by leveraging Ruby fibers introduced in Ruby 1.9+, it can untangle the complicated callback-based code into a format we are all familiar and comfortable with: linear execution, which leads to more maintainable and readable code.
Each Goliath request is executed in its own Ruby fiber and all asynchronous I/O operations can transparently suspend and later resume the processing without requiring the developer to write any additional code. Both request processing and response processing can be done in fully asynchronous fashion: streaming uploads, firehose API's, request/response, and so on.
InfoQ caught up with Ilya Grigorik, founder and CTO of PostRank, about Goliath and the technologies involved. PostRank developed Goliath for their own use and decided to open source the technology for the good of the community. Ilya announced the project on his blog last week.
Since this is such a new project and new newly released, we wanted to ask Ilya the details.
Robert Bazinet (RB) : Can you tell readers what is Goliath?
Ilya Grigorik (IG) : Goliath is an open source version of the non-blocking (asynchronous) Ruby web server framework powering PostRank. It is both an app-server and a lightweight framework designed to meet the following goals: bare metal performance, Rack API and middleware support, simple configuration, fully asynchronous processing, and readable and maintainable code (these last two are arguably the ones that differentiate Goliath from other competing and similar frameworks).
RB : Why did PostRank choose to develop it?
IG : Technically, the released version of Goliath is the v4 of our internal web-stack we have developed at PostRank. Our first version dates back to early 2008 - at the time, we were not happy with the available alternatives in the Ruby app-server space, hence we picked up EventMachine and started developing our own. Over time, we've gone through three major versions, packing a lot of learning and improvements into each one.
Having said that, our primary goal for Goliath remained the same: it is a framework for developing web-services (API's). Meaning, we were not trying to compete or replace our Rails or Sinatra applications, rather, Goliath was designed to act as a data-source for these apps. Internally, we used Goliath to abstract our databases, scheduling systems, and so forth, behind clean, high-performance HTTP endpoints.
RB : You use it at PostRank; What problem does it solve for your company?
IG : The first versions date back to early 2008, but the latest iteration, which leverages Ruby 1.9 features has been in production since early 2010. It has been a rock solid performer for us - today, we serve 500+ req/s through Goliath, with uptime measured in months - hence our decision to open source it. Most of the internal use cases for Goliath at PostRank are the straightforward request/response style API's, but we also use it to provide "firehose" style API's to our partners where the data is streamed directly from an AMQP queue, filtered, and delivered as an HTTP stream.
Many of our own internal services rely on the same HTTP endpoints that we expose to our partners, and we to tend to push a lot of data around, hence the reason why we wanted a server that can efficiently handle parallel request processing, support keep-alive and pipelining, and provide a simple and readable API.
RB : Would you explain some use cases for Goliath and what is it ideally suited for?
IG : If you are looking to bring up a web-service to decouple or abstract some resource, then Goliath is the perfect fit - don't think of it as a replacement for Rails or related frameworks, instead, think of it as a data source (JSON / XML endpoint) for your next web application. Goliath allows you to write high-performance API's which can support streaming, keep-alive, and all the other things you would expect from a full-featured, asynchronous HTTP 1.1 web server.
RB : Is Goliath the Ruby version of Node.js?
IG : Goliath belongs to the same category of servers as node.js: fully asynchronous and architected around an event-loop. In the case of node.js, you are using V8 as the runtime, and in the case of Goliath you are using Ruby 1.9 and Eventmachine. In that respect, yes, they are very similar.
That's where Ruby 1.9 and Fibers come in - that was our solution. But more on that in a second...
RB : Ruby 1.9 is required when using Goliath. What specific aspects of Ruby 1.9 are leveraged?
IG : The primary reason for Ruby 1.9 in Goliath is due to the introduction of Fibers . What are Ruby Fibers? They are continuations, which allow us to arbitrarily suspend and resume any processing function (aka, cooperative scheduling). What this gave us is the ability to transparently pause and resume any IO operation at will, without even exposing this interaction to the developer. Hence, by leveraging fibers, we could hide all the callback code under the hood, expose a "synchronous looking API" within Goliath, and at the same time preserve all the nice properties of running within an event-loop.
Net result? We could get rid of all the callbacks in our Goliath API's, which drastically simplified our code, and made it readable and much easier to maintain. In fact, new developers starting with Goliath at PostRank could be completely oblivious to the fact that they were writing an async app - at that point, we knew we hit on a model that is worth pursuing.
RB : How does PostRank utilize data stores such as MySQL or MongoDB?
IG : In most cases we actually try to stay "as close to the metal" as we can, meaning we tend to roll our own SQL and iterate over the result set, but that's also primarily because our SQL is usually pretty simple, but we do *a lot* of lookups. Having said that, we do use other drivers like Cassandra, etc., to abstract the actual muck.
RB : I'm curious, can I had existing models from a Ruby on Rails application, using ActiveRecord or some other OR/M such as DataMapper?
There is no reason why you can't use any of those ORMs with Goliath, as long as they are using an async driver under the hood. Here's a simple example with AR.
RB : What features are missing from the project?
IG : As an app server, Goliath already supports most of the features that you would expect: keep-alive, pipelining, robust HTTP parser, async request processing and middleware support, etc. Moving forward, I would love to add support for handling and exposing websocket connections, and also to work on simplifying the deployment scenario for leveraging multiple cores.
Beyond that, Goliath is also a (barebones) framework for developing web-services, and there is a lot of room for improvement there in terms of improving the DSL, configuration syntax, testing infrastructure and so forth. The good news is, the Ruby ecosystem offers a lot of great examples of all of the above, and with Goliath we can leverage and integrate a lot of that work - that's also where we are looking for feedback and help from the community.
RB : How would you like to see Goliath evolve?
IG : Goliath is able to run on MRI Ruby, Rubinius and JRuby today. At the moment, MRI is the best performer, but there is some very promising work in JRuby that could make it an overall winner. I'm hoping we can really push the state of art in that department, since that would open a number of interesting opportunities. For example, once JRuby and Rubinius are viable production platforms, then we are running on "non GIL'ed" environments, which means that we can look into making Goliath take advantage of multiple cores within the same VM.
And of course, as I mentioned earlier, there is a ton of room for improvement in the DSL and supporting frameworks - definitely something we're looking to improve moving forward.
RB : Ilya, thank you so much for your time.
Goliath looks like a way to solve some fundamental problems without having to use a full Ruby on Rails stack but instead create small services for serving data or implementing an API for your application. A recent thread in the Goliath Google Group asking how to deploy for a Rails application, Ilya responds indicating Goliath can be used in conjunction with a Rails or Sinatra application:
In practice, we use Goliath to be *data sources* for many of our Rails apps. So, think of that next metal endpoint, or a sinatra app, where you may want to abstract some external resources (a database, some custom logic, etc). Rails is fantastic for developing the user facing components, I tend to think of Goliath as "developer centric" components - in other words, endpoints that serve JSON, XML, etc.
The key to understand here is that Goliath is backend plumbing used to create really fast Rack-based API's (web services), data services as well as any other application needing bare-metal performance.
Developers interested in learning more about Goliath, information can be found at the Goliath.io website, the GitHub repository, documentation and Google Group. For those developers wondering how one might test an application utilizing Goliath, there's a very lengthy and detailed article by PostRanker Dan Sinclair. Dan gives readers the basics of Goliath but then dives right in showing developers how to properly test Goliath applications.
About the Author
Robert Bazinet is a .NET and Ruby developer as well as a System Architect with over 20 years experience working on small to enterprise-wide system development. He is an independent consultant and founder of the Still River Software Company, LLC with clients ranging from small to Fortune 10 sized companies. Rob lives in Woodstock, CT with his wife and daughter.
 See this link