RiCal: A New iCalendar Library for Ruby
RiCal is an implementation of RFC2445, better known as the iCalendar format. Although there are some existing Ruby libraries to read and write the iCalendar format, Rick DeNatale decided to write a new one. We talked to him to learn why he started the project and what obstacles he had to face.
Implementing a calender format doesn't seem to be much fun, so why did you start RiCal?
I got interested in this problem during my last full-time job. For most of last year I worked for a startup company which had a large Rails application which provided team/group collaboration functions as a software service. The application had an existing calendar component, but it was considered inadequate for several reasons, the UI didn't stack up to competition like Google calendar, and there was a requirement to import icalendar files from external calendar applications.
But the cost of properly importing general icalendar files turned out to be outside the budget. At the time, there were, and still are, two popular Ruby gems which provide a measure of support for the icalendar format (Which is defined in RFC 2445). Both do a reasonable job of parsing and emitting icalendar data, but don't go very far in implementing the semantics.
One of the biggest lacks was support for recurring events, and in a related vein time zones as defined in the RFC.
Since the company ran out of funding and ceased development operations, I've been finding paying work on a hit or miss basis, and have had more time that I'd otherwise like to work on RiCal.
The RFC is quite large and complex, where did you start?
Yes, it's definitely a camel designed by a committee. The driving forces were Microsoft and Lotus Notes/IBM. In my IBM career I've been involved with other similar stardards bodies for Smalltalk, Java, etc. so I know both how such beasts get bred and somewhat how to deal with the animal.
The first thing I looked at, because I was interested in it, was the problem of recurring events. RFC 2445 defines recurrence using a complex combination of rules and lists for generating, and excluding times or periods from the occurrences of an Event, Journal entry or Todo. In addition, time zones within icalendar calendars have their transitions between standard and daylight periods defined using the same mechanism.
Have you had to start from scratch or were there any reusable pieces around?
There are a couple of Ruby gems for dealing with recurrence, Matthew Lipper's Runt, and Jim Weirich's Texp. Both are based on Martin Fowler's Temporal Expressions pattern. Martin did a much cleaner job than the committee, temporal expressions probably exhibit the 80/20 rule, they get 80% accomplished with 20% of the work/complexity.
I know several friends and colleagues who have tried to use one of these libraries to implement occurrence enumeration with one of these two libraries and didn't have a very enjoyable time, nor achieve the desired goal. The problem isn't with Martin's pattern, it's with the camel.
How does such a recurring event look like?
The biggest level of complexity in RFC 2445 is the recurrence rule, it specifies a basic frequency and interval, such as every Day, every 2 Months, etc. This is applied to a start time to produce a series of modified times. That's simple enough, but the rule can also add things like on the 2nd and last Mondays of the Month, expressed with rule parts like BYWDAY=2MO,-1MO, which I call byparts in the code. These can be combined and interact in various ways with each other and the basic frequency and interval. Depending on the relationship of the frequency of the bypart, a bypart can either add additional occurrences of filter them out. As an example an event describing when US Presidential Elections take place starting with the 1996 election might look like this:
DTSTART;TZID=US-Eastern:19961105T090000RRULE:FREQ=YEARLY;INTERVAL=4;BYMONTH=11;BYDAY=TU;BYMONTHDAY=2,3,4,5,6,7,8The legal rule is that US Presidential elections happen every year which is divisible by 4, on the first Tuesday after the first Monday in November.
In fact the example I just gave comes straight out of RFC 2445.
This really looks quite complex, so how do you analyze these rules?
I spent a lot of time thinking about how to interpret these rules before coming up with an initial algorithm which worked. The basic approach was to come up with a method which given the last occurrence time would produce the next. My first working approach used a single enumerator class with a series of methods which figured out whether the next occurrence needed to have it's seconds changed, then it's minutes, etc, until it got to the level of the basic frequency , and then filtered out times which didn't match byparts with longer durations than the basic frequency.
I wrote RSpec examples for each of the recurrence rule examples in the spec and got them all to work. And they worked fairly efficiently. The full spec suite ran in something like 13 seconds on my MacBook.
With that accomplished, what came next?
The next step was to tackle time zones. Time zones in general are a mess. There are no real standards for naming them, and there's a lot of confusion about what they are. Most Rubyists who have dealt with time zones are probably familiar with the Tzinfo gem, which provides access to the Olson time zone database which is commonly used by most modern operating systems. For better or worse, the Olson database has lots of alternate identifiers for timezones so they aren't unique. And RFC2445 punts on the matter of standardized time zone identifiers, which means that time zones can't be passed by reference between icalendar using applications, but must have their definitions contained within the icalendar data.
So RiCal contains a class representing a VTIMEZONE component, which can actually convert times between UTC and the time zone represented by the instance. As I mentioned in order to determine whether the particular time is in a standard or daylight saving period requires using the recurrence rules to find the right period.
The flip side of that coin is that if you create say an event with RiCal from your Ruby/Rails application, and then export it to icalendar format data which can be understood by another application, it has to be exported in a calendar which contains VTIMEZONE components defining any time zone or time zones needed by that event. So RiCal also has a class which represents a VTIMEZONE component which gets its time zone data from TZInfo, and knows how to export a valid RFC 2445 external representation of that VTIMEZONE component along with any periods and when they transition.
With these two hurdles taken, were there more problems?
When I got that all done, I realized that although the occurrence enumeration worked rather efficiently on the average, that a few cases were driving that average in the wrong direction, and unfortunately those cases tended to be used in defining timezone definitions. In many such cases I was enumerating potential occurence times for every day of the year, and thowing away all but those which fell on, say the first Sunday in April.
So I went back to the drawing board, or maybe it was the shower where I seem to do most of my best "aha!" thinking, and came up with a new approach which now uses a chain of objects, one for each bypart and frequency/interval in the rule. This effectively 'compiles' a hybrid object the first time a particular recurrence rule. More importantly it let me 'fast-foward' those every 1st Sunday in April cases.
It did take quite a while to do that refactoring and get things back to green, but the results seem worth it. That spec suite went from running in around 13 seconds to about 4-5. The suite has grown a bit since then but thats primarily because of added specs for new function unrelated to recurrences.
You also created a DSL to create calendars, could you elaborate on that?
The past few weeks I've been working on a 'builder dsl' for RiCal. I've tried to make this as compatible as I can with the Icalendar and Vpim gems, but I've extended it to do things like picking up a time zone from a TimeWithZone object if someone is using a recent enough version of Rails (or actually ActiveSupport) to have those. I've also tried to make RiCal insensitive to whether or not you are using it in Rails. I don't require ActiveSupport, which might have made my job a bit easier), and although I need TZInfo::Timezone, I don't care whether it comes from the tzinfo gem or ActiveSupport.
There are a few things about the DSL which I'm not completely happy with right now which are the last issues before I 'officially' publish RiCal on RubyForge as well as on github.
The Readme on GitHub contains an example of the DSL:
description "MA-6 First US Manned Spaceflight"
dtend DateTime.parse("2/20/1962 19:43:02")
location "Cape Canaveral"
description "Segment 51"
So what is there left to do in RiCal?
As I said, right now I'm trying to put the final touches on the DSL API.
After that I'll listen to users. Although RiCal should make it easier for people to write Ruby apps which interchange with other calendar applications, the complexity of the spec means that not every app interprets the spec in the same way in every detail. I expect people will discover things that need to be attended to.
Follow on work might involve things like a rails plugin to help with modeling calendars and calendar components in ActiveRecord.
And there's still room for some help in building calendar user interfaces, but that and the active record plugin are likely to be separate but related projects.
Thank you very much Rick for taking the time to talk to us.
John Krewson, Steve Ropa and Matt Badgley Nov 24, 2014