Scout - Extensible Server and Application Monitoring
Scout is an extensible server and application monitoring service which focuses upon ease of installation and configuration. Scout offers default alerts to help administrators understand how the application is behaving under various loads as well as allowing developers to create plugins to extend Scout.
The power of Scout stems from its ability to be extended by writing Ruby plugins such as those available for monitoring Ruby on Rails, Phusion Passenger, Nginx, MySQL and others.
InfoQ had the opportunity to speak with Andre Lewis, co-founder of Scoutapp.com, to talk about Scout and how it takes a different approach to helping developers and sysadmins monitor not just their servers but extend it into monitoring applications.
Robert Bazinet (RB): First, tell us a bit about Scout?
Andre Lewis (AL): Scout is hosted server monitoring. We asked ourselves: what's the best part of open source monitoring solutions? The flexibility. What's the worst part? The setup and ongoing maintenance. So, we combined a flexible plugin system with a hosted service.
The result: Scout is the easiest way to monitor your production environments. You can track all the major pieces of your infrastructure, get email and SMS alerts based on trends and thresholds, and drill down with charts and graphs.
RB: How do you ease data security and privacy concerns from customers using your service?
AL: We use industry standard security. All data is sent over SSL (the same level your bank uses). We control the data rigorously on our servers. Finally, so much software is available as a service these days, whether it's GMail for mail, Salesforce for business info, Github for source code, or Minty for personal finance. So acceptance of the hosted model is really high.
RB: How is Scout similar to other tools on the market like New Relic RPM?
AL: If a server goes down, or you are running out of disk space, or a job queue is backed up you probably won't find out about it through New Relic. Any one of these incidents can break a web application, that's where Scout comes in. Scout and New Relic form a great performance duo - we have plenty of customers that use New Relic for detailed Rails Performance analysis and Scout for all of the other things that can bring their web applications down.
Scout's pricing scales very well when you have a larger number of servers. New Relic RPM can cost between 4x and 10x more depending on your setup.
RB: How does Scout compare to more traditional tools like Nagios?
AL: I use a home renovation analogy: someday, I'd be interested in remodeling my bathroom. I would learn a lot but I'll surely run into some frustrating problems as well. Just envisioning my wife yelling "there is no hot water" scares me. At the end of the day, it's often simpler to have a specialist do the work. Scout is a specialist. The work on your end is minimal and it doesn't distract from your core job (which probably isn't setting up monitoring). Software like Nagios and Ganglia are powerful tools if you have time to harness them. With developers and system administrators stretched, often there's just not enough time.
Another nice-to-have with Scout: we're constantly improving and refining the product. Since it's hosted, you get those upgrades without having to re-install or re-configure anything.
RB: How does Monit fit into the picture?
AL: Monit is great for automated intervention (i.e., restart a process if it's memory usage goes above 200MB). We use Monit ourselves, and it's pretty easy to set up simple cases. I posted a simple "getting started" with Monit.
RB: What are some of the technical components of Scout and what parts are built in-house?
AL: From a user's perspective, there's just one thing you need: the Scout Agent, which is a small Ruby program distributed as a Ruby gem. A cron job runs the Scout agent every few minutes, and it collects performance metrics from your system.
Those metrics are sent back to scoutapp.com via secure HTTP. A question we get sometimes is, "does the agent open up an ports or accept any incoming connections?" The answer is no -- it communicates outwardly only, and only over regular HTTP port 80.
On the server side, Scout is built on a pretty standard open-source stack: Ruby, MySQL, RRDTool, Apache, and Linux. We use both the Ruby on Rails and Sinatra application frameworks: Sinatra is very lightweight with low overhead, and it powers the data collection mechanism. Rails is more full-featured, and powers the UI you see when you log into Scoutapp.com.
Of all the technical components we use, RRDTool is the one that's probably least-known. RRDTool is a fantastic for storing huge quantities of time-series data quickly and efficiently. RRDTool is what makes the great graphs you see for all your metrics on Scout possible. If you're a Ruby developer and you're interested in seeing what we've done with it, we've released our Ruby library for interfacing with RRDTool. It's also what makes it possible for us to display a historical graph of metrics you collected a year or more ago, which is key for long-term trends and capacity planning.
RB: How has open source tools such as RRDTool been an advantage to your company?
AL: RRDTool has been amazing -- we would be a year behind where we are now without RRDTool or an equivalent. Managing volumes of time-series data so efficiently is a huge problem, and I'm really happy to have the problem already solved.
In fact, we tried to re-engineer what RRDTool does at one point, and it was a disaster. I blogged some lessons learned from the experience.
AL: We use Sinatra to collect incoming data from the Scout agents, so it has to process a ton of requests, it has to process them quickly, and it has to be ultra-reliable. Sinatra runs like a champ for this. It's request-response cycle is very light-weight. The Sinatra infrastructure adds virtually no overhead to the our business logic, while making the code easy to understand and maintain.
The main Scout application, on the other hand, is more like a traditional web application. This is where users create their account, view alerts, configure trigger thresholds, browse graphs, etc. This part of the application really benefits from the full Rails stack. Rails is heaver than Sinatra, but it provides lots of infrastructure that makes writing web applications a breeze.
Another benefit of separating the data collection mechanism (Sinatra) from the end-user site (Rails): we can update the user site with zero interference to data collection. We update scoutapp.com frequently with new features and user feedback. From an operational standpoint, we like being able to push updates knowing that the business-critical data collection mechanism isn't affected.
RB: Are there any risks to your business, products or customers by using open source components?
AL: I don't see any risks. Just the opposite, in fact -- if we had to rely on traditional closed-sourced technologies (for OS, database, application server, etc), our prices would have to be much higher to cover licensing costs. That would make it riskier for us to start the business, and a worse deal for our customers. Open source is a great way to build a technology company.
RB: Does Scout monitor only Rails applications?
AL: Scout monitors much more than Rails applications. Much of what Scout monitors is system-level, so it's useful regardless of what language or application framework you're using. Just a few of the things Scout can monitor for you: CPU load, disk usage, MySQL performance and slow queries, I/O stats on any number of devices, Apache status, NginX, EC2 Cloudwatch, and more. We find that when something goes wrong in a production system, it's just as likely to be one of the other pieces of infrastructure as it is in your application. Having visibility into all the moving parts is what makes Scout valuable. You can see correlations that are hard to catch otherwise. For example, you might see that throughput in your application has gone down while CPU load has increased. Or you might see an abnormally large number of slow MySQL queries, correlate it to your IOStat numbers, and discover that a disk drive is close to failure.
Sometimes it's the simplest things that save you. Every scout install has a trigger that automatically emails you when a disk becomes 80% full (of course you can change the threshold if you want). It recently sent me a notice about one of my open-source projects. Having that 80% email was the perfect reminder to go in there and see what's going on. In this case I had forgotten to configure logrotate, so the disk was filling up with logs.
RB: If Scout can be used to monitor database servers such as MySQL, it seems based on implementation of Scout, that Ruby will be required to be installed on any device which a user wishes to monitor?
AL: That's correct, you need to have Ruby there for Scout to run. Ruby is increasingly commonplace, however, so it hasn't been much of a problem. Many Linux distributions already have Ruby installed. And if Ruby isn't already installed, it's just an 'apt-get' or a 'yum install' away.
RB: I see there are many of plugins available, with source code. What is plug-in architecture employed by Scout?
AL: Plugins are at the heart of Scout, and they're what makes Scout so versatile and extensible. Plugins are small Ruby scripts, built on a simple API we provide. You can use any of the plugins in our directory with confidence that each is vetted and approved by us. But, all plugins are also open-source, so you can see exactly what they're doing if you like. If you need to adjust the behavior of a plugin, it's very easy to do, because the plugin source is typically very short and straightforward.
RB: Are developers free to create their own plugins?
AL: Absolutely -- if you need to monitor something completely new, just write a plugin for it. You'll find detailed instructions on creating and testing your plugin on our site. Once your plugin is up and running, it's a first-class citizen within Scout: you can graph its data alongside anything else, create triggers, get emails when it exceeds boundaries you set, and capture significant upward and downward trends.
Furthermore, if you want to share a plugin you've created, just tell us and we'll review it for the directory. We've got some really smart folks in our community, and the quality of user-created plugins is outstanding. Two recent examples are the Delayed Job plugin (a background-task runner for Ruby applications) and the Redis plugin. New plugins being written all the time, so customers constantly get new capabilities in Scout.
RB: How do you see the upcoming release of Rails 3 impacting your business? Or does it matter? The re-architecting of Rails gives so many new touch points in an application, so would Scout be the kind of tool which would be used to take advantage of that and give us much more detailed information?
AL: As a technologist, I'm incredibly excited about Rails 3. I already have it installed and am upgrading some side projects. Operationally, It won't change things too much for us at Scout. The primary interface between Scout and a Rails application is through the log file, so we have to make sure our Rails plugin keeps pace with any log format changes in Rails 3. We'll certainly be Rails 3 compatible by the time Rails 3 exits beta.
Regarding the the improved instrumentation points that Rails 3 provides -- while I wouldn't rule it out in the future, we're not currently planning on building on this. For very deep Rails instrumentation, we recommend running New Relic RPM alongside Scout.
RB: How is your company preparing for Rails 3 with respect to your products?
AL: From a business perspective, it will be a while before we upgrade our internal infrastructure to Rails 3. Fortunately, I have side projects and open-source work I can use to experiment with Rails 3.
What's going to have a bigger impact for us is Ruby 1.9, which is significantly faster than 1.8. This will make a difference for us because we do a lot of background computation on the metrics coming into Scout. Ruby 1.9 will let us do more on current hardware. The gain isn't just for us, it's for everyone who uses Ruby. I'm really excited to see the momentum build around Ruby 1.9.
RB: I see Scout is now available on the Rackspace Cloud. Could you give some details about that relationship and how Scout works in the cloud on the Rackspace platform?
AL: Sometimes making a partnership smooth requires a significant investment. We were lucky with Rackspace. When we initially started talking, we realized neither of us needed to change to make monitoring a cloud environment easier. The tools were already there on both sides.
Using Scout in the Rackspace Cloud is trivial - a single line in a crontab file ensures that new servers booted in the cloud are monitored automatically by Scout. Rackspace has one of the simplest interfaces for managing a cloud environment so it's a simple to do. Just save a backup with Scout in the crontab file and you're done.
Perhaps the biggest problem with monitoring a cloud environment is that it's so dynamic. Spinning up new instances is often scripted and you want those scripts to be as simple as possible. Because all of the monitoring logic is hosted by us and updated on our interface, it reduces the number of moving pieces in a cloud environment.
RB: How does Scout work if you have lots of cloud instances?
AL: Tasks that might take an entire day with traditional monitoring tools often take minutes with Scout. Monitoring new instances is automatic and you can update monitoring on all instances at once.
Whenever you fire up a new cloud instance running Scout, it automatically assumes the monitoring profile of whatever image it was created from. So, set up scout once on your base image, and your done. Likewise if you have a hundred instances already running, you can easily install a new plugin across all of them. You can tell Scout to update all instances with a few mouse clicks.
RB: What type of things can customers expect from Scout in the future?
AL: Our focus is on making Scout your server smoke detector. With a smoke detector, you just put batteries in and it works. There's no maintenance. Our customers need Scout to be the most reliable part of their support system. So, most of our work is focused on making things clear, simple, and efficient. It might mean updating documentation for a plugin one day or determining more efficient ways to judge overall cluster health. We don't have a firm roadmap - so much of Scout's direction has been based off of customer feedback - but it's always going to be about a smoke detector for your servers.
RB: Andre, thank you very much for your time discussing Scout.
More information about server and application monitoring with Scout can be found on the Scout web site. The folks at Scout have offered users interested in Scout a $10 discount off of their first paid month by using the coupon code ‘infoq’. Scout has a 30-day trial, so the $10 discount will be applied to the first month following the 30-days.
About Andre Lewis
Andre Lewis is co-founder of Scoutapp.com, a hosted server and application monitoring company. Andre is passionate about building agile, profitable businesses and doing more with less. His technical tool of choice is Ruby -- he's been working with Ruby and Ruby on Rails since 2006, is a published Ruby author, and maintains several open-source projects. Andre is based in San Francisco, CA.