BT

Monitoring Microservices at Scale at Crisp

| by Hrishikesh Barua Follow 14 Followers on Mar 24, 2018. Estimated reading time: 2 minutes |

Crisp's engineering team shared their experience in monitoring their microservices. Vigil, their open source monitoring project, is a set of pull/push probes to collect health data with support for multiple languages, a status dashboard and integration with some external alerting tools.

Crisp offers a live chat solution for websites. Crisp's monitoring toolset, called Vigil, consists of probes and a dashboard which displays the status of various microservices collected by the probes. Vigil's probes fall into two categories - poll and push. Poll probes periodically poll a service over TCP or HTTP, and check the response and response time against expected values. Push probes work by integrating with the microservice source code, and send periodic status information to Vigil from inside the service process. This pattern is common in monitoring systems and most systems support both with a focus on one. Vigil is written in Rust, and ran as an internal project for a couple of years before being released as open source.

Crisp serves more than a billion requests per month. Their backend has more than 40 different microservices, most of them non-HTTP. Inter-service communication happens over RabbitMQ. Some of the HTTP-based ones, like the REST API, are behind a load balancer. In addition, there are around 20 daemon processes like Postfix and MongoDB.

Each microservice runs on multiple nodes, and a node is identified by a replica identifier. A node's status can be obtained from the dashboard - healthy, sick or dead. A service node's 'sick' status is determined by either the reported system load (CPU or RAM) being above a threshold in push mode, and a service response taking too much time in poll mode. A dead status for a service indicates that it might be down.

InfoQ reached out to Valerian Saliou, CTO of Crisp, to find out more about how Vigil does internal as well as external monitoring:

When a node in the web of nodes goes down, we'll know as those microservice nodes are monitored in push mode, which means that if one goes down, it won't report and will quickly trigger a 'Down' notification from Vigil to our Slack and to the public status page, pinpointing the node that went down.

For external monitoring of end user endpoints, Vigil "checks the API at https://api.crisp.chat from a poll probe to check public access is OK", says Saliou, adding that "the same API microservice is also reporting via push, which is why you see two references to the API on the Crisp status page, under the 'Web' group and the 'Relay' group."

Vigil's push integration is supported in multiple languages: Rust, node and Go. It also integrates with third party services like Slack and email, but there is no support yet for other popular alerting systems like Nagios and PagerDuty. At Crisp, Vigil currently runs on a single node. Redundancy is not on the roadmap, since the goal is "to have a simple status page that does the job and give SaaS developers / sysadmins easy access to a status page that costs nothing", says Saliou.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT