BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Facebook’s Switch from ntpd to chrony for a More Accurate, Scalable NTP Service

Facebook’s Switch from ntpd to chrony for a More Accurate, Scalable NTP Service

This item in japanese

Facebook's engineering team wrote about their approach on how they built a more accurate and scalable Network Time Protocol (NTP) service by replacing ntpd with chrony and a multi-layered architecture.

Facebook switched from ntpd to chrony after their tests found it to be "more accurate and scalable" than ntpd, with improvements from "10 milliseconds to 100 microseconds".  These tests were aimed at improving the accuracy of the synchronization in the range of micro and nanoseconds to support distributed databases and logging systems. They also shared their verification process of these test results. Many cloud and content delivery network providers offer public NTP servers that can be used by anybody.

The Network Time Protocol - which has been around since before 1985 - is the standard method for synchronizing clocks between computers. Implementations of NTP have to account for leap seconds to allow for irregularities in the earth’s rotation by introducing an extra second, usually on June 30th or December 31st. One of the ways this is done is by "smearing" - spreading this increment over a period of time. Smearing is used by Google, AWS, and Facebook NTP servers. Cloudflare server, however, does not implement it. ntpd - the reference implementation of NTP - is widely used by servers to synchronize their clocks. chrony is another implementation of the NTP as well as the Extended NTP. The Extended NTP "enables NTP servers to provide their clients and peers with more accurate transmit timestamps".

Facebook's multi-layered time server architecture consists of satellites with precise atomic clocks at the top layer. Facebook's own atomic clocks sync with one of these, forming the second layer. A pool of NTP servers synchronizing with these form the next layer, and smearing is implemented here. The last layer consists of a number of servers that can serve more traffic and deal with smeared time only.

Facebook's test measurements included measuring the estimated error when two computer clocks sync. chrony displayed a lesser estimated error (in the range of microseconds) compared to ntpd (in the range of milliseconds). Such measurements are subject to certain assumptions about the network topology as well as characteristics of the hardware. They also tested with dedicated hardware testing devices. Using hardware timestamps (present in some network cards) bypasses possible delays arising from CPU scheduling and host address resolution. After running tests inside their dedicated internal networks, Facebook reran them on public networks against other public time servers as well.

Facebook's public NTP servers are situated in five different locations across their global infrastructure.

Rate this Article

Adoption
Style

BT