BT

How GitHub Revamped its DNS Infrastructure

| by Sergio De Simone Follow 14 Followers on Jun 04, 2017. Estimated reading time: 1 minute |

GitHub moved from a fairly simple DNS infrastructure that served its requirements fairly well for many years to a new architecture that better supports working at GitHub scale, writes GitHub senior infrastructure engineer Joe Williams.

Among the reasons that led GitHub to a new model for dealing with DNS, Williams mentions many applications being sensitive to DNS resolution performance or availability. This can cause degraded performance for customers, including the possibliity of outages, which, with the old infrastructure, was particularly a problem when doing configuration and code changes. Additionally, it was difficult to identify the root causes of any malfunctions, and the only tool engineers could use was tcpdump. Besides improving those issues, GitHub engineers also aimed to:

  • Add flexibility to the way internal and external zones were served, making internal zones not visible from the outside unless specifically configured so, while also guaranteeing external zones could be reached from the inside without leaving the internal network.
  • Improve role isolation between caches and authorities.
  • Support both deploy-based and API-based workflows for automated changes.
  • Avoid any external dependencies to improve reliability.

The resulting architecture that GitHub designed included three kinds of nodes:

  • Caches, which live in data centers and are responsible for providing live data to applications without requiring them to cross data-center boundaries.
  • Edges, which are authorities at the regional level and act as a gateway for the data center by handling requests from the caches and are in charge to perform zone transfer.
  • Authorities, which serve as DNS masters and manage zone transfers from edge nodes as well as providing HTTP API to create, modify, or delete records.

Another area where GitHub’s new DNS infrastructure has brought benefits is logging. Based on their logging requirements, GitHub engineers have chosen to use Unbound for caches, NSD for edge hosts, and PowerDNS for authorities.

As mentioned, external zones using the github.com domain can be accessed from internal zones, using the github.net domain, without ever communicating with the external DNS providers. This is made possible by Unbound, which additionally supports the option to access the external network in case the internal DNS fails.

There are a lot more details in Williams’ post, so make sure to read it in its entirety.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT