BT

Scaling Global Traffic at Dropbox with Edge Locations and GSLB

| by Hrishikesh Barua Follow 16 Followers on Oct 27, 2018. Estimated reading time: 3 minutes |

The Dropbox engineering team shared their experience of architecting and scaling their global network of edge locations. Located around the globe, these run a custom stack of NGINX and IPVS, and connect to the Dropbox backend servers over their backbone network. A combination of GeoDNS and BGP Anycast ensures availability and low latency for end users.

Dropbox manages exabytes of data and petabytes of metadata from half a billion users. They have 20 Points of presence (POPs) across the world to facilitate low latency download and upload for end users. POPs, also known as edge servers - are used by Content Delivery Networks (CDNs) to serve content to end users from a location that is geographically closest to them. Dropbox initially stored users' files on Amazon S3, while their webservers and metadata servers ran on their self-managed datacenters. They moved to their own storage infrastructure to fine tune performance and customize the hardware and software being used for block storage.

Dropbox's global network spans seven countries, supports IPv6 and has exchange points with ISPs and peering partners. In general, internet traffic from one network to another can flow either via transit agreements with ISPs, which is a paid service, or via peering agreements, which is usually free. The edge proxy architecture at Dropbox is part of this network, where Dropbox deployed their servers on the end user facing POPs. Building their own POPs entailed that they had to configure their own routing architecture. Routing uses two kinds of protocols. Interior gateway protocols like OSPF and IS-IS help to route inside an autonomous system (AS). Routing between ASs over the internet uses exterior gateway protocols like BGP. Dropbox started with BGP and OSPF and had moved to IS-IS for internal routing by 2015. They also increased peering relationships with other networks while keeping their transit agreements, which gave them more control over traffic engineering.
 
A Dropbox edge location has several components that help it function. Global server load balancing (GSLB), Anycast and hybrid routing for BGP, and real user metrics (RUM) collection to assess actual performance are the key ones. Factors like backbone network capacity, peering connectivity, and undersea cables affect the process of setting up new POPs. Population density as well as the potential number of new users also plays a role.

Image courtesy: https://blogs.dropbox.com/tech/2017/06/evolution-of-dropboxs-edge-network/

GSLB is the entry point for edge locations as it decides to which POP it should route a user request. Dropbox uses multiple GSLB techniques - but the preferred one is a hybrid approach. BGP Anycast is the easiest to configure. In Anycast, multiple computers (or edge locations in this case) have the same IP address, and routing ensures that packets are sent to the location nearest to the user. Dropbox uses it mostly as a fallback mechanism as it has performance, control and debugging issues.

Another technique - GeoDNS - relies on DNS to resolve the correct IP address for a POP based on the end user’s location. However, if the DNS mapping changes to a different POP, it can take a long time for clients to resolve to the new IP in spite of DNS TTLs being set to low values, since many ISPs ignore the setting. The key difference between these two routing mechanisms is that Anycast resolves a name to the same IP address and relies on BGP routing after that, whereas GeoDNS resolves to an IP address that is closest to the user.

In the hybrid approach, GeoDNS maps multiple POP addresses to the same name, and BGP announces their subnets and their supernet (a combination of two or more subnets). Routing ensures that users are sent to an available POP when one goes down without the need to change the DNS mapping.

Dropbox uses several tools to measure the performance of its global network and its desktop clients. The clients have a measurement framework built into them that captures latency data and sends it back for analysis after anonymizing it. Users can use "debug" sites like dropbox-debug to capture network characteristics and send it back to Dropbox.

Dropbox POPs are built with NGINX and IPVS and handle user facing connections. SSL is terminated at the POP which connects to Dropbox backend servers. IPVS load balancers send the TCP traffic to multiple NGINX servers, which act as an L7 (HTTP in this case) proxy. These proxies maintain encrypted, persistent connections to the Dropbox backend servers over their internal backbone network to serve content.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss
BT