BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Presentations Living on the Edge: Running Code and Serving Data with Edge Services

Living on the Edge: Running Code and Serving Data with Edge Services

Bookmarks
32:07

Summary

Erica Pisani discusses what the edge is, how running code and serving data on the edge can improve the performance of services, and how to leverage these tools to maximize performance.

Bio

Erica Pisani is currently a Sr. Software Engineer on the integrations team at Netlify. She's worked in a number of startups across a variety of industries throughout her career, including financial/small business technology, human resources software, and pharmaceutical research technology.

About the conference

Software is changing the world. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.

Transcript

Pisani: Welcome to living on the edge, boosting your performance with edge computing. My name is Erica. I am a Senior Software Engineer at Netlify, working on the integrations team there.

We're first going to talk about what the edge is. Some of you might already be familiar with it in a cloud computing context. Some of you might never have heard about it before this conference, that's ok.

We're going to start from ground zero and bring everyone up to the same level of understanding before switching to web, backend, and mobile application functionality on the edge. Then we're going to talk about data on the edge. Then this fun section to wrap up called the edgiest part of the edge. I'll leave that as a bit of a surprise.

What Is the Edge?

What is the edge? In order to understand that we need to take a quick step back and understand how cloud providers organize their data centers. On the broadest possible scope, they organize these things by region, so think your Canada Central, your U.S.-East 1. Within each of these regions, there are multiple availability zones. When we're talking about origin servers, we're typically talking about servers located in one of these availability zones.

The edge are data centers that live outside of an availability zone. An Edge Function is a function that's run on one of these centers. Data on the edge is data that is cached, stored, or accessed at one of these data centers. Depending on the provider that you use, whether that's AWS, Google Cloud Platform, or Microsoft Azure, you might see different terminology, points of presence is one, or POPs. Edge locations, they all mean the same thing at the end of the day. It's just everyone want to use different terminology.

To get a sense of just how many more edge locations there are in the world, relative to availability zones, this is a map that I took off of AWS's documentation that shows all of their availability zones within their global network. This next slide shows all their edge locations. The blue dots are individual edge locations, purple show multiple, and then those yellowish-orangish circles are regional edge caches, which gives you a bit more caching capability closer to your users without needing to go all the way to the origin server.

Just to look at these two things side by side, you can see at a glance that if you're able to handle a user request in its entirety, or even part of it through the edge network, you can improve your services' performance significantly due to the lower latency incurred from receiving and fulfilling the request.

In particular, I want to call out the Australia, Africa, and South America regions where they each have one availability zone, which you can see on the left-hand side image, but multiple edge locations. If for some reason you have some business requirement or regulatory requirement where you have to host a user's data in the same region that they live in, and they live on the opposite end of that region relative to the origin server, the server's performance of what you're building can be significantly improved if you can just handle a few of the more popular requests at the edge.

To see where the edge fits into the lifecycle of a request, it should be noted here that the user, in this case, it could be someone making a request through a website. Maybe they're using a mobile application on their phone. Maybe it's an Internet of Things device that's making this request. Whichever one it is, it'll make a request. The edge location will be the first place to pick that up. As mentioned before, best case scenario, the edge location is able to handle the request in its entirety and send the response back to the user.

Let's say that it can't for whatever reason, let's say a deploy recently went out and the cache wasn't validated. Maybe this is a request with a lot of personalization so it's something that you don't generally cache, what will happen is the edge location will send a request to the origin server. The origin server will handle it as per usual, and then it will send back a response to the edge location.

If this is something that's more generalized, you can choose to cache that response at the edge before sending the response back to the user. If this is the more generalized response that you want to cache, all these other users in the area that likely want the exact same response to that request will benefit from significantly faster responses. The price that you're paying is just that initial cold start of having to reach the origin server and come back.

Now that we've done an overview of what the edge is, and where it fits in the lifecycle of a request, let's look at some functionality running at the edge using Edge Functions. I thought it'd be easiest to demonstrate the performance boosting capabilities of the edge by looking at some common performance challenges. Let's take a look at the first one. To set the scene here, rather than working as a software developer, I instead run a thriving global e-commerce business, perhaps a pet store.

I want to show different banner messages to users based on where they're located in the world. At the moment, the pages with the banner messages would be created using server-side rendering. I'm looking for opportunities to remove load from the server and return cache responses without any major rearchitecting of my site and how it functions.

The solution where the edge can come into play to help boost the performance here is transitioning that server-side rendered page into a static one, so that that can be cached at build time on the Content Delivery Network, or CDN for faster access going forward, and removes the load on the origin server that existed previously, where it would have had to render the page based on the data it received on every single request.

To go through a code example of what this looks like, I'm using a Next.js application for my site. You'll see on the left-hand side to accomplish this, I'm using some Next.js middleware. On the right-hand side, I've got my server-side rendered page. To walk through the code here, we're getting the geo object off the request, we're getting the country off the geo object. We're taking the value of the country and adding it to the URL before rewriting the request. Then on the right-hand side, we're getting the country value off the query object.

Then this is where I can show different banner messages depending on what I want to promote, whether it's maybe a new storefront that's opening in someone's city, doing a sale, or just a friendly Hello World. In this case, I am showing most of the world a Hello World message. To celebrate me being here at QCon NYC, I'm giving my fellow Canadians a 50% off their order promo code.

As a refresher, the downsides to this approach are that the server has to render it every single time. This is not cacheable on the CDN. If this is a small, little fun hobby site, you could argue that you could make different static pages to route to in the middleware in order to make those pages cacheable. I think we've all had experience that even trying to duplicate a little bit, things get out of sync really quickly. We want to avoid that at all cost.

That brings us to our static and Edge Functions example. You'll notice that the middleware has a little bit more code in there now. One thing to know that is not obvious from the code itself is that instead of running on the origin server, like what was happening in the first example, the middleware function is now running on the edge. To go through this code, again, we're getting the geo object off the request, country off the geo object.

The const request = newMiddleware request, we're going to put a pin in that for one moment, that functionality is going to come in at the end. The next line after that const response = await request.next, what's happening there is that is making the outbound request to the origin server. Instead of actually hitting the origin server, it's getting the cached asset of the static page from the CDN. At this point in time, the value of response is the HTML showing the Hello World message.

Now, because we need to inject our localized content, assuming that the user is in Canada, at this point, we'll create the message we want to create. Then this is where that new middleware request variable comes in. The first thing we're going to do is we're going to call this replaceText method, which will replace the text in the HTML with the message we want to show. Then that setPageProp method right below it is updating the Next.js page props before sending the response back to the user.

There's a bunch of little performance boosts that are happening as a result of this change. The first is that the middleware being on the edge means the request starts getting handled sooner. The page on the CDN means that that asset is cached and returned more quickly as well when you do have to make a request for it. Even though I'm not doing it explicitly here in this middleware function, you could cache that response at the edge, which will mean that future requests are even faster, like we saw in that sample request path slide.

Let's say that some of these high traffic pages on my website require a user to have an account. I happen to notice in my search for easy performance boosting wins, that the user session validation is taking more time than I'd like for some groups of users, because they're physically further away from the origin server where my site is hosted. It's often the case that those users are not initially signed in. How can I use the edge in this case to improve my site's performance for those users?

The answer would be moving that user session validation into an Edge Function. Let's take a look at what that looks like. Building on the middleware example that we just saw, I've added Auth0 as an import statement and then with these two lines of code that I've added that are surrounded by the white boxes there, I'm protecting my whole site with Auth0. At the moment, this code is just requiring that users would need to create an account to access it, but you could do far more sophisticated logic. You could be checking for a role's value on something like a JSON Web Token or a session cookie, and check that the user has the correct role before giving them access to particular pages on the site.

The last example I want to talk about in this section is this problem of routing a third-party integrations request to the correct region. This was something that I ran into in a previous role of mine. To get a good understanding of this, I want to walk through the problem with you all before talking about how the edge could be used to address this. We had two instances of our site. We had one in North America, which was our original instance, and one in the EU. Users and their data would live in one of these instances, but not both, because we were looking to comply with certain privacy laws and customer requests that involved hosting data in a particular region.

We also had third-party OAuth integrations that we need to support that we want to access or modify our users' data. Because these integrations were potentially not located in the same region as our users, honestly, it was most of the time, we couldn't use geo detecting to make an assumption about where the user's data lives, such as what we saw in our earlier middleware example. To go through what an authorization request would have looked like, and we're going to use an integration Australia really just to hammer home the points of just how bad this could get.

When the integration was being enabled by a user on the third-party integrations website, the request would go to my company's original origin server first in North America. We would check at that point to see if the user existed within that region. If it didn't, we would have to send a request to the EU instance to see if the user existed there. Assuming that the user existed, we would then send a response back to the integration with the data needed as part of the authorization flow.

For subsequent requests, it was expected by the integrations that they can make a request to either instance, and we would redirect the request to the correct instance on our end to access and modify the user's data on their behalf. That is a lot of back and forth across the world. As a junior dev, I remember hearing as a rule of thumb for myself that a highly performant request is under 300 milliseconds, and every across the ocean request was 150 milliseconds of latency added to a request.

You can imagine by going through that diagram, that the request just hung there on the authorization. While I think as users, we're trained to expect that that initial authorization might take a little bit more time, it was completely unacceptable to have that happen for subsequent requests. We really focused mainly on how to improve those ones.

The first thing we considered was returning a URL corresponding to the instance where the user's data was hosted as part of the authorization response, so that the integration could then make requests directly to the correct instance without needing to be redirected. This meant leaking implementation details, which obviously isn't ideal, especially as we added more instances going forward. We also considered encoding the region on a JSON Web Token as part of the OAuth authorization response.

That meant we didn't need to query the database for the user to determine if the user was hosted in that region. We could take some load off of our database that way and save a little bit of time on the overall request. The downside to this approach is we would have to run proxy servers in both the North American and EU regions to route to the correct region if a decoded token showed that the user's data belonged in the other region, than the one that was receiving the request. How could the edge have helped here?

That would be using the data that's encoded in the JSON Web Token and read from an Edge Function rather than at an origin server. What this would look like is, with the Edge Function approach, we're still responsible for routing. It would drastically reduce the latency from the initial request made by the integration because it's not going to the EU and North America only to bounce somewhere else again.

It would just go to the edge location and then go directly to the EU, or directly to North America. If this request is something that's made frequently, it's a bit more general in nature and the response is with caching, we can cache at the edge so that future requests would be even far faster to fulfill still.

Backend and Mobile Development and The Edge

While the problems that we just looked at are a little more web development centric, the edge isn't just for addressing challenges that web developers face, the edge can bring similar benefits to backend and mobile services the way that it does for web developers. They both have some similar challenges, so we're going to quickly talk about backend applications first before switching to mobile. Backend applications may be handling direct requests from users.

As an example, let's say we're building an API service. This might be exposed to external users to consume, or it might be an internal API that's maybe powering a user facing website or frontend application. In both cases, maintaining a consistent backwards compatible API is a requirement because you want to still be able to evolve your services without breaking things for your users. This backend application might be making multiple requests to the database as part of fulfilling a request.

Continuing the example of an API, it could be for an e-commerce service. This endpoint is fetching user data from one table in the database and then doing a query for all the orders that were made at a different table. Then they might also be making requests to other backend services. Continuing with our example, based on those recent orders, maybe the next request after that is being made to a ChatGPT powered AI service to ask for items that should be recommended to the user to buy next.

To switch to mobile development, there's a couple of concerns that I have that I think they have more top of mind that maybe web or backend developers do. One of these is intermittent connectivity. We've likely all experienced the dead zone or some loss of signal inconvenient moments. With this in mind, the speed at which requests are fulfilled is crucial to ensuring with higher degrees of certainty that the user gets response to an action that they've taken. The other thing is that people don't update their apps.

I think a number of us out there are famous for when they're on work calls, they have the red prompt on Chrome to update their browser. We've gotten a lot better as an industry by automating this, but there's still some users out there that they can't. As an example, my younger sister will hold on to her phone for as long as possible. In general, my family does this. I think my mom made it 8 years on a swivel Sony Ericsson phone, if people remember those, from the early 2000s.

My sister could not update her apps because she had no memory left on her phone. Then the one before that, she was afraid to update her operating system because she had no memory again. She was worried that the additional hardware requirements of the operating system was essentially going to break her phone, and she was going to be without a phone, which she needed for work as most of us do.

In order to account for user behavior like this, developers need to build things such that as much as possible lives on servers so that they can deliver changes without relying on a user to take action. When a hotfix is required for a zero-day security issue, this architecture is particularly helpful. Because these developers are incentivized to deliver code from the server that they control, and building on the previous point around intermittent connectivity, the use of origin servers that are physically further away from the user could mean that the extra time to fulfill the request means that the request gets dropped because the user lost connection.

It's more than just performance in this context, it's actually a question of service reliability. The problems facing backend and mobile developers can be summed up as they want to fulfill requests as close as possible to the user, and minimizing latency as much as possible from the potentially multiple requests being made to backend services. They need to maintain backwards compatibility to evolve new services without requiring a user to take action.

How can the edge help mobile and backend developers achieve these objectives? One of the possible solutions is using an API layer in an Edge Function. This pattern isn't novel, it's been around for a while. Moving it to the edge from a data center in an availability zone can mean those faster responses without major rearchitecting on the maintaining developer's part. This is just a very trivial example. I just put it together with this fun little API framework that a coworker of mine built.

If you were to change line 6 and line 10 to return something other than just a raw JSON body, it could be making a request to other services, or databases, or whatever you need to fulfill requests to the homepage or the login service. You can extend this further where rather than returning a raw API response, if it's a mobile application that's making this request, you can return view models that are easily mapped into native UI on the mobile client in order to reduce the need to update the mobile application itself even further.

Some things to consider with this approach, though, and this is a common theme throughout this talk so far, and it will continue to be so, where you want to favor caching generalized requests over more personalized ones. Especially because edge locations have smaller cache sizes compared to what's available on an origin server. If you had the option to choose between REST and GraphQL, REST is a better approach here. Because with how GraphQL queries work, where folks can go really far into the nested data objects that they have, it's a form of personalization. It means that the odds are higher that you will need to make a request to an origin server.

Then, depending on how many services you will need to make requests to when fulfilling the overall request, it may be more performant to have the Edge Functions placed closer to the backend services. This is a common enough problem that Cloudflare about a month ago released this new feature called Smart Placement, which is currently out in beta. It's intended to address this problem, where functionality on the edge when it's by default deployed closest to the client making the request, and the majority of requests are being made to a backend service, there's a lot of latency incurred from those.

You can see that at a glance with this diagram here that I pulled from their documentation. Cloudflare Workers are Edge Functions, and when it's by default deployed closest to the client, and let's say this client is in Sydney, Australia, and they're making a bunch of requests to a database in the EU, it would be better in this case to have the edge location closer to that database and that backend service.

How Smart Placement works is it behaves like normal. It places the worker initially closer to the user, but then it will analyze the worker requests that are being made to backend services when there's more than one request round trip being made. Then based on what it's seeing, it'll make a best effort to place the worker in an optimal edge location relative to that backend. What you end up getting is something more like this where the edge location is now based in Germany, rather than in Australia.

Some limitations with this tool are that you have to opt in on a per worker basis. It doesn't work with globally distributed services like CDNs or distributed databases and APIs. Depending on your context, this can actually still give you a bit of a boost in terms of performance, and it'll help reduce the impact of the latency incurred from the multiple requests to the backend service.

Data On the Edge

For those who have been developing in the web ecosystem for a while, you're probably familiar with the debates that have been made over the years about how to boost website performance through how a website is architected, and when and where data is fetched. Single page applications, islands architectures, and server-side rendering are part of those discussions.

Regardless of which camp you tend to gravitate towards here, it's fair to say that having the ability to load data on a server physically closer to the user can be of help here in boosting your performance. Some of the historical challenges of hosting data on the edge can be generally summed up by the limited number of connections that a database has and data inconsistency. How do you ensure that you don't run out of connections when you could suddenly be dealing with a spike in traffic that results in hundreds of thousands of serverless or Edge Functions trying to access your database at the same time, and it just falls over?

How do you ensure that when data is updated in one region or edge location that the cache values in other areas are invalidated and updated in a timely manner. We're going to take a look at the limited number of connections problem first. One of the ways to help mitigate the potential for running out of database connections is through the use of a connection pool. For those who aren't familiar with it, it's a collection of open connections that are passed from operation to operation as needed. The benefit of doing this is it reduces the cost of having to open and close a brand new connection every time you're performing an operation on the database.

An example of a tool that leverages this approach in this context of serverless and edge environments is Prisma Data Proxy. They create an external connection pool, and requests for the database need to go through the proxy that's managing that pool before the request reaches the database. Another tool that uses the connection pooling approach, although they use it alongside an internal network, is PlanetScale.

We'll get to talking about PlanetScale's internal network. With respect to the connection pooling part, the use of Vitess, which is an open source database clustering system for MySQL under the hood, and they leverage its connection pooling at a pretty low level within Vitess, specifically the VTTablet level, for those who are interested. By doing this, they can scale the connection pooling with the database cluster. If you're interested in learning more, the technical nitty-gritty details of this, I highly recommend reading a blog post that they have, where they talk about how they load tested this. They were able to open a million connections against their database without the database breaking a sweat. They could easily have handled far more traffic.

Switching gears to the challenge of data consistency, Cloudflare's durable objects is one approach that's taken to ensure that consistent data lives as close as possible to the user. Their approach involves having small logical units of data rather than a large monolithic database, much like serverless functions are to monolithic applications. When one of these is created, Cloudflare automatically determines the data center that the object will live in, which will be the one closest to the user.

That's great, because as a developer, you don't need to worry about which region to host the user's data in for optimal performance. If you needed to for similar reasons that we saw for why Smart Placement is now a feature, these objects can be migrated between locations at a later time quite easily. These objects are globally unique, and they can only see and modify their own data to ensure that strong data consistency.

The side effect of this, though, is that this can lead to a little bit of work on the developer's part because if you need data from multiple Cloudflare durable objects, you'll need to write those requests into your web app. You'd be accessing them through Cloudflare's internal network via Cloudflare Workers which are on the edge. That'll be a far faster request to fulfill than your standard one that's done to a third party on the public internet.

Coming back to PlanetScale and their internal network, they also use one to speed up data requests from edge and serverless environments. They describe their network as similar to a CDN, where a client connects to the closest geographic edge in PlanetScale's network, and then that request is backhauled over long-held connection pools in their internal network to reach the actual destination where the data lives.

We're going to take a look at fetching and caching data in an Edge Function using PlanetScale in this example. I tested this at Netlify, because that's where I work. You might see some interesting lines there where that first import statement doesn't quite look like node, it's actually using Deno. You could make this the node equivalent by updating that import statement and replacing Deno.env with Process.env.

At the moment, this function will query my database for the top five most purchased items that my pet store still has in stock-in every single time that function is invoked. Given that something like this is likely going to appear on a high traffic web page, maybe like the homepage, caching would be very valuable here. This is that function with caching included. I have to call out that this caching is a little bit experimental on Netlify.

It's not widely available yet. You can see just by using web standards on line 16 with the cache control header, that we can serve the result of the Edge Function directly from the cache, so that'll mean faster responses and other requests can use the result as well. The other benefits of this, which is not insignificant is that serving from the cache bypasses the function invocation all together. If this is a high traffic page, that'll save you a ton of money on the reduced Edge Function invocations.

The Edgiest of The Edge

I want to mention a product that is part of the edge, but not in the way that you think. With the exception of mobile applications, the main assumption that we've made in everything that we've talked about so far is that there's always reliable internet access. What if that isn't the case in a more extreme sense? What if what we're dealing with isn't occasional intermittent network access, but potentially non-existent internet. That brings us to talking about AWS Snowball Edge.

This one blew my mind when I first heard about it, because this is an edge offering from AWS as a way of making cloud computing available to places with potentially non-existent internet, or as a way of migrating data to the cloud if you're limited on bandwidth. This is what that device looks like if you've seen a computer in the '90s. To me, this reminds me of a computer in the '90s. You order it online, it gets mailed to you, and you can run lambda functions on it when you've hooked it up to your servers.

When you're trying to transfer a significant amount of data, you just load it onto the device, mail it back to AWS, and it gets uploaded to the cloud for you. Some of the locations that are listed as places to use this are ships, windmills, and remote factories, so places that maybe for even security reasons, in the case of ships or remote factories, they may be completely cut off from the internet. Given that there's still a lot of places in the world that have unreliable or non-existent internet, we may need to consider delivering our products similar to how Snowball Edge delivers edge and compute capabilities to these areas, and be so on the edge that we are literally on our users' doorsteps.

What Are the Limits of The Edge?

We've talked a lot about some of the positive things about the edge. What are the limits of it? Obviously, nothing's ever perfect. The first thing to call out is the lower CPU time available. This depends on the vendor. It depends on how much money you're willing to throw at the problem. What this means is it's a little bit different than wall time. This just means that you can do less operations within that function compared to an origin serverless function.

It should be noted that time spent on network requests don't count toward this. With that being said, the advantages of using an Edge Function might be lost when a network request is made, if that request is going to a very distant third-party origin server. There's also limited integration with other cloud services. I'm just going to talk about AWS, because that's what I'm more familiar with. In their marketing, they say that their serverless lambda functions have really tight integrations with hundreds of their services, but their edge offering only has tight integrations with maybe a few dozen.

If having a tight integration between one cloud provider service and the serverless functions that you're using is a requirement for you, you may not be able to use the edge in your particular use case. Edge locations also have smaller caches than the origin. You can get around that a little bit by using regional edge caches to give yourself some more wiggle room there.

This might mean that you have to be a little bit more aggressive with what responses you want to cache at the edge versus what you would do at an origin server. It's also very expensive to run these relative to running functions or serving data on an origin server. If cost is a concern for you, before you go like, "I'm going to put my whole application on the edge. It's going to be so blazing fast." Choose what you're moving to the edge wisely, otherwise, our chief financial officers will be very upset with engineering.

Boosting Performance with The Edge

We've covered a lot on what edge capabilities are out there that can help boost performance for our applications. We've looked at a simple use case of serving localized content that went from being served by the origin server to being served on the CDN and at the edge using Edge Functions. We've also looked at running an API layer at the edge in order to minimize latency for mobile users so that the increased performance also translates to increased service reliability.

This API layer coupled with something like Cloudflare Smart Placement can help us improve performance for backend services by striking a balance between being closer to the client making the request, while not so close that too much latency is incurred from the multiple requests made to backend services.

We've looked at various tooling that allows us to access and cache data closer to our users distributed all over the globe, either through something like Cloudflare durable objects that allow us to no longer have to worry about which region to host the data in, or AWS Snowball, where storage and compute capabilities can literally be shipped to our user's doorstep and operate in areas with completely nonexistent internet. While we've also looked at some of the limitations of the edge, we can see some of the use cases where we should feel comfortable handling requests at the edge as the preferred default rather than handling them at a potentially very distant origin server.

What You Can Do Today

Some things that you can do today if you're hearing about the edge and you're like, ok, I'm interested in seeing if I can incorporate this into my tech stack. One of the things you can do is take a look at some high traffic functions and see if they can become an Edge Function. Some really good candidates for this are validating user sessions, such as in the example we saw earlier. Maybe setting cookies or custom request headers that you need to do as part of fulfilling an overall request, and see if they can live on the edge as a way of easily boosting your performance with minimal changes needed to your architecture.

Then we'll also take a look at the groups of users that reside in locations furthest away from your origin server instance, and experiment with handling some of the more popular requests at the edge in order to improve your performance with them. All this is to say I think that we're starting to enter an edge-first future. This is really exciting because the more requests and data that you can serve closer to your users, the better the experience of your services will be regardless of where that user is located in the world.

 

See more presentations with transcripts

 

Recorded at:

Nov 21, 2023

BT