While rate limiting is a well-known problem for web servers, there are many other situations where similar capabilities are needed. For example, a client application could benefit by having its own built-in rate limiting so that it doesn’t trigger the web server’s limitations. And rate limits may be needed for other resources such as file and database servers so as to not monopolize them.
While in theory developers can build their own rate limiting apparatus using semaphores, this can be very error prone. So, it was decided.NET would offer a new package called System.Threading.RateLimiting. This package would define a common rate-limiting framework and provide several rate-limiting options out of the box.
The abstract base class all rate limiters should implement is simply called RateLimiter
. It offers both a synchronous and asynchronous way of obtaining leases, plus a count of available leases. A lease, also known as a permit, is permission from the rate limiter to do work. It is represented by the RateLimitLease class, which should be released back to the pool once the current task is complete.
Four built-in rate limiters are currently being proposed. The simplest is the ConcurrencyLimiter
, which like a semaphore only allows a set number of concurrent workers at one time.
The next two options are FixedWindowRateLimiter
and SlidingWindowRateLimiter
. As the name implies, these limit the number of requests for a given window of time. The former uses a simple timespan to reset the count. The latter offers a more sophisticated option where the time window is sliced into N segments, with a request counter for each segment.
The final option is the TokenBucketRateLimiter
, which is inspired by the token bucket algorithm. Normally a worker would acquire a single lease, but there are scenarios where multiple leases are needed at once. For example, each lease may represent permission to use a CPU Core or block of memory. In this case, the TokenBucketRateLimiter
may be more useful. This can organize requests based on the number of leases each worker wants and the available leases in each time-limited bucket.
If the rate limiter cannot honor a lease request, its behavior varies. For the synchronous case, a RateLimitLease
is returned with a flag indicating failure.
For the asynchronous case, requests may be queued while they wait for their turn. If the queue grows beyond a configurable limit, workers may be rejected from the queue. Programmers can configure whether the oldest or newest request will be rejected in this scenario.
One use case this library explicitly doesn’t support is aggregated limiters. From the Rate Limits proposal:
A separate abstraction different from the one proposed here is required to support scenarios where aggregated limiter are needed.
For high cardinality limiters such as rate limit by IP, we don't want to have a rate limit per bucket (i.e. one rate limiter per IP Address). As such, we need an API where you can pass in a id. This is in contrast to simpler limiters where a key is not necessary, such as a n requests/second limit, where requiring a default key to be passed in becomes awkward. Hence the simpler API proposed here is preferred.
However, we have not yet found any use cases for aggregated limiters in dotnet/runtime and hence it's not part of this proposal. If we eventually deem that such aggregated limiters are needed in the BCL as well, it can be added independently from the APIs proposed here.
Packages developed specifically for ASP.NET Core such as AspNetCoreRateLimit fill this role.