How NGINX Achieves Performance and Scalability

Owen Garrett, heads of products at Nginx, Inc., has described on Nginx’s blog which design decisions allow NGINX to provide top-in-class performance and scalability.

The overall architecture of NGINX is characterized by a collection of processes that work together:

Master process: responsible to perform privileged operations such as reading configuration files, binding sockets, and creating/signalling child processes.
Worker processes: responsible for accepting and serving connections, reading and writing to disk, and communitating with upstream servers. These are the only processes that are busy when NGINX is active.
Cache loader: responsible for loading the disk cache into memory. This process runs at startup and then exit.
Cache manager: responsible for pruning entries from the disk caches to stay within their boundaries. It is run periodically.

The key to NGINX performance and scalability lies with two fundamental design choices:

The number of worker processes is constrained to minimize context switching. The default and recommended configuration is using one worker process per CPU core to make an efficient use of hardware resources.
Worker processes are single-threaded and handle multiple connections in a non-blocking fashion.

Each worker processes in NGINX handles multiple connections through a state machine that is implemented in a non-blocking fashion:

A worker process has got a number of sockets to handle, they are either listen or connection sockets.
When a new request comes in on a listen socket, a new connection socket is opened to handle communication with the initiating client.
When an event comes in on a connection socket, the worker responds promptly to it and moves to handle any other events that has come in on any sockets.

NGINX design choices, Garrett says, make it depart radically from other web servers, which usually opt for a model in which each connection is assigned a separate thread. This makes it very easy to handle multiple connections, since each connection can be thought of as a linear sequence of steps, but has a cost in terms of context switching. Indeed, worker threads spend most of their time in blocked state, waiting for the client or some other upstream server. The context switching cost becomes non-trivial when the number of simultaneous connections/threads that want to, e.g., execute I/O operations grows beyond a threshold, or when memory gets exhausted.

NGINX design, on the other hand, makes the workers never block on network traffic, unless there is no work to be done. Additionally, every new connection consumes very little resources, just a file descriptor and a small amount of memory in the worker process.

Overall, this allows NGINX to be able to handle up to hundreds of thousands of concurrent HTTP connection per worker process with appropriate system tuning.

InfoQ Software Architects' Newsletter

Write for InfoQ

Rate this Article

This content is in the Enterprise Architecture topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter