BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Microservice Threading Models and their Tradeoffs

Microservice Threading Models and their Tradeoffs

Leia em Português

Bookmarks

Architects designing Micro-Service Architectures typically focus on patterns, topology, and granularity, but one of the most fundamental decisions to make is the choice of threading model. With the proliferation of so many viable open source tools, programming languages, and technology stacks, software architects have more choices to make now than ever before.

It is very easy to get lost in the details of nuanced language and/or library differences and lose sight of what is important.

Choosing the right threading model for your micro-services and how it relates to database connectivity can mean the difference between a solution that’s good enough and a product that’s amazing.

Paying attention to the threading model is an effective way to focus the architect on considering the trade-offs between efficiency and code complexity. As a service is decomposed into parallel operations with shared resources, the application will become more efficient and its responses will exhibit less latency (within limits, see Amdahl’s Law).  Parallelizing operations and safely sharing resources introduces more complexity into the code.

However, the more complex the code is, the harder it is for engineers to fully comprehend; which means developers are more likely to introduce new bugs with every change.

One of the most important responsibilities of the architect is to find a good balance between efficiency and code complexity.

Single Threaded, Single Process Threading Model

The most basic threading model is the single threaded, single process model. This is the simplest way to write code.

A single threaded, single process service cannot execute on more than one core at a time. A modern, bare metal server typically has up to 24 cores. A service built around this model will not be able to utilize more than one server core. The throughput of these services will not increase with additional load and their CPU utilization will not be able to rise over single digit percentage. With so much underutilization, a compensating tactic is to have larger server pools in order to handle the load.

This approach works, but is wasteful and ultimately expensive. The most popular cloud computing vendors offer single virtual core instances fairly cheaply in order to facilitate this approach’s more granular scaling needs.

Single Threaded, New Multi-Process, Threading Model

The next step up in both complexity and efficiency would be the single threaded, multi-process, threading model where a new process gets created for each request. Code for this type of micro-service is relatively simple, but it does contain more complexity than the previous model.

(Click on the image to enlarge it)

The overhead of process creation and constantly having to create and destroy database connections can steal processor time and thereby increase latency across all collocated services. The reason why this threading model creates more database connections is because database connections are per process and cannot be shared across process boundaries. Because the process lives only as long as the request, each request has to reconnect to each database.

Micro-services that run in this threading model should delay connecting to databases until they are needed. There is no reason to incur the cost of a database connection if the code path does not require that connection. While database connections cannot be cached across processes, some environments support a cross process opcode cache where you can store your service’s configuration data such as host IP and credentials for connecting. to a database; two popular examples of opcode caches are Zend OpCache and APC.

Single Threaded, Reused Multi-Process, Threading Model

The next increase of code complexity with efficiency is a threading model which is single threaded, multi-process, and any new request reuses existing worker processes. This is different from the previous threading model which always created a new process for each request. In this threading model, a new process is not created with each request after the process has been provisioned.

(Click on the image to enlarge it)

The service’s complexity is relatively simple but extra orchestration code must be involved to manage the worker process life-cycle. Code must also correctly re-initialize itself with each request. For example, programmers might maintain static variables instead of passing around a lot of extra data as parameters. That makes for simpler code and is fine when those static variables are reset with each new request. If the code doesn’t reset these variables, then behaviour will be based on previous requests instead of the current one. The last bit of additional code complexity is that logic for recovering from stale database connections will need to be included. A database connection can go stale when the database disconnects most likely due to inactivity.

Because each process can service multiple requests, there is no need to reconnect to each database with each request; database connections get reused which reduces latency by avoiding connection costs. But each process still has to create and manage its own database connections. Because processes cannot share database connections, shared databases maintain more open connections. Excessive open connections can degrade database performance. That is because database connections are stateful so the database application has to allocate resources in its own process for each connection.

Multi-Threaded, Single Process, Threading Models

There is a way to better protect the databases with a configurable number of connections. By using connection pooling in the multi-threaded, single long lived process model. Although a database connection cannot be shared across multiple processes, it can be shared across multiple threads in the same process.

(Click on the image to enlarge it)

Here is an example: If you have 100 single threaded processes each on 10 servers, then the database will see 100 X 10 = 1000 connections. If you have 1 process each with 100 threads on 10 servers and each process has 10 connections in its connection pool, then the database will see only 10 X 10 = 100 connections and the service can still achieve high throughput. Cross thread connection pooling is very efficient for both the service and the database.

This connection pooling technique achieves high throughput while protecting the databases but comes at a cost of extra code complexity. Because threads must share stateful database connections, developers must be able to identify and fix concurrency bugs such as deadlock, livelock, thread starvation and race conditions. One way to address these types of bugs is to serialize access but serializing access too much reduces parallelism. These types of bugs can be difficult for junior developers to identify and correct.

Multi-threaded, single long lived process models come in two flavors; by dedicating a thread per request or by sharing a single thread for all requests. In the former threading model, an extra thread is tied up with each request which limits the number of requests being processed in parallel. Too many threads can lead to inefficiencies due to excessive task switching in the CPU scheduler part of the Operating System.

In the latter threading model, there is no need to have an extra thread for each request but I/O bound tasks must run in a separate thread pool in order to prevent the entire service from hanging on the first slow operation that it encounters. If the results must be returned to the caller, then the request handler must wait for the results from the thread pool to finish.

With the no dedicated thread per request approach, expect high throughput and low latency for asynchronous operations but no real performance gains over the dedicated thread per request approach for synchronous operations.

Summary

threading model

efficiency concerns

code complexity issues

single threaded, single process

The service will not be fully able to utilize server cores. Expect throughput to not increase with additional load and CPU utilization to not be able to rise over 10%.

The simplest and most easy to understand approach.

single threaded, multi-process, new process for each request

The overhead of process creation and constantly having to create and destroy database connections a lot can raise latency.

Database connections should be lazy loaded. Consider using an OpCode cache.

Single threaded, multi-process, requests reuse worker processes

The databases see more open connections because they cannot be shared across process boundaries. Excessive open connections can degrade database performance.

Extra code must be present to manage the worker process lifecycle. The code must be able to recover from stale connections. Static variables should get reset with each request.

Multi-threaded, single long lived process, dedicated thread per request

Cross thread connection pooling is very efficient for both the service and the database but an extra thread is tied up with each request which limits the number of requests being processed in parallel.

Because threads must share stateful database connections, developers must be able to identify and fix concurrency bugs such as deadlock, livelock, thread starvation and race conditions.

Multi-threaded, long lived single process, no dedicated thread per request

Cross thread connection pooling is very efficient for both the server and the database. Expect high throughput for asynchronous operations.

I/O bound tasks must run in a separate thread pool. If the results must be returned to the caller, then the request handler must wait for the results from the thread pool to finish.

Conclusion

Before thinking about libraries and languages, software architects should reflect on the choice of threading model most appropriate to their engineering culture and competency. Striking the right balance between code complexity and efficiency will help sort out the confusion and give direction in choosing between the various technology stacks available. Because each micro-service has less scope than a monolithic application, consider leaning a little more towards code complexity in order to achieve higher efficiency.

About the Author

Glenn Engstrand is the Technical Lead for the Architecture Team at Zoosk. His focus is server side application architectures that need to run at B2C web scale with manageable operational and deployment costs. Glenn was a breakout speaker at the 2012 Lucene Revolution conference in Boston. He specializes in breaking monolithic applications up into micro-services and in deep integration with Real-Time Communications infrastructure.

Rate this Article

Adoption
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Threading is not specific for microservices, but ...

    by Dong Liu,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    I saw microservices only in the title. The article should discuss more about what is specific for microservices as the title suggested.

  • Re: Threading is not specific for microservices, but ...

    by Glenn Engstrand,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    I count five points about threading models in micro-services here in this article. It is true that the same guidance provided by computer science applies equally to both micro-services and to other forms of application development.

    Did you read all the way to the end? There is a recommendation to lean more towards complexity in micro-services due to their limited scope.

  • Re: Threading is not specific for microservices, but ...

    by Dong Liu,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Sorry if I sounded like a troll.

    I think the topic is very important and interesting to me. And it is also a hard topic to discuss, because threading itself is already complicated.

    I would like to see more discussion about how the design decisions of microservices impact what we already know about threading. For example, what if a microservice owns its data storage, and what will that result compared to the situation a number of microservices share one storage service.

  • Re: Threading is not specific for microservices, but ...

    by Sean Wiley,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Hi Dong,

    I believe you would be interested in taking a look at the single-thread single-owner data model of Baratine (doc.baratine.io/v0.11/architecture/service-arch... ) . A service has a single inbox, where requests are queued and answered by a single thread on an event loop. In this model, requests are nonblocking and the encapsulation of data being accessed only by a single thread prevents possible concurrency issues. In short, you no longer need huge synchronization blocks when accessing data due to an improved encapsulation model encompassing the thread + data.

  • How about in case of PaaS(Cloud) model?

    by Praneeth Yerrawar,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Hi Glenn,
    Thank you for a nice article. And wondering what would be your choice of threading model in case of Microservices architecture on Platform as a Service hosting model (Heroku, Cloud Foundry, Azure Service Fabric etc.)?

    12factor.net (Concurrency factor) suggests that we should choose Single Threaded, Multi Process threading model over others because scaling-out is simple and reliable, do you second with their opinion? or do you've different perspective?

    Thanks
    Praneeth

  • Re: How about in case of PaaS(Cloud) model?

    by Glenn Engstrand,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Thanks for asking these questions, Praneeth. I believe that the concurrency factor is about scaling out with multiple processes. The Heroku folks are actually quite neutral when it comes to threading models. The point that they are trying to make is that a single process won't scale no matter how many threads you use. My article here on InfoQ focuses on threading models. It was not my intention to advocate for systems where everything runs in a single process. I completely agree with the concurrency factor. Design systems where you can scale out on multiple, horizontally partitionable, share nothing processes. In any single process, you still need to decide on which threading model is best for your situation.

    I find a lot of sentiment on the web that single threaded models are superior. In this article, I summarize that there is a complexity vs efficiency trade off. Single threaded apps are simpler but less efficient. You may be wondering if efficiency is all that relevant anymore since you can always scale out on more instances in the cloud. That is true but instances cost money. When you are staring at a six figure monthly AWS bill, you suddenly realize just how important efficiency still is.

    Is there any concrete evidence that can back up my claims? At the beginning of this year, I conducted some research into this very question. I ran load tests on AWS against two functionally identical micro-services. One micro-service was written in Java using the DropWizard framework and the other was written in javascript using the node.js framework. You might be interested in my findings.

    glennengstrand.info/software/performance/nodejs...

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT