BT

Scaling the Stack Overflow Monolithic App by Obsessing Over Performance

by Daniel Bryant on Jun 21, 2015 | NOTICE: The next QCon is in San Francisco Nov 7-11, 2016. Join us!

At QCon New York 2015, David Fullerton presented a deep-dive into the monolithic C# / MS SQL architecture that powers the Stack Exchange web applications, which includes the Stack Overflow website that serves over 4 billions requests per month. Fullerton argued that by focusing on performance, scalability was included ‘almost for free’; and that by minimising the number of external application services, the need to pay ‘SOA tax’ has been avoided.

Fullerton, VP of engineering at Stack Exchange, opened the presentation by stating that although the architecture that powers the Stack Exchange family of websites is boring, the methodology used to keep this boring is very interesting. Stack Exchange own and operate several community-driven ‘question and answer’-style websites, including the popular Stack Overflow developer Q&A portal.

The Stack Exchange team operate in a fully remote manner, and even if team members are co-located, they are encouraged to act as if they were not. For example, by exclusively using instant messaging and distributed bug-tracking applications. Fullerton discussed how the mentality of ‘hiring smart people and getting out of their way’ has lead to the creation of a high-performing team of full stack developers and sysadmins that share responsibility for building and keeping the websites running.

The Stack Exchange family of websites are designed using a ‘monolith plus’ architecture, where almost everything happens in the monolithic C# web tier application and associated MS SQL database. There are a few exceptions to this rule, such as the use of a ‘tag engine’ service that has been extracted from the monolith, and also several Redis servers provide caching, and ElasticSearch servers provide fulltext search.

The website application stack is deployed into two datacenters in order to increase fault-tolerance, and New York hosts the primary datacenter, and Oregan the secondary. Application deploys occur all day, every day, via rolling deploys that occurr throughout the web tier servers. Testing of new functionality is primarily conducted using a cohort of real users, the selection of which is controlled by a series of feature flags.

We turn [new functionality] on for a subset of sites to see how it performs. This works for us. We have a read-heavy load centered on one page [the question/answer page], not as much customised content as some sites, and a forgiving community of users.

The development mantra at Stack Exchange is ‘start with what we know, measure it live and fix the slow’. The original developers knew C# and MS SQL, and therefore this is the development stack that is still in use today. The initial web application utilised several off-the-shelf tools: ASP.NET MVC, LINQ to SQL, MS SQL fulltext search, and built-in caching. Fullerton stated that at Stack Exchange, performance is a feature - primarily because of user experience, but also because search engines apply positive weighting to performant sites. Stack Exchange testing is typically performed under real load, measurements are always taken (guessing and assumptions are not allowed), and the team prioritise slow performance as a bug that must be fixed as soon as possible.

[The architecture] scales pretty well for us. We handle 4 billion requests per month, 3000 req/s peak, and 800M SQL queries per day, 8500/s peak

Fullerton discussed that over time major parts of the initial stack were replaced: caching was added using Redis, ElasticSearch was added to improve fulltext searching, SQL access was improved by replacing the object mapper with a custom intermediate language generation framework, and a tag engine service was created by extracting code from the monolith because of scalability and new functionality requirements.

Tooling has been instrumental in the identification and monitoring of performance, such as miniprofiler for request profiling and bottleneck detection, Opserver for monitoring, and Dapper for request tracing and breakdown. Fullerton stated that the focus of performance has many benefits.

You can optimise for performance and get scale thrown in (almost for free). Your monolith can scale further than you think.

Fullerton concluded by stating that the ‘monolith plus’ architecture has been very successful for Stack Exchange. Although the architecture may appear boring, it is the process that the team use to keep it this way that is interesting. The microservices architectural style may be a popular trend at the moment, but Fullerton cautioned about overusing this pattern due to inherent “SOA tax” that must be paid when building distributed systems.

SOA is not the only way. Know your own problem space. Fix actual problems, and extract services that solve real problems, not imagined ones.

Additional information about David Fullerton’s QCon New York talk “Scaling Stack Overflow: Keeping it Vertical by Obsessing Over Performance” can be found on the conference website.

Rate this Article

Relevance
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

I agree! by Mircea Z

This article shows an important point and I agree with it. There is always a balance. Not everything has to have 100 microservices. It depends.

non-coding architects by Amel Music

From my experience, only non-coding architects tend to create systems with too much microservices or layers. Either because the haven't actually tried to write even the simplest things on their proposed architecture or they are justifying their role by making things complex

Re: non-coding architects by Daniel Bryant

An interesting comment Amel, but in reality I've seen quite a few 'hands-on' architects that do the same thing. I totally agree that coding is an essential part of being an architect, but so is learning about core architecture skills, how to balance trade-offs, and also how to do 'just enough' up front design

A recurring characteristic by Pierre-Luc Maheu

There seems to be a key for performance, and it is a simple one: caring about it. Make a list of teams/companies caring for performance, and chances are the systems they work on perform well. Pretty much the same thing Nori Heikkinen said at some point in her talk.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

4 Discuss
General Feedback
Bugs
Advertising
Editorial
Marketing
InfoQ.com and all content copyright © 2006-2016 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT

We notice you’re using an ad blocker

We understand why you use ad blockers. However to keep InfoQ free we need your support. InfoQ will not provide your data to third parties without individual opt-in consent. We only work with advertisers relevant to our readers. Please consider whitelisting us.