BT

MySpace Explains How They Use the Concurrency and Coordination Runtime

by Jonathan Allen on Oct 12, 2009 |

The CCR, or Concurrency and Coordination Runtime, first made its debut in .NET Robotics. Since then a number of companies have adopted CCR for a wide variety of non-Robotic projects including the website MySpace.

CCR uses message passing semantics to isolate different parts of the application. At the lower levels, carefully managed threads and work-stealing queues have a significantly lower overhead than traditional threads and locks.

MySpace is using CCR for their communication layer, which is a custom wire format on top of raw sockets. Each server has two queues; the first is for on-way messages, the second for synchronous messages that expect an immediate response. They can tune each server by altering the number of threads assigned to the synchronous message queue vs. the one-way queue.

Beyond the communication layer is their caching clusters. Each contains a storage component and a collection of processing components. When a one-way message is received, it goes into a third CCR queue. From there is passed along to the storage component and any processing components that are subscribed to that message type. Synchronous messages are handled in a similar fashion, though the skip the aforementioned queue.

MySpace is using CCR because it offers them significantly higher throughput than what they obtained from thread pools. One reason for this is that .NET thread pools have a significant amount of overhead just to deal with trying to determine how many threads to keep alive. There is also the context switching between threads that consume valuable processing time. CCR avoids a lot of this by using defaulting to a single thread per CPU. In order to get even more control over threads, MySpace has extended CCR so that they can dynamically change the number of threads assigned to a queue at run time.

In addition to the thousands of incoming messages a cluster receives each second, it also has to handle out-going messages. For this they use CCR’s multiple receive and other patterns to make decisions like “wait for 100 messages or 500 milliseconds”, at which time the outgoing messages are sent in a batch. According to MySpace, these patterns are essential for moving large amounts of data.

Prior to adopting CCR, MySpace would either use .NET thread pools or manually manage their threading. They claim to be unaware of any alternatives for thread management in .NET prior to CCR, which happened to advertise support for the same use cases they were trying to solve with their in-house implementations.

Currently MySpace is using 1,200 middle-tier servers and 3,000 web servers running CCR. It has become so popular at MySpace that they incorporated into their frameworks and many developers there are not even aware they are using it.

You can watch the entire MySpace interview with Erik Nelson and Akash Patel on Channel 9. For more information on CCR, check out the CCR and DSS Toolkit 2008 R2.

Hello stranger!

You need to Register an InfoQ account or to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

That sounds expensive by Udi Dahan

I've written up a blog post on this presentation:

www.udidahan.com/2009/10/09/myspace-architectur...

In this case, they would probably have been better off had they gone with a more REST-oriented architecture as I wrote about here:

www.udidahan.com/2008/12/29/building-super-scal...

Re: That sounds expensive by Jonathan Allen

I don't see how switching from a custom wire format to HTML for cross-machine communication is supposed to help. Could you elaborate?

Re: That sounds expensive by Udi Dahan

Jonathan,

The main difference is that we leverage the caching behavior of the internet rather than caching the data on our own servers. It's not an issue of wire format. It makes it possible to leverage CDNs to distribute data and not just static content like images.

Does that answer your question?

Re: That sounds expensive by Jonathan Allen

Not really. There is no obvious reason to me why the techniques cannot be combined.

Re: That sounds expensive by Udi Dahan

Absolutely - the techniques can be combined.

The question is, after offloading all this data to the CDN, and as a result the number of requests to our servers drops substantially, how many servers will we need then? Probably quite a bit fewer. Yes, we can cache on those remaining servers as well, but then we're in an environment where the read-to-write ratio is much lower, decreasing the value of caching on our servers.

Hope that makes sense.

Re: That sounds expensive by Sonny Varona

I guess they use cdn (Alamo and stuff) to serve those statsic pages. But most traffic is when people viewing thier profiles. For popular celebrity profiles I see that rest can help. But for individual profiles I'm not sure how much if an edge over.
Then prpfiles can fall in/out of favour quickly.... And caching profiles info along the Internet? Public profiles yes and if they keep being public that way, private profiles probably not .....

So I'm pondering ......

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

6 Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2013 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT