InfoQ

InfoQ

News

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

More on Parallel LINQ

Posted by Jonathan Allen on Sep 20, 2007

Sections
Development,
Architecture & Design
Topics
.NET ,
Performance & Scalability
Tags
LINQ ,
PLINQ

LINQ, or Language Integrated Query, is the poster child of Visual Studio 2008. It allows for a SQL-like syntax to be used for querying any kind of data. Out of the box it supports SQL Server, XML, and in-memory objects and the extensible framework allows developer to support other providers including MySQL, Amazon, and Google Desktop.

Queries are special in that they are inherently parallelizable. While queries can have side effects that prevent parallelization, they are rare. This makes them an ideal candidate for parallel processing libraries like PLINQ.

PLINQ, formally known as Parallel LINQ, operates on XML and in-memory objects. Queries that are executed on a remote server such as LINQ to SQL are obviously not candidates.

To turn a LINQ query into a PLINQ query is surprisingly easy. Just add the call ".AsParallel()" to the end of the data source specified in the From clause. The Where, OrderBy, and Select clauses are automatically changed to call the parallel version, as are any joins.

According to MSDN Magazine, PLINQ can execute queries in one of three modes. In pipeline processing, one thread is used to read the data source and the other threads are used to process the query. The results can be used while this is going on, though the single consumer thread may have difficulty keeping up with the multiple producers. But when the workload is well balanced, this can have a significant reduction in memory consumption.

The second mode is called "stop and go". This is used when the entire result set is needed at one time, such as when calling ToList, ToArray, or when sorting the output. This completes all the processing at once and then turns over all the resources to the consumer thread. This can have better performance than the first method because the need for synchronization between the query threads is reduced.

The final method is called "inverted enumeration". Instead of collecting all the output and processing it on a single thread, the final function called is passed to each thread via the ForAll extension method. This is by far the fastest method, but it requires that the function passed to ForAll be thread safe and preferably free of locks and side effects.

An exception in any PLINQ thread causes all the other threads to terminate. If multiple exceptions occur, they are all bundled into a single MultipleFailuresException with their original stack traces preserved.

No comments

Watch Thread Reply

Educational Content

10 tips on how to prevent business value risk

One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.

Interview: Software Systems Architecture: Working With Stakeholders Using Viewpoints and Perspectives

InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.

Beauty Is in the Eye of the Beholder

Alex Papadimoulis discusses ugly code, where it comes from, how to avoid it, and how to get rid of it.

Architecting Visa for Massive Scale and Continuous Innovation

John Davies examines Visa’s architecture and shows how enterprises have architected complex integrations incorporating Hadoop, memcached, Ruby on Rails, and others to deliver innovative solutions.

Max Protect: Scalability and Caching at ESPN.com

Sean Comerford unveils ESPN.com’s architecture, what components are used and why, and the current changes the website goes through.

The Seven Deadly Sins of Enterprise Agile Adoption

Are there repeated patterns of failure on Enterprise Agile Enablement efforts? Sanjiv and Arlen discuss Seven Deadly Sins to avoid when adopting Agile in an enterprise.

Questions for an Enterprise Architect

Erik Dörnenburg answers: What is Enterprise and Evolutionary Architecture?, discussing 4 issues: Turning strategy into execution, Ensuring conformance, Where do the architects sit? Buying or building?

Wrap Your SQL Head Around Riak MapReduce

Sean Cribbs explains what Map-Reduce and Riak are, why and how to use Map-Reduce with Riak, and how to convert SQL queries into their Map-Reduce equivalents.