InfoQ

News

Digging into the Performance of the ADO.NET Entity Framework

Posted by Robert Bazinet on Feb 15, 2008

Community
.NET
Topics
Data Access ,
.NET Framework
Tags
Microsoft ,
ADO.NET ,
ADO.NET Entity Framework

The ADO.NET Team recently discussed various performance aspects of the ADO.NET Entity Framework.  The ADO.NET Entity Framework entered its third beta back in December and since that time the team has provided developers with information about using the framework but now they are providing developers with the performance aspects.

The articles spends time looking at the performance of the ADO.NET Entity Framework by breaking down the stack and showing how to speed up a simple query and explain the performance characteristics of the framework.

It's important to point out when a layer of abstraction or something else like an EDM is used to transform the relational schema of a database, there is going to be a performance decrease.

The Query and Results

The article uses the NorthWind database for the model and creates a very simple query:

using (NorthwindEntities ne = new NorthwindEntities()) 
{
foreach (Order o in ne.Orders)
{
int i = o.OrderID;
}
}

The test was run for 10 iterations over a total of 848 rows for each query.  The results are interesting with the first run being 4241 ms and each subsequent run averaging around 13 ms.  A good part of the time is the creation of the ObjectContext and then executing any operation that accesses the database, some expensive operations occur.

Breaking down each operation by percentage gives us some insight:

  • Loading Metadata (11%)
  • Initializing Metadata (14%)
  • Opening Connection (8%)
  • View Generation (56%)
  • Load Assembly (2%)
  • Tracking (1%)
  • Materialization (7%)
  • Misc (1%)

By far the biggest percentage of time is View Generation at a whopping 56%.  When View Generation is the primary time cost, developers can use  the EDM generator (EdmGen.exe) command line tool with the view generation command parameter (/mode:ViewGeneration), the output is a code file (C# or VB.NET) that can be included in the project. Having the view pre-generated reduces the startup time down to 2933 ms, about a 28% decrease in the overall time for the iteration.  Generating the views and distributing with applications is a good solution to gain performance.  The downside is the views are no longer dynamic and need to be regenerated and kept synchronized when there are changes to the model.

Query Performance

It's pointed out that a major design element for performance is the query cache. Once a query is executed, parts of it are maintained in a global cache. The query and metadata caching results in the second run always executes faster than the first run.  For example, looking at this Entity SQL query:

using (PerformanceArticleContext ne = new PerformanceArticleContext()) 
{
ObjectQuery orders = ne.CreateQuery("Select value o from Orders as o");
foreach (Orders o in orders)
{
int i = o.OrderID;
}
}

The first run of this query takes 179 ms, but the next run takes only 15 ms. The execution difference between the initial and subsequent ones is building the command tree that gets passed down to the provider for execution.

LINQ queries are similar to Entity SQL queries in the way it executes.  For example, the query below:

using (PerformanceArticleContext ne = new PerformanceArticleContext()) 
{
var orders = from order in ne.Orders
select order;

foreach (Orders o in orders)
{
int i = o.OrderID;
}
}

The execution of the LINQ query takes 202 ms initially and 18 ms on subsequent executions, still slower than Entity SQL. Taking a look at using compiled LINQ queries to improve performance further. The advantage of compiling a LINQ query is that the expression tree is built when the query is compiled and doesn’t need to be rebuilt on subsequent executions.   The code for the compiled LINQ query looks like this:

public static Func

Notice PerformanceArticleContext is a delegate.  The execution time for the compiled LINQ query is 305 ms on the first execution and 15 ms on subsequent ones.  The results are not amazing but it's interesting to note the 3ms decrease in execution time for the compiled LINQ query from the the regular one, not important for a few queries but has value when performing thousands of queries.

The ADO.NET team suggests being careful with the Track/NoTrack options in your queries:

In the previous examples, all the queries result in the creation of an object that gets added to the ObjectStateManager so that we can track updates. When it is not important to track updates or deletes to objects, then executing queries using the NoTracking merge option may be a better option. For example, NoTracking may be a good option in an ASP.NET web application that queries for a specific category name but doesn’t make updates to the returned data. In a case like this, there is a performance benefit to using NoTracking queries.

Based on the numbers, the NoTracking option provides a big reduction in the amount of time, where most of this gain comes when we stop tracking changes and managing relationships. For a NoTracking query, the compiled LINQ query outperforms the standard LINQ query both in first execution and in subsequent executions. Note that the second execution of the compiled LINQ query is equivalent to the second execution of the Entity SQL query.

The ADO.NET team also suggests keeping a few things in mind when creating queries:

When optimizing query performance in the Entity Framework, you should consider what works best for your particular programming scenario. Here are a few key takeaways:

  • Initial creation of the ObjectContext includes the cost of loading and validating the metadata.
  • Initial execution of any query includes the costs of building up a query cache to enable faster execution of subsequent queries.
  • Compiled LINQ queries are faster than Non-compiled LINQ queries.
  • Queries executed with a NoTracking merge option work well for streaming large data objects or when changes and relationships do not need to be tracked.

For more information on ADO.NET and Entity Framework information, please check out the ADO.NET Team Blog.

Typo? by Christophe Vanfleteren Posted Feb 15, 2008 1:50 PM
Re: Typo? by Robert Bazinet Posted Feb 15, 2008 2:20 PM
How to get the breakdown results? by Peter Cheung Posted Jul 16, 2008 4:12 PM
  1. Back to top

    Typo?

    Feb 15, 2008 1:50 PM by Christophe Vanfleteren

    In the part where you talk about the compile Linq query, there's only one line of code visible:


    The code for the compiled LINQ query looks like this:

    public static Func

    I guess there should be more visible? :)

  2. Back to top

    Re: Typo?

    Feb 15, 2008 2:20 PM by Robert Bazinet

    Thank you for pointing it out. No, it's not a typo, just happens to be an error in the publishing software.

    I will get it fixed.

  3. Back to top

    How to get the breakdown results?

    Jul 16, 2008 4:12 PM by Peter Cheung

    Hi Robert,

    I'm interested in getting the breakdown results like yours:

    Loading Metadata (11%)
    Initializing Metadata (14%)
    Opening Connection (8%)
    View Generation (56%)
    Load Assembly (2%)
    Tracking (1%)
    Materialization (7%)
    Misc (1%)

    How can I do that and what tools are needed?

    Thanks.

Educational Content

QCon SF Keynote: Techie VC's Talk About Trends & Opportunities

Kevin Efrusy and Salil Deshpande talk about what makes a business successful or not, presenting three actual cases they have been involved with: Hyperic, G2One, SpringSource.

Project Lead Mark Fisher Discusses the Spring Integration Project

InfoQ talks to Mark Fisher, project lead for the Spring Integration project, about the framework.

How HTML5 Web Sockets Interact With Proxy Servers

Peter Lubbers explains in this article how HTML5 Web Sockets interact with proxy servers, and what proxy configuration or updates are needed for the Web Sockets traffic to go through.

Rails in the Large: How Agility Allows Us to Build One Of the World's Biggest Rails Apps

Neal Ford shows what ThoughtWorks learned from scaling Rails development: infrastructure, testing, messaging, optimization, performance.

Stuart Halloway on Clojure and Functional Programming

Stuart Halloway discusses Clojure and functional programing on the JVM in depth, and touches on the uses of a number of other modern JVM languages including JRuby, Groovy, Scala and Haskell.

Oren Teich and Blake Mizerany on Heroku

Oren Teich and Blake Mizerany talk about the technology behind Heroku and the benefits of the new add-on system.

Security for the Services World

Chris Riley presents security issues threatening service based systems, examining security threats, presenting measures to reduce the risks, and mentioning available security frameworks.

Navigating The Rapids:Real-World Lessons in Adopting Agile

This talk investigates technical issues encountered when moving to an Agile process.