BT

SharePoint Object Model Performance Considerations

Posted by Andreas Grabner on Mar 04, 2009 |

The SharePoint Object Model allows external applications or hosted WebParts to query, modify and create content stored in the SharePoint Content Database. There are many blog entries, knowledge base articles and best practices about how to correctly use the Object Model for different use case scenarios.

The most common use case scenarios are around displaying and modifying SharePoint lists - unfortunately, that is also where we see many performance issues because the SharePoint Object Model is not always used in ways conducive to optimal performance.

Use Case 1: How many items are stored in a SharePoint list?

There are multiple ways to answer this question. One example that I’ve seen many times is the following:

int noOfItems = SPContext.Current.List.Items.Count;

This code gives us the number of items in our list but in order to do this it has to retrieve ALL items of the list from the content database. The following screenshot shows what is executed within the Object Model when accessing the Count property as in the code above:

img1

For small lists this might not be a problem as the query is rather fast. But it will become a problem when lists grow over time or when the custom code has never been tested with live data.

For this scenario Microsoft offers a different property on the SPList object itself called ItemCount. The correct code for this would be:

int noOfItems = SPContext.Current.List.ItemCount;

In this case, SharePoint only needs to query a single record from the Lists table in the content database. The number of items in the list is stored there redundantly in order to get this information without the need to query the entire AllUserData table where all SharePoint list items are stored.

Use Case 2: Displaying items in a list using SPList?

There are multiple ways to iterate through the items of a SharePoint list by using the SharePoint Object Model. One approach - which I’ve seen in a live SharePoint Application - may work fine on the developer’s machine or on very tiny lists. But it is going to ruin your performance once executed on a list that exceeds a couple of hundred items. Let’s start by looking at a code snippet that can be used in a WebPart to access the first 100 items from the SharePoint list of the current context:

SPList activeList = SPContext.Current.List;

for(int i=0;i<100 && i<activeList.Items.Count;i++) {

SPListItem listItem = activeList.Items[i];

htmlWriter.Write(listItem["Title"]);

}

Assuming that there are at least 100 items in the list - how many roundtrips to the database is this code going to make in order to retrieve the 100 Title’s of the first 100 SharePoint list items? You might be surprised. It’s a total of 200 database calls as you can see from the database view when analyzing the transaction executing the above code:

img2

The reason for that is because in every loop we request a new SPListItemCollection object when accessing the Items property. The Items property is not cached and therefore always requests all items from the database over and over again. Here is how the first loop iterations look:

img3

The CORRECT way to do it

The correct way to do it is of course to store the Items property return value in a SPListItemCollection variable. With this the database is only queried once and we will then iterate over the result set that is stored within the collection object. Here is the changed sample code:

SPListItemCollection items = SPContext.Current.List.Items;

for(int i=0;i<100 && i<items.Count;i++) {

SPListItem listItem = items[i];

htmlWriter.Write(listItem["Title"]);

}

You can also use a foreach loop which will be compiled into a similar code leveraging the IEnumerable interface of the Items Collection. Here is now how the new loop is executed internally:

img4

Use Case 3: Use SPQuery and SPView to only request data that you really need

One of the main performance problems that we can witness in any type of application that has to deal with data from a database is that too much data is requested. Requesting more information than actually needed by the current use case on hand results in additional

  • query overhead in the database to gather the requested information
  • communication overhead between the database and the application
  • memory overhead on both the database and the application

Looking back at the previous two use cases you can see that the executed SQL Statement always selected ALL items from requested SharePoint list. You can tell that by looking at the SELECT clause which says: SELECT TOP 2147483648 …

Limiting the number of returned rows

In case you only want a limited result set when accessing items in a SharePoint list you can make use of the SPQuery.RowLimit property.

Here is an example:

SPQuery query = new SPQuery();

query.RowLimit = 100;

SPListItemCollection items = SPContext.Current.List.GetItems(query);

for (int itemIx=0;itemIx<items.Count;itemIx++) {

SPListItem listItem = items[itemIx];

}

Using the SPQuery object with SPList.GetItems will result in the following SELECT clause

img5

In the previous example we already limited the number of items that we want to retrieve. We however still request ALL columns that are defined in the SharePoint list. This might be ok in case we really need all columns to display to the end user or if we need all of them to perform some calculations. In most of the cases we however only need a few and not all.

Limiting the retrieved columns

There are two ways of limiting the columns to retrieve from the database:

  • Use a SharePoint View: SPView
  • Use the SPQuery.ViewFields property

The sample code from above can therefore be changed in the following two ways:

SPQuery query = new SPQuery(SPContext.Current.CurrentView.View);

or

SPQuery query = new SPQuery();

query.ViewFields = "<FieldRef Name='ID'/><FieldRef Name='Text Field'/><FieldRef Name='XYZ'/>";

In both scenarios the SELECT clause will only contain those fields that are defined in the SharePoint view respectively those fields that are referenced in the ViewFields property. The following illustration shows the difference in the SELECT clause:

img6

Use Case 4: Paging through SharePoint list items with SPQuery

SharePoint lists can contain thousands of items. We have all heard about the 2000 items limit that should not be exceeded in order to achieve good list performance. There are definitely performance impacts when exceeding this limit and there are ways to overcome this limit by using indexed columns and views.

Besides these considerations it’s also important to be smart when accessing data in the list. As already explained in the previous use cases - only accessing the data that you need can take a lot off pressure of the SharePoint Content Database. Additionally, the SharePoint Object Model provides additional features to enhance access to list items.

Paging data is a technique that we all well know from rich client applications or from web applications using e.g.: data grids. Paging allows easy navigation for the end-user and - if implemented correctly - reduces the load on the underlying database.

The SPQuery object provides the property ListItemCollectionPosition that allows you to specify the start position of your query page. RowLimit allows you to specify how many items to retrieve per page. Let’s have a look at some sample code:

SPQuery query = new SPQuery();

query.RowLimit = 10; // that is our page size

do

{

SPListItemCollection items = SPContext.Current.List.GetItems(query);

// do something with the page result

// set the position cursor for the next iteration

query.ListItemCollectionPosition = items.ListItemCollectionPosition;

} while (query.ListItemCollectionPosition != null)

SPList.GetItems takes the query and only returns 10 items each time GetItems is called. The SPListItemCollection offers the property ListItemCollectionPosition which acts like a cursor on the SharePoint list. This property can be used for any further page iteration to define the starting point of the next page. The following illustration shows the database activities:

img7

A closer look at one of the SQL Statements shows us that the combination of the SELECT TOP and WHERE clause is used to retrieve the items of a certain page:

img8

Use Case 5: Updating a large number of SharePoint list items

The previous use cases focused on read-access of items stored in SharePoint lists. Now its time to discuss how best to update or add new items. As the SharePoint Object Model provides a wide variety of interfaces we can again choose between multiple approaches.

The first obvious way of adding or updating items in a SharePoint list is SPListItem.Update. You can obtain the list item by either querying an existing item or by adding a new one via SPListItemCollection.Add.

Let’s have a look at the following sample:

for (int itemIx=0;itemIx<100;itemIx++) {

SPListItem newItem = items.Add();

// fill all the individual fields with values

newItem.Update();

}

Analyzing this code shows us that EVERY call to the Update method actually calls the internal method SPListItem.AddOrUpdateItem which in fact calls a stored procedure to perform the task:

img9

We can see that adding 100 items to my list took a total of 4.5 seconds.

Using batch updates instead of individual updates

If you have to update a larger number of items it’s highly recommended to not use the Update method on every item. Instead - use the batch update function ProcessBatchData provided by SPWeb.

ProcessBatchData takes batch method definitions in XML format. There is a nice article explaining how to use batch update. Implementing the sample from above by making use of batch updates looks like this:

StringBuilder query = new StringBuilder();

for (int itemIx=0;itemIx<100;itemIx++) {

query.AppendFormat("<Method ID=\"{0}\">" +

"<SetList>{1}</SetList>" +

"<SetVar Name=\"ID\">New</SetVar>" +

"<SetVar Name=\"Cmd\">Save</SetVar>" +

"<SetVar Name=\"{3}Title\">{2}</SetVar>" +

"</Method>", itemIx, listGuid, someValue, "urn:schemas-microsoft-com:office:office#");

}

SPContext.Current.Web.ProcessBatchData(

"<?xml version=\"1.0\" encoding=\"UTF-8\"?>" +

"<ows:Batch OnError=\"Return\">{0}</ows:Batch>", query.ToString())

Adding the same 100 items via ProcessBatchData and analyzing the internals shows us how much time is spent in the update:

img10

Comparing both update approaches shows that we have a significant performance improvement with the batch update:

img11

Caution

Batch updates are really recommended when doing larger updates. But please consider the overhead of creating the batch update xml:

  • Make sure you use a StringBuilder and not individual string objects that get concatenated.
  • Split up batch update calls to keep the generated XML small enough to not run into out of memory exceptions. I ran into an OOM when executing the above example with 50000 batch updates
  • As an alternative you can also use the UpdateListItems method of the Lists Web Service.

Use Case 6: Which are my slowest lists, how are they used and why are they slow?

We all know that SharePoint list performance can degrade with the more items stored in the lists, and depending on how the lists are filtered when viewed. You will find many articles and blog entries talking about the 2000 items per list limit. The 2000 items however are not the real problem - you can in fact have many more items in the list. It all depends on how those lists are viewed. The question therefore is: How can we identify which lists have degraded in performance and how are they commonly used?

In order to do the correct performance correcting actions we first need to understand the current usage scenarios and analyse the resulting performance problems.

There are multiple ways to figure out current access statistics of your SharePoint application. You can analyze IIS log files or you can make use of the SharePoint Usage Reporting Feature.

The easiest way monitor list performance is by analyzing the http response times to the respective SharePoint List and SharePoint View URLs. A SharePoint URL has the following format: http://servername/site/{LISTNAME}/{VIEWNAME}.aspx.

In order to analyze it we can group the requests based on those two indicators. I used dynaTrace’s Business Transaction Feature to group the captured PurePath’s according to a regular expression. The following images show the usage scenario of my lists and views:

img12

These results give us a good indication about which lists and views are used more frequently and how well they perform.

Additionally to analyzing the HTTP Requests - which only provides accurate data for pages that display the specific list or view - we can analyze the list usage of custom web parts or custom pages that access more than just a single list or view or that access a list in a special filtered way.

We can do that by analyzing the interaction with the SharePoint object model like SPRequest.RenderViewAsHtml which is used to render lists and views or access to the SPList and SPView. The following illustration shows usage and performance metrics based on SPRequest method invocations:

img13

The illustration above shows us the internal GUIDs of the Lists. Each list and view is uniquely identified with a GUID. There are different ways to find out the actual list name: You can paste the GUID in the URL to edit the settings of the list or view. Here is an example:

http://servername/_layouts/listedit.aspx?List={GUID} (GUID must be URL-Encoded). 

The other option would be to open your content database and query the table AllLists. This table contains both the GUID and the List name.

Why are certain lists slower?

Now as we know which lists and views are accessed frequently we can focus on those that show signs of performance degradation. In order to improve the end user experience we should focus on lists that are accessed most frequently rather than lists that are accessed just occasionally.

There can be multiple reasons why lists perform slowly:

  • Too many items are displayed in the list view
  • Too many items in the list without using filters and indexed columns
  • Inefficient data access by custom web parts

Conclusion

The SharePoint object model provides an easy and flexible way to extend SharePoint Applications. The framework offers different mechanisms to access and modify the data stored in SharePoint lists. Not every possible approach, however, is good for every use case scenario. Knowing the internals of the SharePoint Object Model allows creating SharePoint Applications that perform well and scale.

About the Author

Andreas Grabner works as a Technology Strategist for dynaTrace Software. In his role as a member of the R&D department, he influences the dynaTrace product strategy and works closely with key customers in implementing performance management solution for the entire application lifecycle. Andreas Grabner has 10 years experience as architect and developer in the Java and .NET space.

About the Author’s employer: dynaTrace software

dynaTrace is the leader in continuous application performance management across the lifecycle for business critical Java and .NET applications. dynaTrace is the only solution that provides a common, integrated performance management platform for all key stakeholders – development, test and production. Industry leaders such as UBS, LinkedIn, EnerNOC, Fidelity, and Thomson Reuters use dynaTrace’s patent pending technology to gain full visibility into application performance, identify problems sooner and dramatically reduce the mean time to repair by as much as 90%. These and other leading companies rely on dynaTrace to proactively prevent performance problems from happening and quickly resolve those that do occur – saving time, money and resources.

Hello stranger!

You need to Register an InfoQ account or to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Screenshots by Francois Ward

Nice article. Honestly though, most of these things would be obvious if the Sharepoint object model followed the .NET Framework Design Guidelines... the fact that accessing a public property causes so much side effects and is so performance intensive (like the .Items property) is a pitfall that is warned about over and over. People WILL use public properties in loops, and during a code review it isn't obvious.

If that was not available, and only GetItems was, it would be obvious to most seasoned developers that performance impacts may be involved and to not use it repeatedly.

Anyway...the performance screenshots displayed in the first part of the article... are they from dynaTrace too, or another tool?

Re: Screenshots by Andreas Grabner

All screenshots are from dynaTrace
You can also read more about my SharePoint findings on my blog: blog.dynatrace.com or blog.dynatrace.com/tag/sharepoint-net

Re: Screenshots by Stefan Wenig

If done right, SP would be built on top of an O/R mapper. It could then just cache those lists internally and return them the next time you'd access the property. But things being what they are, you're right, it should definately be GetItems(), or rather LoadItems() to be really explicit about what happens.

Btw, I guess you can tell it's dynaTrace just from the fact that it lets you drill down into method parameters. This really gives you answers where other tools just provide hints for further research.

Re: Screenshots by Francois Ward

[blockquote]If done right, SP would be built on top of an O/R mapper[/blockquote]

OR/M or not, they could still be cached internally, thats not an issue. It could still be done, but it would have to be optional...some lists can be far too large (Ive seen sites with thousands of lists of hundreds of thousands of items...), so it couldn't be an "always" thing, it would have to be optional, and then things would get complicated.

Just following the .NET framework design guidelines, so that things in SP behave like they do in everything else would be more than sufficient, honestly. Not just for performance, but for development efficiency, and even for SP's developers themselves, it would be a decent bit easier.

Oh well! Until then, there's posts like these to enlighten us small folks :)

Re: Screenshots by Stefan Wenig

"they could still be cached internally"

yes, but since SharePoint has no concept of transactional behavior, that would mean that the cache might not be up to date with the information in the data store. One way to solve this is to leave all caching to the application developer, like they do now - only then it would have to be named GetItems() like you already said. This is pretty hard to get right on an app by app basis, but at least then it's not SP's fault ;-)

Another way would be to have a sensible concept of caching and cache invalidation, which is something that O/R mappers usually have. (Of course, these concepts could be implemented in SP without an O/R mapper, it just makes little sense to do that.)

useful recommendations by Dana Miller

There are some good recommendations in there. We followed a lot of those, but still ran into problems. We solved some of them after reading this: www.webperformanceinc.com/library/casestudies/s...

Re: useful recommendations by Andreas Grabner

Good Case Study - thanks for the link

Loop efficiency by Nick Tulett

In Case 2, wouldn't it be better to cache the List Items' count as a simple integer before the loop, saving yourself (in your example) half the time spent in the inner loop?

Re: Loop efficiency by Andreas Grabner

Caching the value would be one option. the best option however would be to use the ItemCount property which doesnt require a roundtrip to the database

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

9 Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2013 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT