BT

Using LINQ to XML Instead of XSLT for Transformations

by Jonathan Allen on Jul 30, 2007 |

Transforming XML from one format to another is a common task for many developers. To do this, most of them leave the confines of their general purpose language and make calls to an XSLT library. But what if they didn't have to?

With LINQ to XML, it now becomes much easier to manipulate XML using C# and VB. Eric White describes how one can perform XSLT style transformations using C# 3.0.

The key to Eric's method is the ability to annotate XML nodes with additional information. Instead of altering the tree piece-meal, each pending change is stored as annotation against XElement it replaces. Eric writes

One advantage to taking this approach - as you formulate queries, you are always writing queries on the unmodified source tree. You need not worry about how modifications to the tree affect the queries that you are writing.

Once all of the pending changes have been generated, they are applied at one time. This is done via the XForm function, which creates a copy of the tree, making replacements where appropriate.

You can learn more about this technique and get a copy of the XForm function from Eric White's blog. 

Hello stranger!

You need to Register an InfoQ account or to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Why use objects by Frank Cohen

Interesting move for the XLinq team. Anything that makes it easier to work with XML in an object oriented environment is a good thing. I did not find from Eric's blog if there is an impact on system performance at runtime with this approach. I investigated using XML and XQuery in an architecture I named FastSOA as a way to mitigate the performance problems I find in Java approaches to working with XML. My findings are in my book titled FastSOA from Morgan Kaufmann Publishers.

The basic problem I found was all the object instantiation needed to move from serialized XML to objects, to business logic operating on the data, and then back to XML or RDMS data formats. There's just too much object instantiation going on. I recommend using a domain specific language like XQuery because certain XQuery implementations compile XQuery to Java Byte Code and avoid objects entirely. (I also never did get my head into the non-proceduralness of XSLT, oh well.)

-Frank Cohen
www.pushtotest.com

Re: Why use objects by Jonathan Allen

As far as performance is concerned, it is way too early to tell. Because they are implemented so differently, I can easily see minor tweaks to a transformation shifting the advantage back and forth between the various techniques.

I think the real gain from this technique is that you don't have to marshall all the data you might need from C# to XQuery or XSLT. If your transformation needs to do something expensive like a database call or hard calculation one time out of ten, you only pay the cost when you actually need it.

On the other hand, XSLT is still a lot cleaner if you think in terms of "This is what my results are suppose to look like." rather than "This is how I transform my source.".

Needless to say, this is why I didn't conclude with "And this is why you should use X".

this seems to be a step back .... by Ke Jin

In java and C++ world, this has been a common and obvious practice for years. For simple transformations, we have DOM and SAX API to parse XML documents. Based on the parsing result, one can use pure imperative java and/or C++ code to generate DOM objects or simply output XML text streams without using XSLT transformer API and certainly without XSLT style sheets.

However, for generic applications that use XSLT today, declarative XSLT style sheet code are much cost effective to develop and maintain than imperative Java or C++ code. Also, XSLT is much easy to learn and use by non-programmers, namely business domain experts who have no skills on java or C++ programming, not to mention the DOM or SAX APIs.

Re: this seems to be a step back .... by Ke Jin

In java and C++ world, this has been a common and obvious practice for years. For simple transformations, we have DOM and SAX API to parse XML documents. Based on the parsing result, one can use pure imperative java and/or C++ code to generate DOM objects or simply output XML text streams without using XSLT transformer API and certainly without XSLT style sheets.

However, for generic applications that use XSLT today, declarative XSLT style sheet code are much cost effective to develop and maintain than imperative Java or C++ code. Also, XSLT is much easy to learn and use by non-programmers, namely business domain experts who have no skills on java or C++ programming, not to mention the DOM or SAX APIs.


Also, declarative XSLT style sheets are much easy to be generated and verified by UI tools than imperative java/C++ or C# code.

Re: this seems to be a step back .... by Eric White

FWIW, I absolutely agree about the usage scenarios for XSLT. In the LINQ to XML documentation, I have at least 4 or 5 examples that show how to use XSLT to transform an XML tree. XSLT transforms create a new tree, so XSLT does not aleviate the problems of too many short-lived objects.

With respect to processor cost, I haven't done any metrics, however, when doing XSLT transforms using LINQ to XML, the XML tree has to be transformed into an XPathDocument internally, which is a big transform. Then XPath expressions have to be evaluated, and then the transform is effected. In contrast, the LINQ to XML queries that you use to add annotations are quite efficient (due to lazy evaluation of LINQ queries and the semantics of LINQ to XML axes), and adding annotations is cheap. I am going to bet that the pure LINQ to XML approach is more efficient. However, I'm not going to test this until my current deadlines are met :-)

One more note: this technique can expanded significantly. Possible improvements are:
- add modes, ala XSLT. Annotations are marked with modes. ApplyTransforms takes a mode as an attribute. The XForm function can also take a mode.
- allow for annotations on other types of nodes: attributes, text nodes, (and processing instructions and comments for completeness).

My only point about this post is that this is simply one approach to transforming XML trees when using LINQ to XML. It may be useful in some scenarios, but in other scenarios, XSLT may certainly be better.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

5 Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2013 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT