A New Library and Tooling Package for Open XML
Office Open XML is an internationally recognized standard for documents that is based on an ZIP/XML representation of various Microsoft Office file formats. It competes with the Open Document Format (ODF), another internationally recognized standard format based on the native format for Open Office files. While it is possible to manipulate Open XML files using low level APIs, the complexity of the format makes that a daunting challenge.
The first generation of the Open XML SDK provided a thin layer on top of the raw XML. While better than nothing, it still required an intimate knowledge of the underlying format. As such it wasn’t of much interest and most developers continued using the Office COM APIs. Unfortunately the COM libraries are very problematic. They require the associated Office products to be installed and cannot be safely used from servers such as IIS. Even when accessed via standalone programs, developers need to take extreme to avoid leaking instances of Word or Excel.
Open XML SDK 2.0 offers a higher level API for manipulating Open XML documents. Unlike the previous version there are specific APIs for each type of document. A deep understanding of the underlying file format is still required, but it is a stepping stone.
Also included in this release is the Open XML SDK v2.0 Productivity Tool. The primary purpose of this tool is to reverse engineer a Word, PowerPoint, or Excel document. It will then generate C# code that can recreate the document. This tool can also be used to validate documents.
Too hard to use for the common mortal, but there's a way
That being said, the SDK is too hard to use for the average programmer. You need a deep understanding of the structure of an Open XML document. The SDK let's you create invalid documents quite easily. For example, there is nothing stopping you from putting a string straight in a cell, even though by default cells expect the value to be in a string dictionary. You need to either do that, or change the type of the cell.
Excel WILL open the invalid document, and prompt you to fix the corrupt data, and then it will work, but that isn't very friendly to your users.
There's a solution I stumbled on while doing a project that required generating Excel 2007 files on the fly.
This is a very actively maintained wrapper around the SDK. It only works with Excel files right now, but it is extremely user friendly, and intuitive (the documentation is top notch, but even without it you can generally guess how to do 90% of things).
It doesn't do everything by any mean, but 90% of common cases are covered. Give it a shot (disclaimer: I'm not associated with the project in any way, shape or form. I'm just a happy user)
too hard for small projects..
but, for small projects I continue using the old-school interop objects :)
Re: too hard for small projects..
Don't get me wrong, I've done it. It is just an order of magnitude harder than what you'd expect, and there's nothing stopping you from doing it wrong.
Interop objects work great, but they aren't thread safe, so its a no go for serious web site development (with large amount of concurrent users). My understanding and my testing so far show that the OOXML SDK works fine (since all it is is a glorified XML manipulation sdk) in those environments. There's Aspose that I beleive work fine too.