Cloud Foundry: Design and Architecture
Derek Collison discusses the goals, the design premises and patterns employed in creating the architecture of Cloud Foundry, VMware’s open source PaaS, unveiling internal architectural details.
The content has been bookmarked!
There was an error bookmarking this content! Please retry.
Posted by Jonathan Allen on Jun 17, 2011
Office Open XML is an internationally recognized standard for documents that is based on an ZIP/XML representation of various Microsoft Office file formats. It competes with the Open Document Format (ODF), another internationally recognized standard format based on the native format for Open Office files. While it is possible to manipulate Open XML files using low level APIs, the complexity of the format makes that a daunting challenge.
The first generation of the Open XML SDK provided a thin layer on top of the raw XML. While better than nothing, it still required an intimate knowledge of the underlying format. As such it wasn’t of much interest and most developers continued using the Office COM APIs. Unfortunately the COM libraries are very problematic. They require the associated Office products to be installed and cannot be safely used from servers such as IIS. Even when accessed via standalone programs, developers need to take extreme to avoid leaking instances of Word or Excel.
Open XML SDK 2.0 offers a higher level API for manipulating Open XML documents. Unlike the previous version there are specific APIs for each type of document. A deep understanding of the underlying file format is still required, but it is a stepping stone.
Also included in this release is the Open XML SDK v2.0 Productivity Tool. The primary purpose of this tool is to reverse engineer a Word, PowerPoint, or Excel document. It will then generate C# code that can recreate the document. This tool can also be used to validate documents.
Troubleshoot Java/.NET performance while getting full visibility in production
Automating Error Reporting for .NET Applications
RDBMS to NoSQL: Managing the Transition
Visual Studio vNext: ALM features for Agile Planning, Team Collaboration
That's a few months old, so not that new :)
That being said, the SDK is too hard to use for the average programmer. You need a deep understanding of the structure of an Open XML document. The SDK let's you create invalid documents quite easily. For example, there is nothing stopping you from putting a string straight in a cell, even though by default cells expect the value to be in a string dictionary. You need to either do that, or change the type of the cell.
Excel WILL open the invalid document, and prompt you to fix the corrupt data, and then it will work, but that isn't very friendly to your users.
There's a solution I stumbled on while doing a project that required generating Excel 2007 files on the fly.
closedxml.codeplex.com/
This is a very actively maintained wrapper around the SDK. It only works with Excel files right now, but it is extremely user friendly, and intuitive (the documentation is top notch, but even without it you can generally guess how to do 90% of things).
It doesn't do everything by any mean, but 90% of common cases are covered. Give it a shot (disclaimer: I'm not associated with the project in any way, shape or form. I'm just a happy user)
they provide a C# class generation to create or clone a pre-existing document..
but, for small projects I continue using the old-school interop objects :)
Even if you close a pre-existing document. Just adding something as simple as text in a cell is non-trivial, as i described above. You need to get "parts", find parts in those parts, the naming convention is non-intuitive, etc. If you understand OOXML at a low level, it all makes sense.
Don't get me wrong, I've done it. It is just an order of magnitude harder than what you'd expect, and there's nothing stopping you from doing it wrong.
Interop objects work great, but they aren't thread safe, so its a no go for serious web site development (with large amount of concurrent users). My understanding and my testing so far show that the OOXML SDK works fine (since all it is is a glorified XML manipulation sdk) in those environments. There's Aspose that I beleive work fine too.
That certainly look interesting, I'll have to get an interview with them.
And for the record, we don't mind self promotion. If you have a project that you think is worth talking about by all means let us know.
Derek Collison discusses the goals, the design premises and patterns employed in creating the architecture of Cloud Foundry, VMware’s open source PaaS, unveiling internal architectural details.
Andrew Watson talks about the work of the OMG, where CORBA is alive and well (hint: in your car), UML and UML Profiles vs. custom Modeling languages, DDS and other middleware, and much more.
Sohil Shah discusses creating iPhone and Android enterprise mobile applications based on cloud services using the open source platform OpenMobster.
Paul Sanford presents the transformations supported by data throughout its life cycle, and how that can be better done with Splunk, an engine for monitoring and analyzing machine-generated data.
A common “best practice” for unit tests is to only write a one assertion in each test. I intend to question this advice by showing that multiple assertions per test are both necessary and beneficial.
John Rauser presents the architectural and technological evolution of Amazon retail websites starting with 1994 and ending with adopting Amazon Web Services.
Michael Stal discusses system architecture quality, how to avoid architectural erosion, how to deal with refactoring, and design principles for architecture evolution.
Every developer has had to integrate with another system, API or component. Tis article provides strategies to handle the change and for he separating system boundaries.
4 comments
Watch Thread Reply