BT

File System Transactions - still a problem area?

by Alexander Olaru on Jan 10, 2008 |
Historically transaction-processing systems have relied primarily, if not solely, on databases to handle the ACID aspects of any IO activities that required to be transactional. The support for transactions for file system operations has been weak at either the libraries/frameworks, languages or file system levels. Lately, this situation is starting to show signs of improvement.

Certain file systems operations (file rename, deletion, etc.) are atomic when considered individually but, to date, few solutions have emerged as alternatives for providing a comprehensive set of APIs to support the full range of file IO operations in a transaction based context. Applications that perform file operations (e.g. creating, modifying, renaming, deleting files) which needed to be executed sequentially as part of a transaction had to often rely on custom build solutions to reduce the likelihood of an inconsistent state in the event of system/application crashes or concurrent access.

A range of applications could benefit from more robust, full-featured transaction support for file-systems:

  • "office" applications (word processors, spreadsheets, etc.) - where numerous read, write, delete file operations are performed
  • installers - which can require a restore to the original file system state in the event of a crash or an error
  • applications that do not require powerful "SQL-like" search capabilities - which might work faster if some of their data is stored in flat files vs. databases
  • document and content management systems - which make heavy use of file IO operations

Some of the most notable efforts towards progress in this area represented by:

In an MSDN Magazine article, Jason Olson, Technical Evangelist at Microsoft, explains the main features of TxF which is a new feature that introduces the concept of transacting file operations into Windows Vista and the next version of Windows Server (‘Longhorn’). According to Jason, the main goals are:

Improved application stability:

Transactional NTFS enables better application stability by reducing or eliminating the amount of error-handling code that needs to be written and maintained for a given application. This ultimately reduces application complexity and makes the application easier to test … Without transactional file operations, it would be nearly impossible to account for every possible failure scenario, up to and including the operating system crashing at any imaginable point during the process.

Improved platform stability:

...this is achieved by some of the ways that Microsoft is using TxF in its own technologies. There are three core features in Windows Vista and Windows Server "Longhorn" that now make use of Transactional NTFS: Windows Update, System Restore, and Task Scheduler. All of these use TxF to write files to the file system within the scope of a transaction in order to handle rollback/commit in case of any exceptions, such as a system reboot due to a loss of power.

Increased innovation:

TxF drives innovation by providing a framework for using transactions outside of SQL calls. Ultimately, Transactional NTFS can fundamentally change the way developers write applications, allowing them to build more robust code. By incorporating transactions into your design, you can write code without having to account for every single possible failure that can occur.

As TxF is build on top of the Kernel Transaction Manager (KTM) and the KTM can work directly with the Microsoft Distributed Transaction Coordinator. Jason describes that the developer can enlist transacted file operations with other technologies that are using XA-Transactions: SQL operations, Web Service calls via WS-Atomic Transaction or MSMQ operations - thus allowing the file system to participate in XA Transactions.

Concepts similiar to TxF have also been implemented in Vista's and Longhorn's Transactional Registry. References to potential pereformance impacts are also mentioned in the article: "TxF has a strictly pay-to-play model. If you aren’t using transacted file operations, there is no overhead." He adds that TxF is optimized for commit.

An Apache project, the Commons Transaction aims, among other things, at providing a transactional access to file systems in a manner that is agnostic of the file system provider/implementation. This is achieved through a Java library whose API features ACID transactions on the file system using a pessimistic locking schema. A blog on myjavatricks.com presents a few of the Commons Transaction concepts and examples of performing basic file operations in a transactional manner in Java.

A central component of the Commons Transaction context is the FileResourceManager which starts transactions, coordinates the file operations - copy, create, delete, move, write - on the resources/files it manages, prepares and commits the transactions. At initialization time the FileResourceManager is supplied the:

  • the directory where main data should go after commit
  • the directory where transactions store temporary data (working directory)
  • boolean flag that indicates if the path should be URL encoded
  • the logger to be used by the FileResourceManager

Upon startup the FileResourceManager will attempt to rollback any incomplete transactions unless the transaction was already in the process of committing when the system crashed or when it encountered an unrecoverable problem, in which case it will try to roll-forward the transaction. In case a transaction cannot be recovered - i.e. can not be either rolled-back or rolled-forward - the whole working directory is marked by the FileResourceManger as "dirty" and no more changes are allowed to it until the issue is resolved. The logger provided when the FileResourceManager is instantiated will typically supply the information that allows a manual recovery from a "dirty" state.

Even if the Commons Transaction will not turn a file system into an XA compliant resource, if file system transactions are needed in your applications, in the opinion of the blog author the library "...is probably better then any custom mechanism you can come up with."

Although there still is a long way to go until transactions within file systems can be supported similarly to database environments, at least some implementations have started to emerge and they are poised to provide more viable solutions to this still problematic area for the developers.

Hello stranger!

You need to Register an InfoQ account or to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

File /and/ Database Transaction by Geoffrey Wiseman

Commons Transaction will be more appealing to me when it can co-operate with a database transaction:
issues.apache.org/jira/browse/TRANSACTION-13

An Encouraging Article by nitin verma

Hi Alexander,
Thanks for the interesting article. I had come across this around 15 months back and it has played an inspiring role for a project called XADisk (xadisk.dev.java.net/). XADisk has made into its first release 1.0 (after a beta release) last month.

Thanks,
Nitin

Re: File /and/ Database Transaction by nitin verma

Hi Geoffrey,
I believe you are talking about XA transactions. The project I was mentioning in my last comment (XADisk, xadisk.dev.java.net/) allows for involvement in an XA transaction. So, for example, you can have your database updates and file-system operations (with XADisk) inside the same XA transaction.

I hope that is relevant,
Nitin

Re: An Encouraging Article by nitin verma

Please note that the new URL for XADisk is xadisk.java.net/.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

4 Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2013 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT