InfoQ

InfoQ

News

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

New Features and Performance Improvements for System.IO

Posted by Jonathan Allen on Oct 13, 2009

Sections
Development,
Architecture & Design
Topics
.NET ,
Performance & Scalability ,
.NET Framework
Tags
File I/O ,
Memory-Mapped Files

Microsoft is planning some simple but much welcomed performance improvements for the core System.IO functionality. These include convenience methods for reading and writing text-based files, significantly faster directory enumeration, and support for memory mapped files.

The first improvement is a replacement for the convenience method File.ReadAllLines. For small files this is a perfectly acceptable function, but as the file size increases so do the problems. The fundamental flaw is that ReadAllLines does just that, it pauses the program until the entire file can be read into an array of strings.

The replacement is File.ReadLines, which returns a string enumerator. This will lazily read the file, just as if you used the lower level stream objects. Also available are new overloads of File.WriteAllLines and File.AppendAllLines, both of which now take an enumerator instead of just an array.

DirectoryInfo.GetFiles has the same array problem, but they lies an even more serious issue underneath. When retrieving a list of files the Win32 API also returns basic information like size and last modified date. Unfortunately this information is discarded by .NET instead of being passed to the FileInfo objects. So when the program starts to loop through the files, perhaps to determine the directories overall size, it has to requery the file system one by one. What you end up with is a classic 1+N optimization issue. Both DirectoryInfo.GetFiles and the new DirectoryInfo.EnumerateFiles fix this problem.

Another major performance boost for .NET is support for memory-mapped files. Memory-mapped files are an operating system feature that links a block of memory to a file. Once linked, you can read and write to any part of the file as if it were nothing more than just an array of unmanaged memory. The operating system handles important details like paging different parts of the file into and out of memory as needed. Memory-mapped files allow applications to work with incredibly large files, even in excess of a gigabyte, in a highly efficient manner.

In addition to raw file I/O, memory-mapped files provide a powerful means of communication between processes. If two applications open the same memory-mapped file, changes made by one application will be immediately visible to the other application.

Despite the name, memory-mapped files are not necessarily real files. They can also be purely in-memory objects with no backing store. While potentially useful within an application, these are particularly applicable to cross-process communication.

Network I/O by Alex Suvorov Posted
  1. Back to top

    Network I/O

    by Alex Suvorov

    I wish they also improved network I/O, current implementation consumes much CPU and memory. The only alternative is a commercial XF.Server component (www.kodart.com), but I wish that would be available in .NET for free.

Educational Content

Jesper Boeg on Priming Kanban

In this interview, Jesper Boeg, author of the new InfoQ book – Priming Kanban, discusses the keys to using Kanban effectively, and how to get started if you are currently using other approaches.

New-age Transactional Systems - Not Your Grandpa's OLTP

John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.

Cool Code

Kevlin Henney examines code samples to see what can be learned from them starting from the premise that one won’t write great code unless he knows how to read it.

Collaboration: At the Extremities of Extreme

Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.

Yesod Web Framework

Michael Snoyman presents Yesod, a web framework written in Haskell and containing a web server, templating, ORM, libraries (templating, gravatar, etc.).

Transactions without Transactions

Richard Kreuter and Kyle Banker on how to avoid classical RDBMS transactional systems by using compensation mechanisms, transactional messaging or transactional procedures.

Attila Szegedi on JVM and GC Performance Tuning at Twitter

Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.

10 tips on how to prevent business value risk

One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.