BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles A Detailed Look at The New File API in Java 7

A Detailed Look at The New File API in Java 7

Leia em Português

Bookmarks

Purpose of new file package

Java 7 introduced a number of useful features to the language, including a new I/O file package which offers finer grained control over file system functionality, particularly for POSIX based systems, than was possible using the older java.io package. This article will first introduce the new API, and then explore it in more detail using an example of a web-based file manager project, called WebFolder. This project provides a mechanism for managing file systems on remote computers. It supports operations such as navigation over the file system, and inspecting, renaming, copying or deleting files. Using the new I/O file allows us to extend the capabilities of the project to manipulate ZIP archives content and also watch for modifications. It can be freely downloaded from http://webfolder.sf.net.

Although the basic file manipulation API did get some updates between versions, for Java 7 the Java team decided to provide an alternative package with a fresh design, to cover file system operations in a new way.

The base file manipulation API resides in the package java.nio.file with two sub-packages of java.nio.file.attribute, and java.nio.file.spi. The new API separates file related operations from the java.io package and also provides additional methods with the aim of making the management of file systems more straightforward. Conceptually the new API is built as a set of entity interfaces covering base objects of a file system, and operational classes covering operations on the file system itself. The concept was inherited from the java.util package, where classes as Collections and Arrays provide many operations on basic aggregation data structures as a collection and an array respectively. The new package contains differently named base classes and interfaces to avoid any confusion, especially when java.io. and java.nio.file packages are used together.

The new package doesn't only have a different organization of the classes that  support  file and file system operations, it also extends the capabilities of the API,  for example providing a simpler means to copy and move files.

Comparison of regular and new file operations related classes

The following table gives a short overview of base classes and interfaces from both packages

Java < 7 java.io, javax.swing.filechooser Java >= 7 java.nio.file Comment
File Path and Files Whilst File provided both file location, and file system operations, the new API splits this into two.  Path provides just a file location and supports additional path related operations, whilst Files supports file manipulation with a number of new functions not available in File, like copying or reading the content of an entire file, or setting an owner.
FileSystemView FileSystem FileSystemView  provides a view of the underlying file system, only used in the context of the Swing file chooser. FileSystem can represent different file systems defined locally or remotely, and also on top of alternative storage mechanisms, such as ISO images or ZIP archives.  FileSystem includes factories that provide a concrete implementation of different interfaces like Path.
No analog FileStore Represents some attributes of file storage, for example total size. It can be retrieved from a particular Path or from FileSystem.

As well as a different organization of objects and operations, the new file system API is able to exploit relatively new Java features, such as autoboxing, in most methods and constructors, and the new API is cleaner and easier to use as a result.

In the sections that follow we’ll look in more detail at particular improvements. 

File system traversal and group operations

The new file package introduces a new way of traversing a file system, which is intended to be more memory efficient in comparison to the older array and filter based version. In addition, the new approach makes it possible to traverse the file system in depth. The new implementation makes uses of theVisitor design pattern.  Whilst you could mimic the Visitor pattern using a filter with a regular File based traversal operation, it is much harder to provide a simple and memory efficient algorithm for a multi-level traversal.

The Visitor pattern is introduced as interface FileVisitor. Since the interface is generic, you might expect that you could use it for traversing the file system using a File based implementation, however the new I/O file implementation requires it to be used against Path inherited objects only. The interface declares four methods, with SimpleFileVisitor providing an implementation of the interface you can inherit from so as to implement any methods you need for a given use case. The following table gives a short overview of FileVisitor methods and their behavior in SimpleFileVisitor:

Name Purpose Default
visitFile Called for every traversed ordinary file, including symbolic links, unless filtering is defined.  Any meaningful file related operation can be processed here, for example take a backup or search for something in the file. A decision about whether to stop or continue traversing can be made. The method isn’t called for directories. Returns CONTINUE
preVisitDirectory If the visited item is a directory rather than a file, then this method gets called instead of visitFile. It allows skipping traversing of a particular directory, or creating a corresponding directory at a target location for copying operations. Returns CONTINUE
postVisitDirectory The method gets called when traversal of the entire directory has been completed. It is a convenient way to finalize an operation with the directory. For example, if the purpose of traversing was to delete all its files, then the directory itself can be deleted in this method. Returns CONTINUE
visitFileFailed If any unhandled exception happens during traversal of the file system, then this method gets called. If an exception gets re-thrown, then all traversing will be stopped and the exception propagated to the code initiating the file system traversal, using Files.walkFileTree. An exception can be analyzed here and a decision to continue traversing can be made. Re-throws IOException

As you can see it is a very powerful interface, supporting most conventional use cases on a file system, including archiving, searching, backing up or deleting files.  The exception handling is also very flexible. However, if you simply need to get the content of some directory without traversing in depth and you were comfortable with the old File.list() operation, then a similar feature is available from the new IO Files as well, though it returns a collection rather than a plain array.

New features not found in java.io

Although the possibilities of file system traversal and group operations offered by new IO File are really useful, they are also supported by standard java.io. In addition, however, the new IO File provides OS specific capabilities that aren’t supported by the older package. One important example of such functionality is working with links and symbolic links, which can now be created and/or processed in any file system traversal operation. It works only for file systems that support them - in other cases an UnsupportedOperationException  will be thrown. Another extension is related to managing file attributes as owner, and permissions.  Again, if the underlying file system doesn’t support them, then either an IOException or UnsupportedOperationException will be thrown.  The table below provides a quick overview of link and extended file attribute operations. All these operations can be requested from the Files class.

Operation Purpose Comment
createLink Creates a hard link mapped to a certain file  
createSymbolicLink Creates a symbolic link mapped to a file or a directory  
getFileAttributeView Access to file attributes in the form of a file system specific implementation of FileAttributeView Although this method gives flexibility to provide a non-predefined set of attributes, it still requires use of the underlying specific class implementation and, as a result, limits portability of code
getOwner Gets the owner of a file Works only on file systems supporting the owner attribute
getPosixFilePermissions Gets file permissions POSIX systems specific
isSymbolicLink Indicates if a given path is a sym link File system specific
readSymbolicLink Reads target path of a sym link File system specific
readAttributes Reads file attributes There are two variants of the method, to return attributes in different forms
setAttribute Sets a file attribute Attribute name may include FileAttributeView qualifier

Refer to new IO file API documentation when you plan to use the operations provided in the table.

Watches

The API also provides a mechanism for watches, so the state of a particular file or directory can be watched for such events as creation, modification or deletion.  Unfortunately, there is no guaranteed push model for watching events, and in most cases a polling mechanism should be used, which makes implementation less attractive in my view. The Watching service is also system dependent so you can’t build a truly portable application using the service. There are 5 interfaces covering watching functionality.  The following table gives a quick overview of the interfaces and their usage.

Interface Purpose Usage
Watchable An object of this type can be registered in a watch service. A watch key is obtained which can be used to monitor modification events A concrete implementation of the interface has to be obtained to register an interest in watch events related to the object. Note that interface Path is Watchable
WatchService A Service available in the file system to register Watchable objects and then use WatchKey to monitor modifications AWatchService can be retrieved from FileSytem object
WatchKey Registration receipt which is used for polling modification events The object can be stored and then used for polling modification events. It can be also directly taken from WatchService when modification events are available for it
WatchEvent Carries watch event WatchEvent object is passed in event notification call, kind of event and affected object path can be retrieved from it
WatchEvent.Kind Carries watch event kind information Used for specifying the particular event types you are interested in at registration Watchable. It is also provided in WatchEvent at notification calls

I would emphasize two scenarios where you might use a watch service. One is when you just need to monitor the modification of a particular object. In this case, a Watchable object can be registered in a watch service to obtain a watch key, and then the watch key can be used for polling modification events. The polling mechanism against a watch key isn’t blocking, so even if no new events occur, an empty list of events can still be returned at poll. You can introduce a delay between polls to reduce polling overhead, in exchange for losing some precision over when the notification event occurs.  The second scenario utilizes the watching mechanism of the watch service and is suitable for polling modification events related to multiple watchable objects.  As in the first scenario, you need to register all watchables, however returned watch keys can be ignored. Instead of a watch key polling mechanism, a service polling mechanism is used to retrieve watch keys against the modification events that were fired, and then process the events using a polling operation against a watch key. In this case a watch key is guaranteed to have some events assigned.  A single thread can be used to manage all watch keys. The watch service polling mechanism is more flexible, since it supports blocking, non-blocking and blocking with timeout operations. As result it can be also more accurate. We will see an example of this second scenario later on, since the aforementioned WebFolder project uses it.

Utility operations

The next central feature of the new I/O file is a set of utility methods. It makes the package self-sufficient, in that, for the majority of use cases, no additional functionality from the standard java.io package is required. Input streams, output streams, and byte channels can be obtained directly using methods of Files. Complete operations, such as copying or moving files are supported through the API.  In addition, the entire file content can be read now as a list of Strings, or an array of bytes.  Note however that there is no parameter on size control, so some investigation of file size has to be added to avoid possible memory problems.

More about new I/O file organization

Finally, file system and storage are an essential part of the new I/O file package. As we’ve seen, a key element of the package is that a file location is represented by the Path interface. You need to get a concrete implementation of that interface using the FileSystem factory, which in turn has to be obtained from the FileSystems factory. The diagram below shows relations between key elements of the new I/O file.

Storage information can be retrieved from a particular file (Path) as from file system.

Working With File Systems

All file systems implementations are backed up by corresponding providers, whose base classes are defined in package java.nio.file.spi.  The Service provider concept allows a developer to easily extend the coverage of file systems. Some interesting file system providers are packaged, for example one transforms the content of a ZIP file, allowing most of the standard operations such as traversing content, creating, deleting and modifying files. We’ll see an example of this later on.

Concurrency and atomic operations

Our overview of new IO File would be incomplete without mentioning that the implementation has high awareness of concurrency, and therefore most operations are safe for a concurrent environment. Moving a file can also be atomic. Working with directory content can be also secured by obtaining the concrete implementation of the SecureDirectoryStream interface. In this case, all directory related operations remain consistent if the directory gets moved or modified by an external attacker. Only relative paths are used to achieve this.

Real world example

The best way of learning new stuff is by doing real coding. The above mentioned WebFolder web-based file manager was initially developed using java.io, so I decided to migrate the existing project to use the new IO file. It helped me better understand the concepts in I/O file and also evaluate it for a particular use for other more serious projects. I’ve deliberately tried to keep the example code small here, but the complete source code can be downloaded from the project web page.

1. To obtain the content of one directory

try (RequestTransalated rt = translateReq(getConfigValue("TOPFOLDER", File.separator),
req.getPathInfo());

	DirectoryStream<Path> stream = Files.newDirectoryStream(rt.transPath);) {

             for (Path entry : stream) {

                    result.add(new Webfile(entry, rt.reqPath)); // adding directory
element info in model

             }

} catch (Exception ioe) {

	log("", ioe);

}  // No finally block here since the API supports the AutoCloseable and the new try
block syntax

This example populates a directory model to be rendered by a web page view. Files.newDirectoryStream is used to obtain the directory content iterator.

2. Traverse in depth

Path ffrom = ….

Files.walkFileTree(ffrom, EnumSet.of(FileVisitOption.FOLLOW_LINKS), Integer.MAX_VALUE,

        new SimpleFileVisitor<Path>() {

            @Override

            public FileVisitResult preVisitDirectory(Path dir,

BasicFileAttributes attrs)

                    throws IOException {

                Path targetdir =
fto.resolve(fto.getFileSystem().getPath(ffrom.relativize(dir).toString()));

                try {

                    Files.copy(dir, targetdir,
StandardCopyOption.COPY_ATTRIBUTES);

                } catch (FileAlreadyExistsException e) {

                    if (!Files.isDirectory(targetdir))

                        throw e;

                }

                return FileVisitResult.CONTINUE;

            }

            @Override

            public FileVisitResult visitFile(Path file, BasicFileAttributes
attrs) throws IOException {

                Path targetfile = fto.resolve(fto.getFileSystem()

       .getPath(ffrom.relativize(file).toString()));

                Files.copy(file, targetfile,
StandardCopyOption.COPY_ATTRIBUTES);

                return FileVisitResult.CONTINUE;

        }

});

This code copies the content of a directory to another location on the file system.  preVisitDirectory takes care of copying the directory itself. Since the target can be another file system, the example is a convenient way to extract the entire content of the ZIP archive whilst preserving the directory structure, or to place a directory structure in a ZIP archive. The COPY_ATTRIBUTES option preserves all source file attributes, including timestamp, in the target file.

A similar implementation can be used for deleting the entire content of a directory, in this case the method postVisitDirectory has to be implemented instead of preVisitDirectory to delete the directory itself after deleting its content.

@Override

public FileVisitResult postVisitDirectory(Path dir, IOException e) throws IOException {

    if (e == null) {        

        if (dir.getParent() != null) {

            Files.delete(dir);

            return FileVisitResult.CONTINUE;

        } else

            return FileVisitResult.TERMINATE;

    } else {

        // directory iteration failed

        throw e;

    }

}

This example checks to make sure the target directory isn’t at root level before deleting it. All possible exceptions are propagated up for processing by a caller.

3. File system from ZIP

FileSystem fs = FileSystems.newFileSystem(zipPath, null);

Path zipRootPath = fs.getPath(fs.getSeparator());

….

Fs.close();

zipRootPath can be used to initiate traversing of the content of a ZIP file for any purpose. The obtained file system is fully functional and most operations, including copy, move and delete will work. However the watch service isn’t available for the ZIP file system. Note also that the file system  has to be closed after use.  If you open another file system on the same ZIP, then you may observe an operation failing, so write your code with this possibility in mind. Closing  the default file system isn’t required however. It appears that the new I/O file package maintains just one instance of it and takes care of concurrency.

4. Watch

There are several approaches to using the watch service, so here is an illustration of the two most common ones, already mentioned earlier.

WatchService ws = dir.getFileSystem().newWatchService();

WatchKey wk = dir.register(ws,  StandardWatchEventKinds.ENTRY_CREATE,
StandardWatchEventKinds.ENTRY_DELETE, StandardWatchEventKinds.ENTRY_MODIFY);

After obtaining a watch  key it can be passed to the watching thread to monitor relevant events

@Override

public void run() {

    for (;;) {

        if (watchKey != null) {

            for (WatchEvent<?> event : watchKey.pollEvents()) {

                updateScreen(event.kind(), event.context());

        }

        boolean valid = watchKey.reset();

        if (!valid) {

            break;

        }

    }

}

If events can’t be consumed fast enough then the event kind OVERFLOW will be received. The watch key can be canceled if there is no more interest in its events. A watch service can be also closed after use. Another approach is using a watch service method to poll modification events in the case of having multiple watchables registered in it. This approach is more applicable to the WebFolder application.

public void run() {

    for (;;) {

        try {

            WatchKey watchKey = watchService.take(); // poll(10, );

            processWatchKey(watchKey);

        } catch (InterruptedException ie) {

            break;

        }

    }

}

One watch service was obtained for the default file system and then used in a single monitoring thread. The take operation was used, and since it is blocking there are no wasteful loops. The method processWatchKey has an implementation similar to one provided above and associated with the watch key events, to support polling. However, there is no extra loop for this since the key obtained from the watch service already has associated events.

Recap

The new I/O File provides:

1. A powerful file system traversing mechanism helping to do complex group operations.

2. Manipulation of specific file and file system objects and their attributes as links, owners and permissions.

3. A convenient utility method to operate with entire file content as read, copy and move.

4. A watch service for monitoring file system modifications.

5. Atomic operations on the file system providing synchronization of processes against file system.

6. Custom file systems defined on certain file organization, like archives.

Migration

There are four reasons why you might consider migrating a system based on the older I/O package to the newer one:

  • You are observing memory problems with a complex file traversing implementation
  • You need to support file operations in ZIP archives
  • You need fine grained control over file attributes in POSIX systems
  • You need watch services.

As a rule of thumb, if two or more of these items apply to your project then migration may be worthwhile, otherwise I would recommend staying with the current implementation.  A reason for not moving is that the new I/O file implementation won’t make your code more compact or more readable, On the other hand, the new file traversal operation can be a little sluggish on first access in certain runtime implementations. It appears that the Oracle implementation for Windows does a lot of caching which takes a noticeable amount of time on first access. However the OpenJDK implementation (IcedTea) didn’t show this problem on Linux so the issue appears to be platform/implementation specific.

If you do decide to migrate, the below table gives you some tips

Current Migrated Comment
fileObj = new File(new File(pe1, pe2), pe3) pathObj = fsObj. getPath(pe1, pe2, pe3) fsObj can be obtained as FileSystems.getDefault(), since file system is preserved in Path itself, it can be retrieved from any existing path obtained from same file system
fileObj.someOperation() Files.someOperation(pathObj) In most cases the operation name is the same, although additional parameters related to links and attributes can be added
fileObj.listFiles() Files.newDirectoryStream(pathObj) Files.walkFileTree should be used for traversing in depth
new FileInputStrean(file) Files.newInputStream(pathObj) Additional options can be specified for how the file gets opened
new FileOutputStream(file) Files.newOutputStream(pathObj) Additional options can be specified for how the file gets opened
new FileWriter(file) Files.newBufferedWriter(pathObj) Additional options can be specified for how the file gets opened
new FileReader(file) Files.newBufferedReader(pathObj) Additional options can be specified for how the file gets opened
new RandomAccessFile(file) Files.newByteChannel(pathObj) Opening options and file creation attributes can be specified

Class File and interface Path have two ways of converting between them - pathObj.toFile() and fileObj.toPath(). It can help to reduce the efforts of migration and be focused only on the new functionality offered by new I/O file. As a part of the migration process, a replacement of a custom file copying implementation by Files.copy can be considered.  Interface Path itself provides many convenient methods which can reduce the coding required previously based on Files objects. Since new code will run under Java 7 and up, it is worth it to improve exception handling and resource releasing. The code below demonstrates the old and new mechanisms:

ClosableResource resource = null;

try {

     Resource = new Resource(…);

// resource processing

} catch(Exception e) {

} finally {

    if (resource != null)

        try {

               resource.close();

        }  catch(Exception e) {

        }

}

can be replaced by more compact code as

try (Resource = new Resource(…);) {

     // resource processing

 

} catch(Exception e) {

}

Resource has to implement interface AutoCloseable, and all standard resources coming from JRT are AutoCloseable.

About the Author

Dmitriy Rogatkin is the CTO for WikiOrgCharts where he is responsible for the technological direction of the company. Previously he worked on technologies primarily targeting enterprise software: he was chief architect for over 10 years at MetricStream, a lead company in enterprise GRC software.  He likes testing out different ideas through the creation of open source software ranging from multimedia desktop applications to frameworks and application servers. Amongst his projects, TJWS is a tiny application server, an alternative for when using a full Java EE profile application server is too much overhead, and TravelsPal, which helps connect people whilst traveling and planning time away.

Rate this Article

Adoption
Style

BT