Securing the Web with Decentralized Information Flow Control

The Computer Science Department of the University of Washington just published a talk from Max Krohn, (MIT) on Securing the Web with Decentralized Information Flow Control.

In this talk, Max explains that he sees a computing shift happening right now, moving from desktop software to server-side software and cloud computing.

He notes however that:

Web software is buggy, attackers find and exploit these bugs. And as a result, data is stolen or corrupted.

Most people use dynamic languages which do not allow for a static analysis, they easily use 3rd party code, plugins,... and let's face it, there is a lot of duck-taping going because the Web Site needs to be up and running fast.

He actually defines an interesting metric to get a sense of how vulnerable a software can be. He recommends to take the # of Lines of Code divided by the # of installs. The more a software is installed, say like Linux, the less number of vulnerabilities should be expected since they would have most likely be discovered and corrected. He presents a couple of slides to illustrate his point, representing a web app in LOCs and the same web app in LOCs/Installs.

The goal of his research is to define a security model for these new types of applications and architectures. The problem is becoming acute with applications such as Facebook which are allowing developers to insert code in the platform, or even now enable 3rd party servers to provide functionality within the Facebook platform.

To respond to these challenges, Max and his colleagues have developed Flume, an open source web application security infrastructure based on a Decentralized Information Flow Control (DFIC) model:

Decentralized Information Flow Control (DIFC) is an approach to security that allows application writers to control how data flows between the pieces of an application and the outside world.

When applied to privacy, DIFC allows untrusted software to compute with private data while trusted security code controls the release of that data while when applied to integrity, DIFC allows trusted code to protect untrusted software from unexpected malicious inputs.

They treat the server as black box and track the data as the response to a request is being constructed. The security architecture is made of a security gateway and an operating system library which tags data as it is being used by the web application. The core concept is to centralize all security decisions in the gateway and prevent unwanted data access.

A typical Flume application consists of processes of two types. Untrusted processes do most of the computation. They are constrained by, but possibly unaware of, DIFC controls. Trusted processes, in contrast, are aware of DIFC and set up the privacy and integrity controls that constrain untrusted processes. Trusted processes also have the privilege to selectively violate classical information flow control—for instance, by declassifying private data (perhaps to export it from the system), or by endorsing data as high integrity.

The core of the system is based on a fairly simple set of rules to track data based on Tags and Labels.

A tag t carries no inherent meaning, but processes generally associate each tag with some category of secrecy or integrity. Tag b, for example, might label Bob’s private data. A label is a subset of the tag set.

A flume process p can send data to process q if one of its label is a subset of q. The Flume model assumes many processes running on the same machine and communicating via messages, or “flows”. The model’s goal is to track data flow by regulating both process communication and process label changes.

Fig1. Communication Rule

Max indicates that this concept is not new and it has been around since the 80s.

The Gateway is a key element of the security architecture. First the web application does not need to know anything about the browser since the gateway can legislate policies. However this central role also requires the introduction of a new abstraction: Endpoints. Because the Gateway needs to coordinate interactions with several systems (browser, authentication repository, web application...) it cannot expose a single set of labels to all these processes. Endpoints help define specific combinations of labels dedicated to enforce the communication between the gateway and a specific process.

The second part of the presentation is focused on presenting a use case based on MoinMoin Wiki. Max shows in this use case that Flume tackles problems well beyond known vulnerability types (buffer overrun, cross-site scripting and SQL injection). He demonstrated that MoinMoin Wiki had a bug in their calendar functionality and that all users could actually see some items in the calendar that were intended to be restricted to a particular group. Flume was able to prevent the content of the calendar to be displayed simply based on its standard policies.

Fig 2. System Call Delegation

Max concluded that there is still a lot of work to be done. They want to be able to make the system flexible enough to work with 3rd party software uploaded in the Web Application. They are also working on enabling people to share data using the same principles. They also plan to extend the reach at the browser level and bring JavaScript in the architecture. Max sees a large set of applications in the financial industry.

The development of connected systems will increasingly require end-to-end security solutions to prevent unwanted access to data using policy enforcement strategies outside the comfort of the code of an application. What is your opinion? Have you been confronted to this kind of security issues yet? What solutions did you use?

InfoQ Software Architects' Newsletter

Write for InfoQ

Rate this Article

This content is in the Enterprise Architecture topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter