Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News How Quantifying Information Leakage Helps to Protect Systems

How Quantifying Information Leakage Helps to Protect Systems

This item in japanese

Information leakage happens when observable information can be correlated with a secret. According to Mireya Jurado, secrets such as passwords, medical diagnosis, locations, or financial data uphold a lot of our world, and there are many types of information, like error messages or electrical consumption patterns, that can give hints to these secrets.

Quantitative Information Flow is an academic framework that can help us develop a better understanding of what information leakage is, how we can measure it, and how we can build systems with leakage in mind.

Mireya Jurado, a graduate researcher, spoke about quantifying information leakage at The Diana Initiative 2021.

Information is like water: it is hard to control, as Jurado explained:

Often, we hear about accidental information leakage since implementation mistakes can be subtle. Think about an app that alerts users if they tested positive for a disease. If the app only sends messages to users with a positive result, an attacker looking at the traffic could identify these users. But if the app sends dummy messages to all users, an eavesdropper cannot pinpoint who actually tested positive.

Jurado mentioned that we also reveal information intentionally. We want clients to know about a patch even when the same patch could give an attacker knowledge about a vulnerability. We reveal sensitive information in exchange for services and efficiency. "I want the map to get me to where I want to go, even if it reveals my location," Jurado said.

In many practical situations, information leakage is unavoidable. If election results show that all votes were unanimous, then the tally reveals how everyone voted, Jurado mentioned. Even your password checker on your laptop will tell you whether your guess was right or not, which helps you narrow down your next guess.

Quantitative Information Flow (QIF) is a framework for modeling information flow and calculating leakage, by recognizing that some leaks are more damaging than others, and conversely, that some are acceptable. Jurado mentioned that QIF models a system as a channel that takes in a secret input and produces an observable output:

Modeling a system means to provide a mathematical description of the system, so we can reason about it in a formal and quantitative way, and prove security properties in a rigorous manner. For QIF, the properties are typically probabilistic in nature. For instance, QIF can give the probability that an attacker guesses a correct password.

QIF can help to design better, i.e. more secure, systems, Jurado said. Once the source and the amount of leakage has been detected, the leakage can be stopped or mitigated. The capability to measure the leakage at each stage of the repair can guide the process, pointing to the right direction.

InfoQ interviewed Mireya Jurado about quantifying information leakage.

InfoQ: How does quantitative information flow work? Can you provide some examples?

Mireya Jurado: For example, we want to make informed policy decisions based on our statistical databases, but we do not want to harm people’s privacy. QIF can help us navigate these decisions.

With QIF, we can model an encryption program as a channel that takes in a classified message and outputs the amount of time it took to encrypt it. There is this correlation here between secret and output that could help an attacker guess the message.

Incorporating the adversary’s capabilities and goals, QIF then asks how valuable that observable information is to an adversary. Under QIF, we do not have to trust that our intuition about information leakage is correct. We may feel like a program taking longer to process one secret message versus another is bad somehow, but with QIF, we can get a precise understanding of how bad.

InfoQ: How can we measure and calculate information leakage?

Jurado: We can measure information leakage by isolating the effect of the channel, or system, that is revealing sensitive information. To measure leakage, we compare the prior vulnerability, when an adversary only knows the distribution on the secrets, to the posterior vulnerability, when they know the distribution on secrets and can observe the channel output. You can think about the prior vulnerability as how well an attacker could do when they are at their desk and planning their attack, while the posterior vulnerability is how well they could do when they launch their attack and can see what is going on. By comparing these values, we can judge whether the leakage is acceptable or not, before any leakage actually happens.

For example, if an attacker can correctly guess your passcode 2% of the time when they are sitting at their desk, but can guess correctly 20% of the time when they can see numbers worn off the keypad, the vulnerability increases by 18%, 10-fold! This is the leakage: 18% is the additive leakage, representing the absolute difference, and 10 is the multiplicative leakage, representing the relative difference. Either way, a 1 in 5 chance of attacker success is pretty bad. Replace your old keypads.

To compute exact leakage, QIF uses a specific attacker’s knowledge, but it can also calculate bounds or limits on leakage regardless of the attacker, their abilities, or their knowledge. This means that QIF is powerful enough to make assessments of future leaks, even in unknown contexts.

InfoQ: What’s your advice to architects for addressing information leakage?

Jurado: The first and most important step is to identify the high value secrets that your system is protecting. Not all assets need the same degree of protection. The next step is to identify observable information that could be correlated to your secret. Try to be as comprehensive as possible, considering time, electrical output, cache states, and error messages.

Once you have identified what an attacker could observe, a good preventative measure is to disassociate this observable information from your sensitive information. For example, if you notice that a program processing some sensitive information takes longer with one input than another, you can take steps to standardize the processing time. You do not want to give an attacker any hints.

Next, I suggest threat modeling. Identify the goals, abilities, and rewards of possible attackers. Establishing what your adversary considers "success" could inform your system design. Finally, depending on your resources, you can approximate the distribution of your secrets. Assume that an attacker knows the distribution of whatever you are trying to protect, whether passwords or salaries or PII. Given that assumption, consider how that information could help an attacker. For example, if an attacker knows that some passwords are more common than the rest, take steps to avoid those passwords. If an attacker knows the most common last names in the United States, then maybe you should not rely on your user’s mother’s maiden name for authentication.

Rate this Article