Google Open-Sources Python Fuzzy Testing Tool Atheris

Google recently announced the open-sourcing of a new fuzzy testing engine for Python. The new fuzzer, Atheris, strives to find bugs in Python code and native extensions in Python 2.7 and Python 3.3+. Atheris can be used in combination with the Address Sanitizer and Undefined Behavior Sanitizer tools that detect memory corruption bugs and undefined behavior (e.g., buffer overflows, misaligned or null pointers).

Google explained the value that Atheris seeks to add to the current space of fuzzy testing engines:

Fuzz testing is a well-known technique for uncovering programming errors. Many of these detectable errors have serious security implications. Google has found thousands of security vulnerabilities and other bugs using this technique. Fuzzing is traditionally used on native languages such as C or C++, but last year, we built a new Python fuzzing engine.

Atheris can be used on Python code (Python 2.7 and Python 3.3+, with Python 3.8+ strongly recommended for better code coverage support) and native extensions written for CPython. When fuzzing native code, Atheris can be used in combination with Clang’s Address Sanitizer or Undefined Behavior Sanitizer to catch extra bugs.

An example of Python code fuzzing is as follows:

import atheris
import sys

def TestOneInput(data):
    if data == b"bad":
        raise RuntimeError("Badness!")

atheris.Setup(sys.argv, TestOneInput)
atheris.Fuzz()

The TestOneInput is the function under test. The function under test will be called repeatedly by Atheris with automatically generated inputs until a crash or an exception occurs.

The function admits one input (data) that is a bytes container. Atheris provides a FuzzedDataProvider that supports fuzzing with additional input shapes (e.g., string, list, integers, floats, intervals). Atheris can be used with the property-based testing tool Hypothesis to write fuzz harnesses, and shrink an input causing a malfunction to a smaller reproducible failure case. Hypothesis additionally provides advanced input-generation strategies (e.g., email, dictionaries, dates, regexp) that complement Atheris’ fuzzed data provider.

Atheris is a coverage-guided (grey-box) fuzzing engine. Atheris leverages Clang’s libFuzzer to instrument the code under test and collect coverage information. Atheris then dynamically tries to generate inputs that increase the code coverage generated by previous input sequences.

While Atheris takes care of test case generation and test execution, it is the onus of the programmer to recognize erroneous behaviors of the function under test. This may be achieved by using cross-referencing oracles (differential fuzzing). In this method, as with more generic metamorphic testing methods, two implementations of the same specifications are run on the same input with differences in results being singled out for analysis.

If a test oracle or metamorphic property is not available, programmers can still use fuzzing to detect malfunctions. This is the case when the function under test raises unexpected exceptions or fails a fault-detection mechanism.

Google gives the following example of the usefulness of checking for unexpected exceptions:

As an example, the one YAML parsing library we tested Atheris on says that it will only raise YAMLErrors; however, yaml_fuzzer.py detects numerous other exceptions, such as ValueError from trying to interpret “-_” as an integer, or TypeError from trying to use a list as a key in a dict. (Bug report.) This indicates flaws in the parser.

Differential fuzzing and fuzz testing are powerful automated testing techniques that have found many bugs in existing software — C compilers, Java decompilers, antivirus software, and more. Nick Fitzgerald recently explained in an InfoQ interview how generative testing allows finding bugs that are not easy to detect with other methods:

There’s a ton that we miss with basic unit testing, where we write out some fixed set of inputs and assert that our program produces the expected output. We overlook some code paths or we fail to exercise certain program states. […]

Testing pseudo-random inputs helps us avoid our own biases by feeding our system “unexpected” inputs. It helps us find integer overflow bugs or pathological inputs that allow (untrusted and potentially hostile) users to trigger out-of-memory bugs or timeouts that could be leveraged as part of a denial of service attack.

Fitzgerald reported finding bugs in the wasmparser crate’s validator. Google recently reported that 50 interns who participated in Google’s OSS internship initiative reported over 150 security vulnerabilities and 750 functional bugs.

Atheris is an open-source project under the Apache 2.0 license. Atheris supports Linux (32- and 64-bit) and Mac OS X. Contributions are welcome and must follow the appropriate guidelines.

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

Write for InfoQ

Rate this Article

This content is in the Dynamic Languages topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

The InfoQ Newsletter