Overview
In the context of cybersecurity, zero-day vulnerabilities are defined as undisclosed weaknesses in software, hardware, or firmware that can be utilized by malicious attackers to take advantage of a system [1]. Finding zero-day vulnerabilities can be the most fulfilling and frustrating task presented to security personnel and developers across all industries. The race to find zero-day vulnerabilities is crucial to the success of an organization in preventing data breaches and cybercrime.
Fuzzing is the process of identifying bugs and vulnerabilities by sending unexpected and malformed input to the target. For example, if a developer created a tool that transformed all uppercase characters in a body of text to lowercase characters, the fuzzing process would include sending numbers or special characters to the developer’s tool in an attempt to crash the program. The numbers and characters in this scenario represent unexpected data provided to the program that the developer may not have anticipated.
The fuzzing process described in the following sections was used to discover CVE-2022-41220, CVE-2022-36752, CVE-2022-34913, and CVE-2022-34556. This process is repeatable at a large scale and can be employed by software developers and security researchers to quickly discover hidden flaws in a system.
Prerequisites
Basic C Programming and Compilation
Basic Linux Command Line Tools
Basic Understanding of Buffer Overflows
Basic Understanding of the Stack and Heap
Disclosure and Disclaimer
The vulnerabilities discussed in this post were disclosed to the respective security teams. This post was intended for developers and security researchers who are interested in identifying vulnerabilities within applications and is for educational purposes only.
Fuzzing Process Overview
The process of fuzzing local programs varies from fuzzing remote programs. A local program is defined as a program that does not receive input over a network connection, and a remote program is a program that receives input from a network connection. An example of a local program would be the Linux ‘ls’ command, and a remote program would be the ‘apache2’ http server.
When we are fuzzing local programs we can quickly provide input to the program via stdin and send a large amount of test cases without being concerned about packet loss, rate limiting, and other remote connectivity issues. When using a local program, there can be various entry points into the program where a user can provide necessary information to carry out a particular task.
Let’s take a look at a vulnerable C program that takes input from the command line.
Looking at our rudimentary C program we can verify that we have 1 program, four entry points (or ‘targets’), and an infinite amount of data (or ‘test cases’) we can provide to each target. As bug hunters, we need a repeatable methodology for discovering flaws in our software that resembles the following process:
Target identification- Identify all entry points into the program.
Fuzzing- Send test cases to each target in an attempt to crash the program.
Triage- Run each test case that successfully crashed the program and determine if it is a security vulnerability.
Given the endless array of possible test cases we could provide each target, it would be nice to automate the fuzzing process with a tool that can generate a large number of test cases for each target and subsequently modify each test case depending on how the program reacts to a particular subset of data. A popular open source tool that was created for this very scenario is called AFL++.
AFL++
At its core, AFL++ is a fuzzer that generates input based on an initial test case given to it by a user. The generated input is subsequently fed into a target software program. As AFL++ learns more about the program, it mutates the input to better identify bugs with the goal of crashing the program by making it exhibit unexpected behavior. We highly recommend checking out their Github for more details on how this works. The entire process from compilation of a target using instrumentation to inciting a crash can be seen below:
AFL++ is the successor to AFL, which was originally developed by Michał Zalewski at Google. This quick overview is quite an oversimplification of the tool’s full capabilities. The important bits of information required to fuzz programs with AFL++ are:
Compilation using instrumentation.
Creating inputs.
Fuzzing the program and triaging crashes.
If you are running Kali Linux, AFL++ can be installed using the APT package manager.
Once AFL++ is installed, the process of fuzzing a binary can be fairly simple. We only need to complete a few steps to get AFL++ started.
Discovering CVE-2022-34913 With AFL++
First, we can download the md2roff tool (version 1.7) from GitHub onto our local machine and browse to the folder containing the source code and Makefile. The md2roff tool is written in C and can be compiled to produce an executable. AFL++ includes a special clang compiler used for instrumentation. Instrumentation is the process of adding code, variables, and symbols to the program to help AFL++ better identify the program flow and produce a crash. AFL++ instrumentation is not limited to compilation alone, and can be used in binary-only mode to instrument binaries. Typically the $(CC) variable is used in Makefiles to specify which compiler to use. Let’s point the ‘CC’ environmental variable to the location of our ‘afl-clang-fast’ compiler. Once we have verified this variable is set, we can run the ‘make’ command to compile the source code.
Creating Input and Output Directories
AFL++ requires two folders before it can get started. The first folder will contain our sample input (test cases), and the second will be an output directory where AFL++ will write the fuzzing results.
Our input folder needs to contain a test case that will be utilized and modified by AFL++. If we want to fuzz md2roff’s markdown processing functionality, our input directory must have a sample markdown file with primitive contents. This file serves as a ‘base case’ of what program input should resemble.
Once we have verified our sample input we can start AFL++ by using the ‘afl-fuzz’ command:
afl-fuzz– The AFL++ command used to fuzz a binary.
-i input– The input directory containing our base case.
-o output– The output directory that AFL++ will write our results to.
./md2roff- The name of the program we want to start with any applicable flags.
@@– This syntax tells AFL++ that the input is coming from a file instead of stdin.
AFL++ Fuzzing
Once AFL++ has initialized, it will continue fuzzing the program with mutated input until you decide to stop it.
The important sections from the interface are ‘saved crashes’ and ‘exec speed’. ‘Exec Speed’ will show us how fast AFL++ is able to generate new input and fuzz the program. ‘Saved Crashes’ shows us the number of unique crashes the fuzzer was able produce.
It looks like AFL++ discovered a few crashes! Let’s investigate the input that was used to produce the crash. The output/default/crashes directory will contain a file for each unique crash that was generated.
There are plenty of crashes in the output folder to triage. Let’s take a look inside one of them:
It seems like one of the files that produced a crash was a massive buffer of 1’s.
Reproducing the Crash
We can generate a markdown document with identical input to the crash file seen in the ‘output/default/crashes directory’ using python3:
To confirm the crash, execute the md2roff program with the markdown file as the input:
It looks like the program segfaults when trying to process our large buffer of 1’s. At a minimum, we have a denial of service condition. We can attach GDB to our program and run md2roff a second time to see if we have altered the control flow and overwritten the return address.
Success! The stack was successfully smashed by our buffer of 1’s. From this point forward we could put together an exploit using a binary exploitation technique such as ret2libc or ROP chaining. This would allow an attacker to compromise a victims computer if a malicious file was opened with the md2roff tool.
There are many other fuzzers such as honggfuzz, Boofuzz, Libfuzzer, Syzkaller, and go-fuzz that can assist developers and researchers in tailoring their fuzzing process to the type of software being tested. Implementing fuzz testing early in the development cycle can greatly reduce an organization's exposure to zero-day vulnerabilities and prevent cybercriminals from taking advantage of unintended software flaws.
Citations
“Zero-day (computing).” Wikipedia, https://en.wikipedia.org/wiki/Zero-day_(computing).