🐛 Fuzzing Random Ubuntu Packages with Mayhem - Part 1

🐛 Fuzzing Random Ubuntu Packages with Mayhem - Part 1
Photo by Erik Karits / Unsplash

About Mayhem

Mayhem is a cloud (or on-premises) fuzzing solution created by ForAllSecure. It has some great features that make fuzzing more approachable for software developers with little fuzzing experience. I almost think of it as a big red easy button for fuzzing. For experienced fuzz testers, it’s quite nice for a number of reasons.

  1. It takes care of the backend for me. I don’t have to worry about setting up infrastructure like ClusterFuzz or networking a bunch of virtual machines together.
  2. A sub-component of 1. is that I am not using my own resources. I still have all of the cores available on my Mac, and I’m not paying for CI/CD minutes or anything. Though, Mayhem isn’t free unless you’re working on open-source projects.
  3. It abstracts network calls. With tools like AFL, your fuzz harness needs to convert calls that read from the network (think recv to stdin). Mayhem takes care of this for you.
  4. It supports Docker containers or native applications.
  5. It supports running native code (with or without source) or cross-architecture targets running with QEMU.

Random Package #1

Enough about Mayhem. I just scrolled down the list of available Ubuntu packages for Jammy Jellyfish (22.04 LTS), looking for something easy to harness (it was a lazy weekend, after all). I came across the words “assembler” and knew I had something simple. crasm is a cross assembler for 6800/6801/6803/6502/65C02/Z80 and is hosted on GitHub. At the time of harnessing, it was last updated on February 15th, 2021.

Compilation

I cloned down the repository from GitHub and compiled it with the AFL compiler CC=afl-clang-lto CXX=afl-clang-lto++ make. This produced the compiled amd64 ELF in /src/crasm. Easy. I wrapped this up and threw it on Dockerhub(which will come in handy later) as a public image.

Test Cases

For fuzzing, you usually want test cases for optimization. This saves the fuzzer a lot of time. Like, a lot. Fortunately for us, crasm includes its own test suite under /test. By default, Mayhem looks for these test cases in the folder testsuite. For reference, documents for AFL usually just call this /input.

Usage

The next thing I always do before starting a fuzzing campaign is to use the program and see how it accepts input and reacts. We can invoke crasm just by calling the executable without any flags.

./crasm
No input!
Syntax:  crasm [-slx] [-o SCODEFILE] INPUTFILE
Crasm 1.10 known CPUs:
         6800 6801 6803
         6500 6502 65C02
         Z80

If we provide the program with one of the test cases described above, we get the output dumped to stdout. This looks really easy to fuzz.

./crasm /testsuite/copy.6800.asm 
Pass #1
Pass #2
Crasm 1.10:                                                   page  1

                         1  ;;; Author: Leon Bottou
                         2  ;;; Public Domain.
                         3  
                         4          cpu 6800
                         5  
  8000                   6          * = $8000
                         7  
  0040                   8          begin  = $40
...

Successful assembly...
 Last address     803b (32827)
 Code length        78 (120)

Crasm 1.10:                                                   page  2

 0040   Abs BEGIN                                            
^8013   Abs COPY                                             
 0042   Abs DEST                                             
 0044   Abs LEN

Harness and Mayhemfile

Since the crasm program just reads from a file passed in as an argument, we don’t really need a harness. To run a fuzzing job, Mayhem uses a .yml file called Mayhemfile. This file specifies how to run the program, environments (like LD_PRELOAD), how long to wait for the program to respond, etc. A full list of Mayhemfile options are provided for reference here. Mayhem provides a command line tool (aptly called mayhem) that can automatically generate a Mayhem file for you, with mayhem init. This creates the Mayhemfile. We can modify it like so.

# Mayhem by <https://forallsecure.com>
# Mayhemfile: configuration file for testing your target with Mayhem
# Format: YAML 1.1

# Project name that the target belongs to
project: crasm

# Target name (should be unique within the project)
target: crasm

# Base image to run the binary in.
image: whatthefuzz/crasm-afl:1.0.0

# Turns on extra test case processing (completing a run will take longer)
advanced_triage: false

# List of commands used to test the target
cmds:

  # Command used to start the target, "@@" is the input file
  # (when "@@" is omitted Mayhem defaults to stdin inputs)
  - cmd: /crasm/src/crasm @@
    env: {}

    # Max size in bytes of the test size.
    max_length: 65536

So what are we looking at? ForAllSecure does a good job of providing descriptions for the keys. I’ll specifically point out the image key. It’ll pull down a public Docker container to use as the executable. This is super helpful since the host I usually use is an arm64 Mac. The cmds.cmd key has a value that is similar to what we used above in Usage. The only difference is the @@ which the fuzzer will substitute with each test case as it runs (i.e. the target reads input from a file, if it reads from stdin we would omit this).

Start Fuzzing! 👾

With all of this, we can just kick off the fuzzing job with mayhem run . From experience, it sometimes takes a while to get a job running. Mayhem provides an event log that lets you get things right (also notice that this was my seventh run 😬). Whether it’s a typo or you left something enabled that you shouldn’t have, keep trying. You’ll get it.

Screenshot 2022-12-16 at 11.56.19 AM.png

Bugs

I went to grab a drink while Mayhem did its thing. I came back to two bugs, a divide by zero and a NULL pointer dereference. The nice thing about Mayhem is that it comes with its secret-sauce symbolic execution engine, which makes finding interesting test cases easier and happen faster. It’s also pretty great that Mayhem automatically provides CWEs.

defects.png

Fixing the Bugs 🔨

This wouldn’t be a cool project if we didn’t at least fix the bugs we found. First, let’s compile the target without instrumentation (i.e. don’t use the AFL clang compiler) and add debugging symbols. In crasm/src we can modify the flags like so: CFLAGS = -O0 -Wall -g. -g will give us debugging symbols. -O0 will remove optimizations and maybe leave variables and such intact. Then we can compile like so:

$ clang --version

Apple clang version 14.0.0 (clang-1400.0.29.202)
Target: x86_64-apple-darwin22.2.0
Thread model: posix

$ CC=clang make

...

Then, we verify that we get a crash with the crashing test cases provided by Mayhem. If we use mayhem sync . in the directory with our Mayhemfile, we will download the corpus of test files Mayhem generated (crashing and non-crashing).

$ mayhem sync .
$ ls -al ./defects
4ed6eacf6ec3c24f587ec3321b5fd739480c96a7679c8108f2f6034f07ecaff4
517d1b402d585fdb0458f96802a616419b9112bdc119a2393c35e034576a0c62

# Make sure to use the non-instrumented crasm to test!
$ crasm ./defects/4ed6eacf6ec3c24f587ec3321b5fd739480c96a7679c8108f2f6034f07ecaff4 
Pass #1
[1]    58286 segmentation fault  ~/Developer/crasm/src/crasm

$ crasm ./defects/517d1b402d585fdb0458f96802a616419b9112bdc119a2393c35e034576a0c62 
Pass #1
[1]    58369 floating point exception  ~/Developer/crasm/src/crasm

Success (or failure?)! We still have crashing test cases. Now we can triage. We can call the program under lldb (Linux folks can use gdb, just substitute the -- with --args) with the following:

$ lldb -- ./crasm ./4ed6eacf6ec3c24f587ec3321b5fd739480c96a7679c8108f2f6034f07ecaff4
(lldb) run
Process 99307 launched: '/crasm/src/crasm' (x86_64)
Pass #1
Process 99307 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x00000001000037b3 crasm`Xasc(modifier=0, label="msgb", mnemo="asc", oper=0x0000000000000000) at pseudos.c:221:15
   218    register char delimiter;
   219 
   220    s = oper;
-> 221    delimiter = *s;
   222 
   223    if (delimiter != '\\'' && delimiter != '\\"')
   224    {
Target 0: (crasm) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x00000001000037b3 crasm`Xasc(modifier=0, label="msgb", mnemo="asc", oper=0x0000000000000000) at pseudos.c:221:15
    frame #1: 0x0000000100002d6c crasm`asmline(s="asc", status=3) at crasm.c:562:7
    frame #2: 0x00000001000027b1 crasm`pass(n=1) at crasm.c:274:9
    frame #3: 0x0000000100002490 crasm`crasm(flag=138) at crasm.c:180:3
    frame #4: 0x0000000100002292 crasm`main(argc=0, argv=0x00007ff7bfeff440) at crasm.c:147:5
    frame #5: 0x00007ff812381310 dyld`start + 2432

So this is our null pointer dereference. It’s easy to see why with the source.

**********int Xasc(int modifier, char* label, char* mnemo, char* oper)
{
  register char* s;
  register char r;
  register char delimiter;

  s = oper;
  delimiter = *s;
...**********

Whatever called this function (which we can see with the backtrace) passed in a NULL value for oper. Looking at asmline:562, we can see the offending line:

if (status & 2)
    {
      (*labmnemo->ptr)(labmnemo->modifier, label, mnemo, oper);
    }
}

Lots of bugs come from using pointers as functions, as in this case. Anyway, we’re going to recommend a minimum viable patch to at least prevent the dereference. Here, we check that the character pointer oper is not NULL. The other arguments aren’t used but are likely needed because of how the function is called with the dynamic function pointer.

int Xasc(int modifier, char* label, char* mnemo, char* oper)
{

  if (oper == NULL)
  {
    error("Need an operand");
  }
...

After re-compilation, we verify that we don’t get a segmentation fault, just more errors. 🙂

$ crasm ./4ed6eacf6ec3c24f587ec3321b5fd739480c96a7679c8108f2f6034f07ecaff4
< assembly output omitted>

ERRORS:       5
WARNINGS:     0

No code generated...

After fixing this, I took a look at the pending (from 2019) pull requests on the author’s repository and saw that another individual spotted a similar bug in an adjacent function, Xdc.

Test Case #2 - Divide by Zero

Looking at the next test case:

$ lldb -- ./crasm ./517d1b402d585fdb0458f96802a616419b9112bdc119a2393c35e034576a0c62
(lldb) target create "./crasm"
Current executable set to '/crasm/src/crasm' (x86_64).
(lldb) settings set -- target.run-args  "./517d1b402d585fdb0458f96802a616419b9112bdc119a2393c35e034576a0c62"
(lldb) r
Process 2564 launched: '/crasm/src/crasm' (x86_64)
Pass #1
Process 2564 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_ARITHMETIC (code=EXC_I386_DIV, subcode=0x0)
    frame #0: 0x00000001000078f4 crasm`opdiv(presult=0x0000000100017468, parg=0x00007ff7bfefefa0) at operator.c:415:18
   412    presult->flags |= parg->flags;
   413    checktype(presult, L_ABSOLUTE);
   414    checktype(parg, L_ABSOLUTE);
-> 415    presult->value /= parg->value;
   416  }
   417 
   418  void oprlist(struct result* presult, struct result* parg)
Target 0: (crasm) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_ARITHMETIC (code=EXC_I386_DIV, subcode=0x0)
  * frame #0: 0x00000001000078f4 crasm`opdiv(presult=0x0000000100017468, parg=0x00007ff7bfefefa0) at operator.c:415:18
    frame #1: 0x0000000100006455 crasm`parse2(expr="ed/maica", presult=0x0000000100017468) at parse.c:152:7
    frame #2: 0x00000001000062b4 crasm`parse(expr="ed/maica") at parse.c:233:3
    frame #3: 0x0000000100009357 crasm`findmode(oper="aciam/de", pvalue=0x00007ff7bfeff068) at cpu6800.c:99:11
    frame #4: 0x0000000100009214 crasm`standard(code=202, label=0x0000000000000000, mnemo="orab", oper="aciam/de") at cpu6800.c:163:9
    frame #5: 0x0000000100002d3c crasm`asmline(s="orab aciam/de", status=3) at crasm.c:562:7
    frame #6: 0x0000000100002781 crasm`pass(n=1) at crasm.c:274:9
    frame #7: 0x0000000100002460 crasm`crasm(flag=138) at crasm.c:180:3
    frame #8: 0x0000000100002262 crasm`main(argc=0, argv=0x00007ff7bfeff440) at crasm.c:147:5
    frame #9: 0x00007ff812381310 dyld`start + 2432

Pretty easy to spot the bug there. Also, a pretty easy fix to check the value before we divide (I’m not going for gold here, just a minimum viable patch) like so:

void opdiv(struct result* presult, struct result* parg)
{ 
  presult->flags |= parg->flags;
  checktype(presult, L_ABSOLUTE);
  checktype(parg, L_ABSOLUTE);
  if (presult->value != 0) {
    presult->value /= parg->value;
  }
}

Re-compiling and rerunning the test case shows that the issue is resolved. We still get errors, but no floating-point exceptions!

./crasm ./517d1b402d585fdb0458f96802a616419b9112bdc119a2393c35e034576a0c62
...
ERRORS:       4
WARNINGS:     0

No code generated...

I submitted a pull request for both bugs. We’ll see if they get merged! It was merged within twenty minutes. Props to the author. After the merge, I submitted bug reports to the Ubuntu package repository to alert them to the possible security issues in one of their packages.

What’s Next?

Mayhem limits OSS fuzz jobs to five minutes, so I continued with AFL++ (a super easy conversion). Maybe we’ll shake out more bugs?