Buffer Overflows & Fuzzing
Adapted, with permission, from the Data Security and Ethics lecture materials by Martin Lester (University of Reading).
Buffers
A buffer is a block of memory (or other storage) used for storing data temporarily.
Example 1
A calculator program allows the user to type sums.
As the user types, input is stored in a buffer.
When the user presses return, the program calculates the result.
A buffer is a block of memory (or other storage) used for storing data temporarily.
Example 2
A program counts the number of lines in a text file on disk.
It would be inefficient to load the whole file into memory...
...or to read it one byte at a time.
Instead, the program loads a block at a time into memory, then processes the file a block at a time.
A buffer is a block of memory (or other storage) used for storing data temporarily.
Example 3
A webserver logs names of pages accessed to disk.
The log message is a string that includes a timestamp and the name of the page accessed.
The program concatenates the timestamp string and page name string together in memory.
Then it prints the whole message with a single operating system call.
Buffer overflows
A buffer overflow, also called a buffer overrun, is when a program is meant to write to memory in a buffer, but instead writes to memory after the buffer.
Main cause: * Programmer allocated fixed size buffer and thought that would be OK.
Other causes: * Programmer incorrectly calculated length of variable size buffer. * Incorrect assumption about how much data a library call could write.
Exploiting a buffer overflow
What can happen if memory after a buffer is written to?
- Nothing!
- Program crashes.
- Variable value changed.
- Return address changed.
- Arbitrary code execution.
Causing a buffer overflow
Overflowing a buffer is sometimes very simple:
Just send the program more data than it is expecting (for example, over network connection).
But sometimes overflows only occur in very specific circumstances:
Maybe the data includes a field specifying the length of another field. What if the field is longer than the claimed length?
Or could rely on several rare events occurring simultaneously.
Exploiting a buffer overflow
In order to exploit a buffer overflow, you usually need to do more than just trigger it.
Most likely outcomes are nothing or random crash.
An attacker want to run code of his choosing: Arbitrary Code Exection.
He often wants to run shellcode: a small program that allows the attacker to make a network connection and run any command.
-
How do you get shellcode in memory on the target?
-
How do you run shellcode on the target?
Sending shellcode
Maybe you just send shellcode into the buffer. But:
- Buffer might be small.
- Shellcode needs to be small to fit in buffer.
- Optimise for size, or hand-code.
- Program reading into buffer might be expecting text or some other format.
- Some bytes of code (like 0) might be interpreted as string termination!
- Use alternate encodings of some instructions, or hand-code.
- Buffer might not always be at same address in memory.
- Absolute jumps might not work.
- Compile Position Independent Code (PIC).
Running shellcode
Now, how to make the shellcode run?
- Carefully examine memory layout of compiled program.
- Then write a value into the buffer that will jump to the right address.
- Can vary between builds of a program.
- Can vary between runs of a program!
- Heap-spraying: Fill memory with copies of shellcode and "ramps".
- Maximise probability that random address leads to shellcode.
- Usually needs another way of getting shellcode into memory.
- Example: JavaScript + buggy browser plugin.
Countermeasures
- Don't write buggy code.
- Hard! Can use testing, auditing...
- Use language with automated memory management.
- Fine for applications, not high-performance or systems code. Or language runtimes.
- Address Space Layout Randomisation (ASLR).
- Operating system/runtime pick different memory address every run.
- Hardware protection.
- Memory segmentation/paging and no-execute flag.
- Malware detection and Web application firewalls.
- Nozzle/Zozzle detect heap-spraying JavaScript.
- Install updates regularly.
- Make sure security patches applied.
Fuzz testing
Basic idea:
- Run program with random input.
- Does it crash?
- If yes, log the input for later analysis.
- If no, try again.
Can try billions of possible inputs!
May need to run the program in a special environment (for example, using Valgrind) to force a crash on memory management errors.
Problems with purely random fuzz testing
Many programs look something like this:
try {
f = read_input_file();
x = parse(f);
}
catch (ParseError e) {
println("Error: malformed input.");
}
// do something interesting with x
The interesting behaviour with x rarely gets considered.
Many programs look something like this:
if (complex_test(x) && complex_test(y)) {
// do something interesting with x and y
}
else {
// do something boring with x and y
}
The interesting behavoiur with x and y rarely gets considered.
Improving fuzz testing: generation vs mutation
A generation-based fuzzer creates a new random input each time.
A mutation-based fuzzer uses genetic algorithms to create new inputs: * Flip random bits or insert random bytes. * Concatenate, split or merge existing inputs.
Mutation can be better at creating interesting test inputs.
But often relies on good initial corpus of test inputs.
Imagine fuzz-testing a program that parses HTML fragments.
Initial corpus:
* <p>Hello.</p>
* <em>Yes.</em>
Example mutations:
* <p>Hello.</p><p>Hello.</p>
* <p>He<em>Yes.</em>
* <en>Yes.</em>
* s.</em>
Improving fuzz testing: unstructured vs structured
An unstructured/dumb fuzzer does not know expected input format.
A structured fuzzer has a grammar or other way of recognising/generating syntactically valid inputs.
Advantages: * More likely to test interesting program paths... * ...especially if input format has checksums. Disadvantages: * Misses bugs in parser. * Requires accurate grammar to pass to fuzzer. * Slower rate of test case generation.
Improving fuzz testing: black/grey/white box
A black box fuzzer has no knowledge of program structure. This is rare in practice.
A grey box fuzzer uses instrumentation (via special compiler or virtual machine) to track trace of execution.
If a test case generates a new trace, focus on it in later mutations.
A white box fuzzer uses symbolic execution and/or constraint solvers to generate test cases that specifically target new execution paths.
Trade-off between number and quality of test cases.
Example: American Fuzzy Lop
American Fuzzy Lop (AFL) is a mutation-based, unstructured, grey box fuzzer.
It is not the most technically advanced fuzzer, yet it has been used to detect many bugs in real applications.
Lessons learnt:
- Having sensible defaults and being easy to use is more important for adoption than flexibility.
- Reliability and speed are also important.
- In practice, volume of test cases is often more important than quality.
Ethical issues: buffer overflows
Buffer overflows raise mostly the same ethical issues as other software vulnerabilities:
- As a developer, what obligations do you have to test for or minimise the likelihood of buffer overflows?
- As a system administrator, what obligations do you have to minimise the likelihood of a buffer overflow being exploited?
- As a penetration tester, if you discover a buffer overflow, when and how should you disclose it?
- As a software developer, if someone informs you of a buffer overflow, what should you do and when?
- Is it ethical to introduce buffer overflows in software projects?
- Is it ethical to exploit buffer overflows?
Ethical issues: fuzz testing
Fuzz testing raises mostly the same ethical issues as other security tools with "offensive" uses. Is it ethical...
- ...to write a fuzz tester?
- ...to distribute a fuzz tester?
- ...to restrict distribution of fuzz testers by law?
- ...to use fuzz testers over the network to test someone else's system?