这是连续第三届参加广东省的红帽杯比赛了,就题目质量来说明显是一届比一届高,看到这题万花筒惊喜之余也感叹国内的CTF比赛门槛真是越来越高了。作为一道基于解释器改编的题目,通过传统的逆向方法来做还是比较困难,因此分享一下用fuzzing来找到题目漏洞以及后续的分析利用。
This challenge is from a CTF game of Guangdong province, China. It is a Pwn challenge based on llvm JIT engine. You can download this challenge at this link.
Recon
At first you may not able to run this binary directly, because of the missing libary libLLVM-6.0.so.1
, use sudo apt-get install libllvm6.0
to solve the dependency. It will give us a interpreter interface like:
1 | ready> a = 1 |
Drop the binary into ida and we can see that it was written by C++, and the designer turned on some optimization settings when compiling, so the decompile result was really hard to follow. The symbols tell us, this is a Kaleidoscope JIT interpreter, which is used by llvm project as tutorial to demonstrate how to implement a JIT interpreter. We can find the tutorial here: Building a JIT: Starting out with KaleidoscopeJIT, and the source code at llvm-kaleidoscope.
The main function is clear in source code:
1 | int main() { |
but in ida it looks really terrible:
1 | LLVMInitializeX86TargetInfo(*(_QWORD *)&argc, argv, envp); |
Comparing these two pieces of code, we can see the challenge define =
as BinopPrecedence
while the original version didn’t. I try to follow the code but soon decide to change another method.
Fuzzing
So I turned to fuzzing and hope to find some bugs. I tried AFL with qemu mode to run this binary first, but it stuck on the initialization. If you know how to run such a binary with AFL, please do let me know.
1 | matthew@matthew-MS-7A37 /m/d/L/Fuzz> afl-fuzz -i in/ -o out1/ -Q -- ./wang |
Then I try honggfuzz, which is another popular fuzzer support binary instrument. At first I cloned the source code from github but failed on compilation. Then I found a docker image at Doker hub, but it does not support qemu mode. I had to attach to the container and complied the qemu mode, some dependencies installation are unavoidable. It took me more than 2 hours to setup this tool (the network connection is always big problem when you setting up similar tools in China).
The command of running this docker image is:
1 | docker run --rm -it -v (pwd):/work --privileged zjuchenyuan/honggfuzz:latest /bin/bash |
and you can find the usage of honggfuzz here: USAGE, for the qemu mode we need, it can be run by:
1 | honggfuzz -f /work/in/ -s -- ./qemu_mode/honggfuzz-qemu/x86_64-linux-user/qemu-x86_64 /work/kaleidoscope |
The seed corpus was put in /work/in
, I simply chose the code snippet from https://llvm.org/docs/tutorial/OCamlLangImpl1.html#the-basic-language:
1 | # Compute the x'th fibonacci number. |
I run this in a vmware workstation vm, so the speed is a kind of slow, but it still give us some crashes in less than ten minutes. I believe this will be much faster on a bare metal linux machine.
1 | Iterations : 5810 [5.81k] |
Crashes
The fuzzer gave us a crash in less than ten minutes, I review the crashes, it seems like some heap corruption issue, but the stacktrace was hard to look at.
1 | ─────────────────────────────────[ REGISTERS ]────────────────────────────────── |
I then went to dinner and some really interesting crashes was found before I came back. The fuzzer reports these inputs lead to crashes, but the binary did not crash at all in dry run.
1 | matthew@matthew-MS-7A37 ~/L/fuzz> cat c5 |
1 | matthew@matthew-MS-7A37 ~/L/fuzz> cat c5 | ./kaleidoscope |
The output was interesting though, it said the external function “fib” could not be resolved. If you compare the binary code with the original source code, you can see that there was an extern
keyword in the source. However the handler was disabled in this challenge, it will say “No extern function” if you try to use extern
keyword.
1 | { |
Then I search the string Program used external function 'fib' which could not be resolved!
in this binary but find nothing. However, this message was stared by a tag LLVM ERROR
, did that mean the string located at the llvm library?
1 | matthew@matthew-MS-7A37 ~/L/fuzz> strings /usr/lib/x86_64-linux-gnu/libLLVM-6.0.so.1 | grep "Program used external" |
YES!
Analysis
Consider the logic behind this test case, the binary have finished the parsing job and pass the function name to libLLVM, libLLVM get the function name and try to resolve it from libc. If we search this info in the source of llvm, we can see it was invoked by RTDyldMemoryManager::getPointerToNamedFunction
, see https://github.com/llvm-mirror/llvm/blob/8b8f8d0ad8a1f837071ccb39fb96e44898350070/lib/ExecutionEngine/RuntimeDyld/RTDyldMemoryManager.cpp#L290.
Then I thought maybe we can call the libc functions directly in same manner. I changed the fib
to puts
, loaded the binary into gdb and read the input. I also set a breakpoint at puts
, it did stop at the call.
1 | LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA |
We can see that the first argument $rdi
is 0x28=40
, that means we can control the argument too.
Exploit
With the handy arbitrary libc function calling ability, it should be quite straightforward to get a shell. The binary was protected by PIE
, so we don’t know any address information initially. My solution is using mmap
to get an new region like 0x100000
and use read
to load the /bin/sh
string into memory. Finally we can call system(0x100000)
to get shell.
1 | payload = """ |
I spent some time to tune the if-else
statements according to the number of arguments to make it accept by the interpreter, but later find it is unnecessary, any function definition with if-else
statement will be regard as external function.
Wrapup
This is the first time I used fuzzing technique to solve a challenge during a CTF. As you can see, this is a promising skill in competition, it can save plenty of time from reverse engineering. In terms of the interpreter pwn challenges, I had came across some like Javascript, Lua (see another writeup at of XNUCA2019), and this Kaleidoscope, many of them were related to the external function calling or, foreign function interface (FFI). So this might be the thing to look at when you meet a interpreter-based pwn challenge.
References
- https://llvm.org/docs/tutorial/BuildingAJIT1.html
- https://github.com/ghaiklor/llvm-kaleidoscope
- https://hub.docker.com/r/zjuchenyuan/honggfuzz
- https://github.com/google/honggfuzz/blob/master/docs/USAGE.md
- https://llvm.org/docs/tutorial/OCamlLangImpl1.html#the-basic-language
- http://blog.leanote.com/post/xp0int/%5BPWN%5D-ls-cpt.shao%E3%80%81MF