Protostar 0x0E - Heap3 pt 1

Prev: 0x0D - Heap2
Next: 0x0F - Heap3 pt 2

The introduction for this level explains that the malloc implementation used in this VM is Doug Lea's malloc (called "dlmalloc"). This allocator will need to be exploited, specifically by modifying its metadata, in order to change program execution.


The program seems innocent enough: three variables have 32 bytes of memory allocated for them on the heap, then three arguments are placed into them respectively, and then they are freed in reverse order. Fortunately, we have the clue about "dlmalloc" from the instructions to guide our efforts, although we have to remember that different versions of this allocator likely exist.

The source code for this system's "dlmalloc" can be found here.

At this point, if you are not familiar with how "dlmalloc" works, you will first want to familiarize yourself with Doug Lea's own explanation of his allocator, specifically the "Algorithms" section. However, Doug Lea's explanation is generally pretty light, so I recommend Azeria's heap articles for a more holistic understanding: "malloc" and "free/bins."

From this point forward, it will be assumed that you have a general understanding of dlmalloc, particularly the internals. You will need to note that an allocated chunk looks like this:


    chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of previous chunk, if allocated            | |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of chunk, in bytes                         |P|
      mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             User data starts here...                          .
            .                                                               .
            .             (malloc_usable_space() bytes)                     .
            .                                                               |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of chunk                                     |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+




While a freed chunk looks like this:


    chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of previous chunk                            |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    `head:' |             Size of chunk, in bytes                         |P|
      mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Forward pointer to next chunk in list             |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Back pointer to previous chunk in list            |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Unused space (may be 0 bytes long)                .
            .                                                               .
            .                                                               |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    `foot:' |             Size of chunk, in bytes                           |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


As you can see, this heap implementation makes used of doubly-linked lists, in which free chunks point to one another, allowing "malloc" to quickly scan for suitably-sized chunks. As you might imagine, these pointers are written, rewritten, and modified at various places, but the most infamous is within the "unlink" macro:


/* Take a chunk off a bin list */
#define unlink(P, BK, FD) {
  FD = P->fd;
  BK = P->bk;
  FD->bk = BK;
  BK->fd = FD;
}


If we understand "P" to mean "(free) chunk pointer" and "fd" and "bk" to stand for "forward" and "backward" pointers respectively to other free chunks on a linked list (as noted in the above chunk metadata), then we can clearly see that an arbitrary write is achievable if we are able to control these chunk metadata values via an overwrite.

If we look ahead in the source code, we can see where this "unlink" macro is used: in "free," which is the complement to "malloc" that allows a program to reuse previously allocated memory. More specifically, "unlink" is used within "free" to consolidate the chunk being freed with either of its nearby chunks if they are free as well. The chunk (whether consolidated with its neighbors or not) is then stored in an "unsorted bin," which is a special bin class for temporary storage of freed chunks which reduces the allocator's overhead.

Here is the relevant source code for the instance of the "unlink" macro in "free" which we will aim to exploit. This is near the beginning of "free:"


/*
  ------------------------------ free ------------------------------
*/
... ... ...
/*
      If eligible, place chunk on a fastbin so it can be found
      and used quickly in malloc.
    */
    if ((unsigned long)(size) <= (unsigned long)(av->max_fast)
... ... ...

    /*
       Consolidate other non-mmapped chunks as they arrive.
    */

    else if (!chunk_is_mmapped(p)) {
      nextchunk = chunk_at_offset(p, size);
      nextsize = chunksize(nextchunk);

      /* consolidate backward */
      if (!prev_inuse(p)) {
        prevsize = p->prev_size;
        size += prevsize;
        p = chunk_at_offset(p, -((long) prevsize));
        unlink(p, bck, fwd);
      }
So, putting all of the pieces together, by reading the above source code with the added context from the aforementioned articles, we can understand the workings of "free:"

  1. Check if the chunk is a fastbin by seeing whether the size is less than or equal to 80 bytes. If it is not a fastbin, continue with these steps. (red text above)
  2. Check if the chunk is mmapped via the 0x2 bit of “size,” which is the second doubleword of the chunk metadata. If it is not set, continue with these steps. (green text above)
  3. Check the prev_inuse 0x1 bit of “size” to determine whether the chunk preceding it in memory is also free. If it is, continue with these steps. (blue text above)
  4. The allocator will now try to coalesce the two chunks. First, it needs to reference the previous chunk. It does this by calculating the previous chunk’s pointer by subtracting “prev_size” (the first doubleword of the chunk metadata) from the current chunk’s pointer. 
  5. It will call the “unlink” macro (orange text above) on the previous chunk (using the calculated pointer from step 4) to attempt to remove it from its current linked list. As we know, the previous chunk is of course not actually free in this case - or may not even exist depending on what pointer we use - and thus it is not truly on a linked list as the allocator expects. However, this doesn't matter, because we will provide what the allocator expects to see.
  6. “unlink” will operate as listed in the source code above to generate our arbitrary write: the forward pointer plus an offset (0xC) will equal the backward pointer, while the backward pointer plus an offset (0x8) will equal the forward pointer. Thus, if we set the forward pointer to Address A minus 0xC, and set the backward pointer to Address B, the value at Address A will be overwritten with Address B.
None of this seems too difficult in theory, especially if we are already to overwrite all of this metadata with "strcpy!" However, that is not exactly the case here - "strcpy" ceases copying when it hits null bytes, which means we won't be able to overwrite the "prev_size" and "size" internals of the chunk metadata unless we use massive values! 

At this point, we are very close, and we just need to get a bit clever to work around this issue. However, first make sure you have a solid understanding of everything discussed up to this point, because it will get tricky.

Prev: 0x0D - Heap2
Next: 0x0F - Heap3 pt 2