Getting started with Binary reverse engineering: an example

Posted on Sat 13 January 2018 in binary reverse, IT, security

For a challenge in a university security class, I was given this file to crack: reverse1. I started with reverse0, which was considerably easier than the second one. In this post I will briefly explain how I tackled reverse1. I provided the files so you can you try on your own and then came back for hints if you are stuck! If you are new to this business, as I relatively am, I advise you to start from reverse0 and crack that first.

Hashes of reverse1 file: 
MD5 - c22c985acb7ca0f373b7279138213158
SHA256 - cd56541a75657630a2a0c23724e55f70e7f4f77300faf18e8228cd2cffe8248e

Disassembling and hoping for the best

The first thing I did was to disassemble the file with Radare to have a look at the code.

#In a terminal
r2 -A ./reverse1
#In Radare
s main
Vp

radare

The assembly is quite jumbled up, and difficult to analyse all together. A quick look tells us that trying to crack the file just by reversing the assembly is no easy task, and actually a silly idea to begin with. There's a cycle after the password is read from standard input, then some other instructions, then another cycle... it's difficult to get what is going on...

Instead, let's seek the Bad password print section, and see what should happen for the code to jump there. If we are lucky enough, we may find a bunch of final checks that will send over to the Bad password section. If we can find those, we may then look at those bits of assembly to understand how to avoid going there.

Scroll down enough, and down at the bottom I can see the Bad password part, starting at 0x080484f0.

radare

Radare helps in showing two different arrows going into this address. The related comparisons are the following:

radare

So at this point, we want to make sure that esi is zero and that the current byte in the address printed at by edi is zero as well. If those conditions are not met, we'll see the Bad password message.

So we should ask: where is esi changed throughout the execution? What's in esi? What's in edi? What is important to realize is that answers to these questions are the only thing we care about at the moment. The program may be doing very weird things at the beginning, but, as of now, the only important things to crack it are what's in esi and in edi. Period. Don't overshoot!

So, it turns out the answer to these questions can be found in the lines just above the checks on edi and esi, starting at 0x08048483.

radare

Here, esi is first set to zero, and edi contains the pointer to the user-entered password. Then there's a cycle, in which edi (i.e. entered password) is read byte by byte. Each iteration features a call to strtol (convert string to long integer) and strlen (get string length). We see that esi is incremented with results of xors with function returns, and we assume the result of the xors should be zero, but who knows what the password is? You may try to debug the program to see what the inputs and outputs of those calls are, but it's really going to be a pain.

ltrace to the rescue

Instead, using ltrace can save your life:

radare

First, try to run ltrace several times providing different input passwords. The output is always the same: the same calls are done over and over, regardless of the password we enter!

Now we can dive into understanding the assembly we saw earlier, the one featuring the calls to strtol and strlen.

radare

From ltrace, we see that the first strtol call returns 97, which goes into eax (as all return values usually do). That is then xored with local_1ch, which contains the first char of the input password. The result is used to increment esi. So here we go! This is vital! We want esi to be 0 at the end, so it should never be incremented. The result of the xor should thus be zero, which means that the two xor values should be the same. Eureka! We know the output of strtol is 97, so we should provide that same value as first char of the password! 97 is a in ascii, so the password starts with a.

radare

The same reasoning can be applied to subsequent calls to strtol details provided by ltrace to find that the second char s, c, then i and finally i again.

What are the strlen calls for? They are used as cycle condition: as soon as a string longer than 3 chars is found, the cycle ends and the two conditions on esi and edi are evaluated. The byte condition on edi basically checks that the password is exactly the one it should be, and does not only start with the expected password. That's why it checks for a zero byte, meaning string termination.

And there you go: password cracked!

terminal