Fun with Assembly 3: Deeper into the Assembly Beast

Posted by Hywel Carver on February 12, 2015

This is the third in a series of articles about assembly language. You might want to read the previous post, and the ones that came before it, before this one.

This post is based around the Sydney level on microcorruption.com. Like last time, we’re trying to find an input to open a lock without knowing the correct password, using knowledge of assembly language. I’m going to explain more about assembly language than last time, and show more of the debugging process.

 Diving into the main function

Let’s get straight into the assembly. Here’s the first half of the main function.

4438:  3150 9cff      add #0xff9c, sp
443c:  3f40 b444      mov #0x44b4 "Enter the password to continue.", r15
4440:  b012 6645      call  #0x4566 <puts>
4444:  0f41           mov sp, r15
4446:  b012 8044      call  #0x4480 <get_password>
444a:  0f41           mov sp, r15
444c:  b012 8a44      call  #0x448a <check_password>
4450:  0f93           tst r15

Let’s break that down further, and read it line by line.

add #0xff9c, sp adds to the stack pointer. I’ve talked about integer overflow before, so you should remember that this will actually decrease the value of the stack pointer by 100 bytes (0x100 - 0x9c). Stacks grow downwards, so this is making the stack larger by 100 bytes for the main function to use.

mov #0x44b4 "Enter the password to continue.", r15
call  #0x4566 <puts>

will move the value 0x44b4 (which, the debugger helpfully tells us, points to a string saying “Enter the password to continue”) into register 15, then calls the puts function. This is the same as calling puts with the string as an argument, so that it outputs to the console.

mov  sp, r15
call  #0x4480 <get_password>

Move the stack pointer into register 15, then call the function at 0x4480 (again, we’re helpfully told that it’s called get_password). Remember, register 15 is the default place for the first argument for a function, so this is calling the function and passing the stack pointer as the argument.

mov sp, r15
call  #0x448a <check_password>
tst r15

This looks similar to what came before. We call check_password with the stack pointer as an argument. The assembly has to copy it into register 15 again because the value in r15 might have been changed by get_password since we last moved the stack pointer. The last line tests r15. Which means, it will look at the value in r15 and populate the status register accordingly. If the value is negative, the status register’s “negative” bit will be 1. If the value is zero, the status register’s “zero” bit will be 1. Why does it do that? The answer is in the other half of the main function.

4452:  0520           jnz #0x445e <main+0x26>
4454:  3f40 d444      mov #0x44d4 "Invalid password; try again.", r15
4458:  b012 6645      call  #0x4566 <puts>
445c:  093c           jmp #0x4470 <main+0x38>
445e:  3f40 f144      mov #0x44f1 "Access Granted!", r15
4462:  b012 6645      call  #0x4566 <puts>
4466:  3012 7f00      push  #0x7f
446a:  b012 0245      call  #0x4502 <INT>
446e:  2153           incd  sp
4470:  0f43           clr r15
4472:  3150 6400      add #0x64, sp

The first thing this does is jnz #0x445e <main+0x26>, which is to jump to 0x445e if we have a non-zero number. Jumping to an address means that the current path of execution will stop, and the next instruction will be the one that’s jumped to. If the jump doesn’t happen (in this case, when the number is zero), it just carries on with the next line of assembly. The previous line (tst r15) tested register 15, which sets the bits in the status register depending on whether r15 is 0 or positive, etc. The line we’re currently looking at uses jnz to check the value of the zero bit in the status register, so the overall effect is to test whether the contents of register 15 are 0. In this situation, we will jump to #0x445e if r15 was non-zero after check_password returned.

The next two lines output “Invalid password; try again.”, then skip ahead to the end of the function.

If the jump does happen, the CPU will run the lines from 0x445e. First of all, this outputs “Access Granted!”. Then it does this:

push  #0x7f
call  #0x4502 <INT>

This is a hardware interrupt, which is necessary for all attempts to read input from another process, all attempts to send output to another process, or any other operation that interacts with the outside world. We generally call input or output ‘IO’, and operations that stop the program from continuing ‘blocking’ operations.

This interrupt has the argument 0x7f, which is the magic code for “open the lock” (there’s a manual which lists that kind of thing). This argument is passed on the stack, not in a register (for each CPU, rules called “calling convention” determine whether arguments are on the stack or in registers - it just so happens that this CPU’s interrupt takes arguments on the stack). The next line, incd sp is to clear up the 2 extra bytes of stack space that were used by pushing the value 0x7f onto the stack (2 bytes is the minimum variable size on a 16-bit system, so even though 0x7f is 1 byte, it takes up 2 bytes on the stack). incd is short for “increment double”, so it will add 2 bytes to the stack pointer (mirroring the 2 bytes used by the value we pushed).

After the jnz instruction branched code execution, the two branches join again at 0x4470, which does clr r15 and add #0x64, sp. This clears the value of r15, which means this function is returning 0, and then adds 100 on to the stack pointer, to clear up the extra stack space taken by the start of the function.

So everything here hinges on the return value of check_password. If it’s non-zero the lock is opened. If it’s zero, the lock stays shut.

Looking at get_password

The get_password function is simple.

4480:  3e40 6400      mov #0x64, r14
4484:  b012 5645      call  #0x4556 <getsn>
4488:  3041           ret

Put the value 100 in register 14, then call getsn. Remember that this function was called with the stack pointer in register 15? Well that will still be there. So getsn is called with the stack pointer as the first argument, and the value 100 as the second. This second argument instructs getsn to take at most 100 characters (including the null character) as input. The bad news (from our perspective) is that 100 is exactly how much stack space has been allocated for the input. I’m saying this is bad news because, if the stack had only allocated, say, 30 bytes, we’d be able to overwrite 70 bytes of memory which we weren’t meant to get access to.

This function then returns.

Looking at check_password

Here’s the assembly.

448a:  bf90 342b 0000 cmp #0x2b34, 0x0(r15)
4490:  0d20           jnz $+0x1c
4492:  bf90 3529 0200 cmp #0x2935, 0x2(r15)
4498:  0920           jnz $+0x14
449a:  bf90 5a34 0400 cmp #0x345a, 0x4(r15)
44a0:  0520           jne #0x44ac <check_password+0x22>
44a2:  1e43           mov #0x1, r14
44a4:  bf90 3c4c 0600 cmp #0x4c3c, 0x6(r15)
44aa:  0124           jeq #0x44ae <check_password+0x24>
44ac:  0e43           clr r14
44ae:  0f4e           mov r14, r15
44b0:  3041           ret

There’s definitely a pattern here. Lots of lines of comparing a literal value, against something near where r15 points to. Remember that r15 is set to the stack pointer before this function is called. And the stack is where the inputted password is. By the way, jnz and jne can be used interchangeably (as can jz and je) - all 4 looking at the zero bit-flag on the status register.

So we compare the first pair of bytes of the input against 0x2b34, the second pair with 0x2935 the third pair with 0x345a. If ever the two things aren’t equal, we jump to the same line (0x44ac - sometimes this is calculated as an offset. E.g. from 0x4490, we jump $+0x1c, which still takes the CPU to 0x44ac).

The next lines are

mov #0x1, r14
cmp #0x4c3c, 0x6(r15)
jeq #0x44ae <check_password+0x24>

So we put the value 1 into register 14, then do another comparison, of the 4th pair of bytes against 0x4c3c. This time, we jump if they’re equal - and we only jump over one instruction, skipping the line 0x44ac (which the other jumps aim for).

That line 0x44ac reset r14 to 0. So if any of the comparison are not equal, that line will be hit. Otherwise, it’ll be set to 1. The next line, mov r14, r15, copies the value into register 15, which will be the return value. The last line performs the actual return.

The password

We’ve seen that pairs of bytes in the password need to equal 0x2b34, 0x2935, 0x345a and 0x4c3c in turn. So can we just input 0x2b342935345a4c3c as the password, and open the lock?

No is the answer. The reason is to do with endian-ness. This is all to do with how bytes are stored in a memory not being the same as how they’re written out. On the CPU we’re using, bytes are stored little-endian within each 16-bit word. That means that a value in the real world of 0x0102030405060708 would be stored in memory as 0x0201040306050807, with the bytes swapped in every pair.

There’s no great trick to this, it’s just something you have to be aware of whenever you’re converting from real-world values to values in memory. Although there is a bit of a hint here from comparing the assembled version of the program with the disassembled version. E.g. 448a: bf90 342b 0000 cmp #0x2b34, 0x0(r15) The assembled version on the left shows 342b, which is displayed as 0x2b34 on the right. That’s as much a hint as you’ll ever get.

To find the real password, we have to swap the bytes in every pair of the password we had above. 0x2b342935345a4c3c becomes 0x342b35295a343c4c.

And with that, the lock opens.

Endnotes

This hack was similar to the last one. The lock had a master password which was hard-coded in the program. The only extra challenge was that it wasn’t hardcoded as a readable string, but hardcoded in the instructions.

Next time, we’ll look at using a buffer overflow to open a lock by writing to memory we weren’t meant to access.

Thanks to Tom Carver for reading a draft of this.