This is the third in a series of articles about assembly language. You might want to read the previous post, and the ones that came before it, before this one.
This post is based around the Sydney level on microcorruption.com. Like last time, we’re trying to find an input to open a lock without knowing the correct password, using knowledge of assembly language. I’m going to explain more about assembly language than last time, and show more of the debugging process.
Diving into the main
function
Let’s get straight into the assembly. Here’s the first half of the main
function.
4438: 3150 9cff add #0xff9c, sp
443c: 3f40 b444 mov #0x44b4 "Enter the password to continue.", r15
4440: b012 6645 call #0x4566 <puts>
4444: 0f41 mov sp, r15
4446: b012 8044 call #0x4480 <get_password>
444a: 0f41 mov sp, r15
444c: b012 8a44 call #0x448a <check_password>
4450: 0f93 tst r15
Let’s break that down further, and read it line by line.
add #0xff9c, sp
adds to the stack pointer. I’ve talked about integer overflow before, so you should remember that this will actually decrease the value of the stack pointer by 100 bytes (0x100 - 0x9c). Stacks grow downwards, so this is making the stack larger by 100 bytes for the main function to use.
mov #0x44b4 "Enter the password to continue.", r15
call #0x4566 <puts>
will move the value 0x44b4
(which, the debugger helpfully tells us, points to a string saying “Enter the password to continue”) into register 15, then calls the puts function. This is the same as calling puts with the string as an argument, so that it outputs to the console.
mov sp, r15
call #0x4480 <get_password>
Move the stack pointer into register 15, then call the function at 0x4480
(again, we’re helpfully told that it’s called get_password
). Remember, register 15 is the default place for the first argument for a function, so this is calling the function and passing the stack pointer as the argument.
mov sp, r15
call #0x448a <check_password>
tst r15
This looks similar to what came before. We call check_password
with the stack pointer as an argument. The assembly has to copy it into register 15 again because the value in r15 might have been changed by get_password
since we last moved the stack pointer. The last line tests r15. Which means, it will look at the value in r15 and populate the status register accordingly. If the value is negative, the status register’s “negative” bit will be 1. If the value is zero, the status register’s “zero” bit will be 1. Why does it do that? The answer is in the other half of the main
function.
4452: 0520 jnz #0x445e <main+0x26>
4454: 3f40 d444 mov #0x44d4 "Invalid password; try again.", r15
4458: b012 6645 call #0x4566 <puts>
445c: 093c jmp #0x4470 <main+0x38>
445e: 3f40 f144 mov #0x44f1 "Access Granted!", r15
4462: b012 6645 call #0x4566 <puts>
4466: 3012 7f00 push #0x7f
446a: b012 0245 call #0x4502 <INT>
446e: 2153 incd sp
4470: 0f43 clr r15
4472: 3150 6400 add #0x64, sp
The first thing this does is jnz #0x445e <main+0x26>
, which is to jump to 0x445e
if we have a non-zero number. Jumping to an address means that the current path of execution will stop, and the next instruction will be the one that’s jumped to. If the jump doesn’t happen (in this case, when the number is zero), it just carries on with the next line of assembly. The previous line (tst r15
) tested register 15, which sets the bits in the status register depending on whether r15 is 0 or positive, etc. The line we’re currently looking at uses jnz
to check the value of the zero bit in the status register, so the overall effect is to test whether the contents of register 15 are 0. In this situation, we will jump to #0x445e
if r15 was non-zero after check_password
returned.
The next two lines output “Invalid password; try again.”, then skip ahead to the end of the function.
If the jump does happen, the CPU will run the lines from 0x445e
. First of all, this outputs “Access Granted!”. Then it does this:
push #0x7f
call #0x4502 <INT>
This is a hardware interrupt, which is necessary for all attempts to read input from another process, all attempts to send output to another process, or any other operation that interacts with the outside world. We generally call input or output ‘IO’, and operations that stop the program from continuing ‘blocking’ operations.
This interrupt has the argument 0x7f
, which is the magic code for “open the lock” (there’s a manual which lists that kind of thing). This argument is passed on the stack, not in a register (for each CPU, rules called “calling convention” determine whether arguments are on the stack or in registers - it just so happens that this CPU’s interrupt takes arguments on the stack). The next line, incd sp
is to clear up the 2 extra bytes of stack space that were used by pushing the value 0x7f
onto the stack (2 bytes is the minimum variable size on a 16-bit system, so even though 0x7f
is 1 byte, it takes up 2 bytes on the stack). incd
is short for “increment double”, so it will add 2 bytes to the stack pointer (mirroring the 2 bytes used by the value we pushed).
After the jnz
instruction branched code execution, the two branches join again at 0x4470
, which does clr r15
and add #0x64, sp
. This clears the value of r15, which means this function is returning 0, and then adds 100 on to the stack pointer, to clear up the extra stack space taken by the start of the function.
So everything here hinges on the return value of check_password
. If it’s non-zero the lock is opened. If it’s zero, the lock stays shut.
Looking at get_password
The get_password
function is simple.
4480: 3e40 6400 mov #0x64, r14
4484: b012 5645 call #0x4556 <getsn>
4488: 3041 ret
Put the value 100 in register 14, then call getsn
. Remember that this function was called with the stack pointer in register 15? Well that will still be there. So getsn
is called with the stack pointer as the first argument, and the value 100 as the second. This second argument instructs getsn
to take at most 100 characters (including the null character) as input. The bad news (from our perspective) is that 100 is exactly how much stack space has been allocated for the input. I’m saying this is bad news because, if the stack had only allocated, say, 30 bytes, we’d be able to overwrite 70 bytes of memory which we weren’t meant to get access to.
This function then returns.
Looking at check_password
Here’s the assembly.
448a: bf90 342b 0000 cmp #0x2b34, 0x0(r15)
4490: 0d20 jnz $+0x1c
4492: bf90 3529 0200 cmp #0x2935, 0x2(r15)
4498: 0920 jnz $+0x14
449a: bf90 5a34 0400 cmp #0x345a, 0x4(r15)
44a0: 0520 jne #0x44ac <check_password+0x22>
44a2: 1e43 mov #0x1, r14
44a4: bf90 3c4c 0600 cmp #0x4c3c, 0x6(r15)
44aa: 0124 jeq #0x44ae <check_password+0x24>
44ac: 0e43 clr r14
44ae: 0f4e mov r14, r15
44b0: 3041 ret
There’s definitely a pattern here. Lots of lines of comparing a literal value, against something near where r15 points to. Remember that r15 is set to the stack pointer before this function is called. And the stack is where the inputted password is. By the way, jnz
and jne
can be used interchangeably (as can jz
and je
) - all 4 looking at the zero bit-flag on the status register.
So we compare the first pair of bytes of the input against 0x2b34
, the second pair with 0x2935
the third pair with 0x345a
. If ever the two things aren’t equal, we jump to the same line (0x44ac
- sometimes this is calculated as an offset. E.g. from 0x4490
, we jump $+0x1c
, which still takes the CPU to 0x44ac
).
The next lines are
mov #0x1, r14
cmp #0x4c3c, 0x6(r15)
jeq #0x44ae <check_password+0x24>
So we put the value 1 into register 14, then do another comparison, of the 4th pair of bytes against 0x4c3c
. This time, we jump if they’re equal - and we only jump over one instruction, skipping the line 0x44ac
(which the other jumps aim for).
That line 0x44ac
reset r14 to 0. So if any of the comparison are not equal, that line will be hit. Otherwise, it’ll be set to 1. The next line, mov r14, r15
, copies the value into register 15, which will be the return value. The last line performs the actual return.
The password
We’ve seen that pairs of bytes in the password need to equal 0x2b34
, 0x2935
, 0x345a
and 0x4c3c
in turn. So can we just input 0x2b342935345a4c3c
as the password, and open the lock?
No is the answer. The reason is to do with endian-ness. This is all to do with how bytes are stored in a memory not being the same as how they’re written out. On the CPU we’re using, bytes are stored little-endian within each 16-bit word. That means that a value in the real world of 0x0102030405060708
would be stored in memory as 0x0201040306050807
, with the bytes swapped in every pair.
There’s no great trick to this, it’s just something you have to be aware of whenever you’re converting from real-world values to values in memory. Although there is a bit of a hint here from comparing the assembled version of the program with the disassembled version. E.g. 448a: bf90 342b 0000 cmp #0x2b34, 0x0(r15)
The assembled version on the left shows 342b
, which is displayed as 0x2b34
on the right. That’s as much a hint as you’ll ever get.
To find the real password, we have to swap the bytes in every pair of the password we had above. 0x2b342935345a4c3c
becomes 0x342b35295a343c4c
.
And with that, the lock opens.
Endnotes
This hack was similar to the last one. The lock had a master password which was hard-coded in the program. The only extra challenge was that it wasn’t hardcoded as a readable string, but hardcoded in the instructions.
Next time, we’ll look at using a buffer overflow to open a lock by writing to memory we weren’t meant to access.
Thanks to Tom Carver for reading a draft of this.