This is the third in a series of articles about assembly language. You might want to read the previous post, and the ones that came before it, before this one.
This post is based around the Sydney level on microcorruption.com. Like last time, we’re trying to find an input to open a lock without knowing the correct password, using knowledge of assembly language. I’m going to explain more about assembly language than last time, and show more of the debugging process.
Diving into the
Let’s get straight into the assembly. Here’s the first half of the
4438: 3150 9cff add #0xff9c, sp 443c: 3f40 b444 mov #0x44b4 "Enter the password to continue.", r15 4440: b012 6645 call #0x4566 <puts> 4444: 0f41 mov sp, r15 4446: b012 8044 call #0x4480 <get_password> 444a: 0f41 mov sp, r15 444c: b012 8a44 call #0x448a <check_password> 4450: 0f93 tst r15
Let’s break that down further, and read it line by line.
add #0xff9c, sp adds to the stack pointer. I’ve talked about integer overflow before, so you should remember that this will actually decrease the value of the stack pointer by 100 bytes (0x100 - 0x9c). Stacks grow downwards, so this is making the stack larger by 100 bytes for the main function to use.
mov #0x44b4 "Enter the password to continue.", r15 call #0x4566 <puts>
will move the value
0x44b4 (which, the debugger helpfully tells us, points to a string saying “Enter the password to continue”) into register 15, then calls the puts function. This is the same as calling puts with the string as an argument, so that it outputs to the console.
mov sp, r15 call #0x4480 <get_password>
Move the stack pointer into register 15, then call the function at
0x4480 (again, we’re helpfully told that it’s called
get_password). Remember, register 15 is the default place for the first argument for a function, so this is calling the function and passing the stack pointer as the argument.
mov sp, r15 call #0x448a <check_password> tst r15
This looks similar to what came before. We call
check_password with the stack pointer as an argument. The assembly has to copy it into register 15 again because the value in r15 might have been changed by
get_password since we last moved the stack pointer. The last line tests r15. Which means, it will look at the value in r15 and populate the status register accordingly. If the value is negative, the status register’s “negative” bit will be 1. If the value is zero, the status register’s “zero” bit will be 1. Why does it do that? The answer is in the other half of the
4452: 0520 jnz #0x445e <main+0x26> 4454: 3f40 d444 mov #0x44d4 "Invalid password; try again.", r15 4458: b012 6645 call #0x4566 <puts> 445c: 093c jmp #0x4470 <main+0x38> 445e: 3f40 f144 mov #0x44f1 "Access Granted!", r15 4462: b012 6645 call #0x4566 <puts> 4466: 3012 7f00 push #0x7f 446a: b012 0245 call #0x4502 <INT> 446e: 2153 incd sp 4470: 0f43 clr r15 4472: 3150 6400 add #0x64, sp
The first thing this does is
jnz #0x445e <main+0x26>, which is to jump to
0x445e if we have a non-zero number. Jumping to an address means that the current path of execution will stop, and the next instruction will be the one that’s jumped to. If the jump doesn’t happen (in this case, when the number is zero), it just carries on with the next line of assembly. The previous line (
tst r15) tested register 15, which sets the bits in the status register depending on whether r15 is 0 or positive, etc. The line we’re currently looking at uses
jnz to check the value of the zero bit in the status register, so the overall effect is to test whether the contents of register 15 are 0. In this situation, we will jump to
#0x445e if r15 was non-zero after
The next two lines output “Invalid password; try again.”, then skip ahead to the end of the function.
If the jump does happen, the CPU will run the lines from
0x445e. First of all, this outputs “Access Granted!”. Then it does this:
push #0x7f call #0x4502 <INT>
This is a hardware interrupt, which is necessary for all attempts to read input from another process, all attempts to send output to another process, or any other operation that interacts with the outside world. We generally call input or output ‘IO’, and operations that stop the program from continuing ‘blocking’ operations.
This interrupt has the argument
0x7f, which is the magic code for “open the lock” (there’s a manual which lists that kind of thing). This argument is passed on the stack, not in a register (for each CPU, rules called “calling convention” determine whether arguments are on the stack or in registers - it just so happens that this CPU’s interrupt takes arguments on the stack). The next line,
incd sp is to clear up the 2 extra bytes of stack space that were used by pushing the value
0x7f onto the stack (2 bytes is the minimum variable size on a 16-bit system, so even though
0x7f is 1 byte, it takes up 2 bytes on the stack).
incd is short for “increment double”, so it will add 2 bytes to the stack pointer (mirroring the 2 bytes used by the value we pushed).
jnz instruction branched code execution, the two branches join again at
0x4470, which does
clr r15 and
add #0x64, sp. This clears the value of r15, which means this function is returning 0, and then adds 100 on to the stack pointer, to clear up the extra stack space taken by the start of the function.
So everything here hinges on the return value of
check_password. If it’s non-zero the lock is opened. If it’s zero, the lock stays shut.
get_password function is simple.
4480: 3e40 6400 mov #0x64, r14 4484: b012 5645 call #0x4556 <getsn> 4488: 3041 ret
Put the value 100 in register 14, then call
getsn. Remember that this function was called with the stack pointer in register 15? Well that will still be there. So
getsn is called with the stack pointer as the first argument, and the value 100 as the second. This second argument instructs
getsn to take at most 100 characters (including the null character) as input. The bad news (from our perspective) is that 100 is exactly how much stack space has been allocated for the input. I’m saying this is bad news because, if the stack had only allocated, say, 30 bytes, we’d be able to overwrite 70 bytes of memory which we weren’t meant to get access to.
This function then returns.
Here’s the assembly.
448a: bf90 342b 0000 cmp #0x2b34, 0x0(r15) 4490: 0d20 jnz $+0x1c 4492: bf90 3529 0200 cmp #0x2935, 0x2(r15) 4498: 0920 jnz $+0x14 449a: bf90 5a34 0400 cmp #0x345a, 0x4(r15) 44a0: 0520 jne #0x44ac <check_password+0x22> 44a2: 1e43 mov #0x1, r14 44a4: bf90 3c4c 0600 cmp #0x4c3c, 0x6(r15) 44aa: 0124 jeq #0x44ae <check_password+0x24> 44ac: 0e43 clr r14 44ae: 0f4e mov r14, r15 44b0: 3041 ret
There’s definitely a pattern here. Lots of lines of comparing a literal value, against something near where r15 points to. Remember that r15 is set to the stack pointer before this function is called. And the stack is where the inputted password is. By the way,
jne can be used interchangeably (as can
je) - all 4 looking at the zero bit-flag on the status register.
So we compare the first pair of bytes of the input against
0x2b34, the second pair with
0x2935 the third pair with
0x345a. If ever the two things aren’t equal, we jump to the same line (
0x44ac - sometimes this is calculated as an offset. E.g. from
0x4490, we jump
$+0x1c, which still takes the CPU to
The next lines are
mov #0x1, r14 cmp #0x4c3c, 0x6(r15) jeq #0x44ae <check_password+0x24>
So we put the value 1 into register 14, then do another comparison, of the 4th pair of bytes against
0x4c3c. This time, we jump if they’re equal - and we only jump over one instruction, skipping the line
0x44ac (which the other jumps aim for).
0x44ac reset r14 to 0. So if any of the comparison are not equal, that line will be hit. Otherwise, it’ll be set to 1. The next line,
mov r14, r15, copies the value into register 15, which will be the return value. The last line performs the actual return.
We’ve seen that pairs of bytes in the password need to equal
0x4c3c in turn. So can we just input
0x2b342935345a4c3c as the password, and open the lock?
No is the answer. The reason is to do with endian-ness. This is all to do with how bytes are stored in a memory not being the same as how they’re written out. On the CPU we’re using, bytes are stored little-endian within each 16-bit word. That means that a value in the real world of
0x0102030405060708 would be stored in memory as
0x0201040306050807, with the bytes swapped in every pair.
There’s no great trick to this, it’s just something you have to be aware of whenever you’re converting from real-world values to values in memory. Although there is a bit of a hint here from comparing the assembled version of the program with the disassembled version. E.g.
448a: bf90 342b 0000 cmp #0x2b34, 0x0(r15) The assembled version on the left shows
342b, which is displayed as
0x2b34 on the right. That’s as much a hint as you’ll ever get.
To find the real password, we have to swap the bytes in every pair of the password we had above.
And with that, the lock opens.
This hack was similar to the last one. The lock had a master password which was hard-coded in the program. The only extra challenge was that it wasn’t hardcoded as a readable string, but hardcoded in the instructions.
Next time, we’ll look at using a buffer overflow to open a lock by writing to memory we weren’t meant to access.
Thanks to Tom Carver for reading a draft of this.