Challenge RE #6
Introduction
The assembly code is as follows
<f>:
0: push rbp
1: mov rbp,rsp
4: mov QWORD PTR [rbp-0x8],rdi
8: mov QWORD PTR [rbp-0x10],rsi
c: mov rax,QWORD PTR [rbp-0x8]
10: movzx eax,BYTE PTR [rax]
13: movsx dx,al
17: mov rax,QWORD PTR [rbp-0x10]
1b: mov WORD PTR [rax],dx
1e: mov rax,QWORD PTR [rbp-0x10]
22: movzx eax,WORD PTR [rax]
25: test ax,ax
28: jne 2c
2a: jmp 38
2c: add QWORD PTR [rbp-0x8],0x1
31: add QWORD PTR [rbp-0x10],0x2
36: jmp c
38: pop rbp
39: ret
Analysis
Function signature
The first thing to try to figure out here is the signature of f
. In this case we have two arguments, each one of 8 bytes, both of them been positioned in the rbp
register. This can be easily identified on the instructions:
<f>:
push rbp
mov rbp, rsp
mov QWORD PTR [rbp-0x8],rdi
mov QWORD PTR [rbp-0x10],rsi
;; ...
Now, the way these two arguments are used for example in instructions
mov rax, QWORD PTR [rbp - 0x8]
, gives me the clue that these two arguments are really just char *
, strings in C. Let’s take that assumption at the moment. Then we have function signature as follows
void f(char *str1, char *str2)
Here I put void
as the returned because we haven’t analyzed yet the return value, so it’s more like a placeholder for the moment.
Main logic
The main logic here starts in memory position c, where the next three instructions
c: mov rax,QWORD PTR [rbp-0x8]
10: movzx eax,BYTE PTR [rax]
13: movsx dx,al
Basically read one character from our first string. Then this character it’s copied into our second string, in the first position. This can be checked on instructions:
17: mov rax,QWORD PTR [rbp-0x10]
1b: mov WORD PTR [rax],dx
1e: mov rax,QWORD PTR [rbp-0x10]
22: movzx eax,WORD PTR [rax]
What follows is a check to ax
, which contains this character just copied. If this character is the \0
terminator character, we will finish our program, with a jmp 38
instruction. Otherwise we first the position of the pointer to str1
by one, and then the position of str2
by two. Then we come back at the beginning of the loop.
Without getting this into C code, you can have the idea, that this is basically copying str1
into str2
, but leaving an “space” between characters. For example, having:
str1 = ABCD\0
str2 = 11111111
# After program we will have
str2 = A1B1C1D1
Let’s write the code in C:
void f(char *str1, char *str2)
{
int i = 0, j = 0;
while (str1[i] != 0) {
str2[j] = str1[i];
i++;
j += 2;
}
}
This can be written in a more simplified way, as
void f(char *str1, char *str2)
{
while (*str1 != 0) {
*str2++ = *str1++;
str2++;
}
}
Notes
This was quite easy compared to the other ones.