Challenge RE #22
In this article let’s dive into the #22 RE Challenge. As in previous challenges we have assembly code examples that we have to understand and infer what it’s doing. Sometimes this process it’s straightforward, if you understand assembly language, of course, other times it’s hard as fuck like in Challenge #19…I haven’t forgotten this baby, going back to you another day. Enough of dirty talk with programming challenges Gealber. Back to our goal here, the assembly code can be found in challenge.re, take a look there’s a fair amount of challenges there.
Note:
I know we programmers like a lot the Ctr+C and Ctr+V combination keys, but I don’t recommend you to do so with this code here. The main point of these challenges is to understand what this code does, but that doesn’t mean that all the code here runs smoothly.
Analysis
The first two things that you should identify in this code, we have two functions here, f1
and f2
. Given that f1
has calls to f2
, we can assume that this is our main entry point. Let’s start with this.
f1
;; SIGNATURE OF F1
;; ARGS: rdi, esi, edx
;; void f1(int *arr,int start,int end);
f1:
push r13
push r12
;; r12d
mov r12d, edx
push rbp
;; rbp
mov rbp, rdi
push rbx
;; ebx
mov ebx, esi
push rcx
.L12:
;; ebx >= r12d (two numbers, then)
;; start >= end
cmp ebx, r12d
jge .L10
;; copy back original parameters
;; to their original registers
;; to pass them as argument for f2
;; from this we know that rdi is an array of integers as well
;; f2(arr, start, end)
mov esi, ebx
mov edx, r12d
mov rdi, rbp
call f2
;; end arg
lea edx, [rax-1]
mov r13d, eax
;; start arg
mov esi, ebx
;; arr
mov rdi, rbp
lea ebx, [r13+1]
call f1
jmp .L12
.L10:
pop rax
pop rbx
pop rbp
pop r12
pop r13
ret
Looking at this code, you can notice first that f1
calls itself, so it’s a recursive function. Also, we have a loop which stops condition it’s that ebx >= r12d
. Putting this into C code, we have
void f1(int *arr, int start, int end)
{
while (start < end) {
end = f2(arr, start, end) - 1;
f1(arr, start, end);
}
return;
}
I renamed the parameters to something meaningful. This is one of the most painful things of assembly, you have just registers with weird names :).
f2
Now f2
, this is the tricky one here I won’t explain the whole code only the most tricky part. The signature we already know it from f1
, we will receive an array of ints with and two ints.
xor r11d, ebx
;; arr[i+1] = r11d;
mov DWORD PTR [rcx+4+r8], r11d
;; r11d = arr[j-1] ^ arr[i+1] ^ arr[j-1] = arr[i+1];
xor r11d, DWORD PTR [r9]
;; arr[j-1] = arr[i+1]
mov DWORD PTR [r9], r11d
;; arr[i+1] ^= arr[i+1];
xor DWORD PTR [rcx+4+r8], r11d
This was a real pain because if it was with normal variables you would know what it does. Look this is the equivalent of normal C code
a ^= b
b ^= a
a ^= b
😅 not so obvious either, but if you take into consideration that XOR it’s a commutative operation you can figure out that this is a swap of variables, but without using a temporary variable. Actually, it’s called XOR Swap, please don’t use this to swap your variables. First, this is not intuitive, and also when you provide equal variables, you will end up with values 0 on both variables. Even if you put an if statement to discard this case, will run slower than the normal temporary approach.
This is the most tricky part of this function and the two loops it has inside as well. I managed to get this into C code, would be something like this
int f2(int *arr, int start, int end)
{
int i = start, j = end + 1, esi = start;
while (1) {
esi++;
if ((esi > end) || (arr[i+1] > arr[start])) {
while (arr[j-1] > arr[start]) {
j--;
}
if (j <= esi)
break;
swap(arr, i+1, j);
}
i++;
}
swap(arr, start, j-1);
return j;
}
Note:
Let me put this here again. This code it’s my literal translation of the assembly code, so I might be wrong, please don’t copy this as an implementation of the algorithm I think it is.
When you combine these two functions we have what seems to be a recursive sorting algorithm. If you take a look at f2
you can notice that it’s quite similar to the partition process in quick sort algorithm. This is my guess.
Formal description
Performs quick sort on an array.
Conclusion
The point of reversing is not to reverse detail by detail each instruction. Instead, is to get a broad idea of what the program does.