Whenever a new function is called, a new stack frame is created for the calling function. Let's take the below shared piece of code & try to examine.
Disassemble code -
#include <stdio.h>
#include <unistd.h>
int callme()
{
char buffer[100];
int in;
in = read(0, buffer, 500);
return 0;
}
int main(int argc, char *argv[])
{
callme();
return 0;
}
#include <unistd.h>
int callme()
{
char buffer[100];
int in;
in = read(0, buffer, 500);
return 0;
}
int main(int argc, char *argv[])
{
callme();
return 0;
}
Disassemble code -
gdb-peda$ disass main
Dump of assembler code for function main:
0x00000000004005e8 <+0>: push rbp
0x00000000004005e9 <+1>: mov rbp,rsp
0x00000000004005ec <+4>: sub rsp,0x10
0x00000000004005f0 <+8>: mov DWORD PTR [rbp-0x4],edi
0x00000000004005f3 <+11>: mov QWORD PTR [rbp-0x10],rsi
0x00000000004005f7 <+15>: mov eax,0x0
0x00000000004005fc <+20>: call 0x40059d <callme>
0x0000000000400601 <+25>: mov eax,0x0
0x0000000000400606 <+30>: leave
0x0000000000400607 <+31>: ret
End of assembler dump.
gdb-peda$ disass callme
Dump of assembler code for function callme:
0x000000000040059d <+0>: push rbp
0x000000000040059e <+1>: mov rbp,rsp
0x00000000004005a1 <+4>: add rsp,0xffffffffffffff80
0x00000000004005a5 <+8>: mov rax,QWORD PTR fs:0x28
0x00000000004005ae <+17>: mov QWORD PTR [rbp-0x8],rax
0x00000000004005b2 <+21>: xor eax,eax
0x00000000004005b4 <+23>: lea rax,[rbp-0x70]
0x00000000004005b8 <+27>: mov edx,0x1f4
0x00000000004005bd <+32>: mov rsi,rax
0x00000000004005c0 <+35>: mov edi,0x0
0x00000000004005c5 <+40>: call 0x400480 <read@plt>
0x00000000004005ca <+45>: mov DWORD PTR [rbp-0x74],eax
0x00000000004005cd <+48>: mov eax,0x0
0x00000000004005d2 <+53>: mov rcx,QWORD PTR [rbp-0x8]
0x00000000004005d6 <+57>: xor rcx,QWORD PTR fs:0x28
0x00000000004005df <+66>: je 0x4005e6 <callme+73>
0x00000000004005e1 <+68>: call 0x400470 <__stack_chk_fail@plt>
0x00000000004005e6 <+73>: leave
0x00000000004005e7 <+74>: ret
End of assembler dump.
The highlighted instruction above refers to the function prologue & epilogue. When ever a function is called, a stack frame is setup, specific to the calling function. The Stack frame consists of the following - Function arguments, Return address, Base pointer, Local variables etc. Below image can be referred to get a visual understanding of the same.
As soon as the calling function has completed executing all it's instructions, its returned back to the caller function. The question arises, how does it goes back? How does it knows where to go back?. Well, Its through the return address, but let's deep dive & understand how it's actually happening.
Credit - https://eli.thegreenplace.net/2011/09/06/stack-frame-layout-on-x86-64 |
As soon as the calling function has completed executing all it's instructions, its returned back to the caller function. The question arises, how does it goes back? How does it knows where to go back?. Well, Its through the return address, but let's deep dive & understand how it's actually happening.
0x00000000004005e8 <+0>: push rbp
0x00000000004005e9 <+1>: mov rbp,rsp
The first instruction, where the rbp, also known as frame pointer is pushed to the stack, responsible for marking the start of the stack frame, which is done in the next instruction. The main purpose it serves is marking the starting point of the stack pointer in the stack frame, as the stack pointer would be changed, due to the newly added variables in stack frame. Using the frame pointer as a checkpoint, it can knows & can locate/reference the local variables stored on the stack by getting the offset to the rbp.
After that, the function instructions are executed, after which the function epilogue comes in.
The first instruction, where the rbp, also known as frame pointer is pushed to the stack, responsible for marking the start of the stack frame, which is done in the next instruction. The main purpose it serves is marking the starting point of the stack pointer in the stack frame, as the stack pointer would be changed, due to the newly added variables in stack frame. Using the frame pointer as a checkpoint, it can knows & can locate/reference the local variables stored on the stack by getting the offset to the rbp.
After that, the function instructions are executed, after which the function epilogue comes in.
0x00000000004005e6 <+73>: leave
0x00000000004005e7 <+74>: ret
The leave instruction actually does the following thing -
mov rsp,rbp
The leave instruction actually does the following thing -
mov rsp,rbp
pop rbp
It moves back the rbp register value to the rsp, which does nothing but points back the stack pointer to the frame pointer initially saved, which in turn de-allocated the resource which it previously allocated to the calling function in terms of local variables. Next instruction will pop the rbp from the top of the stack and make the rsp point to the top of the calling stacks frame, which now is the return address. Finally the last instruction `ret`, will pop the return address from the stack, which rsp points to & jump to it, thereby returning back to the caller function.
Since we now have an idea of what a stack frame, function prologue & epilogue is, we can correlate this to understand how a stack overflow happens & allows attackers to hijack the control flow.
In a stack overflow, an attackers end goal is normally to overwrite the return address, in order to make it point to it's own control flow. So as we know when a function completes, in the function epilogue, the calling function stack frame is cleared, then the control flow is pointing to the return address & also the stack pointer is pointing to the same area. Hence an attacker after a function returns, gets the offset to the rsp, in order to overwrite the return address. This can be learned in the next blog, where we will practically be trying this out.
0 Comments