Linux Stack frame - Function prologue & epilogue

Whenever a new function is called, a new stack frame is created for the calling function. Let's take the below shared piece of code & try to examine.

#include <stdio.h>
#include <unistd.h>

int callme() 
{
    char buffer[100];
    int in;
    in = read(0, buffer, 500);
    return 0;
}

int main(int argc, char *argv[]) 
{
    callme();
    return 0;
}


Disassemble code -

gdb-peda$ disass main
Dump of assembler code for function main:
   0x00000000004005e8 <+0>: push   rbp
   0x00000000004005e9 <+1>: mov    rbp,rsp
   0x00000000004005ec <+4>: sub    rsp,0x10
   0x00000000004005f0 <+8>: mov    DWORD PTR [rbp-0x4],edi
   0x00000000004005f3 <+11>: mov    QWORD PTR [rbp-0x10],rsi
   0x00000000004005f7 <+15>: mov    eax,0x0
   0x00000000004005fc <+20>: call   0x40059d <callme>
   0x0000000000400601 <+25>: mov    eax,0x0
   0x0000000000400606 <+30>: leave  
   0x0000000000400607 <+31>: ret    
End of assembler dump.


gdb-peda$ disass callme 
Dump of assembler code for function callme:
   0x000000000040059d <+0>: push   rbp
   0x000000000040059e <+1>: mov    rbp,rsp
   0x00000000004005a1 <+4>: add    rsp,0xffffffffffffff80
   0x00000000004005a5 <+8>: mov    rax,QWORD PTR fs:0x28
   0x00000000004005ae <+17>: mov    QWORD PTR [rbp-0x8],rax
   0x00000000004005b2 <+21>: xor    eax,eax
   0x00000000004005b4 <+23>: lea    rax,[rbp-0x70]
   0x00000000004005b8 <+27>: mov    edx,0x1f4
   0x00000000004005bd <+32>: mov    rsi,rax
   0x00000000004005c0 <+35>: mov    edi,0x0
   0x00000000004005c5 <+40>: call   0x400480 <read@plt>
   0x00000000004005ca <+45>: mov    DWORD PTR [rbp-0x74],eax
   0x00000000004005cd <+48>: mov    eax,0x0
   0x00000000004005d2 <+53>: mov    rcx,QWORD PTR [rbp-0x8]
   0x00000000004005d6 <+57>: xor    rcx,QWORD PTR fs:0x28
   0x00000000004005df <+66>: je     0x4005e6 <callme+73>
   0x00000000004005e1 <+68>: call   0x400470 <__stack_chk_fail@plt>
   0x00000000004005e6 <+73>: leave  
   0x00000000004005e7 <+74>: ret    
End of assembler dump.


The highlighted instruction above refers to the function prologue  & epilogue. When ever a function is called, a stack frame is setup, specific to the calling function. The Stack frame consists of the following - Function arguments, Return address, Base pointer, Local variables etc. Below image can be referred to get a visual understanding of the same.



Credit - https://eli.thegreenplace.net/2011/09/06/stack-frame-layout-on-x86-64




As soon as the calling function has completed executing all it's instructions, its returned back to the caller function. The question arises, how does it goes back? How does it knows where to go back?. Well, Its through the return address, but let's deep dive & understand how it's actually happening.


   0x00000000004005e8 <+0>: push   rbp
   0x00000000004005e9 <+1>: mov    rbp,rsp

 The first instruction, where the rbp, also known as frame pointer is pushed to the stack,  responsible for marking the start of the stack frame, which is done in the next instruction. The main purpose it serves is marking the starting point of the stack pointer in the stack frame, as the stack pointer would be changed, due to the newly added variables in stack frame. Using the frame pointer as a checkpoint, it can knows & can locate/reference the local variables stored on the stack  by getting the offset to the rbp.


After that, the function instructions are executed, after which the function epilogue comes in.

   0x00000000004005e6 <+73>: leave  
   0x00000000004005e7 <+74>: ret   

The leave instruction actually does the following thing -
mov    rsp,rbp
pop rbp


It moves back the rbp register value to the rsp, which does nothing but points back the stack pointer to the frame pointer initially saved, which in turn de-allocated the resource which it previously allocated to the calling function in terms of local variables. Next instruction will pop the rbp from the top of the stack and make the rsp point to the top of the calling stacks frame, which now is the return address. Finally the last instruction `ret`, will pop the return address from the stack, which rsp points to & jump to it, thereby returning back to the caller function.


Since we now have an idea of what a stack frame, function prologue & epilogue is, we can correlate this to understand how a stack overflow happens & allows attackers to hijack the control flow. 

In a stack overflow, an attackers end goal is normally to overwrite the return address, in order to make it point to it's own control flow.  So as we know when a function completes,  in the function epilogue, the calling function stack frame is cleared, then the control flow is pointing to the return address & also the stack pointer is pointing to the same area. Hence an attacker after a function returns, gets the offset to the rsp, in order to overwrite the return address. This can be learned in the next blog, where we will practically be trying this out. 


Post a Comment

0 Comments