What is GOT & PLT | Linux x64 Binary exploitation

Generally, there are two types of binaries - Dynamic & statically linked binaries.

The statically linked binary are independent (standalone), as they carry all the required, dependent functionalities within themselves. To be more precise, when a source code is compiled, it packs any library function required by the executable within itself, that is during the liking phase. This also means that the binary is well aware of where the functions are stored (address, offset).

Whereas, a dynamically linked binary is the opposite, as it doesn't packs the shared library within itself, but is linked. When a dynamically linked binary is ran, a dynamic linker/loader is also loaded in the memory, which helps resolving the linked library function. This is necessary as the address of the shared library on the system is unknown & can differ, due to various factors. One of the factor being, if the library code is updated, that would result in a change in the address. Another example can be of an protection mechanism known as ASLR, which randomizes the address of the shared libraries, leaving the binary no clue of the location.

So the GOT & PLT concept comes into the role when using a dynamically linked binary.

Tip: We can check if a binary is dynamically or statically linked using the `ldd` command. It's is quite a useful utility, as it also shows all  the dynamically linked libraries specific to a binary.                                                                                                               

First let's look at the PLT section, which stands for Procedural Liked Table. It's contain stub code , responsible for providing the requested function address of a shared library during runtime or in case unknown, calls the GOT section, responsible for finding the address.

Too confusing? Let's walk through again from the PLT section. Whenever an external  function, inside a dynamically linked binary is called for the first time, it's PLT structure, whose stub performs a function address lookup, utilizing the dynamic linker/loader (ld.so) to the point to an entry inside the GOT lookup table, where the original function address can be located. While the second time, the resolution doesn't needs to be done as it's already aware of the function address during the first occurrence & is stored in the GOT. That's also called lazy binding. It helps improving the performance as it avoids unnecessary relocation during run time.

Now let' try to understand GOT, which stands for global offset table, a lookup table. It originally contains record (pointers) to all the function addresses.

Tip:  PLT is stored in code segment, hence is readable & executable, whereas GOT is stored in data segment, hence is readable & writable. 

Enough of theory, let's deep dive & look at how things happen under the hood. For this demonstration, we will be using the below piece of code, which utilizes a function call from the libc library, located in the system.

//gcc -fno-pie -g -o plt1 plt1.c
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) {
  puts("Hello world");
  puts("Hello world");

  • Open the binary through gdb & set a breakpoint at puts. 
  • Now run the binary, once hit the breakpoint, step in into the instruction, which is of puts@plt. 

  • Once you step into puts@plt, there you can notice few instructions, the first one being jmp to  function pointer, which will be dereferenced. The dereferenced address is resolved & shown by peda extension that is `0x601018`, which points to next instruction. The reasons it points to the next address is because the address of puts is not known, as it's the first occurace. So for that reasons it points to the next instruction to call the function address resolution routine. Upon the next occurrence of puts, the address would be available there itself & wouldn't be require to call the other routine. Anyhow currently the address is unknown so let's move to the next instruction. 
  • The next instruction is `push 0x0` which refers to offset of puts(), or the position in the relocation table rel.plt, also referred as `rel_offset`. We can use the readelf utility to check the position. The position of puts symbol is first, so the offset is 0. The rel_offset is also an argument required by the dynamic resolution function later. 

  • Then again a jmp is taken, which is responsible for invoking the dynamic linker for resolving the required function address. 

  • Once we take the jump to 0x400440 (GOT), we immediately encounter a push & a jump instruction. The first instruction, which is basically the first entry in the GOT (GOT +8), also  second argument required by the dynamic resolution. It is a basically structure `link_map` (Will discuss later) on the stack, required by the next instruction, which is also the second entry in the GOT (GOT +16), with an indirect jmp to the dynamic linker routine (0x601010 - _dl_runtime_resolve).  The  structure `link_map` contains information about the current ELF object, required by the resolver .

We can quickly check that both the addresses are located inside ld-*.so, one into the .text section, having executable permission & another to the .bss, with writable permission.

info files output

  • Now, the dynamic resolver  _dl_runtime_resolve will attempt get the real address of puts() & store/patch it at `0x601018`. So next time when `puts()` is called, it already knows at address & doesn't need to call the dynamic resolver routine. To confirm, we can set another breakpoint at the next puts & examine the address. This time, it will pickup the address of puts directly from the GOT. This process is also called lazy loading. 

Post a Comment