Skip to content

Instantly share code, notes, and snippets.

@kenballus
Last active April 9, 2024 01:32
Show Gist options
  • Save kenballus/c7eff5db56aa8e4810d39021b23d8a8f to your computer and use it in GitHub Desktop.
Save kenballus/c7eff5db56aa8e4810d39021b23d8a8f to your computer and use it in GitHub Desktop.

Meditations on the Entry Point of a Dynamic ELF

Consider the following C program, which we'll call a.c:

#include <stdio.h>

int main(void) {
    puts("Hello world");
}

Suppose that we compile it with gcc a.c -o a.out.

What is the entry point of a.out?

Any reasonable answer to this question should satisfy both of the following properties:

  1. It should be the address of some code contained inside of a.out.
  2. When we exec a.out, that code should always execute first.

We'll now examine a few candidate answers to the question.

Definition #1: The beginning of main

C program execution begins with main. It seems reasonable, then, to define the "entry point" of a.out as the beginning of main. This definition satisfies property 1, but not property 2.

Definition #2: The beginning of the libc startup code

The previous definition doesn't satisfy property 2 because the compiler inserts startup code that runs before main. This code performs various tasks that a C programmer shouldn't have to think about, like counting up argv to compute argc. The location of the beginning of this startup code is specified in the ELF header. You can see this for yourself with readelf:

$ readelf -h a.out
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  ...
  Entry point address:               0x1040
  ...

We can inspect it with gdb:

$ gdb a.out -ex 'disas 0x1040' -ex 'exit'
Dump of assembler code for function _start:
   0x0000000000001040 <+0>: endbr64
   0x0000000000001044 <+4>: xor    %ebp,%ebp
   0x0000000000001046 <+6>: mov    %rdx,%r9
   0x0000000000001049 <+9>: pop    %rsi
   0x000000000000104a <+10>:    mov    %rsp,%rdx
   0x000000000000104d <+13>:    and    $0xfffffffffffffff0,%rsp
   0x0000000000001051 <+17>:    push   %rax
   0x0000000000001052 <+18>:    push   %rsp
   0x0000000000001053 <+19>:    xor    %r8d,%r8d
   0x0000000000001056 <+22>:    xor    %ecx,%ecx
   0x0000000000001058 <+24>:    lea    0xda(%rip),%rdi        # 0x1139 <main>
   0x000000000000105f <+31>:    call   *0x2f5b(%rip)        # 0x3fc0
   0x0000000000001065 <+37>:    hlt
End of assembler dump.

Seems like the entry point is within a routine called _start, which passes the address of main to another function, which presumably calls main.

Again, this definition clearly satisfies property 1, but, surprisingly, it still doesn't satisfy property 2.

Definition #3: The program interpreter's entry point

We can directly observe that the previous definition doesn't satisfy property 2 by opening a.out in gdb, and pausing execution just before the first instruction runs:

$ gdb a.out
(gdb) starti
Starting program: /home/bkallus/a.out

Program stopped.
0x00007ffff7fe3b60 in _start () from /lib64/ld-linux-x86-64.so.2

If we disassemble this _start routine, it's clear that it's not the one from a.out:

(gdb) disas
Dump of assembler code for function _start:
=> 0x00007ffff7fe3b60 <+0>: mov    %rsp,%rdi
   0x00007ffff7fe3b63 <+3>: call   0x7ffff7fe47e0 <_dl_start>
End of assembler dump.

What we're looking at is the entry point of the dynamic linker, which is needed to map dynamic libraries like libc into the process's address space.

If we define the entry point of a.out to be this new _start routine, then we sacrifice property 1, but we do get property 2! In a very real sense, this is the true entry point of the program, because it's the location of the first instruction to execute in the process after the exec syscall.

Definition #4: The address of the first instruction that executes from the program's text

In a way, this definition satisfies both properties by construction, but it comes with its own set of problems. Remember that the dynamic linker is just a program specified in the ELF header; we have no guarantees about its behavior. In particular, we have no guarantee that it ever hands over control to the text of a.out. We can prove this by patching a.out to produce a binary in which nothing from the program text ever executes. This patched binary therefore has no entry point by the above definition, even though we'll see that it runs just fine.

We can see by running strings on a.out that the path to the dynamic linker is baked into the binary:

$ strings a.out | head -n 1
/lib64/ld-linux-x86-64.so.2

Let's patch that path to point at /bin/busybox:

$ cat a.out | python3 -c 'import sys; linker_path = b"/lib64/ld-linux-x86-64.so.2"; a_out = sys.stdin.buffer.read(); sys.stdout.buffer.write(a_out.replace(linker_path, b"/bin/busybox".ljust(len(linker_path), b"\x00")))' > awk
$ chmod +x awk
$ ./awk
BusyBox v1.36.1 () multi-call binary.

Usage: awk [OPTIONS] [AWK_PROGRAM] [FILE]...

   -v VAR=VAL  Set variable
   -F SEP      Use SEP as field separator
   -f FILE     Read program from FILE
   -e AWK_PROGRAM

The pipeline above replaces the linker path in a.out with /bin/busybox, and saves the result in a new binary called awk.

When we run our awk binary, it behaves just like busybox awk, but if we disassemble it, we can see that its text matches a.out's exactly:

$ diff <(objdump -d a.out) <(objdump -d awk)
2c2
< a.out:     file format elf64-x86-64
---
> awk:     file format elf64-x86-64

It's easy to confirm in gdb that the code in the awk binary never executes; awk's execution begins and ends inside of /bin/busybox, its program interpreter.

In short, when we exec a dynamic ELF, nothing enforces that the program interpreter actually runs any of the code in the binary. Thus, by the above definition of "entry point," our awk binary has no entry point, even though it works just like a fully-functional awk.

Conclusion

In summary, in any definition for the entry point of a dynamically-linked binary, you can have either that the entry point is always contained within the program text, or that the entry point always executes first. You can't have both.

Personally, I think definition #3 is the most ideologically consistent, so I'm sticking with that one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment