Skip to content

Instantly share code, notes, and snippets.

@lava
Last active March 2, 2024 17:22
Show Gist options
  • Save lava/f8732a6802988fe8d0a41bd7979a4d54 to your computer and use it in GitHub Desktop.
Save lava/f8732a6802988fe8d0a41bd7979a4d54 to your computer and use it in GitHub Desktop.
Hello, world: Deep analysis of a shallow program.

Hello, world!

Please explain in detail what will happen if the following program is executed:

#include <iostream>

int main() {
    std::cout << "Hello, world!" << std::endl;
}

The Novice

It will print out "Hello, world!".

The Apprentice

Assuming a unix system, the program will write the string "Hello, world!\n" to the standard output stream, which is connected to file descriptor 1. Afterwards, the stream is flushed.

The Pedant

The given text is not a program, but rather UTF-8 encoded C++ source code. After being turned into a program by a C++ compiler, it's impossible to tell what will happen after it is executed. One possibility would be that it receives a SIGABRT signal immediately after it started, in which case the effect would probably be the creation of a core dump in the current directory.

The Lawyer

Lets analyze the program according to the C++ Draft Standard in version N4296.

Since <iostream> is one of the 53 C++ standard library header listed in §17.6.1.2/2 [headers], its contents will be made available to the translation unit. (§17.6.2.2/1 [using.headers])

Including this header causes the effects of defining an instance of std::ios_base::Init with static storage duration. (§27.4.1/2 [iostream.objects.overview]) During or before construction of this instance, the object std::cout of type std::ostream is constructed and associated with the object stdout declared in the <cstdio> header. (§27.4.2/1 [narrow.stream.objects])

Next, we see the definition of a global function called main returning an int and taking no arguments. The program thus fulfills the requirements of §3.6.1 [basic.start.main] and this function will be the designated start of the program.

The body of the main function consists of an expression statement as in §6.2/1 [stmt.expr]. The expression inside that statement refers to three distinct entities by name:

  1. The object std::cout of type std::ostream (27.4.2 [narrow.stream.objects]),

  2. The string literal "Hello, world!" (a static null-terminated byte string according to §17.5.2.1.4.1/3 Footnote 170), of type const char[14], and

  3. The function template std::endl with the signature std::basic_ostream<C,T>&(std::basic_ostream<C,T>&)

These are joined by two instances of the binary left-shift operator, which groups left-to-right. Therefore, to determine what will happen, we first have to look at the sub-expression std::cout << "Hello, world!", which is the left shift operator with operands of type std::ostream and const char[14].

Since at least one operand has class or enumeration type, overload resolution is used to determine which operator-function or built-in operator is invoked. (§13.3.1.2/2 [over.match.oper])

The set of candidate functions is constructed according to the rules detailed in §13.3.1.2/3. They consist of the result of the qualified lookup of std::ostream::operator<< (§13.3.1.2/3.1), together with the result of the unqualified lookup of operator<< in the context of the expression. (§13.3.1.2/3.2) Since the operands can't be converted to a pair of promoted integral types, the requirement of clause §13.3.1.2/3.3.3 is not be fulfilled and there are no built-in operator candidates.

The best match is the template function specialization

std::ostream& std::operator<< (std::ostream& out, char const*)

from §27.7.3.6.4 [ostream.inserters.character], which behaves like a formatted inserter of out. (§27.7.3.6.1 [ostream.formatted.reqmts])

Therefore, calling this function will begin by constructing an object of class std::sentry. (§27.7.3.4 [ostream.sentry]) If this object returns true when converted to bool, the function will proceed to create a character sequence seq of 14 characters, each widened using out.widen(), to insert seq into out, and to call width(0). (§27.7.3.6.4/3) Finally, the sentry object is destroyed before leaving the function, and it returns its first argument out.

The same procedure is repeated for the next left shift operator, which has a left operand of type std::ostream& and a right operand that refers to a template function of two arguments with the signature template<class C, class T> std::basic_ostream<C,T>&(std::basic_ostream<C,T>&). Here, the selected overload is

std::ostream::operator<<(std::ostream&(*f)(std::ostream& os))

from §27.7.3.6.3 [ostream.inserters]. This function returns f(*this), and calling std::endl has the effect of calling os.put(os.widen('\n')) followed by os.flush(). (§27.7.3.8/1 [ostream.manip])

Finally, control reaches the end of main without encountering a return statement, which has has the effect of destroying any objects with automatic storage duration and calling std::exit() with the argument 0. (§3.6.1/5)

The Idealist

It will introduce side-effects, so let's re-write it in a purely functional way.

The Ideologue

I can tell you what the program does, but not what it should do, because it's lacking unit tests and documentation.

The Engineer

From the lack of any platform specific initialization code, we can infer that the program is intended to be run in a hosted as opposed to a free-standing environment.

Let's for simplicity assume we're on a standard GNU/Linux system on x86_64. This means our process began life when a previously running process called the exec() syscall.

This means it had to store the syscall number 59 in register $rax, the virtual memory addresses of the file name, the argument array, and the environment array in the registers $rdi, $rsi and $rdx, and execute the SYSCALL instruction.

This crosses the border from user space to the kernel by setting the instruction pointer to the address stored in the IA32_LSTAR register, which was set up by the kernel to contain the address of entry_SYSCALL_64, the syscall entry function. (<linux>/arch/x86/kernel/cpu/common.c:syscall_init())

The kernel is now responsible for walking the file system to the given path, and opening the file that was the argument to exec() for reading. (<linux>/fs/exec.c:open_exec())

If the file exists, has the right permissions etc., the binary format of the executable needs to be determined. To do this, the first BINPRM_BUF_SIZE bytes are loaded into memory (<linux>/fs/exec.c:prepare_binprm()), and the list of registered binfmt-handlers is walked to see if one of them recognizes the format.

Probably the compiler will have transformed the program into an ELF file, which can be recognized by the magic bytes "\x7fELF" at the start of the file. In this case, the loading will be performed by <linux>/fs/binfmt_elf.c:load_elf_binary(), where the elf header and the program header table are loaded into memory.

The first thing that is done is to look for a PT_INTERP section, which contains the name of the program interpreter, another ELF executable identified by a fixed path on the file system, in our example "/lib64/ld-linux-x86-64.so.2". If there is an interpreter, again the kernel needs to locate the correct file, check permissions, etc.

After all checks are done and passed, the page table of the old process is cleared, and a new mapping set up. All PT_LOAD sections of the binary are mapped into their respective places, and a memory region for the stack is allocated at a random address. Then, the load sections of the interpreter, which is position-independent, are mapped into private, write-protected pages at some free part of the address space.

When the memory is set up, control is transferred back to user space, in particular to the entry point of the interpreter.

The interpreter reads the DT_NEEDED tags of the binary to determine the shared library dependencies, which will in our case consist of libstdc++.so.6, libc.so.6, libm.so.6, and libgcc_s.so.1. The interpreter tries to locate each of these libraries and map them into memory at a randomly chosen address. A list of library load addresses is maintained in the static global struct _r_debug. (/usr/include/link.h) However, unless the environment variable LD_BIND_NOW is set to 1, the function symbols will not be resolved right now but lazily on the first call to the respective function.

After doing its thing, the dynamic loader passes control to the entry point of the actual binary, which is the symbol _start defined by glibc. (<glibc>/sysdeps/x86_64/start.S) This starting point will setup an initial stack frame, compute the correct values for argc, argv and env from the information in the auxiliary vector, and call the C runtime initialization function __libc_start_main. (<glibc>/csu/libc-start.c)

This will run static initialization functions, in particular constructors of all static objects, and install atexit-handlers for static destruction functions (again, in particular destructors of static objects).

Inside main(), the two functions

_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc

defined in the shared library libstdc++.so.6 are called for the first time, so when the program jumps to their PLT-slots, a symbol lookup will be triggered. (<glibc>/elf/dl-lookup.c)

What these functions do is more or less up to the standard library implementors, but ultimately the syscall write(1, p, 14) will be issued, where the arguments are the file descriptor 1, which is mapped to stdout, a pointer p containing the address of the string "Hello, world!\n", and the number of bytes that should be written.

Finally, the program returns the process signals to the operating system that it is finished and all of its resources should be freed and cleaned up by executing the system call exit_group(), with the only argument being the value returned by main() which is 0.

The Physicist

A program must run on a CPU, and a CPU is made of metal. Information is transmitted through metal by letting electrons flow along local gradients, increasing the entropy of the system. All of these electrons, together with the atoms of the CPU, form a huge quantum system which will evolve according to its wave function. Therefore, we can't know what the program does until we measure it's outcome.

The Enlightened

It will print out "Hello, world!".

@scienclopodia
Copy link

The program will print "Hello world" but it needs to be compiled with Clang, g++ etc. But g++ or Clang is very complicated and finally, g++ or Clang makes a .exe file with a name that contains the variables, the functions and the header files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment