Skip to content

Instantly share code, notes, and snippets.

@jvns
Last active August 30, 2024 17:24
Show Gist options
  • Save jvns/7688286 to your computer and use it in GitHub Desktop.
Save jvns/7688286 to your computer and use it in GitHub Desktop.
What happens when I run ./hello

(You can comment here.)

Here I'm trying to understand what happens when I run

./hello
#include <stdio.h>

int main() {
    printf("Hello!\n");
}

a simple "Hello World" program written in C, in Unix -- what I'd have to do if I wanted to write an OS that could execute it.

I'm going to assume that ./hello is statically linked, because that sounds simpler to deal with. It's worth noting that a statically linked hello is 868K on my machine. Eep.

I compiled it using

gcc -static hello.c -o hello

Any (nice!) comments or clarifications are appreciated.

Read from the filesystem

To run a program, I have to be able to find the program. So there would need to be some kind of filesystem and I would need to read the file from somewhere.

Copy the text into memory

In a Unix system, executables are in the ELF format.

So I would need to copy the "text" of the program somewhere.

Copy the data into memory

There is a string in the program. It needs to go somewhere.

Give the program a stack pointer

This program doesn't actually allocate memory, so perhaps it does not need a heap and it doesn't matter where the heap pointer is. It does need a stack. stack overflow question on how the stack works in assembly

Implement system calls

hello has some system calls in it. I found this out by running

objdump -d -M intel hello | grep 'syscall'

syscall is an assembly instruction for making a system call. That looks like

  401385:       b8 03 00 00 00          mov    eax,0x3
  40138a:       0f 05                   syscall 

The number stored in eax is the system call that is called. In this case, 3

There are 119 instances of syscall, and it's using several different system calls. This is worrying.

(Explained more in this stackoverflow question)

Making sure a stack overflow doesn't happen

I have no idea how the OS would check up on the program. I guess it doesn't just let the program run, but takes away control periodically and makes sure the stack pointer hasn't moved too far. How would it take away control? Hmm.

When there is a stack overflow I guess it sends a signal to the program, which is a POSIX thing.

I do not understand this.

???

There are no mallocs in the program, so I would not need to allocate memory for it or anything.

What else?!??

Outstanding questions

  • How long would this take for a human (where human = me) to write from scratch?
  • Is there a way to write a smaller program with less system calls and magic? There are like 50 system calls and what are they even doing?
  • Do I need a heap if I never use malloc?
  • Could I write my own printf in assembly that does less and is simpler? Just printing a string is pretty easy...
  • How do I kill a program?

Useful links

@rythie
Copy link

rythie commented Nov 29, 2013

On your query about interrupts...

Interrupts can be called by any hardware that has a IRQ line. Interrupts are sent directly to one of the CPUs and they have a location in memory setup to jump to handle that interrupt quickly and get back to what they were doing.

Network interfaces, USB, disk controllers usually do interrupts when they have something ready to send, i.e. a new packet has come in the network interface (though sometimes they batch them which is called interrupt mitigation). There is also a programable timer chip on the motherboard, which in Linux OS is typically programmed to interrupt a set number of times a second (e.g. 1000 on my box) which calls the scheduler to run.

Processes run until they block (by doing a system call) or run out of their timeslice (~20ms). When a process blocks, the system handles the system call and typically that has some wait in it, e.g. it asks for something from disk that might take 1-10ms. The OS then does something else in that 1-10ms. When the data comes back from the disk an interrupt is raised and the system puts that into memory. The original process is now runable again and will be ran by the scheduler based on it's algorithm. If the process runs over it's timeslice (yours wouldn't - a but one doing some CPU intensive stuff would) some thing else can be scheduled for a bit before it get a chance to run again.

@lenary
Copy link

lenary commented Dec 5, 2013

Here's another resource for ELF Files: http://i.imgur.com/GZ5a0sb.png

I think it's really neat, how helpful it is though, remains to be seen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment