Skip to content

Instantly share code, notes, and snippets.

@kenballus
Last active September 18, 2024 17:51
Show Gist options
  • Save kenballus/011732279457f84f516b47e3700703ea to your computer and use it in GitHub Desktop.
Save kenballus/011732279457f84f516b47e3700703ea to your computer and use it in GitHub Desktop.

Pointers

This is a primer on pointers for students learning C, and thus sacrifices some accuracy for clarity. Please do not treat it as absolute truth.

Before a program runs, its code and data are copied into memory. Thus, all code and data is associated with a memory address during program execution. A pointer is just a variable that contains one such memory address. We will now address a few of the most common use cases for pointers in C.

In-out arguments

Consider the following C program:

#include <stdio.h>

void set_to_one(int b) {
    b = 1;
}

int main(void) {
    int a = 0;
    f(a);
    printf("%d\n", a);
    return 0;
}

If you compile and run this program, you'll see that it prints 0. This is a result of how argument passing works in C. The value of a is copied into b when set_to_one gets called. Then, b gets set to 1, and set_to_one returns. Since we assigned to a copy of a, and not to a itself, it makes sense that a is unchanged.

However, sometimes we want functions to change the values of their arguments. This is a common use case for pointers in C.

Consider this C program, a modified version of the program above:

#include <stdio.h>

void set_to_one_v2(int* b_ptr) {
    *b_ptr = 1;
}

int main(void) {
    int a = 0;
    f(&a);
    printf("%d\n", a);
    return 0;
}

If you compile and run this program, you'll see that it prints 1. There are 3 changes between the first and second programs. The first change is in the argument to set_to_one_v2. Instead of set_to_one_v2 taking an integer b, it takes an int* (read "int pointer") b_ptr.1

Since a is an int, and not an int*, it makes sense that you wouldn't be able to pass a directly into set_to_one_v2, since a's type does not match the argument type for set_to_one_v2. Thus, we need to use the & (read "address of") operator to get the address of a, and pass that to set_to_one_v2 instead. Since a has type int, its address has type int*, so the types work out. That's change #2.

Change #3 is probably the most confusing: instead of doing b_ptr = 1, we do *b_ptr = 1. Why the extra *? Well, remember what set_to_one_v2 does; it sets the variable pointed to by b_ptr to 1. If we did b_ptr = 1, we would be setting b_ptr (which is an address) to be the address 1. This is not what we want; in all likelihood, the thing at memory address 1 is not something we want to mess with. What we want to do is assign to the thing pointed to by b_ptr. That's what the * does; it dereferences the pointer, giving us access to the value that it points to (the value at the memory address stored in b_ptr). We then assign to the dereferenced pointer. This amounts to an assignment to a, because we passed the address of a into set_to_one_v2.

A big sticking point for students picking up pointers for the first time is that * has two different pointer-related meanings in C: it's used in type declarations to state that a variable is of pointer type, and it also acts as the dereference operator.

Array Decay

There's an important relationship between pointers and arrays in C. In C, an array is a pointer to its first element.2 So, if you've been using arrays, you've already been using pointers. For instance, if I give you an array of ints, and ask you to write a program to print them in a loop, you might do something like this:

int main(void) {
    int my_array[10] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
    for (int i = 0; i < 10; i++) {
        printf("%d ", my_array[i]);
    }
    printf("\n");
}

You might be surprised to learn that this is an equivalent program:

int main(void) {
    int my_array[10] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
    for (int i = 0; i < 10; i++) {
        printf("%d ", *(my_array + i));
    }
    printf("\n");
}

In C, p[q] is just syntactic sugar for *(p + q).3 What's going on here is that adding an integer i to a pointer increases the pointer's value (which is an address, in bytes) by i times the size (again, in bytes) of the type pointed to. Thus, my_array + i is equal to the address of the first thing in my_array, plus i * sizeof(int) bytes. Thus, *(my_array + i) is equivalent to dereferencing a pointer to the ith thing in the array.

Dodging Inefficient Copying

Perhaps the most common use of pointers is for efficiently passing large structs into functions. Passing such structs into functions the normal way (i.e. without pointers) is inefficient because they'll be copied during the function call, and copying large portions of memory is an expensive operation. However, if we pass them by pointer, the only thing that needs to be copied is the address of the struct, which is of fixed size. Here's an example:

#include <stdio.h>

struct person {
    char* name;
    int age;
    // Pretend that there are 1000 more fields here, so the struct would be inefficient to copy around.
}

void print_person_info(struct person *person_ptr) {
    printf("Hi! My name is %s, and I am %d years old.\n", person_ptr->name, person_ptr->age);
}

Note that in print_person_info, we can't directly access person_ptr.age, because person_ptr is a pointer type, and therefore not a struct with a field named age. In order to access a field in the struct that person_ptr points to, we should first dereference person_ptr, then access the field, like this: (*person_ptr).age. Because pointer dereferencing and field access are very commonly paired together, there's a useful C operator that does both at once: ->. person_ptr->age is shorthand for (*person_ptr).age.

Footnotes

  1. You'll sometimes see this written as int * b_ptr, or int *b_ptr. For this document, I put the * to the left to emphasize that it is a type modifier, and not part of the variable name, but putting it to the right is actually more common. The reason for this is that int* a, b; is equivalent to int* a; int b;, not int* a; int* b; so the * is really right-associative.

  2. This is a little bit untrue. Technically, an array is distinct from a pointer, but can "decay" to a pointer. For all intents and purposes you can just treat arrays as if they are pointers.

  3. Consequently, this means that you can "index" any pointer with array indexing syntax ([]). Be careful, though. This is a common way to access memory out of bounds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment