Before a program runs, its code and data are copied into memory. Thus, all code and data is associated with a memory address during program execution. A pointer is just another word for a memory address. We will now address a few of the most common uses for pointers in C.
Consider the following C program:
#include <stdio.h>
void set_to_one(int b) {
b = 1;
}
int main(void) {
int a = 0;
set_to_one(a);
printf("%d\n", a);
}
If you compile and run this program, you'll see that it prints 0
. This is a result of how argument passing works in C. The value of a
is copied into b
when set_to_one
gets called. Then, b
gets set to 1
, and set_to_one
returns. Since we assigned to a copy of a
, and not to a
itself, it makes sense that a
is unchanged.
However, sometimes we want functions to change the values of their arguments.
Consider this C program, a modified version of the program above:
#include <stdio.h>
void set_to_one(int* b_ptr) {
*b_ptr = 1;
}
int main(void) {
int a = 0;
set_to_one(&a);
printf("%d\n", a);
}
If you compile and run this program, you'll see that it prints 1
. There are 3 changes between the first and second programs.
-
The argument to
set_to_one
has a different type. Instead ofset_to_one
taking an integerb
, it takes anint*
(read "int pointer")b_ptr
.1 -
Since
a
is anint
, and not anint*
, it makes sense that you wouldn't be able to passa
directly intoset_to_one
, sincea
's type does not match the argument type forset_to_one
. Thus, we need to use the&
(read "address of") operator to get the address ofa
, and pass that toset_to_one
instead. Sincea
has typeint
, its address has typeint*
, so the types work out. -
Instead of doing
b_ptr = 1
, we do*b_ptr = 1
. Why the extra*
? Well, if we didb_ptr = 1
, we would be settingb_ptr
(which is an address) to be the address 1. This is not what we want; in all likelihood, the thing at memory address1
is not something we want to mess with. What we want to do is assign to the thing pointed to byb_ptr
. That's what the*
does; it dereferences the pointer, giving us access to the value that it points to (the value at the memory address stored inb_ptr
). We then assign to the dereferenced pointer. This amounts to an assignment toa
, because we passed the address ofa
(&a
) intoset_to_one
.
A big sticking point for students picking up pointers for the first time is that *
has two different pointer-related meanings in C: it's used in type declarations to state that a variable is of pointer type, and it also acts as the dereference operator.
There's an important relationship between pointers and arrays in C. In C, an array is a pointer to its first element.2 So, if you've been using arrays, you've already been using pointers. For instance, if I give you an array of int
s, and ask you to write a program to print them in a loop, you might do something like this:
int main(void) {
int my_array[10] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
for (int i = 0; i < 10; i++) {
printf("%d ", my_array[i]);
}
printf("\n");
}
You might be surprised to learn that this is an equivalent program:
int main(void) {
int my_array[10] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
for (int i = 0; i < 10; i++) {
printf("%d ", *(my_array + i));
}
printf("\n");
}
In C, p[q]
is just syntactic sugar for *(p + q)
.3 What's going on here is that adding an integer i
to a pointer increases the pointer's value (which is an address, in bytes) by i
times the size (again, in bytes) of the type pointed to. Thus, my_array + i
is equal to the address of the first thing in my_array
, plus i * sizeof(int)
bytes. Thus, *(my_array + i)
is equivalent to dereferencing a pointer to the i
th thing in the array.
Perhaps the most common use of pointers is for efficiently passing large structs into functions. Passing such structs into functions the normal way (i.e. without pointers) is inefficient because they'll be copied during the function call, and copying large portions of memory is an expensive operation. However, if we pass them by pointer, the only thing that needs to be copied is the address of the struct, which is of fixed size. Here's an example:
#include <stdio.h>
struct person {
char* name;
int age;
// Pretend that there are 1000 more fields here, so the struct would be inefficient to copy around.
}
void print_person_info(struct person *person_ptr) {
printf("Hi! My name is %s, and I am %d years old.\n", person_ptr->name, person_ptr->age);
}
Note that in print_person_info
, we can't directly access person_ptr.age
, because person_ptr
is a pointer type, and therefore not a struct with a field named age
. In order to access a field in the struct that person_ptr
points to, we should first dereference person_ptr
, then access the field, like this: (*person_ptr).age
. Because pointer dereferencing and field access are very commonly paired together, there's a useful C operator that does both at once: ->
. person_ptr->age
is shorthand for (*person_ptr).age
.
Footnotes
-
You'll sometimes see this written as
int * b_ptr
, orint *b_ptr
. For this document, I put the*
to the left to emphasize that it is a type modifier, and not part of the variable name, but putting it to the right is actually more common. The reason for this is thatint* a, b;
is equivalent toint* a; int b;
, notint* a; int* b;
so the*
is really right-associative. ↩ -
This is a little bit untrue. Technically, an array is distinct from a pointer, but can "decay" to a pointer. For all intents and purposes you can just treat arrays as if they are pointers. ↩
-
Consequently, this means that you can "index" any pointer with array indexing syntax (
[]
). Be careful, though. This is a common way to access memory out of bounds. ↩