Skip to content

Instantly share code, notes, and snippets.

@aw-junaid
Created February 28, 2026 01:05
Show Gist options
  • Select an option

  • Save aw-junaid/2b9579a2f522d33676772caa82633cc1 to your computer and use it in GitHub Desktop.

Select an option

Save aw-junaid/2b9579a2f522d33676772caa82633cc1 to your computer and use it in GitHub Desktop.

Mastering C Programming – From Foundations to Systems-Level Mastery

A Comprehensive Guide to C Programming Language


PART I – Foundations of C Programming

Chapter 1 – Introduction to C Programming

1.1 History of C

The C programming language stands as one of the most influential and enduring languages in the history of computing. Its development in the early 1970s at Bell Laboratories by Dennis Ritchie marked a pivotal moment that would shape the future of systems programming for decades to come. To truly appreciate C, one must understand the context of computing in the late 1960s and early 1970s, a period of rapid innovation and experimentation in programming languages and operating systems.

Before C, systems programming was primarily conducted in assembly language, which was machine-specific, time-consuming to write, and difficult to maintain across different hardware platforms. Higher-level languages like Fortran and COBOL existed but were designed for scientific and business applications respectively, lacking the low-level access necessary for operating system development. The need for a language that could bridge the gap between high-level abstraction and low-level hardware control was becoming increasingly apparent.

1.2 From BCPL to C

The lineage of C can be traced through two predecessor languages: BCPL (Basic Combined Programming Language) and B. BCPL, developed by Martin Richards at the University of Cambridge in 1966, was designed for writing compilers and system software. It introduced many concepts that would later influence C, including a relatively simple syntax and the notion of using braces for block structuring.

In 1969, Ken Thompson, also working at Bell Labs, created the B language as a simplified version of BCPL for use on the early UNIX system being developed for the PDP-7 minicomputer. B was typeless, treating all data as machine words, and while suitable for early UNIX development, its limitations became apparent as the system grew more complex. B lacked the ability to handle different data types efficiently, making systems programming cumbersome and error-prone.

1.3 The Role of Dennis Ritchie

Dennis Ritchie, recognizing the limitations of B, began work on an evolved version that would address these shortcomings. The result, initially called "New B" and later simply C, introduced a type system that allowed variables to be declared as integers, characters, floating-point numbers, and more complex data structures. This typing system provided the compiler with crucial information about how to manipulate data, enabling more efficient code generation while maintaining the low-level access that systems programming demanded.

Ritchie's genius lay not just in creating a new language, but in designing one that was simultaneously high-level enough to be portable across different machines and low-level enough to replace assembly language for systems programming. C provided direct access to memory through pointers, allowed bit manipulation, and generated code that was nearly as efficient as hand-written assembly. This combination of power and portability was unprecedented.

By 1973, C had matured enough that Ritchie and Thompson rewrote the UNIX operating system kernel in C. This was a landmark achievement - previously, operating systems were considered too low-level to be written in anything but assembly language. The success of this endeavor demonstrated that C could indeed replace assembly for systems programming while making the code far more maintainable and portable.

1.4 C and UNIX

The relationship between C and UNIX is symbiotic and profound. UNIX provided the environment in which C was developed, and C became the language in which UNIX was written. This mutual reinforcement led to the explosive growth of both technologies. As UNIX was distributed to universities and research institutions throughout the 1970s, complete with its C source code, a generation of computer scientists learned systems programming by studying and modifying the UNIX kernel and utilities.

The portability that C afforded UNIX was revolutionary. In 1977, the UNIX operating system was ported to the Interdata 8/32, a machine completely different from the PDP-11 on which it was originally developed. This porting effort, accomplished in just a few months, demonstrated that an entire operating system could be moved to new hardware primarily by recompiling its source code, with only a small amount of machine-specific assembly code requiring modification.

1.5 Why C is Still Relevant

Despite being over five decades old, C remains one of the most important programming languages in use today. Its longevity can be attributed to several fundamental characteristics that have proven remarkably durable in the face of changing computing paradigms.

First and foremost, C provides a level of control over system resources that few languages can match. Programmers can manage memory explicitly, control hardware directly through memory-mapped I/O, and produce code that executes with minimal overhead. This makes C irreplaceable in domains where performance and resource constraints are paramount.

Second, C's simplicity and minimalism have contributed to its endurance. The language has a small set of features that can be learned relatively quickly, yet these features can be combined to express complex algorithms and data structures. Kernighan and Ritchie's "The C Programming Language" remains one of the most concise and effective programming books ever written, reflecting the language's elegant design.

Third, C serves as the lingua franca of systems programming. Operating systems, embedded systems, database engines, and programming language runtimes are predominantly written in C. Understanding C provides insight into how computers actually work at a fundamental level, making it invaluable for any serious programmer.

1.6 Applications of C

The versatility of C is demonstrated by its wide range of applications across virtually every domain of computing.

Operating Systems: Beyond UNIX, operating systems including Linux, Windows (kernel components), macOS (XNU kernel), and countless embedded operating systems are written primarily in C. The OS kernel must manage hardware resources, schedule processes, handle interrupts, and provide system call interfaces - tasks requiring the low-level control that C provides.

Embedded Systems: From microcontrollers in automotive systems to IoT devices, C dominates embedded programming. These resource-constrained environments demand efficient code and direct hardware access, making C the natural choice. Embedded C programmers work with memory-mapped registers, interrupt service routines, and real-time constraints.

Compilers and Interpreters: Many programming language implementations are written in C, including the original implementations of Python, Ruby, PHP, and numerous others. The efficiency of C makes it suitable for building language tools that must process source code quickly.

Database Engines: SQLite, one of the most widely deployed software libraries in existence, is written entirely in C. Database engines require efficient memory management, file I/O, and complex data structures - areas where C excels.

Security Tools: Network scanners, vulnerability assessment tools, exploit frameworks, and fuzzers are frequently written in C. The language's ability to interact directly with system interfaces and network protocols makes it ideal for security applications.

1.7 Overview of C Standards

The evolution of C has been carefully managed through formal standardization to ensure consistency across implementations while allowing for controlled language evolution.

C89/C90: The first ANSI standard (X3.159-1989) and subsequent ISO standard (ISO/IEC 9899:1990) formalized the language that had evolved since the 1970s. This standard, largely based on the language described in the first edition of Kernighan and Ritchie's book, established the core features that most C programmers recognize: function prototypes, the standard library, and consistent semantics across implementations.

C99 (ISO/IEC 9899:1999): This major revision added several significant features: inline functions, variable-length arrays, designated initializers, compound literals, the long long type, and single-line comments using //. C99 also introduced support for complex numbers and improved floating-point support.

C11 (ISO/IEC 9899:2011): Building on C99, C11 added multithreading support, atomic operations, bounds-checking interfaces, and anonymous structures and unions. The standard also made some features optional, acknowledging that implementations for embedded systems might not support the full language.

C17 (ISO/IEC 9899:2018): This was primarily a bug-fix release, addressing defects in the C11 standard without introducing new language features. It serves as the current baseline for many implementations.

C23 (ISO/IEC 9899:2023): The most recent standard introduces features including nullptr for null pointers, attributes similar to C++, enhanced binary literals, and the typeof operator. These additions modernize the language while maintaining backward compatibility.

1.8 Structure of a C Program

A C program follows a specific structure that, while flexible, maintains certain conventions that make code readable and maintainable. At its simplest, a C program consists of a collection of functions, one of which must be named main and serves as the program's entry point.

Consider the canonical "Hello, World!" program:

#include <stdio.h>

int main(void) {
    printf("Hello, World!\n");
    return 0;
}

This minimal program illustrates several essential elements:

The #include directive is a preprocessor command that instructs the compiler to include the contents of the standard input/output header file, which contains declarations for functions like printf. This separation of declarations from implementations is fundamental to C's modular design.

The function main is defined to return an integer (int) and takes no arguments (void). The int return value communicates the program's exit status to the operating system, with 0 conventionally indicating successful execution.

The body of main contains a single statement calling printf with a string argument. The \n represents a newline character, and the statement ends with a semicolon, which terminates all C statements.

1.9 Compilation Pipeline

Understanding the compilation pipeline is crucial for effective C programming. The process of transforming human-readable source code into an executable program involves several distinct stages, each performing specific transformations.

Preprocessing: The first stage processes directives beginning with #. The preprocessor removes comments, expands macros, includes header files, and conditionally compiles code based on #ifdef and similar directives. The output is a translation unit ready for compilation.

Compilation: The compiler translates the preprocessed source code into assembly language specific to the target processor. This stage involves lexical analysis (breaking source into tokens), parsing (building an abstract syntax tree), semantic analysis (type checking), and code generation.

Assembly: The assembler converts assembly language into machine code, producing an object file containing relocatable machine code and metadata about symbols (functions and variables) defined and referenced in the file.

Linking: The linker combines one or more object files with library code to produce the final executable. It resolves symbol references, relocates code to specific memory addresses, and produces an executable file in the appropriate format for the target operating system.

This multi-stage process provides flexibility at each step. Object files can be compiled separately and linked later, enabling incremental compilation of large programs. Different optimization levels can be applied, and various debugging information can be included.

1.10 Installing Development Tools

Setting up a C development environment requires a compiler, linker, and usually additional tools like debuggers and build automation systems. The specific tools vary by platform.

On Linux: GCC (GNU Compiler Collection) is typically installed by default or easily added through package managers. For Debian-based systems: sudo apt-get install build-essential. For Red Hat-based systems: sudo yum groupinstall "Development Tools". Clang, an alternative compiler, can be installed similarly: sudo apt-get install clang.

On macOS: Apple's Xcode Command Line Tools provide GCC/Clang compilers. Install with xcode-select --install in the terminal, which installs the compiler, linker, and associated tools.

On Windows: Microsoft Visual Studio provides the MSVC compiler and development environment. Alternatively, MinGW-w64 provides GCC for Windows, and Cygwin offers a Unix-like environment with development tools. Windows Subsystem for Linux (WSL) enables running Linux development tools directly on Windows.

1.11 Writing Your First Program

Beyond "Hello, World!", a first program should introduce fundamental concepts incrementally. Let's expand our example to demonstrate variables, input, and basic arithmetic:

#include <stdio.h>

int main(void) {
    int first_number, second_number, sum;
    
    printf("Enter two integers: ");
    scanf("%d %d", &first_number, &second_number);
    
    sum = first_number + second_number;
    
    printf("The sum of %d and %d is %d\n", 
           first_number, second_number, sum);
    
    return 0;
}

This program introduces several new concepts: variable declaration (int first_number), input with scanf, the address-of operator & needed for input functions, and formatted output with multiple placeholders.

1.12 Common Beginner Mistakes

New C programmers frequently encounter certain pitfalls that, while initially frustrating, provide valuable learning opportunities about the language's semantics.

Missing semicolons: Every statement in C must end with a semicolon. The compiler's error messages for missing semicolons can be confusing because the error is often reported on the following line.

Forgetting to use & in scanf: The scanf function requires pointers to variables so it can modify them. Beginners often write scanf("%d", number) instead of scanf("%d", &number), leading to undefined behavior as scanf attempts to write to an invalid memory location.

Buffer overflows: Reading input without bounds checking is a classic C pitfall. Using gets (now removed from the standard) or scanf with %s without field width specifiers can overflow character arrays.

Uninitialized variables: Variables in C are not automatically initialized. Using the value of an uninitialized variable leads to undefined behavior, often manifesting as random program behavior.

Integer division confusion: When both operands of / are integers, C performs integer division, truncating toward zero. Beginners expecting 5/2 to yield 2.5 are surprised to get 2 unless at least one operand is a floating-point type.

Equality vs assignment confusion: The assignment operator = is easily confused with the equality operator == in conditional expressions. The common mistake if (x = 5) assigns 5 to x and always evaluates to true, rather than checking if x equals 5.

Off-by-one errors: C arrays are zero-indexed, meaning the first element is at index 0. Loops that iterate through arrays must account for this, typically using for (i = 0; i < n; i++) rather than starting at 1 or using <= conditions.


Chapter 2 – The C Compilation Process

2.1 Source Code to Executable

The journey from source code to executable involves multiple transformations, each building upon the previous stage's output. Understanding this process helps programmers diagnose compilation errors, optimize code, and work effectively with build systems.

When you invoke the C compiler with a command like gcc program.c -o program, the compiler driver orchestrates the entire compilation pipeline automatically. However, each stage can be examined independently to understand the transformations occurring at each step.

The preprocessed output can be examined with gcc -E program.c, showing the source after macro expansion and header inclusion. The assembly output appears with gcc -S program.c, producing a .s file containing human-readable assembly instructions. Object files are generated with gcc -c program.c, producing .o files that can be linked later.

2.2 Preprocessor Directives

The C preprocessor is a powerful text-processing tool that operates before the compiler proper. While technically separate from the language, preprocessor directives are an integral part of C programming.

Directives begin with # and must appear at the beginning of a line (whitespace is allowed before the #). The most common directives include:

#include for file inclusion: This directive inserts the contents of the specified file at the current location. System headers are enclosed in angle brackets (#include <stdio.h>), telling the preprocessor to search system include paths. User headers use quotes (#include "myheader.h"), searching the current directory first.

#define for macro definition: Simple macros define constants: #define BUFFER_SIZE 1024. More complex macros can take arguments: #define MAX(a,b) ((a) > (b) ? (a) : (b)). However, function-like macros are error-prone and often better replaced with inline functions or enum constants in modern C.

Conditional compilation directives (#if, #ifdef, #ifndef, #else, #elif, #endif) allow selective compilation of code sections based on conditions evaluated during preprocessing. This is essential for writing portable code that adapts to different platforms and for including debugging code only in development builds.

#ifdef DEBUG
    printf("Variable x has value: %d\n", x);
#endif

#if defined(__linux__)
    // Linux-specific code
#elif defined(__APPLE__)
    // macOS-specific code
#endif

2.3 Object Files and Linking

Object files contain machine code that is not yet executable because addresses of external symbols (functions and variables defined in other files) are not resolved. The object file format typically includes several sections:

The text section contains executable instructions. The data section holds initialized global and static variables. The bss section (block started by symbol) reserves space for uninitialized global and static variables, which are initialized to zero at program startup. The symbol table lists symbols defined in this object file and symbols referenced but not defined.

Linking combines multiple object files, resolves external references, and produces an executable. The linker matches each symbol reference with its definition, relocates code by adjusting addresses, and combines sections from all input files into the final executable.

2.4 Static vs Dynamic Linking

Linking determines how external library code is incorporated into the executable, with important implications for program size, memory usage, and maintenance.

Static linking copies library code directly into the executable. This produces larger executables but ensures that the program always uses the exact library version it was linked with. Static libraries typically have .a (archive) extensions on Unix-like systems and .lib on Windows. Static linking is specified with compiler flags like -static on GCC.

Dynamic linking (also called shared linking) defers resolution of some symbols until program load time or runtime. The executable contains references to shared libraries (.so on Linux, .dylib on macOS, .dll on Windows) that are loaded when the program starts. This reduces executable size and allows multiple programs to share a single copy of the library in memory. However, it introduces dependencies on specific library versions being present on the system.

The choice between static and dynamic linking involves trade-offs. Static linking provides independence and predictability but wastes memory and disk space. Dynamic linking conserves resources but can lead to "DLL hell" when incompatible library versions cause programs to fail.

2.5 Executable File Formats

Different operating systems use different executable file formats, each designed to support the operating system's process loading and memory management model.

ELF (Executable and Linkable Format) is the standard on Linux and many Unix-like systems. ELF files contain headers describing the file structure, program headers specifying how to create process memory images, and section headers for linking and debugging.

PE (Portable Executable) format is used on Windows, derived from the earlier COFF format. PE files include DOS and NT headers, section tables, and import/export tables for dynamic linking.

Mach-O is the format on macOS and iOS, supporting multiple architecture slices (universal binaries) and sophisticated dynamic linking features.

2.6 Using Makefiles

As programs grow beyond single source files, manual compilation becomes impractical. Makefiles provide a declarative way to specify build dependencies and commands.

A simple Makefile:

CC = gcc
CFLAGS = -Wall -O2
TARGET = myprogram
OBJS = main.o utils.o fileio.o

$(TARGET): $(OBJS)
    $(CC) $(OBJS) -o $(TARGET)

main.o: main.c utils.h fileio.h
    $(CC) $(CFLAGS) -c main.c

utils.o: utils.c utils.h
    $(CC) $(CFLAGS) -c utils.c

fileio.o: fileio.c fileio.h
    $(CC) $(CFLAGS) -c fileio.c

clean:
    rm -f $(OBJS) $(TARGET)

.PHONY: clean

Make examines file timestamps to rebuild only those files whose dependencies have changed. This incremental compilation saves significant time in large projects.

2.7 Build Systems (CMake, Ninja)

Modern build systems address limitations of traditional Make, particularly cross-platform portability and complex configuration.

CMake generates native build files (Makefiles, Ninja files, Visual Studio solutions) from high-level descriptions. A CMakeLists.txt file describes the project:

cmake_minimum_required(VERSION 3.10)
project(MyProject)

set(CMAKE_C_STANDARD 11)

add_executable(myprogram main.c utils.c fileio.c)

target_include_directories(myprogram PRIVATE include)

find_library(MATH_LIB m)
target_link_libraries(myprogram ${MATH_LIB})

Ninja is a small build system focused on speed, often used with CMake. It generates build files optimized for parallel execution and minimal overhead.

2.8 Cross Compilation

Cross compilation builds executable code for a platform different from the one running the compiler. This is essential for embedded systems development, where the target device lacks the resources to run a compiler.

A cross-compilation toolchain includes a compiler that generates code for the target architecture (e.g., ARM), assembler, linker, and libraries for that target. When invoking the compiler, additional flags specify the target architecture, system root, and library paths:

arm-none-eabi-gcc -mcpu=cortex-m4 -mthumb \
    -nostdlib -T linker_script.ld \
    main.c -o firmware.elf

Cross compilation introduces complexities around header files, library compatibility, and endianness. Build systems like CMake support cross compilation through toolchain files that define the target environment.


Chapter 3 – Data Types and Variables

3.1 Basic Data Types

C provides a set of fundamental data types that map directly to hardware capabilities, offering precise control over data representation and operations.

The integer types store whole numbers. The basic integer type is int, which is typically the natural word size of the processor (32 bits on most modern systems). char stores a single byte, short provides a smaller integer (usually 16 bits), and long provides a larger integer (at least 32 bits, often 64 bits on 64-bit systems). C23 adds long long which is at least 64 bits.

Floating-point types approximate real numbers with decimal points. float provides single-precision (32-bit IEEE 754), double provides double-precision (64-bit), and long double offers extended precision (implementation-dependent, often 80 or 128 bits).

The void type represents the absence of a value, used primarily for functions that return nothing and for generic pointers.

The _Bool type (accessible as bool after including <stdbool.h>) stores true/false values, requiring only one bit of storage but typically occupying one byte.

3.2 Type Modifiers

Type modifiers adjust the range and interpretation of basic types. signed and unsigned apply to integer types, controlling whether the most significant bit represents the sign or an additional magnitude bit. short and long modify the size of integer types.

The complete hierarchy of integer types includes:

  • signed char, unsigned char
  • short, unsigned short
  • int, unsigned int
  • long, unsigned long
  • long long, unsigned long long

Each type has a minimum range guaranteed by the standard, but actual ranges are implementation-defined and available in <limits.h>.

3.3 Signed vs Unsigned

The choice between signed and unsigned types has profound implications for program behavior, particularly in comparisons and arithmetic.

Signed types use the most significant bit as a sign indicator, representing both positive and negative numbers. The range for signed 32-bit integers is typically -2,147,483,648 to 2,147,483,647.

Unsigned types use all bits for magnitude, representing only non-negative values but with twice the maximum positive value of signed types. A 32-bit unsigned integer ranges from 0 to 4,294,967,295.

The critical difference emerges in overflow behavior. Signed integer overflow is undefined behavior in C, while unsigned overflow is well-defined to wrap around modulo 2^n. This has security implications - compilers may optimize based on the assumption that signed overflow never occurs, potentially eliminating checks that programmers thought were present.

3.4 Integer Representation (Two's Complement)

Almost all modern systems represent signed integers using two's complement. In this system, the most significant bit has a negative weight. For an n-bit number, the value is:

value = -2^(n-1) * b_(n-1) + sum(2^i * b_i) for i=0 to n-2

This representation has several elegant properties: addition and subtraction work identically for signed and unsigned numbers at the bit level, there's a single representation for zero, and the range is symmetric except for one extra negative value.

3.5 Floating Point Representation (IEEE 754)

The IEEE 754 standard specifies floating-point representation that has been almost universally adopted. A floating-point number consists of three fields: sign (1 bit), exponent (8 bits for float, 11 for double), and significand (23 bits for float, 52 for double).

The value is computed as:

value = (-1)^sign * (1 + significand) * 2^(exponent - bias)

The bias is 127 for float, 1023 for double. Special bit patterns represent infinity, negative infinity, and NaN (Not a Number). Subnormal numbers fill the underflow gap between zero and the smallest normalized number.

Understanding floating-point representation is crucial for numerical programming. Floating-point arithmetic is inexact due to finite precision, and operations can produce unexpected results due to rounding.

3.6 Type Qualifiers

Type qualifiers modify how objects can be accessed and used, providing information to both programmers and compilers.

const declares that an object's value should not be modified after initialization. This enables compiler optimizations and documents intent:

const double PI = 3.14159;
const int DAYS_IN_WEEK = 7;

volatile tells the compiler that an object may change in ways not predictable by the compiler, such as memory-mapped I/O registers or variables modified by signal handlers. The compiler must generate code to read the object's value each time it's used, preventing optimizations that would cache the value in registers.

restrict (C99) is a promise to the compiler that, for the lifetime of the pointer, only the pointer itself or a value derived from it will be used to access the object it points to. This enables optimizations that would otherwise be impossible due to potential aliasing.

3.7 Storage Classes

Storage class specifiers control where variables are stored, their lifetime, and their visibility across translation units.

auto is the default for local variables, specifying automatic storage duration (variables exist only within their enclosing block).

register suggests that the variable be stored in a processor register for fast access. Modern compilers generally ignore this hint in favor of their own optimization decisions.

static has different meanings depending on context. For local variables, it extends lifetime to the entire program run while limiting scope to the function. For global variables and functions, it limits visibility to the current translation unit.

extern declares a variable or function defined in another translation unit, enabling access across files.

_Thread_local (C11) gives each thread its own copy of a variable.

3.8 Scope and Lifetime

Scope determines where in the source code an identifier can be accessed. Block scope applies to identifiers declared inside a block (delimited by {}), visible from declaration to the end of the block. File scope applies to identifiers declared outside any function, visible from declaration to the end of the file. Function scope applies only to labels (used with goto).

Lifetime determines when storage for an object is allocated and deallocated. Static storage duration (global and static variables) lasts for the entire program execution. Automatic storage duration (local non-static variables) lasts from entry to exit of the enclosing block. Allocated storage duration (from malloc) lasts from allocation to deallocation with free.

3.9 Type Casting

Type casting converts a value from one type to another. Implicit conversions occur automatically in many contexts, such as when assigning a smaller integer type to a larger one or when mixing types in expressions. Explicit casts are written with the target type in parentheses:

int i = 10;
double d = (double)i / 3;  /* Explicit cast avoids integer division */

Conversions can change values significantly. Converting from floating-point to integer truncates toward zero. Converting from a larger integer to a smaller one discards high-order bits. Converting between signed and unsigned types reinterprets the bit pattern, which can change values dramatically.

3.10 Format Specifiers

Format specifiers in printf and scanf control the interpretation of arguments and must match the actual types being passed:

  • %d, %i: int (signed decimal)
  • %u: unsigned int
  • %x, %X: unsigned int in hexadecimal
  • %o: unsigned int in octal
  • %f: double (float is promoted to double in variadic functions)
  • %lf: double (for scanf, not printf)
  • %c: char (converted to int in printf)
  • %s: null-terminated string
  • %p: pointer (void*)
  • %zu: size_t (for printf, from C99)

Mismatching format specifiers and argument types leads to undefined behavior, often producing garbage output or program crashes.


Chapter 4 – Operators and Expressions

4.1 Arithmetic Operators

C provides the usual arithmetic operators: addition (+), subtraction (-), multiplication (*), division (/), and modulus (%). The behavior of these operators depends on operand types and can have subtle aspects.

Division of integers truncates toward zero in C99 and later (implementation-defined before C99). This means 5/2 yields 2, while -5/2 yields -2. Floating-point division, performed when at least one operand has floating-point type, yields a floating-point result.

The modulus operator % requires integer operands and yields the remainder after division. The result satisfies the property that (a/b)*b + a%b == a. The sign of the result is implementation-defined before C99 but defined to have the same sign as a in C99 and later.

4.2 Relational and Logical Operators

Relational operators compare values: > (greater than), >= (greater than or equal), < (less than), <= (less than or equal), == (equal), != (not equal). Each yields an int result: 1 for true, 0 for false.

Logical operators combine relational expressions: && (logical AND), || (logical OR), ! (logical NOT). These operators short-circuit: for &&, if the left operand is false, the right operand is not evaluated; for ||, if the left operand is true, the right operand is not evaluated.

This short-circuit behavior is often used for safe pointer dereferencing:

if (ptr != NULL && ptr->value > 0) {
    /* Safe to dereference ptr */
}

4.3 Bitwise Operators

Bitwise operators manipulate individual bits within integer types, essential for systems programming and hardware interaction.

& performs bitwise AND, | performs bitwise OR, ^ performs bitwise XOR, ~ performs bitwise complement (one's complement). << shifts bits left, >> shifts bits right.

Shift operators have implementation-defined behavior for signed negative values and for shifting by amounts equal to or greater than the width of the type. Right shifts of signed values may be arithmetic (sign-extending) or logical (zero-filling), implementation-defined.

Bitwise operations are used extensively for flag manipulation, hardware register access, and efficient arithmetic:

/* Set bit 3 */
flags |= (1 << 3);

/* Clear bit 3 */
flags &= ~(1 << 3);

/* Toggle bit 3 */
flags ^= (1 << 3);

/* Test bit 3 */
if (flags & (1 << 3)) {
    /* Bit is set */
}

4.4 Assignment Operators

Simple assignment (=) stores the value of the right operand into the left operand. Compound assignment operators combine assignment with another operation: +=, -=, *=, /=, %=, <<=, >>=, &=, |=, ^=.

The expression x += 5 is equivalent to x = x + 5, but x is evaluated only once, which matters when the left operand has side effects (e.g., *p++ += 5).

4.5 Increment & Decrement

The increment (++) and decrement (--) operators add or subtract 1 from their operand. Prefix form (++x) increments before the value is used in the containing expression; postfix form (x++) uses the current value then increments.

These operators are often used in loops but require care when combined with other operations in the same expression:

int x = 5;
int y = x++;  /* y = 5, x = 6 */
int z = ++x;  /* x = 7, z = 7 */

Using these operators on the same variable multiple times in one expression leads to undefined behavior, as the order of evaluation of side effects is not specified.

4.6 Operator Precedence

Operator precedence determines how expressions are grouped when different operators appear together. For example, multiplication has higher precedence than addition, so a + b * c is evaluated as a + (b * c).

When precedence is unclear, parentheses should be used for readability. Memorizing the entire precedence table is less important than understanding that some operators have surprising precedence (e.g., shift operators have lower precedence than arithmetic operators, so i << 2 + 1 means i << (2 + 1)).

Associativity determines grouping when operators have the same precedence. Most binary operators associate left-to-right, but assignment operators associate right-to-left, making a = b = c equivalent to a = (b = c).

4.7 Type Conversion Rules

When operands of different types appear in an expression, C applies implicit conversions according to the usual arithmetic conversions, designed to preserve precision while producing a common type.

First, integer promotions are applied: types smaller than int (char, short) are converted to int or unsigned int if all values of the original type can fit. Then, if one operand has floating-point type, the conversion proceeds toward the more precise floating-point type.

If both operands are integers, the integer conversion rank determines the result type: the operand with the higher rank is converted to the type of the higher rank. If both have the same rank but one is signed and the other unsigned, the rules become more complex: if the unsigned type can represent all values of the signed type, the signed operand is converted to the unsigned type; otherwise, both are converted to the unsigned version of the signed operand's type.

4.8 Undefined Behavior

Undefined behavior is perhaps the most important concept in C for understanding program correctness and security. The C standard specifies that certain operations result in undefined behavior, meaning the program can do anything: produce the expected result, crash, corrupt data, or even appear to work correctly until a critical moment.

Common sources of undefined behavior include:

  • Array index out of bounds
  • Dereferencing null or invalid pointers
  • Signed integer overflow
  • Using uninitialized variables
  • Violating type rules (e.g., accessing an object through an incompatible pointer type)
  • Modifying a string literal
  • Data races in multithreaded programs
  • Shifting by negative amounts or by amounts >= width of type

Undefined behavior is particularly insidious because compilers may assume it never happens and perform optimizations that break code attempting to check for such conditions:

int check_overflow(int x) {
    if (x + 100 < x) {  /* Checking for overflow? */
        /* Handle overflow */
    }
    return x + 100;
}

Since signed overflow is undefined, the compiler may optimize away the comparison, assuming it's always false. Understanding undefined behavior is essential for writing correct and secure C code.


Chapter 5 – Control Flow

5.1 if / else

The if statement allows conditional execution of code blocks. The basic form tests an expression; if it evaluates to non-zero (true), the following statement or block executes:

if (temperature > 100) {
    printf("Water boils at this temperature\n");
}

The else clause provides an alternative path when the condition is false:

if (grade >= 60) {
    printf("Passing\n");
} else {
    printf("Failing\n");
}

Multiple conditions can be chained with else if:

if (score >= 90) {
    letter = 'A';
} else if (score >= 80) {
    letter = 'B';
} else if (score >= 70) {
    letter = 'C';
} else if (score >= 60) {
    letter = 'D';
} else {
    letter = 'F';
}

5.2 switch

The switch statement provides multi-way branching based on an integer expression. It's often more efficient and readable than long if-else if chains:

switch (command) {
    case 'a':
        do_add();
        break;
    case 'd':
        do_delete();
        break;
    case 'q':
        return 0;
    default:
        printf("Unknown command\n");
        break;
}

The break statement is crucial; without it, execution "falls through" to subsequent cases. While fall-through can be useful in some situations (e.g., multiple cases handling the same code), it should be clearly commented to indicate intentional design.

Case labels must be integer constant expressions, and no two case labels in the same switch can have the same value. The default case handles any value not covered by explicit cases.

5.3 Loops (for, while, do-while)

C provides three loop constructs, each suited to different scenarios.

The while loop tests the condition before executing the body, making it ideal for situations where the loop may need to execute zero times:

while (bytes_remaining > 0) {
    process_next_block();
    bytes_remaining -= BLOCK_SIZE;
}

The do-while loop tests the condition after executing the body, guaranteeing at least one execution:

do {
    response = get_user_input();
    process_response(response);
} while (response != QUIT);

The for loop collects initialization, condition, and iteration expressions in one place, most commonly used for counting loops:

for (int i = 0; i < array_size; i++) {
    array[i] = i * i;
}

C99 allows declaring the loop variable within the for statement, limiting its scope to the loop.

5.4 break and continue

Within loops, break terminates the loop immediately, transferring control to the statement after the loop. continue skips the remainder of the current iteration and proceeds to the next iteration.

These statements are useful for handling exceptional conditions:

while (fgets(line, sizeof(line), file)) {
    if (line[0] == '#') {
        continue;  /* Skip comment lines */
    }
    if (strcmp(line, "END") == 0) {
        break;  /* Stop processing */
    }
    process_line(line);
}

5.5 goto and Labels

The goto statement transfers control unconditionally to a labeled statement within the same function. While often discouraged in high-level programming, goto has legitimate uses in C, particularly for error handling and cleanup in functions with multiple resources:

int process_data(void) {
    FILE *in = fopen("input.txt", "r");
    if (!in) goto error;
    
    FILE *out = fopen("output.txt", "w");
    if (!out) goto close_in;
    
    /* Process data */
    
    fclose(out);
    fclose(in);
    return 0;

close_in:
    fclose(in);
error:
    return -1;
}

This pattern ensures proper cleanup without deeply nested conditionals.

5.6 Nested Control Structures

Control structures can be nested arbitrarily, though deep nesting often indicates a need for refactoring. Common patterns include nested loops for multidimensional array processing:

for (int i = 0; i < rows; i++) {
    for (int j = 0; j < cols; j++) {
        matrix[i][j] = i * cols + j;
    }
}

Conditionals within loops provide fine-grained control:

for (int i = 0; i < size; i++) {
    if (array[i] == target) {
        found = i;
        break;
    }
}

5.7 Best Practices

Clear control flow contributes significantly to program readability and maintainability. Several practices help achieve this:

Keep loop bodies focused and concise. If a loop body becomes lengthy, consider moving it to a separate function. This improves readability and facilitates testing.

Prefer for loops for simple counting iterations and while loops when the number of iterations isn't known in advance. Use do-while only when the loop must execute at least once.

Avoid complex conditions by using well-named boolean variables or functions:

int is_valid_user = (user != NULL && user->active && !user->locked);
if (is_valid_user) {
    grant_access(user);
}

Document intentional fall-through in switch statements with comments:

switch (c) {
    case 'a':
    case 'A':  /* Intentional fall-through */
        handle_A();
        break;
}

PART II – Functions and Program Structure

Chapter 6 – Functions

6.1 Function Declaration & Definition

Functions in C must be declared before they are used. A function declaration (prototype) specifies the function's return type, name, and parameter types:

double calculate_average(int count, double values[]);

The function definition provides the implementation:

double calculate_average(int count, double values[]) {
    double sum = 0.0;
    for (int i = 0; i < count; i++) {
        sum += values[i];
    }
    return count > 0 ? sum / count : 0.0;
}

Functions that don't return a value use void return type. Functions that take no parameters should explicitly specify void in the parameter list: int get_random(void);.

6.2 Parameter Passing

C uses call-by-value parameter passing: the called function receives copies of the argument values, not the originals. Modifying parameters inside the function does not affect the caller's variables:

void try_to_modify(int x) {
    x = 10;  /* Only modifies local copy */
}

int main(void) {
    int y = 5;
    try_to_modify(y);
    printf("%d\n", y);  /* Still prints 5 */
}

To modify caller variables, functions must receive pointers:

void modify(int *x) {
    *x = 10;  /* Modifies original through pointer */
}

int main(void) {
    int y = 5;
    modify(&y);
    printf("%d\n", y);  /* Prints 10 */
}

6.3 Call by Value

Understanding call-by-value is crucial for correct C programming. When passing structures to functions, the entire structure is copied, which can be expensive for large structures. Passing a pointer to the structure avoids this copying and allows the function to modify the original:

struct large_struct {
    /* many fields */
};

void process_copy(struct large_struct s) { /* Copy made here */ }
void process_ptr(struct large_struct *s) { /* No copy, can modify */ }

6.4 Recursion

Recursive functions call themselves, directly or indirectly. While iteration is often more efficient in C (avoiding function call overhead), recursion provides elegant solutions for certain problems:

unsigned long factorial(unsigned int n) {
    if (n <= 1) {
        return 1;
    }
    return n * factorial(n - 1);
}

Each recursive call creates a new stack frame, so deep recursion can cause stack overflow. Problems with natural recursive structure (tree traversal, divide-and-conquer algorithms) benefit from recursion when the depth is bounded.

6.5 Inline Functions

The inline keyword suggests to the compiler that the function's code should be inserted directly at the call site, avoiding function call overhead. This is a hint, not a command; compilers may ignore it for large functions or when optimization is disabled:

inline int max(int a, int b) {
    return a > b ? a : b;
}

Inline functions are often defined in header files, as their definition must be visible at each call site. C99 and later provide semantics for inline functions across translation units.

6.6 Function Pointers

Pointers to functions enable callbacks, dynamic dispatch, and flexible program design. A function pointer declaration specifies the return type and parameter types:

int add(int a, int b) { return a + b; }
int subtract(int a, int b) { return a - b; }

int (*operation)(int, int);  /* Pointer to function taking two ints, returning int */

operation = add;
int result = operation(5, 3);  /* Calls add */
operation = subtract;
result = operation(5, 3);  /* Calls subtract */

Function pointers are essential for implementing generic algorithms, such as the sorting routine qsort which takes a comparison function:

int compare_ints(const void *a, const void *b) {
    return *(int*)a - *(int*)b;
}

qsort(array, size, sizeof(int), compare_ints);

6.7 Variadic Functions

Variadic functions accept a variable number of arguments. The most common example is printf. Implementing variadic functions requires the <stdarg.h> macros:

#include <stdarg.h>

double average(int count, ...) {
    va_list args;
    double sum = 0.0;
    
    va_start(args, count);
    for (int i = 0; i < count; i++) {
        sum += va_arg(args, double);
    }
    va_end(args);
    
    return sum / count;
}

/* Called as: average(3, 1.0, 2.0, 3.0) */

Variadic functions have no type safety; the caller must ensure that the types match what the function expects. This is a common source of bugs, as seen with mismatched format specifiers in printf.


Chapter 7 – Scope, Storage & Linkage

7.1 Local vs Global

Variables declared inside functions have local (block) scope and automatic storage duration. They are created when the block is entered and destroyed when it exits:

void function(void) {
    int local_var = 10;  /* Local to function */
    for (int i = 0; i < 5; i++) {  /* i local to loop (C99) */
        int loop_var = i * 2;  /* Local to loop body */
    }
}

Global variables are declared outside any function, have file scope, and static storage duration (exist for entire program lifetime). While convenient, global variables introduce hidden dependencies and complicate program understanding:

int global_counter = 0;  /* Global variable */

void increment_counter(void) {
    global_counter++;
}

7.2 static Keyword

The static keyword has different meanings based on context:

For local variables inside functions, static extends their lifetime to the entire program while limiting scope to the function. The variable retains its value between function calls:

int next_id(void) {
    static int id = 0;  /* Initialized only once */
    return id++;
}

For global variables and functions, static limits their visibility to the current translation unit (file). This provides encapsulation and prevents name conflicts:

static int internal_counter;  /* Not visible outside this file */
static void helper_function(void) { /* Only callable within this file */ }

7.3 extern Keyword

The extern keyword declares a variable or function defined elsewhere, typically in another translation unit. It tells the compiler "this name refers to something defined elsewhere":

/* File1.c */
int global_value = 100;  /* Definition */

/* File2.c */
extern int global_value;  /* Declaration, not definition */
void function(void) {
    printf("%d\n", global_value);  /* Accesses File1's variable */
}

Functions have external linkage by default, so extern is optional for function declarations.

7.4 Internal vs External Linkage

Linkage determines whether identifiers in different scopes refer to the same object. External linkage means the identifier can be accessed from other translation units. Internal linkage restricts access to the current translation unit.

Global variables and functions have external linkage by default. Adding static gives them internal linkage:

/* External linkage */
int external_var;
void external_func(void);

/* Internal linkage */
static int internal_var;
static void internal_func(void);

7.5 Header Files Design

Proper header file design is crucial for modular C programs. Headers should contain declarations but not definitions (with some exceptions like inline functions and static constants). Include guards prevent multiple inclusion:

#ifndef UTILS_H
#define UTILS_H

/* Function declarations */
int calculate_sum(int a, int b);
double compute_average(double *values, int count);

/* Type definitions */
typedef struct {
    int x;
    int y;
} Point;

/* Constants */
#define MAX_SIZE 1024
extern const double PI;  /* Declaration - defined in .c file */

/* Inline function - safe in header */
static inline int max(int a, int b) {
    return a > b ? a : b;
}

#endif /* UTILS_H */

Each header should be self-contained, including any headers it depends on. This allows source files to include the header without worrying about ordering dependencies.


PART III – Memory and Pointers (Core of C)

Chapter 8 – Pointers

8.1 Introduction to Pointers

Pointers are variables that store memory addresses. They are fundamental to C, enabling dynamic memory allocation, efficient array processing, and direct hardware access. A pointer is declared with an asterisk (*) after the type:

int *int_ptr;        /* Pointer to integer */
char *char_ptr;      /* Pointer to character */
double *double_ptr;  /* Pointer to double */

The address-of operator (&) obtains a variable's address. The dereference operator (*) accesses the value at a pointer's address:

int x = 42;
int *ptr = &x;       /* ptr holds address of x */
printf("%d\n", *ptr); /* Prints 42 */
*ptr = 100;          /* Changes x to 100 */

8.2 Pointer Arithmetic

Pointer arithmetic adjusts addresses based on the size of the pointed-to type. Adding an integer n to a pointer advances it by n × sizeof(type) bytes:

int array[5] = {10, 20, 30, 40, 50};
int *ptr = array;     /* Points to array[0] */

ptr++;                /* Now points to array[1] */
printf("%d\n", *ptr); /* Prints 20 */

ptr += 2;             /* Now points to array[3] */
printf("%d\n", *ptr); /* Prints 40 */

Subtracting pointers yields the number of elements between them. Comparing pointers determines relative positions in memory.

8.3 Pointers and Arrays

Arrays and pointers are closely related in C. An array name in most contexts converts to a pointer to its first element. This relationship enables array-style indexing with pointers:

int array[5] = {1, 2, 3, 4, 5};
int *ptr = array;

/* These are equivalent */
printf("%d\n", array[2]);
printf("%d\n", *(array + 2));
printf("%d\n", ptr[2]);
printf("%d\n", *(ptr + 2));

However, arrays and pointers are not identical. sizeof(array) gives the total array size, while sizeof(ptr) gives the pointer size. An array name cannot be reassigned to point elsewhere.

8.4 Pointers to Pointers

Multiple levels of indirection allow handling of multidimensional data structures and modifying pointer arguments:

int x = 42;
int *ptr = &x;
int **pptr = &ptr;  /* Pointer to pointer to int */

printf("%d\n", **pptr);  /* Prints 42 */

Pointers to pointers are essential for functions that need to allocate memory and return the pointer through a parameter:

void allocate_array(int **arr_ptr, int size) {
    *arr_ptr = malloc(size * sizeof(int));
    /* Caller can now use *arr_ptr as allocated array */
}

int *array;
allocate_array(&array, 10);

8.5 Function Pointers

As covered in Chapter 6, function pointers store addresses of executable code. They enable callbacks, dynamic dispatch, and implementing virtual function tables:

typedef int (*operation)(int, int);

struct calculator {
    operation add;
    operation subtract;
    operation multiply;
};

int add(int a, int b) { return a + b; }
int subtract(int a, int b) { return a - b; }
int multiply(int a, int b) { return a * b; }

struct calc = {add, subtract, multiply};
int result = calc.add(5, 3);  /* Calls add */

8.6 Void Pointers

void* is a generic pointer type that can hold any address without regard to the pointed-to type. It must be cast to a specific type before dereferencing:

void *generic_ptr;
int x = 42;
double y = 3.14;

generic_ptr = &x;
printf("%d\n", *(int*)generic_ptr);  /* Cast to int* before dereferencing */

generic_ptr = &y;
printf("%f\n", *(double*)generic_ptr);  /* Cast to double* */

Void pointers are used in generic functions like memcpy, malloc, and qsort that must work with any data type.

8.7 Null Pointers

A null pointer points to no valid memory location. It's represented by the null pointer constant, typically NULL (defined in <stddef.h>) or 0. Dereferencing a null pointer causes undefined behavior, usually a program crash:

int *ptr = NULL;
if (ptr != NULL) {
    *ptr = 42;  /* Safe only if ptr is not NULL */
}

Checking for null pointers before dereferencing is a fundamental safety practice.

8.8 Common Pointer Bugs

Pointer misuse is the most common source of bugs in C programs. Understanding these pitfalls helps prevent them:

Dangling pointers occur when memory is freed but pointers to it remain:

int *ptr = malloc(sizeof(int));
*ptr = 42;
free(ptr);
/* ptr now dangling - using it is undefined behavior */
*ptr = 100;  /* DANGER! */

Memory leaks happen when allocated memory is not freed:

void leak(void) {
    int *ptr = malloc(1000 * sizeof(int));
    /* Forgot to free ptr - memory leak */
}

Buffer overflows occur when writing beyond allocated boundaries:

int array[5];
for (int i = 0; i <= 5; i++) {  /* Off-by-one - writes past end */
    array[i] = i;
}

Uninitialized pointers contain indeterminate values:

int *ptr;  /* Uninitialized - contains garbage */
*ptr = 42;  /* Undefined behavior - writing to random address */

Type confusion happens when casting between incompatible pointer types:

float f = 3.14;
int *ptr = (int*)&f;  /* Legal but dangerous */
printf("%d\n", *ptr);  /* Prints integer interpretation of float bits */

Chapter 9 – Dynamic Memory Management

9.1 Stack vs Heap

C programs use two primary memory regions for variable storage: the stack and the heap.

The stack stores local variables and function call information. Stack allocation is automatic: variables are pushed when functions are called and popped when they return. Stack allocation is fast but limited in size and requires knowing allocation sizes at compile time.

The heap provides dynamic memory allocation at runtime. Programs request memory of specific sizes and must explicitly free it when done. Heap allocation is more flexible but slower and requires careful management to avoid leaks and other errors.

9.2 malloc

malloc (memory allocation) allocates a block of uninitialized memory of the specified size:

int *arr = malloc(10 * sizeof(int));
if (arr == NULL) {
    /* Handle allocation failure */
    fprintf(stderr, "Memory allocation failed\n");
    exit(1);
}

malloc returns a void* to the allocated memory, which must be cast to the appropriate type. It returns NULL if allocation fails. Always check the return value before using the allocated memory.

9.3 calloc

calloc (contiguous allocation) allocates memory for an array of elements, initializing all bytes to zero:

int *arr = calloc(10, sizeof(int));
if (arr == NULL) {
    /* Handle error */
}

calloc takes two arguments: number of elements and size of each element. It's safer than malloc for certain uses because it zeroes memory, but has a small performance cost.

9.4 realloc

realloc resizes previously allocated memory, preserving existing content up to the minimum of old and new sizes:

int *arr = malloc(10 * sizeof(int));
/* ... use arr ... */
int *new_arr = realloc(arr, 20 * sizeof(int));
if (new_arr == NULL) {
    /* realloc failed - original arr is still valid */
    /* Handle error appropriately */
} else {
    arr = new_arr;
}

realloc may move the allocation to a new address, copying the old data. Always assign the result to a temporary variable first, as realloc returns NULL on failure while leaving the original allocation intact.

9.5 free

free deallocates memory previously allocated by malloc, calloc, or realloc:

int *arr = malloc(100 * sizeof(int));
/* ... use arr ... */
free(arr);  /* Return memory to system */

After freeing, the pointer becomes dangling and should not be used. Setting the pointer to NULL after freeing helps catch accidental use:

free(arr);
arr = NULL;

9.6 Memory Leaks

Memory leaks occur when allocated memory is no longer reachable but not freed. Over time, leaks consume available memory, eventually causing program failure:

void leak_example(void) {
    int *ptr = malloc(1000 * sizeof(int));
    /* Forgot to free ptr - leak */
}

int main(void) {
    for (int i = 0; i < 1000; i++) {
        leak_example();  /* Leaks 1000*1000*4 bytes ~ 4MB per iteration? */
    }
}

Tools like Valgrind detect memory leaks by tracking allocations and frees during program execution.

9.7 Dangling Pointers

Dangling pointers point to memory that has been freed. Using them causes undefined behavior, often manifesting as corruption or crashes:

int *ptr = malloc(sizeof(int));
*ptr = 42;
free(ptr);

/* Later... */
*ptr = 100;  /* Undefined behavior - memory may be reused */

The freed memory might be reallocated elsewhere, causing seemingly unrelated variables to change unexpectedly.

9.8 Memory Alignment

Modern processors require or perform better when data is aligned on specific boundaries. The malloc family returns memory suitably aligned for any standard type. However, custom alignment needs (e.g., for SIMD instructions) require special handling:

/* C11 aligned_alloc */
void *ptr = aligned_alloc(64, 1024);  /* 64-byte alignment, 1024 bytes */

Alignment requirements vary by architecture and data type. Misaligned access can cause performance penalties or hardware exceptions.

9.9 Custom Allocators

Advanced C programs sometimes implement custom memory allocators for improved performance, reduced fragmentation, or specialized allocation patterns:

typedef struct {
    void *memory;
    size_t size;
    size_t used;
} PoolAllocator;

PoolAllocator* pool_create(size_t size) {
    PoolAllocator *pool = malloc(sizeof(PoolAllocator));
    pool->memory = malloc(size);
    pool->size = size;
    pool->used = 0;
    return pool;
}

void* pool_alloc(PoolAllocator *pool, size_t size) {
    if (pool->used + size > pool->size) {
        return NULL;  /* Out of memory */
    }
    void *ptr = (char*)pool->memory + pool->used;
    pool->used += size;
    return ptr;
}

void pool_destroy(PoolAllocator *pool) {
    free(pool->memory);
    free(pool);
}

Pool allocators are common in real-time and embedded systems where allocation speed and predictability are critical.


Chapter 10 – Memory Layout & Internals

10.1 Program Memory Layout

A running C program's memory is organized into several segments:

Text segment contains executable instructions. It's typically read-only to prevent accidental modification.

Data segment holds initialized global and static variables. This includes both read-only data (like string literals) and read-write data.

BSS segment (Block Started by Symbol) contains uninitialized global and static variables. The program loader initializes this segment to zero before program start.

Heap grows upward (toward higher addresses) as memory is dynamically allocated.

Stack grows downward (toward lower addresses) as functions are called and local variables allocated.

High addresses
+------------------+
|      Stack       |  (grows downward)
|        ↓         |
|                  |
|        ↑         |
|      Heap        |  (grows upward)
+------------------+
|      BSS         |  (uninitialized data)
+------------------+
|      Data        |  (initialized data)
+------------------+
|      Text        |  (program code)
+------------------+
Low addresses

10.2 Stack Frames

Each function call creates a stack frame containing local variables, function arguments, return address, and saved registers. The frame pointer (ebp on x86) marks the frame boundary, while the stack pointer (esp) points to the top:

int add(int a, int b) {
    int result = a + b;  /* result stored on stack */
    return result;
}

int main(void) {
    int x = add(5, 3);   /* Creates stack frame for add */
    return 0;
}

Stack layout:

+-----------------+
|  argument b     |  (higher addresses)
+-----------------+
|  argument a     |
+-----------------+
|  return address |
+-----------------+
|  saved frame    |  <-- frame pointer
+-----------------+
|  local result   |
+-----------------+  <-- stack pointer

10.3 Heap Internals

Heap implementations manage free and allocated blocks, typically using data structures embedded in the heap memory itself. A simple heap might maintain a linked list of free blocks:

struct block {
    size_t size;
    int free;
    struct block *next;
};

When malloc is called, it traverses the free list looking for a block large enough. When free is called, it marks the block free and may coalesce adjacent free blocks to prevent fragmentation.

Modern allocators use sophisticated algorithms (segregated fits, buddy allocation) to balance speed, fragmentation, and scalability.

10.4 Buffer Overflows

Buffer overflows occur when writing beyond the bounds of allocated memory. On the stack, this can overwrite the return address, leading to control-flow hijacking:

void vulnerable(char *input) {
    char buffer[100];
    strcpy(buffer, input);  /* No bounds check! */
    /* If input > 100 bytes, overflows buffer */
}

An attacker can craft input that overwrites the return address with a pointer to malicious code, causing the program to execute arbitrary code when the function returns.

10.5 Address Space Layout

Modern operating systems use address space layout randomization (ASLR) to make exploitation harder by randomizing where code, stack, and heap are loaded. This prevents attackers from knowing exact addresses of useful functions or shellcode.

Position-independent executables (PIE) allow the entire program to be loaded at random addresses, further strengthening ASLR.

10.6 Debugging Memory Errors

Memory errors can be subtle and difficult to reproduce. Tools and techniques for debugging include:

Address sanitizer (ASan) instruments code to detect memory errors at runtime:

gcc -fsanitize=address -g program.c -o program

Valgrind runs programs in a simulated environment, detecting memory leaks, invalid accesses, and other errors:

valgrind --leak-check=full ./program

Electric Fence uses virtual memory protection to catch buffer overflows immediately by placing inaccessible pages after allocations.

10.7 Using Valgrind

Valgrind is an indispensable tool for C programmers. It detects various memory errors:

#include <stdlib.h>

int main(void) {
    int *arr = malloc(10 * sizeof(int));
    arr[10] = 42;  /* Invalid write - past end of array */
    free(arr);
    free(arr);     /* Double free */
    return 0;
}

Running with Valgrind:

==12345== Invalid write of size 4
==12345==    at 0x40053E: main (example.c:5)
==12345==  Address 0x51f9068 is 0 bytes after a block of size 40 alloc'd
==12345==    at 0x4C2FB0F: malloc (vg_replace_malloc.c:299)
==12345==    by 0x40052E: main (example.c:4)

==12345== Invalid free() / delete / delete[] / realloc()
==12345==    at 0x4C30D3B: free (vg_replace_malloc.c:530)
==12345==    by 0x40055E: main (example.c:7)
==12345==  Address 0x51f9040 is 0 bytes inside a block of size 40 free'd
==12345==    at 0x4C30D3B: free (vg_replace_malloc.c:530)
==12345==    by 0x400552: main (example.c:6)

PART IV – Arrays, Strings, and Structures

Chapter 11 – Arrays

11.1 Single-Dimensional Arrays

Arrays in C are contiguous sequences of elements of the same type. They're declared with a size, which must be a constant expression for regular arrays:

int scores[5];                 /* Array of 5 integers, uninitialized */
int primes[] = {2, 3, 5, 7, 11}; /* Size inferred from initializer */
char name[20] = "John";        /* Character array with string */

Array elements are accessed using the subscript operator []. The first element is at index 0:

scores[0] = 85;
scores[1] = 92;
scores[2] = 78;

for (int i = 0; i < 5; i++) {
    printf("scores[%d] = %d\n", i, scores[i]);
}

C performs no bounds checking on array accesses. Writing beyond the array bounds leads to undefined behavior, potentially corrupting other variables or causing security vulnerabilities.

11.2 Multi-Dimensional Arrays

Multi-dimensional arrays are arrays of arrays, stored in row-major order (all elements of first row, then second row, etc.):

int matrix[3][4];  /* 3 rows, 4 columns */

/* Initialize with nested loops */
for (int i = 0; i < 3; i++) {
    for (int j = 0; j < 4; j++) {
        matrix[i][j] = i * 4 + j;
    }
}

/* Initializer lists can specify rows */
int identity[3][3] = {
    {1, 0, 0},
    {0, 1, 0},
    {0, 0, 1}
};

When passing multi-dimensional arrays to functions, all dimensions except the first must be specified:

void print_matrix(int rows, int cols, int matrix[][cols]) {
    for (int i = 0; i < rows; i++) {
        for (int j = 0; j < cols; j++) {
            printf("%d ", matrix[i][j]);
        }
        printf("\n");
    }
}

11.3 Variable Length Arrays

C99 introduced variable-length arrays (VLAs), where the size can be determined at runtime:

void process_array(int n) {
    int vla[n];  /* Size determined by n at runtime */
    for (int i = 0; i < n; i++) {
        vla[i] = i * i;
    }
}

VLAs cannot be declared at file scope, cannot have static storage duration, and cannot be initialized in their declaration. They're allocated on the stack, so large VLAs can cause stack overflow. C11 made VLAs optional; implementations may not support them.

11.4 Array vs Pointer Differences

Despite the close relationship, arrays and pointers are distinct:

int arr[5];
int *ptr = arr;

printf("%zu\n", sizeof(arr));  /* Prints 20 (5 * sizeof(int)) */
printf("%zu\n", sizeof(ptr));  /* Prints 8 (pointer size on 64-bit) */

/* arr cannot be reassigned */
arr = ptr;  /* Compilation error */

/* ptr can be reassigned */
ptr = arr + 2;  /* OK */

Array names convert to pointers in most expressions, but they are not pointer variables. The conversion doesn't happen when the array is the operand of sizeof or unary &, or when it's a string literal used to initialize a character array.


Chapter 12 – Strings

12.1 Character Arrays

Strings in C are represented as arrays of characters terminated by a null character ('\0'). This null terminator distinguishes strings from ordinary character arrays:

char str1[] = "hello";        /* Array of 6 chars: 'h','e','l','l','o','\0' */
char str2[6] = "hello";       /* Explicit size, must include terminator */
char str3[] = {'h','e','l','l','o','\0'}; /* Character-by-character */

String literals like "hello" are stored in read-only memory and have static storage duration. Modifying them leads to undefined behavior:

char *ptr = "hello";
ptr[0] = 'j';  /* Undefined behavior - modifying string literal */

char arr[] = "hello";  /* Copy of literal in modifiable array */
arr[0] = 'j';         /* OK - modifies copy */

12.2 String Library Functions

The <string.h> header provides functions for string manipulation:

Copying: strcpy(dest, src) copies from source to destination, including the null terminator. It doesn't check buffer sizes, so destination must be large enough:

char dest[10];
strcpy(dest, "hello");  /* OK */
strcpy(dest, "this is too long");  /* Buffer overflow! */

Concatenation: strcat(dest, src) appends src to dest, overwriting dest's null terminator:

char message[50] = "Hello, ";
strcat(message, "world!");  /* message now "Hello, world!" */

Comparison: strcmp(s1, s2) compares strings lexicographically, returning negative, zero, or positive based on ordering:

if (strcmp(user_input, "password") == 0) {
    /* Strings equal */
}

Length: strlen(s) returns the number of characters before the null terminator:

size_t len = strlen("hello");  /* len = 5 */

12.3 Safe String Handling

Traditional string functions are dangerous due to lack of bounds checking. Safer alternatives exist:

strncpy limits copy length, but doesn't guarantee null termination:

char dest[10];
strncpy(dest, src, sizeof(dest) - 1);
dest[sizeof(dest) - 1] = '\0';  /* Ensure termination */

strncat limits characters appended:

strncat(dest, src, sizeof(dest) - strlen(dest) - 1);

snprintf provides formatted output with bounds checking:

char buffer[100];
snprintf(buffer, sizeof(buffer), "Value: %d", x);

C11 introduced optional bounds-checking functions (strcpy_s, strcat_s, etc.) that return error codes and guarantee null termination, but adoption varies.

12.4 Unicode in C

C's char type (typically 8 bits) cannot represent all Unicode characters directly. Wide characters (wchar_t) and multibyte encodings provide Unicode support:

#include <wchar.h>
#include <locale.h>

setlocale(LC_ALL, "en_US.UTF-8");
wchar_t wide_str[] = L"Hello 世界";
wprintf(L"%ls\n", wide_str);

UTF-8, the most common Unicode encoding, uses variable-length sequences of char. The <uchar.h> header (C11) provides char16_t and char32_t for UTF-16 and UTF-32.

12.5 Buffer Overflow Prevention

Buffer overflows in string operations are a primary security vulnerability. Prevention strategies include:

Use bounds-checked functions: Always use strncpy, strncat, snprintf instead of their unsafe counterparts.

Validate input lengths: Check that input won't overflow buffers before copying:

if (strlen(user_input) >= sizeof(buffer)) {
    /* Input too long, handle error */
    return -1;
}
strcpy(buffer, user_input);

Use dynamic allocation for unknown sizes: When input size isn't known in advance, allocate dynamically:

size_t needed = snprintf(NULL, 0, "%s", user_input) + 1;
char *buffer = malloc(needed);
if (buffer) {
    snprintf(buffer, needed, "%s", user_input);
}

Employ static analysis tools: Tools like Coverity, clang-analyzer, and cppcheck detect many buffer overflow vulnerabilities.


Chapter 13 – Structures and Unions

13.1 Defining Structures

Structures (struct) group related data items of possibly different types into a single unit:

struct Student {
    int id;
    char name[50];
    double gpa;
    int enrolled;
};

/* Declare variables */
struct Student s1;
struct Student s2 = {12345, "Alice Smith", 3.8, 1};

/* Access members with dot operator */
s1.id = 54321;
strcpy(s1.name, "Bob Jones");
s1.gpa = 3.5;
s1.enrolled = 1;

Structures can be passed to functions, returned from functions, and assigned. Structure assignment copies all members:

struct Student s3 = s1;  /* s3 gets copy of s1's members */

13.2 Nested Structures

Structures can contain other structures as members:

struct Date {
    int year;
    int month;
    int day;
};

struct Employee {
    int id;
    char name[100];
    struct Date hire_date;
    struct Date birth_date;
    double salary;
};

struct Employee e;
e.hire_date.year = 2020;
e.hire_date.month = 6;
e.hire_date.day = 15;

13.3 Structures and Pointers

Pointers to structures are common, especially for passing to functions to avoid copying. The arrow operator (->) accesses members through a pointer:

void print_student(const struct Student *s) {
    printf("ID: %d\n", s->id);        /* Equivalent to (*s).id */
    printf("Name: %s\n", s->name);
    printf("GPA: %.2f\n", s->gpa);
}

struct Student s = {123, "John", 3.5, 1};
print_student(&s);

13.4 Bit Fields

Bit fields allow packing multiple small values into a single integer, useful for hardware registers and memory-constrained applications:

struct Status {
    unsigned int error : 1;    /* 1 bit */
    unsigned int ready : 1;     /* 1 bit */
    unsigned int mode : 2;      /* 2 bits */
    unsigned int count : 4;     /* 4 bits */
};

struct Status stat = {0};
stat.error = 1;
stat.mode = 3;  /* Can store 0-3 */

Bit field layout (order, alignment, whether they can straddle byte boundaries) is implementation-defined, limiting portability.

13.5 Memory Padding

Compilers may insert padding between structure members to satisfy alignment requirements:

struct Example {
    char c;      /* 1 byte */
    /* 3 bytes padding */
    int i;       /* 4 bytes */
    short s;     /* 2 bytes */
    /* 2 bytes padding */
};  /* Total size: 12 bytes, not 7 */

printf("%zu\n", sizeof(struct Example));  /* Likely prints 12 */

Reordering members to group same-sized types can reduce padding:

struct Packed {
    int i;       /* 4 bytes */
    short s;     /* 2 bytes */
    char c;      /* 1 byte */
    /* 1 byte padding */
};  /* Total: 8 bytes */

The _Alignas specifier (C11) and #pragma pack can control alignment, though the latter is non-standard.

13.6 Unions

Unions store different data types in the same memory location, with size equal to the largest member:

union Value {
    int i;
    double d;
    char str[20];
};

union Value v;
v.i = 42;
printf("%d\n", v.i);  /* OK */
printf("%f\n", v.d);  /* Undefined - interpreting int as double */

Unions are useful for variant types, saving memory when only one type is used at a time, and for type-punning (though strict aliasing rules make this risky).

A tagged union pattern uses a separate member to track the active type:

enum Type { INT, DOUBLE, STRING };

struct Variant {
    enum Type type;
    union {
        int i;
        double d;
        char str[100];
    } value;
};

void print_variant(const struct Variant *v) {
    switch (v->type) {
        case INT: printf("%d\n", v->value.i); break;
        case DOUBLE: printf("%f\n", v->value.d); break;
        case STRING: printf("%s\n", v->value.str); break;
    }
}

13.7 Anonymous Structs

C11 allows anonymous structures and unions within structures, useful for nested access without intermediate names:

struct Employee {
    int id;
    struct {
        int year;
        int month;
        int day;
    };  /* Anonymous structure */
    char name[100];
};

struct Employee e;
e.year = 2020;  /* Access directly, not e.hire_date.year */
e.month = 6;
e.day = 15;

Chapter 14 – Enumerations and typedef

14.1 enum Basics

Enumerations (enum) define a set of named integer constants, improving code readability:

enum Weekday {
    MONDAY,      /* 0 */
    TUESDAY,     /* 1 */
    WEDNESDAY,   /* 2 */
    THURSDAY,    /* 3 */
    FRIDAY,      /* 4 */
    SATURDAY,    /* 5 */
    SUNDAY       /* 6 */
};

enum Weekday today = WEDNESDAY;
if (today == SATURDAY || today == SUNDAY) {
    printf("Weekend!\n");
}

Enum constants are int values. Without explicit assignment, they start at 0 and increase by 1. Explicit values can be assigned:

enum ErrorCode {
    SUCCESS = 0,
    FILE_NOT_FOUND = 1,
    PERMISSION_DENIED = 2,
    OUT_OF_MEMORY = 3
};

14.2 typedef Usage

typedef creates aliases for existing types, simplifying complex declarations:

typedef unsigned int uint;
uint counter;  /* Same as unsigned int counter */

typedef struct {
    int x;
    int y;
} Point;

Point p1 = {10, 20};  /* No need for "struct" keyword */

Common typedef patterns:

/* Function pointer type */
typedef int (*Comparator)(const void*, const void*);

/* Array type */
typedef int IntArray[10];

/* Pointer to function returning pointer to int */
typedef int** (*ComplexFunc)(void);

14.3 Type Abstraction

Using typedef with incomplete types enables information hiding and abstract data types:

/* In header file */
typedef struct Stack Stack;  /* Incomplete type */

Stack* stack_create(void);
void stack_push(Stack *s, int value);
int stack_pop(Stack *s);
void stack_destroy(Stack *s);

/* In implementation file */
struct Stack {
    int *data;
    size_t capacity;
    size_t top;
};

Users include the header but cannot access structure members directly, enforcing encapsulation.


PART V – Preprocessor & Modular Programming

Chapter 15 – The C Preprocessor

15.1 Macros

The #define directive creates macros, which the preprocessor replaces with their definitions before compilation:

#define PI 3.14159
#define BUFFER_SIZE 1024
#define DEBUG 1

double area = PI * radius * radius;

Object-like macros (without parameters) are typically used for constants, though enum or const are often better alternatives.

Function-like macros take parameters but require careful parenthesization to avoid operator precedence issues:

#define SQUARE(x) x * x           /* Dangerous - SQUARE(1+2) expands to 1+2*1+2 = 5 */
#define SQUARE(x) ((x) * (x))     /* Safe - expands with parentheses */

#define MAX(a, b) ((a) > (b) ? (a) : (b))

15.2 Conditional Compilation

Conditional directives control which code gets compiled, essential for portability and debugging:

#ifdef DEBUG
    printf("x = %d, y = %d\n", x, y);
#endif

#if defined(__linux__)
    /* Linux-specific code */
#elif defined(_WIN32)
    /* Windows-specific code */
#else
    #error "Unsupported platform"
#endif

#ifndef HEADER_H
#define HEADER_H
/* Header content */
#endif

15.3 Include Guards

Include guards prevent multiple inclusion of header files, which would cause duplicate definition errors:

#ifndef MYHEADER_H
#define MYHEADER_H

/* Header content */

#endif /* MYHEADER_H */

Modern compilers also support #pragma once as an alternative, though it's non-standard.

15.4 Token Pasting

The ## operator concatenates tokens during macro expansion:

#define CONCAT(a, b) a##b

int CONCAT(my, var) = 42;  /* Expands to int myvar = 42; */

#define MAKE_FUNC(name, type) \
    type name##_func(type arg) { \
        return arg; \
    }

MAKE_FUNC(identity, int)  /* Creates int identity_func(int arg) */

15.5 Macro Pitfalls

Macros can be treacherous due to their text-based nature:

Multiple evaluation: Arguments may be evaluated multiple times, causing unexpected side effects:

#define MAX(a, b) ((a) > (b) ? (a) : (b))

int x = 5;
int y = MAX(x++, 10);  /* x++ evaluated twice! */
/* x becomes 7, not 6 */

Missing parentheses: Without proper parentheses, precedence can bite:

#define CUBE(x) x * x * x
int result = CUBE(2 + 3);  /* 2 + 3 * 2 + 3 * 2 + 3 = 2 + 6 + 6 + 3 = 17, not 125 */

Semicolon swallowing: Macros intended as statements can cause issues:

#define CALL_FUNC(f) f();  /* Semicolon included */

if (condition)
    CALL_FUNC(func)  /* Expands to if (condition) func();; */
else                 /* else now belongs to wrong if */
    other_func();

Modern practice favors enum constants for integers, const variables for other types, and inline functions or _Generic for type-generic operations over function-like macros.


Chapter 16 – Modular Programming

16.1 Designing APIs

Well-designed APIs in C separate interface from implementation, hide internal details, and provide consistent error handling:

/* queue.h - Public interface */
#ifndef QUEUE_H
#define QUEUE_H

#include <stddef.h>
#include <stdbool.h>

/* Opaque handle - users only see pointer */
typedef struct Queue Queue;

/* Creation/destruction */
Queue* queue_create(size_t capacity);
void queue_destroy(Queue *q);

/* Operations */
bool queue_enqueue(Queue *q, int value);
bool queue_dequeue(Queue *q, int *value);
bool queue_peek(const Queue *q, int *value);
bool queue_is_empty(const Queue *q);
bool queue_is_full(const Queue *q);
size_t queue_size(const Queue *q);

#endif /* QUEUE_H */

16.2 Header File Organization

Headers should be minimal, self-contained, and idempotent:

/* utils.h */
#ifndef UTILS_H
#define UTILS_H

#include <stdio.h>   /* Dependencies included */
#include <stdlib.h>

/* Public types */
typedef struct {
    double x;
    double y;
} Vector;

/* Public functions */
Vector vector_add(Vector a, Vector b);
double vector_dot(Vector a, Vector b);
void vector_print(Vector v);

/* Inline functions - safe in header */
static inline double vector_length(Vector v) {
    return sqrt(v.x * v.x + v.y * v.y);
}

#endif /* UTILS_H */

Implementation files include their own header first to verify declarations match definitions:

/* utils.c */
#include "utils.h"
#include <math.h>  /* Implementation needs math.h */

Vector vector_add(Vector a, Vector b) {
    return (Vector){a.x + b.x, a.y + b.y};
}

/* ... other implementations ... */

16.3 Static Libraries

Static libraries (.a on Unix, .lib on Windows) are archives of object files linked directly into executables:

Creating a static library:

# Compile source files to objects
gcc -c queue.c -o queue.o
gcc -c utils.c -o utils.o

# Create archive
ar rcs libmylib.a queue.o utils.o

# Use library
gcc main.c -L. -lmylib -o program

16.4 Shared Libraries

Shared libraries (.so on Linux, .dylib on macOS, .dll on Windows) are loaded at program start or runtime:

Creating a shared library:

# Compile with position-independent code
gcc -fPIC -c queue.c -o queue.o
gcc -fPIC -c utils.c -o utils.o

# Create shared library
gcc -shared queue.o utils.o -o libmylib.so

# Use library
gcc main.c -L. -lmylib -o program
# May need to set LD_LIBRARY_PATH or install library

Dynamic loading with dlopen (Unix) or LoadLibrary (Windows) allows runtime library selection:

#include <dlfcn.h>

void* handle = dlopen("./libplugin.so", RTLD_LAZY);
if (!handle) {
    fprintf(stderr, "Error: %s\n", dlerror());
    return -1;
}

typedef int (*plugin_func)(int);
plugin_func func = (plugin_func)dlsym(handle, "plugin_function");
if (func) {
    int result = func(42);
}

dlclose(handle);

16.5 Versioning

Library versioning prevents compatibility problems when libraries change. Semantic versioning (major.minor.patch) indicates compatibility:

  • Major version: Incompatible API changes
  • Minor version: Backward-compatible additions
  • Patch version: Backward-compatible bug fixes

On Linux, shared libraries use soname:

gcc -shared -Wl,-soname,libmylib.so.1 -o libmylib.so.1.0.2 queue.o utils.o
ln -s libmylib.so.1.0.2 libmylib.so.1
ln -s libmylib.so.1 libmylib.so

Programs link against the soname (libmylib.so.1), and the runtime loader maps it to the actual file with matching major version.


PART VI – Standard Library Deep Dive

Chapter 17 – stdio.h

File Handling

The standard I/O library provides buffered file operations:

FILE *fp = fopen("data.txt", "r");
if (!fp) {
    perror("Failed to open file");
    return -1;
}

char buffer[256];
while (fgets(buffer, sizeof(buffer), fp)) {
    printf("%s", buffer);
}

fclose(fp);

File modes:

  • "r" - read (file must exist)
  • "w" - write (creates/truncates)
  • "a" - append (creates if needed)
  • "r+" - read/write (file must exist)
  • "w+" - read/write (creates/truncates)
  • "a+" - read/append (creates if needed)

Binary mode adds "b" (e.g., "rb"), preventing newline translation.

Buffering

Stdio buffers I/O for efficiency. Buffering modes:

  • Fully buffered: data written when buffer fills
  • Line buffered: flushed on newline (typical for terminals)
  • Unbuffered: written immediately
setbuf(fp, NULL);  /* Disable buffering */
setvbuf(fp, buffer, _IOFBF, sizeof(buffer));  /* Set custom buffer */

Formatted I/O

printf family formats output:

printf("Integer: %d, Hex: %x, Float: %.2f\n", 42, 42, 3.14159);

int written = fprintf(stderr, "Error: %s\n", msg);

char buffer[100];
sprintf(buffer, "Value: %d", x);  /* Unsafe - no bounds check */
snprintf(buffer, sizeof(buffer), "Value: %d", x);  /* Safe */

scanf family parses input:

int i;
float f;
char str[50];

scanf("%d %f %s", &i, &f, str);  /* Dangerous with %s */
scanf("%49s", str);  /* Limit string length */

Chapter 18 – stdlib.h

Memory Management

stdlib.h provides memory allocation functions covered in Chapter 9:

void *malloc(size_t size);
void *calloc(size_t count, size_t size);
void *realloc(void *ptr, size_t new_size);
void free(void *ptr);

Process Control

Functions for program termination and environment interaction:

exit(EXIT_SUCCESS);     /* Normal termination */
exit(EXIT_FAILURE);     /* Error termination */
atexit(cleanup_func);   /* Register function to call on exit */
system("ls -l");        /* Execute shell command */

Environment Variables

Access to environment variables:

char *path = getenv("PATH");
if (path) {
    printf("PATH: %s\n", path);
}

/* POSIX extensions provide putenv, setenv, unsetenv */

Chapter 19 – string.h

String Utilities

string.h provides essential string manipulation functions (detailed in Chapter 12):

size_t strlen(const char *s);
char *strcpy(char *dest, const char *src);
char *strncpy(char *dest, const char *src, size_t n);
char *strcat(char *dest, const char *src);
char *strncat(char *dest, const char *src, size_t n);
int strcmp(const char *s1, const char *s2);
int strncmp(const char *s1, const char *s2, size_t n);
char *strchr(const char *s, int c);
char *strstr(const char *haystack, const char *needle);

Memory functions work with arbitrary data:

void *memcpy(void *dest, const void *src, size_t n);
void *memmove(void *dest, const void *src, size_t n);
int memcmp(const void *s1, const void *s2, size_t n);
void *memset(void *s, int c, size_t n);

memmove handles overlapping regions correctly, unlike memcpy.


Chapter 20 – math.h

Mathematical Functions

math.h declares standard math functions (link with -lm):

#include <math.h>

double power = pow(2.0, 10.0);   /* 1024.0 */
double root = sqrt(25.0);        /* 5.0 */
double sine = sin(radians);
double cosine = cos(radians);
double angle = atan2(y, x);      /* Arc tangent */
double ceil_val = ceil(3.14);     /* 4.0 */
double floor_val = floor(3.14);   /* 3.0 */
double rounded = round(3.5);      /* 4.0 */

Special values and error handling:

#include <math.h>
#include <errno.h>

errno = 0;
double result = sqrt(-1.0);
if (errno == EDOM) {
    /* Domain error - sqrt of negative */
}
if (isnan(result)) {
    /* Not a number */
}

Chapter 21 – time.h

Time Handling

time.h provides functions for working with time:

#include <time.h>

time_t now = time(NULL);  /* Current calendar time */
char *time_str = ctime(&now);  /* Convert to string */

struct tm *local = localtime(&now);
printf("Year: %d, Month: %d, Day: %d\n",
       local->tm_year + 1900,  /* Years since 1900 */
       local->tm_mon + 1,       /* Months since Jan (0-11) */
       local->tm_mday);

/* Formatted time */
char buffer[100];
strftime(buffer, sizeof(buffer), "%Y-%m-%d %H:%M:%S", local);

/* Timing code */
clock_t start = clock();
/* ... code to time ... */
clock_t end = clock();
double seconds = (double)(end - start) / CLOCKS_PER_SEC;

Chapter 22 – assert.h & errno.h

Assertions

assert.h provides runtime assertions for debugging:

#include <assert.h>

int divide(int a, int b) {
    assert(b != 0);  /* Aborts if b == 0 */
    return a / b;
}

Assertions can be disabled by defining NDEBUG before including assert.h.

Error Handling

errno.h defines the global errno variable for error reporting:

#include <errno.h>
#include <stdio.h>

FILE *fp = fopen("nonexistent.txt", "r");
if (!fp) {
    perror("fopen failed");  /* Prints error message */
    printf("Error code: %d\n", errno);
}

Common errno values:

  • EDOM - Domain error (math function argument out of range)
  • ERANGE - Range error (result out of range)
  • ENOMEM - Out of memory
  • EINVAL - Invalid argument

PART VII – Advanced C Programming

Chapter 23 – Bit Manipulation

Bit Masking

Bit masks select or modify specific bits within integers:

/* Define masks for each bit */
#define FLAG_READ   (1 << 0)  /* 0x01 */
#define FLAG_WRITE  (1 << 1)  /* 0x02 */
#define FLAG_EXEC   (1 << 2)  /* 0x04 */
#define FLAG_USER   (1 << 3)  /* 0x08 */

uint8_t permissions = 0;

/* Set flags */
permissions |= FLAG_READ | FLAG_WRITE;

/* Clear flags */
permissions &= ~FLAG_WRITE;

/* Toggle flags */
permissions ^= FLAG_EXEC;

/* Test flags */
if (permissions & FLAG_READ) {
    /* Read permission granted */
}

Shifting

Bit shifts multiply or divide by powers of two efficiently:

uint32_t x = 42;
uint32_t left = x << 3;   /* Multiply by 8: 42 * 8 = 336 */
uint32_t right = x >> 1;  /* Divide by 2: 42 / 2 = 21 */

/* Extract bit fields */
uint32_t value = 0xABCD1234;
uint8_t low_byte = value & 0xFF;           /* 0x34 */
uint8_t high_byte = (value >> 24) & 0xFF;  /* 0xAB */

Low-Level Register Programming

Embedded systems manipulate hardware registers via bit operations:

/* Assume register at address 0x40021000 */
#define RCC_APB2ENR ((volatile uint32_t*)0x40021000)

/* Enable GPIOA clock (bit 2) */
*RCC_APB2ENR |= (1 << 2);

/* GPIO configuration */
#define GPIOA_CRL ((volatile uint32_t*)0x40010800)
/* Configure PA0 as output (mode = 01, CNF = 00 for bits 0-3) */
*GPIOA_CRL = (*GPIOA_CRL & ~0x0F) | 0x01;

/* Set PA0 high */
#define GPIOA_ODR ((volatile uint32_t*)0x4001080C)
*GPIOA_ODR |= (1 << 0);

Chapter 24 – Advanced Pointers

Pointer to Functions

Function pointers enable callbacks, state machines, and dynamic dispatch:

typedef int (*operation)(int, int);

int add(int a, int b) { return a + b; }
int subtract(int a, int b) { return a - b; }
int multiply(int a, int b) { return a * b; }

int calculate(operation op, int x, int y) {
    return op(x, y);
}

/* Array of function pointers */
operation ops[] = {add, subtract, multiply};
int result = ops[0](5, 3);  /* Calls add */

Callback Mechanisms

Callbacks allow library code to call user code:

/* Library header */
typedef void (*event_handler)(int event_code, void *user_data);

void register_handler(event_handler handler, void *user_data);
void trigger_event(int event_code);

/* User code */
void my_handler(int code, void *data) {
    printf("Event %d, user data: %s\n", code, (char*)data);
}

int main(void) {
    char *context = "My context";
    register_handler(my_handler, context);
    /* Library will call my_handler when events occur */
}

Chapter 25 – Data Structures in C

Linked Lists

Singly linked list implementation:

typedef struct Node {
    int data;
    struct Node *next;
} Node;

typedef struct {
    Node *head;
    size_t size;
} List;

List* list_create(void) {
    List *list = malloc(sizeof(List));
    if (list) {
        list->head = NULL;
        list->size = 0;
    }
    return list;
}

void list_push_front(List *list, int value) {
    Node *node = malloc(sizeof(Node));
    if (node) {
        node->data = value;
        node->next = list->head;
        list->head = node;
        list->size++;
    }
}

int list_pop_front(List *list, int *value) {
    if (!list->head) return -1;
    
    Node *temp = list->head;
    *value = temp->data;
    list->head = temp->next;
    free(temp);
    list->size--;
    return 0;
}

void list_destroy(List *list) {
    Node *current = list->head;
    while (current) {
        Node *next = current->next;
        free(current);
        current = next;
    }
    free(list);
}

Stacks

Array-based stack:

typedef struct {
    int *data;
    size_t capacity;
    size_t top;
} Stack;

Stack* stack_create(size_t capacity) {
    Stack *s = malloc(sizeof(Stack));
    if (s) {
        s->data = malloc(capacity * sizeof(int));
        s->capacity = capacity;
        s->top = 0;
    }
    return s;
}

bool stack_push(Stack *s, int value) {
    if (s->top >= s->capacity) return false;
    s->data[s->top++] = value;
    return true;
}

bool stack_pop(Stack *s, int *value) {
    if (s->top == 0) return false;
    *value = s->data[--s->top];
    return true;
}

void stack_destroy(Stack *s) {
    free(s->data);
    free(s);
}

Queues

Circular buffer queue:

typedef struct {
    int *data;
    size_t capacity;
    size_t head;
    size_t tail;
    size_t count;
} Queue;

Queue* queue_create(size_t capacity) {
    Queue *q = malloc(sizeof(Queue));
    if (q) {
        q->data = malloc(capacity * sizeof(int));
        q->capacity = capacity;
        q->head = q->tail = q->count = 0;
    }
    return q;
}

bool queue_enqueue(Queue *q, int value) {
    if (q->count >= q->capacity) return false;
    q->data[q->tail] = value;
    q->tail = (q->tail + 1) % q->capacity;
    q->count++;
    return true;
}

bool queue_dequeue(Queue *q, int *value) {
    if (q->count == 0) return false;
    *value = q->data[q->head];
    q->head = (q->head + 1) % q->capacity;
    q->count--;
    return true;
}

Trees

Binary search tree:

typedef struct TreeNode {
    int data;
    struct TreeNode *left;
    struct TreeNode *right;
} TreeNode;

TreeNode* tree_insert(TreeNode *root, int value) {
    if (!root) {
        TreeNode *node = malloc(sizeof(TreeNode));
        node->data = value;
        node->left = node->right = NULL;
        return node;
    }
    
    if (value < root->data) {
        root->left = tree_insert(root->left, value);
    } else if (value > root->data) {
        root->right = tree_insert(root->right, value);
    }
    return root;
}

bool tree_search(TreeNode *root, int value) {
    if (!root) return false;
    if (value == root->data) return true;
    if (value < root->data) return tree_search(root->left, value);
    return tree_search(root->right, value);
}

void tree_inorder(TreeNode *root, void (*visit)(int)) {
    if (root) {
        tree_inorder(root->left, visit);
        visit(root->data);
        tree_inorder(root->right, visit);
    }
}

Hash Tables

Simple hash table with chaining:

#define TABLE_SIZE 100

typedef struct HashNode {
    char *key;
    int value;
    struct HashNode *next;
} HashNode;

typedef struct {
    HashNode *buckets[TABLE_SIZE];
} HashTable;

unsigned int hash(const char *key) {
    unsigned int h = 0;
    while (*key) {
        h = h * 31 + *key++;
    }
    return h % TABLE_SIZE;
}

void hash_insert(HashTable *ht, const char *key, int value) {
    unsigned int index = hash(key);
    HashNode *node = malloc(sizeof(HashNode));
    node->key = strdup(key);
    node->value = value;
    node->next = ht->buckets[index];
    ht->buckets[index] = node;
}

int* hash_lookup(HashTable *ht, const char *key) {
    unsigned int index = hash(key);
    HashNode *node = ht->buckets[index];
    while (node) {
        if (strcmp(node->key, key) == 0) {
            return &node->value;
        }
        node = node->next;
    }
    return NULL;
}

Chapter 26 – Algorithms in C

Sorting Algorithms

Quicksort implementation:

void swap(int *a, int *b) {
    int temp = *a;
    *a = *b;
    *b = temp;
}

int partition(int arr[], int low, int high) {
    int pivot = arr[high];
    int i = low - 1;
    
    for (int j = low; j < high; j++) {
        if (arr[j] <= pivot) {
            i++;
            swap(&arr[i], &arr[j]);
        }
    }
    swap(&arr[i + 1], &arr[high]);
    return i + 1;
}

void quicksort(int arr[], int low, int high) {
    if (low < high) {
        int pi = partition(arr, low, high);
        quicksort(arr, low, pi - 1);
        quicksort(arr, pi + 1, high);
    }
}

Mergesort for stable sorting:

void merge(int arr[], int left, int mid, int right) {
    int n1 = mid - left + 1;
    int n2 = right - mid;
    
    int L[n1], R[n2];
    
    for (int i = 0; i < n1; i++) L[i] = arr[left + i];
    for (int j = 0; j < n2; j++) R[j] = arr[mid + 1 + j];
    
    int i = 0, j = 0, k = left;
    while (i < n1 && j < n2) {
        if (L[i] <= R[j]) arr[k++] = L[i++];
        else arr[k++] = R[j++];
    }
    
    while (i < n1) arr[k++] = L[i++];
    while (j < n2) arr[k++] = R[j++];
}

void mergesort(int arr[], int left, int right) {
    if (left < right) {
        int mid = left + (right - left) / 2;
        mergesort(arr, left, mid);
        mergesort(arr, mid + 1, right);
        merge(arr, left, mid, right);
    }
}

Searching Algorithms

Binary search on sorted arrays:

int binary_search(int arr[], int size, int target) {
    int left = 0, right = size - 1;
    
    while (left <= right) {
        int mid = left + (right - left) / 2;
        
        if (arr[mid] == target) return mid;
        if (arr[mid] < target) left = mid + 1;
        else right = mid - 1;
    }
    return -1;  /* Not found */
}

Depth-first search for graphs:

typedef struct GraphNode {
    int id;
    struct GraphNode **neighbors;
    int neighbor_count;
    int visited;
} GraphNode;

void dfs(GraphNode *node) {
    if (!node || node->visited) return;
    
    node->visited = 1;
    printf("Visiting node %d\n", node->id);
    
    for (int i = 0; i < node->neighbor_count; i++) {
        dfs(node->neighbors[i]);
    }
}

PART VIII – Systems Programming

Chapter 27 – File Systems & Low-Level I/O

System Calls

Unix system calls provide direct access to the operating system kernel:

#include <unistd.h>
#include <fcntl.h>
#include <sys/stat.h>

int fd = open("file.txt", O_RDONLY);
if (fd == -1) {
    perror("open");
    return -1;
}

char buffer[1024];
ssize_t bytes_read = read(fd, buffer, sizeof(buffer) - 1);
if (bytes_read > 0) {
    buffer[bytes_read] = '\0';
    printf("Read: %s\n", buffer);
}

lseek(fd, 0, SEEK_SET);  /* Rewind to beginning */

close(fd);

File Descriptors

File descriptors are integers representing open files, pipes, sockets, etc.:

/* Standard descriptors */
#define STDIN_FILENO  0
#define STDOUT_FILENO 1
#define STDERR_FILENO 2

/* Redirect stdout to file */
int fd = open("output.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
dup2(fd, STDOUT_FILENO);
close(fd);

printf("This goes to output.txt\n");

Directory Operations

#include <dirent.h>

DIR *dir = opendir("/path/to/directory");
if (dir) {
    struct dirent *entry;
    while ((entry = readdir(dir)) != NULL) {
        if (entry->d_name[0] != '.') {  /* Skip hidden files */
            printf("%s\n", entry->d_name);
        }
    }
    closedir(dir);
}

Chapter 28 – Process Management

fork

The fork system call creates a new process by duplicating the calling process:

#include <unistd.h>
#include <sys/wait.h>

pid_t pid = fork();

if (pid == -1) {
    perror("fork");
    exit(1);
} else if (pid == 0) {
    /* Child process */
    printf("Child: PID = %d, Parent PID = %d\n", getpid(), getppid());
    exit(42);  /* Child exits with status 42 */
} else {
    /* Parent process */
    printf("Parent: Child PID = %d\n", pid);
    
    int status;
    wait(&status);  /* Wait for child */
    
    if (WIFEXITED(status)) {
        printf("Child exited with status %d\n", WEXITSTATUS(status));
    }
}

exec

The exec family replaces the current process image with a new program:

pid_t pid = fork();

if (pid == 0) {
    /* Child becomes new program */
    execlp("ls", "ls", "-l", "/home", NULL);
    
    /* If we get here, exec failed */
    perror("exec");
    exit(1);
} else {
    wait(NULL);  /* Parent waits for child */
}

Signals

Signals provide asynchronous notification of events:

#include <signal.h>

void signal_handler(int sig) {
    if (sig == SIGINT) {
        printf("\nCaught Ctrl+C, cleaning up...\n");
        /* Clean up resources */
        exit(0);
    }
}

int main(void) {
    /* Install signal handler */
    signal(SIGINT, signal_handler);
    
    while (1) {
        /* Main program loop */
        sleep(1);
    }
}

More robust signal handling with sigaction:

struct sigaction sa;
sa.sa_handler = signal_handler;
sigemptyset(&sa.sa_mask);
sa.sa_flags = 0;

if (sigaction(SIGINT, &sa, NULL) == -1) {
    perror("sigaction");
    exit(1);
}

Chapter 29 – Threads and Concurrency

POSIX Threads

Pthreads provide threading on Unix-like systems:

#include <pthread.h>

void* thread_function(void *arg) {
    int *num = (int*)arg;
    printf("Thread received: %d\n", *num);
    return (void*)(*num * 2);
}

int main(void) {
    pthread_t thread;
    int value = 42;
    void *result;
    
    /* Create thread */
    if (pthread_create(&thread, NULL, thread_function, &value) != 0) {
        perror("pthread_create");
        exit(1);
    }
    
    /* Wait for thread */
    pthread_join(thread, &result);
    printf("Thread returned: %ld\n", (long)result);
    
    return 0;
}

Mutexes

Mutexes protect shared data from concurrent access:

pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
int shared_counter = 0;

void* increment(void *arg) {
    for (int i = 0; i < 1000000; i++) {
        pthread_mutex_lock(&mutex);
        shared_counter++;  /* Critical section */
        pthread_mutex_unlock(&mutex);
    }
    return NULL;
}

int main(void) {
    pthread_t t1, t2;
    
    pthread_create(&t1, NULL, increment, NULL);
    pthread_create(&t2, NULL, increment, NULL);
    
    pthread_join(t1, NULL);
    pthread_join(t2, NULL);
    
    printf("Counter: %d (expected 2000000)\n", shared_counter);
}

Semaphores

Semaphores control access to resources with limited capacity:

#include <semaphore.h>

sem_t semaphore;
#define MAX_RESOURCES 3

void* worker(void *arg) {
    int id = *(int*)arg;
    
    sem_wait(&semaphore);  /* Acquire resource */
    printf("Worker %d acquired resource\n", id);
    sleep(rand() % 3);  /* Use resource */
    printf("Worker %d releasing resource\n", id);
    sem_post(&semaphore);  /* Release resource */
    
    return NULL;
}

int main(void) {
    sem_init(&semaphore, 0, MAX_RESOURCES);
    
    pthread_t threads[10];
    int ids[10];
    
    for (int i = 0; i < 10; i++) {
        ids[i] = i;
        pthread_create(&threads[i], NULL, worker, &ids[i]);
    }
    
    for (int i = 0; i < 10; i++) {
        pthread_join(threads[i], NULL);
    }
    
    sem_destroy(&semaphore);
}

Race Conditions

Race conditions occur when thread interleaving produces incorrect results:

/* Without proper synchronization */
int balance = 1000;

void* withdraw(void *arg) {
    int amount = *(int*)arg;
    
    if (balance >= amount) {  /* Check */
        /* Thread might be preempted here */
        balance -= amount;    /* Use */
    }
    
    return NULL;
}
/* Multiple threads can both pass the check before either subtracts */

Proper synchronization eliminates races:

pthread_mutex_t lock;

void* safe_withdraw(void *arg) {
    int amount = *(int*)arg;
    
    pthread_mutex_lock(&lock);
    if (balance >= amount) {
        balance -= amount;
    }
    pthread_mutex_unlock(&lock);
    
    return NULL;
}

Chapter 30 – Networking in C

Sockets

Socket programming enables network communication:

#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>

/* Create socket */
int server_fd = socket(AF_INET, SOCK_STREAM, 0);
if (server_fd == -1) {
    perror("socket");
    exit(1);
}

/* Allow address reuse */
int opt = 1;
setsockopt(server_fd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));

/* Bind to address and port */
struct sockaddr_in address;
address.sin_family = AF_INET;
address.sin_addr.s_addr = INADDR_ANY;
address.sin_port = htons(8080);

if (bind(server_fd, (struct sockaddr*)&address, sizeof(address)) < 0) {
    perror("bind");
    exit(1);
}

/* Listen for connections */
if (listen(server_fd, 5) < 0) {
    perror("listen");
    exit(1);
}

/* Accept a connection */
struct sockaddr_in client_addr;
socklen_t client_len = sizeof(client_addr);
int client_fd = accept(server_fd, (struct sockaddr*)&client_addr, &client_len);

char *client_ip = inet_ntoa(client_addr.sin_addr);
printf("Connection from %s\n", client_ip);

TCP/IP

TCP provides reliable, stream-oriented connections:

Client example:

int sock = socket(AF_INET, SOCK_STREAM, 0);

struct sockaddr_in server;
server.sin_family = AF_INET;
server.sin_port = htons(8080);
inet_pton(AF_INET, "127.0.0.1", &server.sin_addr);

if (connect(sock, (struct sockaddr*)&server, sizeof(server)) < 0) {
    perror("connect");
    exit(1);
}

char *message = "Hello, server!";
send(sock, message, strlen(message), 0);

char buffer[1024] = {0};
recv(sock, buffer, sizeof(buffer) - 1, 0);
printf("Server replied: %s\n", buffer);

close(sock);

Server handling multiple clients:

void* handle_client(void *arg) {
    int client_fd = *(int*)arg;
    char buffer[1024];
    
    while (1) {
        ssize_t bytes = recv(client_fd, buffer, sizeof(buffer) - 1, 0);
        if (bytes <= 0) break;
        
        buffer[bytes] = '\0';
        printf("Received: %s", buffer);
        
        /* Echo back */
        send(client_fd, buffer, bytes, 0);
    }
    
    close(client_fd);
    return NULL;
}

int main(void) {
    int server_fd = socket(AF_INET, SOCK_STREAM, 0);
    /* bind and listen as before */
    
    while (1) {
        struct sockaddr_in client_addr;
        socklen_t client_len = sizeof(client_addr);
        int client_fd = accept(server_fd, (struct sockaddr*)&client_addr, &client_len);
        
        pthread_t thread;
        pthread_create(&thread, NULL, handle_client, &client_fd);
        pthread_detach(thread);  /* Auto-clean when done */
    }
}

PART IX – Secure and Defensive C Programming

Chapter 31 – Secure Coding Standards

Common Vulnerabilities

Buffer overflows remain the most common vulnerability:

void vulnerable(char *input) {
    char buffer[100];
    strcpy(buffer, input);  /* No length check */
}

Integer overflows can lead to unexpected behavior:

size_t vulnerable(size_t len) {
    char *buffer = malloc(len + 1);  /* If len = SIZE_MAX, len+1 wraps to 0 */
    if (!buffer) return;
    /* ... */
}

Format string vulnerabilities allow information disclosure:

printf(user_input);  /* Dangerous - user can supply format specifiers */
printf("%s", user_input);  /* Safe */

Input Validation

Always validate input before use:

bool validate_number(const char *input, int *result) {
    char *endptr;
    long val = strtol(input, &endptr, 10);
    
    /* Check for conversion errors */
    if (endptr == input || *endptr != '\0') {
        return false;  /* Not a valid number */
    }
    
    /* Check range */
    if (val < INT_MIN || val > INT_MAX) {
        return false;  /* Out of range */
    }
    
    *result = (int)val;
    return true;
}

Secure String Handling

Use bounds-checked functions:

char *safe_strcpy(char *dest, const char *src, size_t dest_size) {
    if (dest_size == 0) return dest;
    
    strncpy(dest, src, dest_size - 1);
    dest[dest_size - 1] = '\0';
    return dest;
}

Chapter 32 – Exploitation Basics

Stack Overflow

Understanding stack overflows helps in preventing them:

void exploit_me(char *input) {
    char buffer[64];
    /* If input > 64 bytes, overflows buffer */
    /* Can overwrite return address on stack */
    strcpy(buffer, input);
}

Heap Overflow

Heap overflows corrupt dynamic memory management structures:

void heap_exploit(char *input) {
    char *buffer = malloc(64);
    /* If input > 64, overwrites heap metadata */
    strcpy(buffer, input);
}

Format String Attacks

Format string vulnerabilities can read and write memory:

printf(user_input);  /* %x leaks stack values */
printf(user_input);  /* %n writes to memory */

Chapter 33 – Hardening Techniques

ASLR

Address Space Layout Randomization randomizes memory addresses:

/* Compile as position-independent executable */
gcc -fPIE -pie program.c -o program

Stack Canaries

Compiler-inserted guards detect stack overflows:

/* Compile with stack protector */
gcc -fstack-protector-strong program.c -o program

DEP/NX

Data Execution Prevention marks memory as non-executable:

/* Use mprotect to control memory permissions */
#include <sys/mman.h>

void *mem = mmap(NULL, size, PROT_READ | PROT_WRITE,
                 MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
/* Later, if code needs to execute from this region */
mprotect(mem, size, PROT_READ | PROT_EXEC);

PART X – Embedded and Low-Level C

Chapter 34 – Embedded Systems Programming

Microcontrollers

Embedded C interacts directly with hardware:

/* STM32F4 example */
#define RCC_BASE    0x40023800
#define GPIOA_BASE  0x40020000

#define RCC_AHB1ENR (*(volatile uint32_t *)(RCC_BASE + 0x30))
#define GPIOA_MODER (*(volatile uint32_t *)(GPIOA_BASE + 0x00))
#define GPIOA_ODR   (*(volatile uint32_t *)(GPIOA_BASE + 0x14))

void delay(volatile uint32_t count) {
    while (count--) {
        __asm__("nop");
    }
}

int main(void) {
    /* Enable GPIOA clock */
    RCC_AHB1ENR |= (1 << 0);
    
    /* Configure PA5 as output (bits 10-11: 01) */
    GPIOA_MODER = (GPIOA_MODER & ~(3 << 10)) | (1 << 10);
    
    while (1) {
        GPIOA_ODR ^= (1 << 5);  /* Toggle LED */
        delay(1000000);
    }
}

Memory-Mapped I/O

Hardware registers accessed through pointers:

typedef struct {
    volatile uint32_t CR;     /* Control register */
    volatile uint32_t SR;     /* Status register */
    volatile uint32_t DR;     /* Data register */
} UART_Type;

#define UART1 ((UART_Type*)0x40011000)

void uart_send(char c) {
    /* Wait until transmit buffer empty */
    while (!(UART1->SR & (1 << 7)));
    UART1->DR = c;
}

Chapter 35 – C for OS Development

Bootloaders

Simple bootloader in C with inline assembly:

__attribute__((naked))
void boot_entry(void) {
    __asm__ volatile (
        "mov $0x13, %al\n"  /* Set video mode */
        "int $0x10\n"
        "ljmp $0x0, $kernel_main\n"
    );
}

void kernel_main(void) {
    /* Clear screen */
    char *video = (char*)0xB8000;
    for (int i = 0; i < 80 * 25 * 2; i += 2) {
        video[i] = ' ';
        video[i + 1] = 0x07;  /* White on black */
    }
    
    /* Print message */
    char *msg = "Hello from kernel!";
    int pos = 0;
    while (*msg) {
        video[pos] = *msg++;
        video[pos + 1] = 0x07;
        pos += 2;
    }
    
    /* Halt */
    while (1) {
        __asm__("hlt");
    }
}

PART XI – Compiler and Internals

Chapter 36 – How C Compilers Work

Lexical Analysis

The lexer converts source code to tokens:

/* Input: int x = 42; */
/* Tokens: KEYWORD_INT, IDENTIFIER("x"), OPERATOR_ASSIGN, INTEGER(42), SEMICOLON */

Parsing

The parser builds an abstract syntax tree:

// Representation of "if (x > 0) y = x;"
struct ast_node {
    enum { AST_IF, AST_BINARY, AST_ASSIGN, AST_VAR } type;
    union {
        struct { struct ast_node *cond, *then, *else_; } if_stmt;
        struct { struct ast_node *left, *right; int op; } binary;
        struct { char *name; struct ast_node *value; } assign;
    } data;
};

Code Generation

The backend generates assembly or machine code:

// Pseudo-code generation
void gen_binary(struct ast_node *node) {
    gen_expr(node->data.binary.left);
    gen_expr(node->data.binary.right);
    switch (node->data.binary.op) {
        case '+': emit("add"); break;
        case '-': emit("sub"); break;
    }
}

Chapter 37 – Writing a Mini C Compiler

A simplified compiler structure:

typedef struct {
    char **tokens;
    int pos;
} Parser;

typedef struct {
    // Symbol table, etc.
} Compiler;

ASTNode* parse_expression(Parser *p) {
    // Parse primary expression
    ASTNode *left = parse_primary(p);
    
    // Parse binary operators
    while (is_binary_op(current_token(p))) {
        int op = get_binary_op(current_token(p));
        next_token(p);
        ASTNode *right = parse_primary(p);
        left = create_binary_node(op, left, right);
    }
    
    return left;
}

void generate_code(ASTNode *node) {
    // Generate assembly
    switch (node->type) {
        case AST_INTEGER:
            printf("mov $%d, %%rax\n", node->value);
            break;
        case AST_BINARY:
            generate_code(node->left);
            push_register("%rax");
            generate_code(node->right);
            pop_register("%rbx");
            emit_operation(node->op);
            break;
    }
}

PART XII – Testing, Debugging & Optimization

Chapter 38 – Debugging with GDB

Essential GDB commands:

# Compile with debugging symbols
gcc -g program.c -o program

# Start GDB
gdb ./program

# Common commands
break main           # Set breakpoint at main
break file.c:42      # Break at line 42
run                  # Start program
next                 # Step over
step                 # Step into
print variable       # Print variable value
backtrace           # Show call stack
info locals         # Show local variables
continue            # Continue execution
quit                # Exit GDB

Chapter 39 – Profiling & Optimization

Profiling with gprof:

gcc -pg program.c -o program
./program
gprof program gmon.out > analysis.txt

Compiler Optimizations:

-O0  # No optimization (default)
-O1  # Basic optimization
-O2  # More aggressive
-O3  # Highest level
-Os  # Optimize for size
-Ofast # O3 with fast-math (may break standards)

Chapter 40 – Static & Dynamic Analysis Tools

Static Analysis with clang-tidy:

clang-tidy program.c -- -Iinclude

Dynamic Analysis with Valgrind:

valgrind --tool=memcheck ./program
valgrind --tool=helgrind ./program  # Detect race conditions
valgrind --tool=cachegrind ./program # Cache profiling

Address Sanitizer:

gcc -fsanitize=address -g program.c -o program
./program

PART XIII – Modern C (C11, C17, C23)

Chapter 41 – Multithreading (C11 Threads)

C11 standard threads (optional):

#include <threads.h>

int thread_func(void *arg) {
    int *num = arg;
    printf("Thread %d\n", *num);
    return 0;
}

int main(void) {
    thrd_t thread;
    int id = 42;
    
    if (thrd_create(&thread, thread_func, &id) == thrd_success) {
        thrd_join(thread, NULL);
    }
    
    return 0;
}

Chapter 42 – Atomics

Atomic operations (C11):

#include <stdatomic.h>

atomic_int counter = ATOMIC_VAR_INIT(0);

void increment(void) {
    atomic_fetch_add(&counter, 1);
}

int main(void) {
    // Thread-safe without locks
    increment();
    printf("%d\n", atomic_load(&counter));
}

Chapter 43 – Generic Selections

Type-generic expressions (C11):

#define cbrt(x) _Generic((x), \
    long double: cbrtl, \
    default: cbrt, \
    float: cbrtf \
)(x)

double d = cbrt(27.0);     // Calls cbrt
float f = cbrt(27.0f);     // Calls cbrtf

Chapter 44 – Latest C23 Features

C23 introduces several improvements:

// Binary literals
int b = 0b101010;  // 42

// nullptr constant
#define nullptr ((void*)0)  // Or use actual nullptr in C23

// Attributes
[[deprecated("Use new_function instead")]]
void old_function(void);

// typeof operator
typeof(x) y = 42;  // y has same type as x

// Enhanced enumerations
enum Color : unsigned char { RED, GREEN, BLUE };

PART XIV – Real-World Projects

Project 1 – Building a Shell

A simple shell implementation:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/wait.h>

#define MAX_INPUT 1024
#define MAX_ARGS 64

void parse_command(char *input, char **args) {
    int i = 0;
    args[i] = strtok(input, " \t\n");
    while (args[i] && i < MAX_ARGS - 1) {
        args[++i] = strtok(NULL, " \t\n");
    }
    args[i] = NULL;
}

int execute_command(char **args) {
    if (!args[0]) return 1;  // Empty command
    
    // Built-in commands
    if (strcmp(args[0], "cd") == 0) {
        if (!args[1]) chdir(getenv("HOME"));
        else chdir(args[1]);
        return 1;
    }
    if (strcmp(args[0], "exit") == 0) return 0;
    
    pid_t pid = fork();
    if (pid == 0) {
        // Child process
        execvp(args[0], args);
        perror("execvp failed");
        exit(1);
    } else if (pid > 0) {
        // Parent process
        int status;
        waitpid(pid, &status, 0);
    } else {
        perror("fork failed");
    }
    return 1;
}

int main(void) {
    char input[MAX_INPUT];
    char *args[MAX_ARGS];
    
    printf("Welcome to MiniShell\n");
    
    while (1) {
        printf("shell> ");
        fflush(stdout);
        
        if (!fgets(input, sizeof(input), stdin)) break;
        
        // Remove trailing newline
        input[strcspn(input, "\n")] = 0;
        
        parse_command(input, args);
        if (!execute_command(args)) break;
    }
    
    printf("Goodbye!\n");
    return 0;
}

Project 2 – Writing a Memory Allocator

A simple malloc implementation:

typedef struct block {
    size_t size;
    int free;
    struct block *next;
} block_t;

#define BLOCK_SIZE sizeof(block_t)
void *heap_start = NULL;

block_t *find_free_block(block_t **last, size_t size) {
    block_t *current = heap_start;
    while (current && !(current->free && current->size >= size)) {
        *last = current;
        current = current->next;
    }
    return current;
}

block_t *request_space(block_t *last, size_t size) {
    block_t *block = sbrk(0);  // Current program break
    void *request = sbrk(size + BLOCK_SIZE);
    
    if (request == (void*)-1) return NULL;  // sbrk failed
    
    if (last) last->next = block;
    
    block->size = size;
    block->free = 0;
    block->next = NULL;
    return block;
}

void *my_malloc(size_t size) {
    if (size <= 0) return NULL;
    
    if (!heap_start) {
        // First call
        block_t *block = request_space(NULL, size);
        if (!block) return NULL;
        heap_start = block;
        return block + 1;  // Return memory after block header
    }
    
    block_t *last = heap_start;
    block_t *block = find_free_block(&last, size);
    
    if (!block) {
        block = request_space(last, size);
        if (!block) return NULL;
    } else {
        block->free = 0;
    }
    
    return block + 1;
}

void my_free(void *ptr) {
    if (!ptr) return;
    
    block_t *block = (block_t*)ptr - 1;
    block->free = 1;
    
    // Coalesce with next block if free
    if (block->next && block->next->free) {
        block->size += BLOCK_SIZE + block->next->size;
        block->next = block->next->next;
    }
}

Project 3 – TCP Chat Server

A multi-threaded chat server:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <pthread.h>
#include <sys/socket.h>
#include <netinet/in.h>

#define MAX_CLIENTS 10
#define BUFFER_SIZE 1024

typedef struct {
    int socket;
    char name[32];
    int active;
} client_t;

client_t clients[MAX_CLIENTS];
pthread_mutex_t clients_mutex = PTHREAD_MUTEX_INITIALIZER;

void broadcast_message(char *message, int sender_id) {
    pthread_mutex_lock(&clients_mutex);
    
    for (int i = 0; i < MAX_CLIENTS; i++) {
        if (clients[i].active && i != sender_id) {
            send(clients[i].socket, message, strlen(message), 0);
        }
    }
    
    pthread_mutex_unlock(&clients_mutex);
}

void *handle_client(void *arg) {
    int client_id = *(int*)arg;
    char buffer[BUFFER_SIZE];
    
    // Get client name
    memset(buffer, 0, sizeof(buffer));
    recv(clients[client_id].socket, buffer, sizeof(buffer) - 1, 0);
    buffer[strcspn(buffer, "\n")] = 0;
    strcpy(clients[client_id].name, buffer);
    
    char welcome[BUFFER_SIZE];
    snprintf(welcome, sizeof(welcome), "%s joined the chat\n", 
             clients[client_id].name);
    broadcast_message(welcome, client_id);
    
    // Message loop
    while (1) {
        memset(buffer, 0, sizeof(buffer));
        int bytes = recv(clients[client_id].socket, buffer, 
                         sizeof(buffer) - 1, 0);
        
        if (bytes <= 0) break;
        
        buffer[strcspn(buffer, "\n")] = 0;
        
        char message[BUFFER_SIZE + 50];
        snprintf(message, sizeof(message), "%s: %s\n", 
                 clients[client_id].name, buffer);
        broadcast_message(message, client_id);
    }
    
    // Client disconnected
    pthread_mutex_lock(&clients_mutex);
    clients[client_id].active = 0;
    close(clients[client_id].socket);
    pthread_mutex_unlock(&clients_mutex);
    
    char farewell[BUFFER_SIZE];
    snprintf(farewell, sizeof(farewell), "%s left the chat\n", 
             clients[client_id].name);
    broadcast_message(farewell, -1);
    
    return NULL;
}

int main(void) {
    int server_fd = socket(AF_INET, SOCK_STREAM, 0);
    
    int opt = 1;
    setsockopt(server_fd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));
    
    struct sockaddr_in address;
    address.sin_family = AF_INET;
    address.sin_addr.s_addr = INADDR_ANY;
    address.sin_port = htons(8888);
    
    bind(server_fd, (struct sockaddr*)&address, sizeof(address));
    listen(server_fd, MAX_CLIENTS);
    
    printf("Chat server running on port 8888\n");
    
    while (1) {
        struct sockaddr_in client_addr;
        socklen_t client_len = sizeof(client_addr);
        int client_fd = accept(server_fd, 
                               (struct sockaddr*)&client_addr, 
                               &client_len);
        
        pthread_mutex_lock(&clients_mutex);
        
        int client_id = -1;
        for (int i = 0; i < MAX_CLIENTS; i++) {
            if (!clients[i].active) {
                clients[i].socket = client_fd;
                clients[i].active = 1;
                client_id = i;
                break;
            }
        }
        
        pthread_mutex_unlock(&clients_mutex);
        
        if (client_id != -1) {
            pthread_t thread;
            pthread_create(&thread, NULL, handle_client, &client_id);
            pthread_detach(thread);
        } else {
            char *msg = "Server full\n";
            send(client_fd, msg, strlen(msg), 0);
            close(client_fd);
        }
    }
    
    close(server_fd);
    return 0;
}

Appendices

Appendix A – Complete Operator Precedence Table

Precedence Operator Description Associativity
1 () [] -> . Function call, array subscript, member access Left-to-right
2 ++ -- ! ~ + - * & (type) sizeof Unary operators Right-to-left
3 * / % Multiplication, division, remainder Left-to-right
4 + - Addition, subtraction Left-to-right
5 << >> Bitwise shift Left-to-right
6 < <= > >= Relational Left-to-right
7 == != Equality Left-to-right
8 & Bitwise AND Left-to-right
9 ^ Bitwise XOR Left-to-right
10 | Bitwise OR Left-to-right
11 && Logical AND Left-to-right
12 || Logical OR Left-to-right
13 ?: Ternary conditional Right-to-left
14 = += -= etc. Assignment Right-to-left
15 , Comma Left-to-right

Appendix G – Interview Questions (Advanced Level)

  1. Explain the difference between int* p and int *p. (Syntactically equivalent, both declare pointer to int)

  2. What's the output of printf("%d", sizeof('a'));? (In C, character constants are int, so prints sizeof(int), typically 4)

  3. How does const int *p differ from int *const p? (First: pointer to constant int; Second: constant pointer to int)

  4. Explain strict aliasing rule and when it matters. (Cannot access same memory through different type pointers except char*; affects optimization)

  5. What's the difference between ++i and i++ in terms of generated code? (Modern compilers generate same code when value unused, but semantics differ)

  6. How would you implement a thread-safe reference counting system? (Use atomic operations for increments/decrements)

  7. Explain the memory layout of a C program and what each segment contains. (Text, data, BSS, heap, stack)

  8. What happens when you call free() twice on the same pointer? (Undefined behavior - heap corruption)

  9. How does volatile affect compiler optimizations? (Prevents caching in registers, forces memory reads/writes)

  10. Explain the concept of "sequence points" and why they matter. (Points where side effects are complete; between sequence points, evaluation order undefined)


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment