Skip to content

C/C++ Fundamentals

Introduction

C is the cornerstone of systems programming, and C++ extends it with object-oriented programming, generic programming, and modern memory management. Understanding C/C++ is essential for gaining deep insight into computer systems (operating systems, compilers, embedded systems).


1. C Language Core

1.1 Pointers

A pointer is a variable that stores a memory address, and it is the central concept in C.

int x = 42;
int *p = &x;    // p stores the address of x

printf("%d\n", *p);    // 42 (dereference)
printf("%p\n", p);     // 0x7ffd5e8a3c (address)
printf("%p\n", &x);    // same as above

Relationship between pointers and arrays:

int arr[5] = {10, 20, 30, 40, 50};
int *p = arr;       // array name decays to a pointer

printf("%d\n", *p);       // 10
printf("%d\n", *(p+2));   // 30
printf("%d\n", p[3]);     // 40 (equivalent to *(p+3))

Multi-level pointers:

int x = 10;
int *p = &x;
int **pp = &p;

printf("%d\n", **pp);  // 10

Function pointers:

int add(int a, int b) { return a + b; }
int sub(int a, int b) { return a - b; }

int (*op)(int, int);  // function pointer declaration
op = add;
printf("%d\n", op(3, 4));  // 7
op = sub;
printf("%d\n", op(3, 4));  // -1

1.2 Pointer Arithmetic

int arr[5] = {10, 20, 30, 40, 50};
int *p = arr;

p++;      // advances sizeof(int) = 4 bytes
// p now points to arr[1]

ptrdiff_t diff = &arr[4] - &arr[0];  // 4 (difference in element count)

1.3 Dynamic Memory Management: malloc/free

#include <stdlib.h>

// Allocation
int *arr = (int *)malloc(100 * sizeof(int));
if (arr == NULL) {
    // handle allocation failure
    return -1;
}

// Usage
for (int i = 0; i < 100; i++) {
    arr[i] = i * i;
}

// Deallocation
free(arr);
arr = NULL;  // avoid dangling pointer

Common pitfalls:

Problem Cause Consequence
Memory leak malloc without free Memory usage grows continuously
Dangling pointer Using a pointer after free Undefined behavior
Double free Freeing the same address twice Heap corruption/crash
Buffer overflow Out-of-bounds write Security vulnerability

1.4 Stack vs Heap

High addr ┌──────────────────┐
          │     Stack        │ ← local variables, auto-managed
          │  (grows downward)│    function call frames
          ├──────────────────┤
          │        ↓         │
          │  (unused space)  │
          │        ↑         │
          ├──────────────────┤
          │      Heap        │ ← malloc/free, manually managed
          │  (grows upward)  │    dynamic allocation
          ├──────────────────┤
          │    BSS segment   │ ← uninitialized global variables
          ├──────────────────┤
          │    Data segment  │ ← initialized global variables
          ├──────────────────┤
Low addr  │    Text segment  │ ← code (read-only)
          └──────────────────┘
Property Stack Heap
Management Automatic (compiler) Manual (malloc/free)
Allocation speed Very fast (move stack pointer) Slower (search free blocks)
Size limit Small (typically 1-8MB) Large (limited by memory)
Lifetime Freed on function return Until explicit free
Fragmentation None Possible

2. Modern C++ Features

2.1 RAII (Resource Acquisition Is Initialization)

The RAII principle: resource acquisition is initialization -- leveraging object lifetimes to automatically manage resources.

class FileHandle {
    FILE* file_;
public:
    FileHandle(const char* path, const char* mode) 
        : file_(fopen(path, mode)) {
        if (!file_) throw std::runtime_error("Cannot open file");
    }

    ~FileHandle() {
        if (file_) fclose(file_);  // automatically closed on destruction
    }

    // Disable copying
    FileHandle(const FileHandle&) = delete;
    FileHandle& operator=(const FileHandle&) = delete;

    FILE* get() const { return file_; }
};

void process() {
    FileHandle fh("data.txt", "r");
    // use fh.get()
}  // leaving scope automatically calls destructor to close the file

2.2 Smart Pointers

C++11 introduced smart pointers for automatic heap memory management.

std::unique_ptr: exclusive ownership

#include <memory>

auto ptr = std::make_unique<int>(42);
// *ptr == 42

// Ownership transfer
auto ptr2 = std::move(ptr);
// ptr is now nullptr

// Custom deleter
auto file_ptr = std::unique_ptr<FILE, decltype(&fclose)>(
    fopen("data.txt", "r"), &fclose
);

std::shared_ptr: shared ownership (reference counting)

auto sp1 = std::make_shared<std::vector<int>>(100);
auto sp2 = sp1;  // reference count = 2

sp1.reset();     // reference count = 1
sp2.reset();     // reference count = 0, object is destroyed

std::weak_ptr: breaks circular references

struct Node {
    std::shared_ptr<Node> next;
    std::weak_ptr<Node> prev;  // use weak_ptr to break the cycle
};

2.3 Templates

// Function template
template<typename T>
T max_val(T a, T b) {
    return (a > b) ? a : b;
}

// Class template
template<typename T, size_t N>
class Array {
    T data_[N];
public:
    T& operator[](size_t i) { return data_[i]; }
    constexpr size_t size() const { return N; }
};

// Usage
Array<int, 10> arr;
arr[0] = 42;

Template specialization:

// General version
template<typename T>
std::string to_string(T val) {
    return std::to_string(val);
}

// Specialization for bool
template<>
std::string to_string<bool>(bool val) {
    return val ? "true" : "false";
}

2.4 STL Containers

Container Underlying Structure Random Access Insert/Delete Lookup
vector Dynamic array \(O(1)\) Tail \(O(1)\), middle \(O(n)\) \(O(n)\)
list Doubly linked list \(O(n)\) \(O(1)\) \(O(n)\)
deque Segmented array \(O(1)\) Front/back \(O(1)\) \(O(n)\)
map Red-black tree - \(O(\log n)\) \(O(\log n)\)
unordered_map Hash table - Avg \(O(1)\) Avg \(O(1)\)
set Red-black tree - \(O(\log n)\) \(O(\log n)\)
#include <vector>
#include <map>
#include <unordered_map>

// vector
std::vector<int> v = {1, 2, 3};
v.push_back(4);
v.emplace_back(5);  // in-place construction, avoids copying

// map (ordered)
std::map<std::string, int> scores;
scores["Alice"] = 95;
scores["Bob"] = 87;

// unordered_map (hash-based, faster)
std::unordered_map<std::string, int> cache;
cache.insert({"key1", 100});
auto it = cache.find("key1");
if (it != cache.end()) {
    std::cout << it->second << std::endl;
}

2.5 Move Semantics

C++11 introduced rvalue references and move semantics to avoid unnecessary deep copies.

class Buffer {
    int* data_;
    size_t size_;
public:
    // Constructor
    Buffer(size_t n) : data_(new int[n]), size_(n) {}

    // Copy constructor (deep copy, expensive)
    Buffer(const Buffer& other) : data_(new int[other.size_]), size_(other.size_) {
        std::copy(other.data_, other.data_ + size_, data_);
    }

    // Move constructor (steals resources, cheap)
    Buffer(Buffer&& other) noexcept 
        : data_(other.data_), size_(other.size_) {
        other.data_ = nullptr;  // nullify the source
        other.size_ = 0;
    }

    ~Buffer() { delete[] data_; }
};

Buffer create_buffer() {
    Buffer buf(1000000);
    return buf;  // triggers move construction (or NRVO)
}

The essence of std::move: forcibly casts an lvalue to an rvalue reference.

std::string s1 = "Hello, World!";
std::string s2 = std::move(s1);  // s1's contents are "stolen"
// s1 is now an empty string

2.6 Lambda Expressions

// Basic syntax: [capture list](parameters) -> return type { body }
auto add = [](int a, int b) -> int { return a + b; };

// Capture modes
int x = 10;
auto by_value = [x]() { return x; };        // capture by value
auto by_ref = [&x]() { x++; return x; };    // capture by reference
auto all_val = [=]() { return x; };          // capture all by value
auto all_ref = [&]() { x++; };              // capture all by reference

// With STL algorithms
std::vector<int> v = {5, 3, 1, 4, 2};
std::sort(v.begin(), v.end(), [](int a, int b) {
    return a > b;  // descending order
});

// Generic lambda (C++14)
auto generic_add = [](auto a, auto b) { return a + b; };

3. Memory Layout

3.1 C++ Object Memory Layout

class Base {
    int x;          // 4 bytes
    virtual void f(); // vtable pointer: 8 bytes
};
// sizeof(Base) = 16 (8 vtable ptr + 4 int + 4 padding)

class Derived : public Base {
    int y;          // 4 bytes
    void f() override;
};
// sizeof(Derived) = 16 (shared vtable ptr + 4 + 4)

Virtual function table (vtable) mechanism:

Derived object layout:
┌──────────────────┐
│ vptr → vtable    │ 8 bytes (points to virtual function table)
├──────────────────┤
│ Base::x          │ 4 bytes
├──────────────────┤
│ Derived::y       │ 4 bytes
└──────────────────┘

vtable (Derived):
┌──────────────────┐
│ &Derived::f      │  → Derived's overridden version
└──────────────────┘

3.2 Memory Alignment

struct Bad {
    char a;    // 1 byte + 3 padding
    int b;     // 4 bytes
    char c;    // 1 byte + 3 padding
};  // sizeof = 12

struct Good {
    int b;     // 4 bytes
    char a;    // 1 byte
    char c;    // 1 byte + 2 padding
};  // sizeof = 8

4. Compilation Model

The C/C++ compilation process consists of four stages:

Source code (.c/.cpp)
    │
    ▼  Preprocessing
Expanded source code
    │  - #include file inclusion
    │  - #define macro substitution
    │  - #ifdef conditional compilation
    ▼  Compilation
Assembly code (.s)
    │  - Syntax analysis
    │  - Semantic analysis
    │  - Optimization
    ▼  Assembly
Object file (.o/.obj)
    │  - Machine code
    │  - Symbol table
    ▼  Linking
Executable (a.out/.exe)
    │  - Symbol resolution
    │  - Address relocation
    │  - Static linking vs dynamic linking (.so/.dll)
# GCC commands for each stage
gcc -E main.c -o main.i   # preprocessing
gcc -S main.i -o main.s   # compilation
gcc -c main.s -o main.o   # assembly
gcc main.o -o main         # linking

# All in one step
gcc -O2 main.c -o main

4.1 Static Linking vs Dynamic Linking

Property Static Linking (.a/.lib) Dynamic Linking (.so/.dll)
Link time Compile time Runtime
File size Larger (includes library code) Smaller
Deployment Self-contained, no dependencies Requires shared libraries
Updates Recompilation needed Replace the shared library
Memory One copy per process Shared across processes

4.2 Header Files and Compilation Units

// math_utils.h (header file: declarations)
#pragma once  // prevent multiple inclusion
int add(int a, int b);
int multiply(int a, int b);

// math_utils.cpp (compilation unit: definitions)
#include "math_utils.h"
int add(int a, int b) { return a + b; }
int multiply(int a, int b) { return a * b; }

// main.cpp
#include "math_utils.h"
int main() {
    return add(3, 4);
}
# Compile separately, then link
g++ -c math_utils.cpp -o math_utils.o
g++ -c main.cpp -o main.o
g++ math_utils.o main.o -o main

5. Modern C++ Best Practices

Principle Old Approach Modern Approach
Memory management new/delete make_unique/make_shared
Arrays C arrays std::vector / std::array
Strings char* std::string / std::string_view
Iteration Index-based loops Range-for / iterators
Callbacks Function pointers std::function / Lambda
Constants #define constexpr / const
Threading pthread std::thread / std::async

References

  • "The C Programming Language" - Kernighan & Ritchie
  • "Effective Modern C++" - Scott Meyers
  • "C++ Primer" - Stanley B. Lippman
  • CppReference: https://en.cppreference.com

评论 #