Skip to content

Operating System Fundamentals

Overview

The operating system (OS) is the foundation layer of the robot software stack, responsible for managing hardware resources, scheduling tasks, and providing programming interfaces. In robotics, Linux is the preferred OS for the main control platform, while real-time extensions are key to meeting control requirements.

Processes and Threads

Process

A process is the basic unit of resource allocation in the OS:

Attribute Description
Address Space Independent virtual address space
Resources Owns independent file descriptors, memory mappings, etc.
Isolation Processes are isolated; one crash does not affect others
Overhead Higher overhead for creation and context switching
Communication Requires IPC (pipes, shared memory, sockets, etc.)

Thread

A thread is the basic unit of CPU scheduling:

Attribute Description
Address Space Shares the address space of its parent process
Resources Shares process resources (files, memory)
Independence Has its own stack, registers, and program counter
Overhead Lower overhead for creation and context switching
Communication Can directly read/write shared variables (requires synchronization)

In the Context of ROS2

ROS2 Node Model:
+-- Node = Process (default) or Thread (component mode)
+-- Process mode: Each node is an independent process, good isolation
|   $ ros2 run perception camera_node
|   $ ros2 run navigation planner_node
+-- Component mode: Multiple nodes share one process, lower communication overhead
    $ ros2 run rclcpp_components component_container

ROS2 Best Practices

  • Use component mode to reduce inter-node communication overhead
  • Nodes within the same process can use zero-copy (intra-process communication)
  • Place compute-intensive nodes in separate processes to avoid affecting other nodes

Scheduling Algorithms

Linux Scheduler

Linux uses CFS (Completely Fair Scheduler) as the default scheduler:

Scheduling Class Policy Priority Use
SCHED_DEADLINE EDF (Earliest Deadline First) Highest Hard real-time tasks
SCHED_FIFO First-in-first-out real-time High (1-99) Real-time tasks
SCHED_RR Round-robin real-time High (1-99) Real-time tasks
SCHED_OTHER CFS fair scheduling Normal (nice value) Normal tasks
SCHED_IDLE Runs only when idle Lowest Background tasks

Real-Time Scheduling Configuration

# View process scheduling policy and priority
chrt -p <pid>

# Set process to SCHED_FIFO with priority 80
sudo chrt -f -p 80 <pid>

# Specify scheduling policy at launch
sudo chrt -f 80 ./motor_control_node

# Set CPU affinity (bind to specific core)
taskset -c 3 ./motor_control_node
# Setting thread priority in Python
import os
import ctypes

# Set real-time scheduling
SCHED_FIFO = 1
param = ctypes.c_int(80)
libc = ctypes.CDLL('libc.so.6')

class sched_param(ctypes.Structure):
    _fields_ = [('sched_priority', ctypes.c_int)]

sp = sched_param(80)
libc.sched_setscheduler(0, SCHED_FIFO, ctypes.byref(sp))

Scheduling Latency

\[ t_{\text{response}} = t_{\text{interrupt}} + t_{\text{schedule}} + t_{\text{context\_switch}} + t_{\text{execution}} \]
System Typical Scheduling Latency Worst Case
Standard Linux ~100us ~10ms
PREEMPT_RT Linux ~10us ~100us
FreeRTOS <1us ~10us
Bare-metal <100ns ~1us

Linux Kernel Architecture

Kernel Subsystems

graph TB
    subgraph User Space
        APP[Applications<br>ROS2 Nodes]
        LIB[C Library glibc]
    end

    subgraph Kernel Space
        SCI[System Call Interface]

        subgraph Core Subsystems
            PM[Process Management]
            MM[Memory Management]
            VFS[Virtual File System<br>VFS]
            NET[Network Subsystem<br>Network Stack]
            IPC_K[Inter-Process Communication<br>IPC]
        end

        DD[Device Drivers]
        HAL_K[Hardware Abstraction Layer<br>HAL]
    end

    APP --> LIB
    LIB --> SCI
    SCI --> PM
    SCI --> MM
    SCI --> VFS
    SCI --> NET
    SCI --> IPC_K
    PM --> DD
    MM --> DD
    VFS --> DD
    DD --> HAL_K

Kernel Modules

The Linux kernel is modular, and drivers can be dynamically loaded:

# View loaded kernel modules
lsmod

# Load modules
sudo modprobe v4l2_common   # V4L2 video driver
sudo modprobe can_raw        # CAN bus raw socket

# Unload a module
sudo rmmod can_raw

# View module information
modinfo nvidia   # NVIDIA GPU driver info

Device Drivers

Device Types

Type Interface Examples Access Method
Character Device Stream read/write Serial port, IMU, GPIO /dev/ttyUSB0
Block Device Random read/write SSD, eMMC /dev/sda
Network Device Socket Ethernet, WiFi eth0, wlan0

/dev Device Files

# Common robot device files
/dev/ttyUSB0     # USB serial (MCU communication)
/dev/ttyTHS0     # Jetson native serial port
/dev/video0      # USB camera (V4L2)
/dev/i2c-1       # I2C bus
/dev/spidev0.0   # SPI device
/dev/can0        # CAN bus
/dev/input/js0   # Game controller
/dev/nvme0n1     # NVMe SSD

sysfs File System

sysfs provides a user-space interface for kernel objects:

# GPIO control
echo 17 > /sys/class/gpio/export
echo out > /sys/class/gpio/gpio17/direction
echo 1 > /sys/class/gpio/gpio17/value

# View CPU temperature
cat /sys/class/thermal/thermal_zone0/temp
# Output: 45000 (means 45.000 C)

# View CPU frequency
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
# Output: 2400000 (means 2.4GHz)

# Set Jetson power mode
sudo nvpmodel -m 0  # Maximum performance
sudo nvpmodel -m 1  # 15W mode

# View GPU utilization (Jetson)
cat /sys/devices/gpu.0/load

V4L2 Camera Driver

# List video devices
v4l2-ctl --list-devices

# View supported formats
v4l2-ctl -d /dev/video0 --list-formats-ext

# Set camera parameters
v4l2-ctl -d /dev/video0 \
    --set-fmt-video=width=1920,height=1080,pixelformat=MJPG \
    --set-parm=30  # 30fps
# Reading V4L2 camera in Python
import cv2

cap = cv2.VideoCapture(0, cv2.CAP_V4L2)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 1920)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 1080)
cap.set(cv2.CAP_PROP_FPS, 30)

while True:
    ret, frame = cap.read()
    if ret:
        # Process frame
        process_frame(frame)

Real-Time Patch (PREEMPT_RT)

Why Real-Time Linux Is Needed

Issues with the standard Linux kernel:

  • Kernel code sections are non-preemptible (long lock-holding periods)
  • Interrupt handling executes in hard interrupt context
  • Scheduling latency is non-deterministic (occasional ms-level latency spikes)

Key PREEMPT_RT Improvements

Improvement Description Effect
Fully preemptible kernel Nearly all kernel code can be preempted Reduced scheduling latency
Threaded interrupts Hard interrupts converted to kernel threads Can be preempted by higher-priority tasks
Spinlocks to mutexes Changed to sleepable locks Reduced non-preemptible time
High-resolution timers hrtimer replaces jiffies Nanosecond-level timing precision

Installation and Configuration

# Install RT kernel on Ubuntu
sudo apt install linux-image-rt-amd64  # x86
# Or compile a custom RT kernel (Jetson)
# Download NVIDIA's L4T source and apply the PREEMPT_RT patch

# Verify RT kernel
uname -a
# Output should include "PREEMPT_RT"

# Test real-time performance
sudo cyclictest -p 80 -t 4 -n -m -l 1000000
# -p 80: priority 80
# -t 4: 4 threads
# -n: use nanosleep
# -m: lock memory
# Result: Min/Avg/Max latency (us)

Real-Time Performance Tuning

# 1. Isolate CPU cores for real-time tasks
# Add to /boot/cmdline.txt:
isolcpus=2,3 nohz_full=2,3 rcu_nocbs=2,3

# 2. Disable CPU frequency scaling
echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

# 3. Lock memory (prevent page swapping)
# In your program:
mlockall(MCL_CURRENT | MCL_FUTURE);

# 4. Set real-time thread priority
# In ROS2:
# ros2 run --prefix "chrt -f 80" my_package my_node

Boot Process

Linux Boot Flow

graph LR
    A[Power On] --> B[Bootloader<br>U-Boot/UEFI]
    B --> C[Kernel Loading<br>vmlinuz]
    C --> D[initramfs<br>Temporary Root FS]
    D --> E[init/systemd<br>PID 1]
    E --> F[System Services Start]
    F --> G[User Space Ready]

Robot Auto-Start Configuration

# Create a boot-time auto-start service using systemd
sudo tee /etc/systemd/system/robot.service << 'EOF'
[Unit]
Description=Robot ROS2 Launch
After=network.target

[Service]
Type=simple
User=robot
WorkingDirectory=/home/robot/ros2_ws
ExecStart=/bin/bash -c "source install/setup.bash && ros2 launch robot_bringup robot.launch.py"
Restart=always
RestartSec=5

# Real-time priority settings
Nice=-10
CPUSchedulingPolicy=fifo
CPUSchedulingPriority=50

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl enable robot.service
sudo systemctl start robot.service

Boot Time Optimization

Optimization Method Time Saved Description
Streamline initramfs 2-5s Keep only necessary drivers
Parallel service startup 3-10s systemd parallelization
Disable unnecessary services 2-5s Turn off Bluetooth, printing, etc.
Compress kernel 1-2s Use LZ4 compression
Use lightweight distro 5-10s Ubuntu Core

IPC (Inter-Process Communication)

Linux IPC Mechanisms

Mechanism Latency Throughput Suitable Scenario
Pipe Medium Medium Parent-child process communication
Message Queue Medium Medium Structured message passing
Shared Memory Low High Large data sharing
Signal Low Low Simple notifications
Unix Socket Medium High Local process communication
TCP/UDP Socket High High Cross-machine communication

ROS2 DDS Communication

ROS2 uses DDS (Data Distribution Service) as its communication middleware:

ROS2 DDS Communication Stack:
+-- Within the same process: Zero-copy (pointer passing)
+-- Same host: Shared memory (Fast-DDS SHM transport)
+-- Cross-host: UDP multicast

See: Real-Time Systems for more on real-time deployment.

Common Linux Commands

System Monitoring

# CPU and memory usage
htop                    # Interactive process monitor
top -H -p <pid>         # View threads of a process

# System resources
free -h                 # Memory usage
df -h                   # Disk usage
nvidia-smi              # GPU usage (use jtop for Jetson)
sudo jtop               # Jetson system monitor

# Real-time analysis
trace-cmd record -p function_graph -l schedule
kernelshark             # Graphical kernel trace analysis

Performance Profiling

# CPU performance profiling
sudo perf record -g ./robot_node
sudo perf report

# System call tracing
strace -f -e trace=read,write ./robot_node

# I/O analysis
iostat -x 1             # I/O statistics
iotop                   # I/O process monitor

Summary

  1. Processes provide isolation, threads provide concurrency; ROS2 component mode is the best practice
  2. Linux CFS is suitable for general tasks; SCHED_FIFO is for real-time control
  3. Device drivers provide user-space interfaces through /dev and sysfs
  4. PREEMPT_RT reduces Linux worst-case latency from ~10ms to ~100us
  5. systemd manages auto-start of robot services
  6. Proper use of CPU isolation and memory locking improves real-time performance

References


评论 #