Skip to content

Unmanned Aerial Vehicles (UAVs)

Overview

Unmanned Aerial Vehicles (UAVs) are an important carrier of embodied intelligence in three-dimensional space. From agricultural crop spraying to logistics delivery, from aerial cinematography to military reconnaissance, drones are profoundly transforming multiple industries. Autonomous drones represent the high integration of perception, planning, and control — real-time decision-making during high-speed three-dimensional motion.


UAV Classification

Rotorcraft, Fixed-Wing, and VTOL

Type Representative Advantages Disadvantages Typical Applications
Multirotor DJI Mavic, Crazyflie Hover, low-speed maneuver, simple Short endurance, low efficiency Aerial photography, inspection
Fixed-wing senseFly eBee Long endurance, high speed, efficient Cannot hover, needs runway/catapult Mapping, long-range inspection
VTOL Wingtra Combines hover and cruise Mechanically complex, difficult transition Logistics, long-range inspection
Single-rotor Traditional helicopter Heavy payload, long endurance Mechanically complex, high vibration Agricultural spraying

Quadrotor as Research Platform

Quadrotors are the most commonly used research platform because:

  • Simple mechanical structure (4 motors + propellers)
  • Analytically derivable dynamics model
  • Easy miniaturization (indoor flight)
  • Rich open-source ecosystem

Quadrotor Dynamics

Coordinate System Definitions

  • World frame \(\{W\}\): NED (North-East-Down) or ENU (East-North-Up)
  • Body frame \(\{B\}\): Origin at center of mass, \(x\) forward, \(z\) upward

Rotation matrix \(R \in SO(3)\) transforms from body frame to world frame.

Newton-Euler Equations

Translational equation:

\[ m\ddot{\mathbf{p}} = \begin{bmatrix} 0 \\ 0 \\ -mg \end{bmatrix} + R \begin{bmatrix} 0 \\ 0 \\ T \end{bmatrix} \]

where \(\mathbf{p} = [x, y, z]^T\) is the position in world frame, \(T = \sum_{i=1}^{4} f_i\) is total thrust.

Rotational equation:

\[ J\dot{\boldsymbol{\omega}} = -\boldsymbol{\omega} \times J\boldsymbol{\omega} + \boldsymbol{\tau} \]

where \(J\) is the inertia tensor and \(\boldsymbol{\omega}\) is the body angular velocity.

Thrust and Torque Allocation

The thrust \(f_i\) and rotational speed \(\omega_i\) relationship for four motors:

\[ f_i = k_f \omega_i^2, \quad \tau_i = k_m \omega_i^2 \]

Total thrust and torques:

\[ \begin{bmatrix} T \\ \tau_\phi \\ \tau_\theta \\ \tau_\psi \end{bmatrix} = \begin{bmatrix} 1 & 1 & 1 & 1 \\ 0 & -L & 0 & L \\ L & 0 & -L & 0 \\ -k_m/k_f & k_m/k_f & -k_m/k_f & k_m/k_f \end{bmatrix} \begin{bmatrix} f_1 \\ f_2 \\ f_3 \\ f_4 \end{bmatrix} \]

where \(L\) is the distance from motor to center of mass. This allocation matrix is invertible, so individual motor speeds can be solved from desired thrust and torques.

Differential Flatness: The quadrotor system is differentially flat — all states and inputs can be expressed using flat outputs \([x, y, z, \psi]^T\) (position + yaw angle) and their derivatives. This means only the position trajectory needs to be planned to derive complete states and control inputs.


Control Architecture

PX4 / ArduPilot Flight Stack

graph TB
    subgraph Ground_Station["Ground Station"]
        GCS[QGroundControl / Mission Planner]
    end

    subgraph Companion_Computer["Companion Computer"]
        COMP[Jetson / RPI / NUC] --> VIO[Visual Odometry]
        COMP --> DET[Object Detection]
        COMP --> PLAN[Path Planning]
    end

    subgraph Flight_Controller["Flight Controller PX4/ArduPilot"]
        POS[Position Controller<br/>PID] --> ATT[Attitude Controller<br/>PID / Quaternion]
        ATT --> RATE[Rate Controller<br/>PID]
        RATE --> MIX[Motor Mixer]
        MIX --> ESC[ESC]

        EKF[Extended Kalman Filter<br/>State Estimation] --> POS

        IMU[IMU] --> EKF
        BARO[Barometer] --> EKF
        GPS[GPS] --> EKF
        MAG[Magnetometer] --> EKF
    end

    GCS -->|MAVLink| COMP
    COMP -->|MAVLink| POS
    VIO -->|Pose| EKF
    ESC --> M1[Motors 1-4]

Cascaded PID Controller

PX4's default controller is a three-layer cascaded PID:

Outer loop — Position control:

\[ \mathbf{a}_{cmd} = K_p^{pos}(\mathbf{p}_{ref} - \mathbf{p}) + K_d^{pos}(\mathbf{v}_{ref} - \mathbf{v}) + K_i^{pos} \int (\mathbf{p}_{ref} - \mathbf{p}) dt + \mathbf{g} \]

Middle loop — Attitude control: Extract desired attitude from desired acceleration, then compute desired angular velocity.

Inner loop — Rate control:

\[ \boldsymbol{\tau}_{cmd} = K_p^{rate}(\boldsymbol{\omega}_{ref} - \boldsymbol{\omega}) + K_d^{rate}\dot{\boldsymbol{\omega}} + K_i^{rate}\int(\boldsymbol{\omega}_{ref} - \boldsymbol{\omega})dt \]

Advanced Control Methods

Model Predictive Control (MPC): Optimizes control sequences within a finite time horizon, handling constraints:

\[ \min_{u_{0:N-1}} \sum_{k=0}^{N} \|\mathbf{x}_k - \mathbf{x}_{ref}\|_Q^2 + \sum_{k=0}^{N-1} \|\mathbf{u}_k\|_R^2 \]
\[ \text{s.t.} \quad \mathbf{x}_{k+1} = f(\mathbf{x}_k, \mathbf{u}_k), \quad \mathbf{u}_{min} \leq \mathbf{u}_k \leq \mathbf{u}_{max} \]

SE(3) Geometric Control: Designs controllers directly on the \(SE(3)\) Lie group, avoiding gimbal lock, suitable for large-angle maneuvers.


Autonomous Navigation

In indoor or tunnel environments where GPS is unavailable, drones must rely on onboard sensors for localization:

Visual-Inertial Odometry (VIO): Fuses camera and IMU data for pose estimation - VINS-Mono / VINS-Fusion: Open-source VIO system from HKUST - MSCKF: Multi-State Constraint Kalman Filter - ORB-SLAM3: Supports VIO mode

LiDAR SLAM: - LOAM / LIO-SAM: LiDAR odometry + IMU tight coupling - High accuracy but heavy sensors, suitable for larger drones

Motion Planning

UAV path planning must consider: - Dynamic constraints (max velocity, acceleration, angular rate) - Collision avoidance (static + dynamic obstacles) - Energy optimization (minimize total thrust variation)

Common methods:

  • Minimum snap trajectory: Minimizes fourth-order derivative (snap), ensuring smooth trajectories:
\[ \min \int_0^T \left\|\frac{d^4 \mathbf{p}}{dt^4}\right\|^2 dt \]
  • EGO-Planner: ESDF (Euclidean Distance Field) gradient-based planning, good real-time performance
  • Fast-Planner: Open-source fast motion planning system from ZJU

Learning-based Agile Flight

UZH RPG Group's Work

The Robotics and Perception Group (RPG) at the University of Zurich achieved breakthrough results in learning-based agile flight:

Swift (2023): RL-trained drones beat human champion pilots in racing - Trained with PPO in simulation - Visual perception via RGB camera + gate detection - Achieved speeds and accelerations beyond human limits

Agile Autonomy (2021): End-to-end learned high-speed obstacle avoidance - Depth image input -> trajectory point output - Flying at 10 m/s through dense forests

Sim-to-Real for UAVs

UAV RL Sim-to-Real must consider: - Aerodynamic effects (rotor wake, ground effect) - Motor response delay - IMU noise and bias - Communication latency


Aerial Manipulation

UAV Grasping

Aerial manipulation combines drones with robot arms for airborne grasping and manipulation: - Challenge: Grasping-induced external forces/torques severely affect flight stability - Solutions: Over-actuated platforms or adaptive control - Applications: High-altitude inspection, hazardous material handling, construction

Multi-UAV Cooperative Transport

Multiple drones cooperatively transport large objects via cables or rigid connections:

\[ \sum_{i=1}^{N} \mathbf{f}_i + m_L \mathbf{g} = m_L \ddot{\mathbf{p}}_L \]

Requires distributed control and communication.


Swarm Intelligence

Multi-Agent Coordination

Core problems in UAV swarms: - Formation control: Maintain predetermined geometric configurations - Collision avoidance: Prevent intra-swarm collisions - Task allocation: Multi-robot division of labor

Reynolds Rules (inspired by bird flocks): 1. Separation: Avoid getting too close to neighbors 2. Alignment: Match velocity direction with neighbors 3. Cohesion: Move toward the center of neighbors

Mathematical formulation:

\[ \mathbf{u}_i = c_s \sum_{j \in \mathcal{N}_i} \frac{\mathbf{p}_i - \mathbf{p}_j}{\|\mathbf{p}_i - \mathbf{p}_j\|^2} + c_a \frac{1}{|\mathcal{N}_i|}\sum_{j \in \mathcal{N}_i} \mathbf{v}_j + c_c \left(\frac{1}{|\mathcal{N}_i|}\sum_{j \in \mathcal{N}_i} \mathbf{p}_j - \mathbf{p}_i\right) \]

Communication and Decentralization

  • Centralized: All drones report to a central node which issues commands. Simple but single point of failure.
  • Decentralized: Each drone communicates only with neighbors. Robust but coordination is difficult.
  • Hierarchical: Leader-follower structure, a compromise.

Representative Swarm Systems

  • Crazyswarm2: Swarm research platform based on Crazyflie 2.1 micro quadrotors
  • ZJU/HKUST Gao Fei Team: Large-scale swarm flying through dense forests
  • EHang: Passenger AAM (Advanced Air Mobility) formation performances

Open-Source R&D Platforms

Platform Size Features Suitable Scenarios
Crazyflie 2.1 27g Ultra-light micro, Python/ROS, swarm-friendly Indoor research/teaching
PX4 + QAV250 ~400g Standard racing frame + PX4 flight controller Outdoor autonomous flight
DJI RoboMaster TT 87g Tello EDU upgrade, programming interface Education/entry-level
Agilicious (UZH) ~850g Designed for agile flight research High-speed/racing research
Flightmare (simulation) - UZH RPG open-source simulator, Unity rendering RL training
AirSim / Colosseum - Microsoft open-source simulation, Unreal rendering Visual navigation research

References

  • Mellinger & Kumar, "Minimum Snap Trajectory Generation and Control for Quadrotors", ICRA, 2011
  • Lee et al., "Geometric Tracking Control of a Quadrotor UAV on SE(3)", CDC, 2010
  • Song et al., "Reaching the Limit in Autonomous Racing: Optimal Control Meets Reinforcement Learning", Science Robotics, 2023
  • Loquercio et al., "Learning High-Speed Flight in the Wild", Science Robotics, 2021

Related Notes:


评论 #