Skip to content

Robot Arms and Mobile Manipulation

Overview

Robot arms (manipulators) are the core form of industrial robots, while mobile manipulation combines mobile bases with robot arms, granting robots the ability to grasp and manipulate objects in open environments. This is the central "hand" problem in embodied intelligence.


Robot Arm Fundamentals

Degrees of Freedom and Joint Types

  • Revolute joint: Rotates about a fixed axis, most common
  • Prismatic joint: Translates along a straight line
  • Degrees of Freedom (DOF): An end-effector has 6 DOF in 3D space (3 translation + 3 rotation), so a 6-DOF arm is fully determined, while 7-DOF has kinematic redundancy

Kinematics

Forward kinematics: Compute end-effector pose \(\mathbf{T}\) from joint angles \(\mathbf{q}\) via chained homogeneous transformation matrices:

\[ \mathbf{T}_{0}^{n} = \prod_{i=1}^{n} \mathbf{T}_{i-1}^{i}(q_i) \]

Each \(\mathbf{T}_{i-1}^{i}\) is determined by DH (Denavit-Hartenberg) parameters or the Product of Exponentials (PoE) method.

Inverse kinematics: Given a desired end-effector pose \(\mathbf{T}_{desired}\), solve for joint angles \(\mathbf{q}\). Analytical solutions exist only for specific configurations; general methods use numerical iteration:

\[ \Delta \mathbf{q} = J^{\dagger}(\mathbf{q}) \cdot \Delta \mathbf{x} \]

where \(J^{\dagger}\) is the Moore-Penrose pseudoinverse of the Jacobian. When \(J\) is near singular, use Damped Least Squares:

\[ \Delta \mathbf{q} = J^T(JJ^T + \lambda^2 I)^{-1} \Delta \mathbf{x} \]

Dynamics

Robot arm dynamics are described by the Lagrangian equation:

\[ M(\mathbf{q})\ddot{\mathbf{q}} + C(\mathbf{q}, \dot{\mathbf{q}})\dot{\mathbf{q}} + G(\mathbf{q}) = \boldsymbol{\tau} \]
  • \(M(\mathbf{q})\): mass matrix (symmetric positive definite)
  • \(C(\mathbf{q}, \dot{\mathbf{q}})\): Coriolis and centrifugal force matrix
  • \(G(\mathbf{q})\): gravity term
  • \(\boldsymbol{\tau}\): joint torques

Computed Torque Control:

\[ \boldsymbol{\tau} = M(\mathbf{q})(\ddot{\mathbf{q}}_d + K_d \dot{\mathbf{e}} + K_p \mathbf{e}) + C(\mathbf{q}, \dot{\mathbf{q}})\dot{\mathbf{q}} + G(\mathbf{q}) \]

where \(\mathbf{e} = \mathbf{q}_d - \mathbf{q}\) is the tracking error.

Workspace and Singularities

  • Reachable workspace: Set of all positions the end-effector can reach
  • Dexterous workspace: Subset of positions reachable with arbitrary orientation
  • Singular configurations: Configurations where the Jacobian loses rank, preventing motion in certain directions

Manipulability measures the dexterity of the arm at its current configuration:

\[ w(\mathbf{q}) = \sqrt{\det(J(\mathbf{q})J(\mathbf{q})^T)} \]

Major Platforms

Research-Grade Robot Arms

Platform DOF Payload Features Price Range
Franka Emika Panda 7 3 kg Torque sensors in all joints, impedance control ~$30K
Kinova Gen3 7 4 kg Lightweight, ROS2 support, force feedback ~$25K
UR5e/UR10e 6 5/12.5 kg Collaborative robot pioneer, 6-axis F/T sensor ~$35-50K
xArm 7 7 3.5 kg Chinese-made high value, open-source SDK ~$8-10K
UFACTORY Lite 6 6 2 kg Ultra-low-price research arm ~$2K
Koch v1.1 6 - Open-source low-cost, LeRobot community ~$300

Franka Emika Panda Details

Franka Panda is the most widely used platform in robot manipulation research:

  • Joint torque sensors: All 7 joints have built-in high-precision torque sensors
  • Impedance control: Supports Cartesian and joint-space impedance control
  • libfranka: 1kHz real-time control interface
  • franka_ros2: Official ROS2 integration
  • Applications: Widely used in grasping, manipulation, and contact-rich task research

Mobile Manipulation

Why Mobile Manipulation Is Needed

Fixed-base robot arms have limited workspace, yet many real tasks require robots to move through environments while manipulating objects:

  • Household tidying (retrieving and placing items from different rooms)
  • Warehouse logistics (moving to shelves for picking)
  • Inspection and maintenance (moving to equipment for operations)

System Architecture

graph TB
    subgraph Perception_Layer["Perception Layer"]
        CAM[RGB-D Camera] --> DET[Object Detection/Segmentation]
        LID[LiDAR] --> MAP[Mapping/Localization]
        FT[Force/Torque Sensor] --> CONT[Contact Detection]
    end

    subgraph Planning_Layer["Planning Layer"]
        DET --> GRASP[Grasp Planning]
        MAP --> NAV[Navigation Planning]
        GRASP --> WBC[Whole-Body Planning]
        NAV --> WBC
    end

    subgraph Control_Layer["Control Layer"]
        WBC --> BASE[Base Control]
        WBC --> ARM[Arm Control]
        CONT --> ARM
        BASE --> MOT_B[Base Motors]
        ARM --> MOT_A[Arm Joint Motors]
    end

    subgraph Hardware
        MOT_B --> ROBOT[Mobile Manipulation Robot]
        MOT_A --> ROBOT
        ROBOT --> CAM
        ROBOT --> LID
        ROBOT --> FT
    end

Whole-Body Planning and Control

The core challenge of mobile manipulation is coordinating base motion with arm motion.

Approach 1: Hierarchical planning 1. Plan base to reach manipulation position first 2. After base settles, plan arm motion 3. Simple but inefficient, not suitable for dynamic tasks

Approach 2: Whole-body motion planning

Unify base DOF (\(x, y, \theta\)) with arm DOF (\(q_1, ..., q_n\)) into a high-dimensional configuration space:

\[ \mathbf{q}_{full} = [x, y, \theta, q_1, q_2, ..., q_n]^T \]

Use sampling-based planners like RRT/PRM in this space for joint planning.

Approach 3: Optimization methods

Use trajectory optimization (e.g., TrajOpt, CHOMP) to simultaneously optimize base and arm motion:

\[ \min_{\mathbf{q}_{0:T}} \sum_{t=0}^{T} \left[ c_{task}(\mathbf{q}_t) + c_{smooth}(\mathbf{q}_t, \mathbf{q}_{t-1}) + c_{collision}(\mathbf{q}_t) \right] \]

Representative Mobile Manipulation Platforms

Platform Composition Features Application
Hello Robot Stretch Differential base + telescoping arm Lightweight, ~$25K, clean design Home assistance research
Fetch Mobile Manipulator Differential base + 7-DOF arm Classic research platform Discontinued, extensive prior work
Mobile ALOHA AgileX base + dual ViperX arms Low-cost dual-arm teleop, open-source Imitation learning, household
Google Everyday Robots Mobile base + 7-DOF arm Internal R&D, RT-1/RT-2 Office cleaning
TIAGo (PAL Robotics) Differential base + 7-DOF arm Commercial research platform, ROS integration Service/research
PR2 (Willow Garage) Omnidirectional base + dual 7-DOF arms Historical classic, ROS origin platform Discontinued

Grasping

Grasping Problem Classification

graph TD
    A[Robot Grasping] --> B[Analytical Methods]
    A --> C[Learning-based Methods]

    B --> B1[Force Closure Analysis]
    B --> B2[Form Closure Analysis]
    B --> B3[Grasp Quality Metrics]

    C --> C1[Image-based<br/>GG-CNN, GraspNet]
    C --> C2[Point Cloud-based<br/>Contact-GraspNet, AnyGrasp]
    C --> C3[Diffusion Model-based<br/>Diffusion Policy]
    C --> C4[Language-guided<br/>VLM + Grasping]

    B1 --> D[Known Object Model]
    C1 --> E[Unknown Object Generalization]
    C2 --> E

Force Closure and Grasp Quality

Force closure: The friction cone combinations at grasp contact points can resist any external disturbance force.

Given contact force \(\mathbf{f}_i\) at contact point \(i\), the friction cone constraint is:

\[ \sqrt{f_{ix}^2 + f_{iy}^2} \leq \mu f_{iz}, \quad f_{iz} \geq 0 \]

Mapping contact forces to the object frame's wrench space:

\[ \mathbf{w}_i = G_i \mathbf{f}_i, \quad G_i = \begin{bmatrix} I \\ [p_i]_\times \end{bmatrix} \]

where \([p_i]_\times\) is the skew-symmetric matrix of the contact point position vector.

Grasp quality metric: The positive linear combinations of all contact wrenches form the feasible wrench set \(\mathcal{W}\), with quality:

\[ Q = \min_{\mathbf{w} \in \partial \mathcal{W}} \|\mathbf{w}\| \]

i.e., the minimum distance from the origin to the boundary of the feasible wrench space. \(Q > 0\) indicates force closure; larger \(Q\) means more robust grasps.

Learning-based Grasping

GraspNet / AnyGrasp: - Input: Single/multi-frame point clouds - Output: Large number of candidate grasp poses (\(SE(3)\)) with quality scores - Training data: Large-scale synthetic data + analytical grasp annotations - Feature: Strong generalization to unseen objects

Contact-GraspNet: - Direct contact grasp prediction on point clouds - 6-DOF grasp pose generation - Fast, suitable for real-time applications

Grasping Pipeline

Typical robot grasping workflow:

  1. Perception: RGB-D to obtain scene point cloud
  2. Segmentation: Instance segmentation to isolate target object
  3. Grasp detection: Generate candidate grasp poses
  4. Motion planning: Plan collision-free path to grasp pose
  5. Execution: Execute grasp and verify

Impedance Control and Force Control

When robot arms interact with the environment, pure position control can produce excessive contact forces. Impedance control models the arm end-effector as a spring-damper system:

\[ M_d \ddot{\mathbf{e}} + D_d \dot{\mathbf{e}} + K_d \mathbf{e} = \mathbf{f}_{ext} \]
  • \(M_d, D_d, K_d\): desired inertia, damping, stiffness matrices
  • \(\mathbf{e} = \mathbf{x} - \mathbf{x}_d\): position error
  • \(\mathbf{f}_{ext}\): external force

Advantage: Behavior can be switched from rigid (high \(K_d\)) to compliant (low \(K_d\)) by adjusting stiffness.

Application scenarios: Wiping surfaces, connector insertion/extraction, collaborative carrying, and other force-controlled tasks.


Frontier Directions

Foundation Model-Driven Manipulation

  • RT-1 / RT-2 (Google): Robot manipulation policies trained on large-scale data
  • Octo (UC Berkeley): Open-source general manipulation foundation model
  • OpenVLA: Vision-language-action model, generating actions directly from language instructions
  • Diffusion Policy: Diffusion models for action generation, handling multi-modal action distributions

Teleoperation and Data Collection

  • ALOHA / Mobile ALOHA: Low-cost dual-arm teleoperation systems using follower arms for direct teleoperation
  • UMI (Universal Manipulation Interface): Hand-held gripper for data collection, no robot needed for demonstrations
  • Open-TeleVision: VR headset teleoperation, supporting dexterous hands

References

  • Siciliano et al., Robotics: Modelling, Planning and Control, Springer
  • Lynch & Park, Modern Robotics: Mechanics, Planning, and Control
  • Fang et al., "AnyGrasp: Robust and Efficient Grasp Perception in Spatial and Temporal Domains", T-RO, 2023
  • Chi et al., "Diffusion Policy: Visuomotor Policy Learning via Action Diffusion", RSS, 2023

Related Notes:


评论 #