Skip to content

Depth Cameras

Overview

Depth cameras (RGB-D cameras) simultaneously provide color images and per-pixel depth information, making them core sensors for 3D perception in robotics. Compared to pure RGB cameras, depth cameras directly output three-dimensional data, greatly simplifying tasks such as obstacle avoidance, grasping, and SLAM.

Depth Sensing Principles

Major Technical Approaches

Technology Principle Accuracy Range Outdoor Performance Representative Product
Active Stereo IR projection + dual IR cameras High 0.3-10m Medium RealSense D435i
Passive Stereo Dual RGB cameras + disparity matching Medium 0.5-20m Good ZED 2i
Structured Light Coded pattern projection + decoding Highest 0.2-4m Poor Kinect Azure
Time-of-Flight (ToF) Measure light flight time Medium 0.1-5m Medium PMD Flexx2

Depth Accuracy Model

Depth error is typically proportional to the square of the distance:

\[ \sigma_Z \propto \frac{Z^2}{fB} \]

Where \(Z\) is the distance, \(f\) is the focal length, and \(B\) is the baseline length.

Intel RealSense Series

RealSense D435i

Parameter Value
Depth Technology Active infrared stereo vision
Depth Resolution 1280 x 720 @90fps
Depth Range 0.3 - 10m
RGB Resolution 1920 x 1080 @30fps
FOV (Depth) 87 x 58 degrees
FOV (RGB) 69 x 42 degrees
Baseline 50mm
IMU BMI055 (accelerometer + gyroscope)
Interface USB 3.0 Type-C
Size 90 x 25 x 25mm
Power ~2.5W
Price ~$250

Suitable scenarios: General choice for SLAM, obstacle avoidance, and robotic arm manipulation.

RealSense D455

Parameter D435i D455
Baseline 50mm 95mm
Depth Range 0.3-10m 0.4-20m
IMU BMI055 BMI085 (more accurate)
RGB FOV 69x42 degrees 90x65 degrees
Depth Accuracy @4m ~2% <1%
Size 90x25x25mm 124x26x29mm
Price ~$250 ~$350

D435i vs. D455 Selection

  • D435i: Compact size, suitable for space-constrained scenarios (e.g., small robots, end-effectors)
  • D455: Longer baseline = better long-range depth accuracy, suitable for mobile robot navigation

RealSense D405

Parameter Value
Depth Technology Active infrared stereo vision
Optimal Depth Range 0.07 - 0.7m
Depth Resolution 1280 x 720 @90fps
Baseline 18mm
Size 42 x 42 x 23mm
Price ~$200

Suitable scenarios: Close-range fine manipulation (e.g., eye-in-hand camera on robotic arms).

RealSense Programming

import pyrealsense2 as rs
import numpy as np
import cv2

# Configure pipeline
pipeline = rs.pipeline()
config = rs.config()
config.enable_stream(rs.stream.depth, 640, 480, rs.format.z16, 30)
config.enable_stream(rs.stream.color, 640, 480, rs.format.bgr8, 30)

# Start
profile = pipeline.start(config)

# Get depth sensor intrinsics
depth_sensor = profile.get_device().first_depth_sensor()
depth_scale = depth_sensor.get_depth_scale()  # ~0.001 (1mm)

# Align depth to color
align = rs.align(rs.stream.color)

# Post-processing filters
decimation = rs.decimation_filter()
spatial = rs.spatial_filter()
temporal = rs.temporal_filter()
hole_filling = rs.hole_filling_filter()

try:
    while True:
        frames = pipeline.wait_for_frames()
        aligned_frames = align.process(frames)

        depth_frame = aligned_frames.get_depth_frame()
        color_frame = aligned_frames.get_color_frame()

        # Apply post-processing
        depth_frame = decimation.process(depth_frame)
        depth_frame = spatial.process(depth_frame)
        depth_frame = temporal.process(depth_frame)

        # Convert to numpy arrays
        depth_image = np.asanyarray(depth_frame.get_data())
        color_image = np.asanyarray(color_frame.get_data())

        # Depth to meters
        depth_meters = depth_image * depth_scale

        # Get 3D coordinates of a specific pixel
        depth_intrin = depth_frame.profile.as_video_stream_profile().intrinsics
        pixel = [320, 240]
        depth = depth_frame.get_distance(pixel[0], pixel[1])
        point_3d = rs.rs2_deproject_pixel_to_point(
            depth_intrin, pixel, depth
        )
        print(f"3D Point: {point_3d}")

finally:
    pipeline.stop()

RealSense + ROS2

# Install ROS2 wrapper
sudo apt install ros-humble-realsense2-camera

# Launch camera node
ros2 launch realsense2_camera rs_launch.py \
    depth_module.profile:=640x480x30 \
    rgb_camera.profile:=640x480x30 \
    enable_gyro:=true \
    enable_accel:=true \
    align_depth.enable:=true \
    pointcloud.enable:=true

OAK-D Series (Luxonis)

OAK-D Lite

Parameter Value
Depth Technology Passive stereo (stereo IR)
Depth Resolution 640 x 400 @60fps
Depth Range 0.35 - 10m
RGB 4K @30fps (IMX214)
AI Acceleration Intel Myriad X VPU (4 TOPS)
Interface USB 3.0 Type-C
Highlight On-chip neural network inference
Price ~$150

Core advantage: Built-in AI accelerator enabling on-camera object detection and other models, reducing the load on the main controller.

import depthai as dai

# Create pipeline
pipeline = dai.Pipeline()

# Configure RGB camera
cam_rgb = pipeline.create(dai.node.ColorCamera)
cam_rgb.setPreviewSize(300, 300)
cam_rgb.setInterleaved(False)

# Configure neural network (MobileNet-SSD)
nn = pipeline.create(dai.node.MobileNetDetectionNetwork)
nn.setBlobPath("mobilenet-ssd.blob")
nn.setConfidenceThreshold(0.5)

# Connect
cam_rgb.preview.link(nn.input)

# Output
xout_nn = pipeline.create(dai.node.XLinkOut)
xout_nn.setStreamName("nn")
nn.out.link(xout_nn.input)

with dai.Device(pipeline) as device:
    q_nn = device.getOutputQueue("nn", maxSize=4, blocking=False)
    while True:
        detections = q_nn.get().detections
        for det in detections:
            print(f"Label: {det.label}, Confidence: {det.confidence:.2f}")

ZED 2i (Stereolabs)

Specifications

Parameter Value
Depth Technology Passive stereo
Resolution 2x 2208x1242 (4.2MP each)
Depth Range 0.3 - 20m
Baseline 120mm
FOV 110 (H) x 70 (V) degrees
IMU Accelerometer + gyroscope + magnetometer
Interface USB 3.0
Highlights Built-in SLAM, object detection, body tracking
Waterproof IP66
Price ~$450

Advantages:

  • Large baseline (120mm) provides better long-range depth accuracy
  • IP66 waterproof and dustproof, suitable for outdoor robots
  • SDK includes built-in visual SLAM and AI features
  • Passive stereo works normally outdoors in sunlight
import pyzed.sl as sl

zed = sl.Camera()

# Configure
init_params = sl.InitParameters()
init_params.camera_resolution = sl.RESOLUTION.HD720
init_params.camera_fps = 60
init_params.depth_mode = sl.DEPTH_MODE.ULTRA
init_params.coordinate_units = sl.UNIT.METER

zed.open(init_params)

# Enable positional tracking
tracking_params = sl.PositionalTrackingParameters()
zed.enable_positional_tracking(tracking_params)

image = sl.Mat()
depth = sl.Mat()
pose = sl.Pose()

while True:
    if zed.grab() == sl.ERROR_CODE.SUCCESS:
        zed.retrieve_image(image, sl.VIEW.LEFT)
        zed.retrieve_measure(depth, sl.MEASURE.DEPTH)

        # Get pose (SLAM)
        state = zed.get_position(pose)
        translation = pose.get_translation()
        print(f"Position: {translation.get()}")

Orbbec Series

Model Depth Technology Range Resolution Price Highlight
Femto Bolt iToF 0.25-5.5m 640x576@30fps ~$300 Azure Kinect alternative
Femto Mega iToF 0.25-5.5m 640x576@30fps ~$450 Onboard compute
Astra 2 Structured light 0.3-6m 640x400@30fps ~$100 Low cost
Gemini 2 Stereo 0.15-10m 1280x800@30fps ~$200 Suitable for robotic arms

Depth Camera Comparison

Feature RealSense D435i RealSense D455 OAK-D Lite ZED 2i Orbbec Femto
Technology Active stereo Active stereo Passive stereo Passive stereo iToF
Depth Range 0.3-10m 0.4-20m 0.35-10m 0.3-20m 0.25-5.5m
Accuracy @1m ~2mm ~1.5mm ~5mm ~3mm ~2mm
Outdoor Performance Medium Medium Medium Good Medium
IMU Yes Yes (better) Yes Yes Yes
AI Acceleration No No 4 TOPS No No
ROS2 Support Official Official Official Official Official
Waterproof No No No IP66 No
Price $250 $350 $150 $450 $300

Depth Accuracy vs. Distance

Stereo Vision Depth Error

\[ \sigma_Z = \frac{Z^2}{fB} \sigma_d \]

Where \(\sigma_d\) is the disparity matching error (typically 0.5-1 pixel).

Distance D435i (B=50mm) D455 (B=95mm) ZED 2i (B=120mm)
1m ~2mm ~1mm ~0.8mm
3m ~18mm ~10mm ~7mm
5m ~50mm ~26mm ~21mm
10m ~200mm ~105mm ~83mm

How to Improve Long-Range Depth Accuracy

  1. Increase baseline (but increases near-range blind zone)
  2. Increase resolution (reduces \(\sigma_d\))
  3. Fuse with other sensors (e.g., LiDAR)

See: Multi-Sensor Fusion for depth camera fusion with other sensors.

Depth Map Post-Processing

Common Filters

import cv2
import numpy as np

def postprocess_depth(depth_image):
    """Depth map post-processing pipeline"""

    # 1. Remove invalid values
    depth_image[depth_image == 0] = np.nan

    # 2. Median filter (remove outliers)
    depth_filtered = cv2.medianBlur(
        depth_image.astype(np.float32), 5
    )

    # 3. Bilateral filter (edge-preserving smoothing)
    depth_smooth = cv2.bilateralFilter(
        depth_filtered, d=9, sigmaColor=75, sigmaSpace=75
    )

    # 4. Hole filling (nearest-neighbor interpolation)
    mask = np.isnan(depth_smooth)
    if mask.any():
        from scipy.ndimage import distance_transform_edt
        indices = distance_transform_edt(
            mask, return_distances=False, return_indices=True
        )
        depth_smooth = depth_smooth[tuple(indices)]

    return depth_smooth

Depth Map to Point Cloud

import numpy as np
import open3d as o3d

def depth_to_pointcloud(depth_image, color_image, intrinsics):
    """Convert depth map to point cloud"""
    fx, fy, cx, cy = intrinsics

    h, w = depth_image.shape
    u, v = np.meshgrid(np.arange(w), np.arange(h))

    z = depth_image
    x = (u - cx) * z / fx
    y = (v - cy) * z / fy

    points = np.stack([x, y, z], axis=-1).reshape(-1, 3)
    colors = color_image.reshape(-1, 3) / 255.0

    # Filter invalid points
    valid = z.reshape(-1) > 0
    points = points[valid]
    colors = colors[valid]

    pcd = o3d.geometry.PointCloud()
    pcd.points = o3d.utility.Vector3dVector(points)
    pcd.colors = o3d.utility.Vector3dVector(colors[:, ::-1])  # BGR->RGB

    return pcd

Summary

  1. Active stereo (RealSense) offers the best overall indoor performance
  2. Passive stereo (ZED) performs better outdoors and at long range
  3. Depth accuracy is inversely proportional to the square of the distance; larger baselines improve long-range accuracy
  4. D435i is the most versatile choice; D455 is for better long-range accuracy
  5. OAK-D's on-chip AI acceleration is a unique advantage
  6. Post-processing filters can significantly improve depth map quality

References


评论 #