Depth Cameras

Overview

Depth cameras (RGB-D cameras) simultaneously provide color images and per-pixel depth information, making them core sensors for 3D perception in robotics. Compared to pure RGB cameras, depth cameras directly output three-dimensional data, greatly simplifying tasks such as obstacle avoidance, grasping, and SLAM.

Depth Sensing Principles

Major Technical Approaches

Technology	Principle	Accuracy	Range	Outdoor Performance	Representative Product
Active Stereo	IR projection + dual IR cameras	High	0.3-10m	Medium	RealSense D435i
Passive Stereo	Dual RGB cameras + disparity matching	Medium	0.5-20m	Good	ZED 2i
Structured Light	Coded pattern projection + decoding	Highest	0.2-4m	Poor	Kinect Azure
Time-of-Flight (ToF)	Measure light flight time	Medium	0.1-5m	Medium	PMD Flexx2

Depth Accuracy Model

Depth error is typically proportional to the square of the distance:

\[ \sigma_Z \propto \frac{Z^2}{fB} \]

Where $Z$ is the distance, $f$ is the focal length, and $B$ is the baseline length.

Intel RealSense Series

RealSense D435i

Parameter	Value
Depth Technology	Active infrared stereo vision
Depth Resolution	1280 x 720 @90fps
Depth Range	0.3 - 10m
RGB Resolution	1920 x 1080 @30fps
FOV (Depth)	87 x 58 degrees
FOV (RGB)	69 x 42 degrees
Baseline	50mm
IMU	BMI055 (accelerometer + gyroscope)
Interface	USB 3.0 Type-C
Size	90 x 25 x 25mm
Power	~2.5W
Price	~$250

Suitable scenarios: General choice for SLAM, obstacle avoidance, and robotic arm manipulation.

RealSense D455

Parameter	D435i	D455
Baseline	50mm	95mm
Depth Range	0.3-10m	0.4-20m
IMU	BMI055	BMI085 (more accurate)
RGB FOV	69x42 degrees	90x65 degrees
Depth Accuracy @4m	~2%	<1%
Size	90x25x25mm	124x26x29mm
Price	~$250	~$350

D435i vs. D455 Selection

D435i: Compact size, suitable for space-constrained scenarios (e.g., small robots, end-effectors)
D455: Longer baseline = better long-range depth accuracy, suitable for mobile robot navigation

RealSense D405

Parameter	Value
Depth Technology	Active infrared stereo vision
Optimal Depth Range	0.07 - 0.7m
Depth Resolution	1280 x 720 @90fps
Baseline	18mm
Size	42 x 42 x 23mm
Price	~$200

Suitable scenarios: Close-range fine manipulation (e.g., eye-in-hand camera on robotic arms).

RealSense Programming

import pyrealsense2 as rs
import numpy as np
import cv2

# Configure pipeline
pipeline = rs.pipeline()
config = rs.config()
config.enable_stream(rs.stream.depth, 640, 480, rs.format.z16, 30)
config.enable_stream(rs.stream.color, 640, 480, rs.format.bgr8, 30)

# Start
profile = pipeline.start(config)

# Get depth sensor intrinsics
depth_sensor = profile.get_device().first_depth_sensor()
depth_scale = depth_sensor.get_depth_scale()  # ~0.001 (1mm)

# Align depth to color
align = rs.align(rs.stream.color)

# Post-processing filters
decimation = rs.decimation_filter()
spatial = rs.spatial_filter()
temporal = rs.temporal_filter()
hole_filling = rs.hole_filling_filter()

try:
    while True:
        frames = pipeline.wait_for_frames()
        aligned_frames = align.process(frames)

        depth_frame = aligned_frames.get_depth_frame()
        color_frame = aligned_frames.get_color_frame()

        # Apply post-processing
        depth_frame = decimation.process(depth_frame)
        depth_frame = spatial.process(depth_frame)
        depth_frame = temporal.process(depth_frame)

        # Convert to numpy arrays
        depth_image = np.asanyarray(depth_frame.get_data())
        color_image = np.asanyarray(color_frame.get_data())

        # Depth to meters
        depth_meters = depth_image * depth_scale

        # Get 3D coordinates of a specific pixel
        depth_intrin = depth_frame.profile.as_video_stream_profile().intrinsics
        pixel = [320, 240]
        depth = depth_frame.get_distance(pixel[0], pixel[1])
        point_3d = rs.rs2_deproject_pixel_to_point(
            depth_intrin, pixel, depth
        )
        print(f"3D Point: {point_3d}")

finally:
    pipeline.stop()

RealSense + ROS2

# Install ROS2 wrapper
sudo apt install ros-humble-realsense2-camera

# Launch camera node
ros2 launch realsense2_camera rs_launch.py \
    depth_module.profile:=640x480x30 \
    rgb_camera.profile:=640x480x30 \
    enable_gyro:=true \
    enable_accel:=true \
    align_depth.enable:=true \
    pointcloud.enable:=true

OAK-D Series (Luxonis)

OAK-D Lite

Parameter	Value
Depth Technology	Passive stereo (stereo IR)
Depth Resolution	640 x 400 @60fps
Depth Range	0.35 - 10m
RGB	4K @30fps (IMX214)
AI Acceleration	Intel Myriad X VPU (4 TOPS)
Interface	USB 3.0 Type-C
Highlight	On-chip neural network inference
Price	~$150

Core advantage: Built-in AI accelerator enabling on-camera object detection and other models, reducing the load on the main controller.

import depthai as dai

# Create pipeline
pipeline = dai.Pipeline()

# Configure RGB camera
cam_rgb = pipeline.create(dai.node.ColorCamera)
cam_rgb.setPreviewSize(300, 300)
cam_rgb.setInterleaved(False)

# Configure neural network (MobileNet-SSD)
nn = pipeline.create(dai.node.MobileNetDetectionNetwork)
nn.setBlobPath("mobilenet-ssd.blob")
nn.setConfidenceThreshold(0.5)

# Connect
cam_rgb.preview.link(nn.input)

# Output
xout_nn = pipeline.create(dai.node.XLinkOut)
xout_nn.setStreamName("nn")
nn.out.link(xout_nn.input)

with dai.Device(pipeline) as device:
    q_nn = device.getOutputQueue("nn", maxSize=4, blocking=False)
    while True:
        detections = q_nn.get().detections
        for det in detections:
            print(f"Label: {det.label}, Confidence: {det.confidence:.2f}")

ZED 2i (Stereolabs)

Specifications

Parameter	Value
Depth Technology	Passive stereo
Resolution	2x 2208x1242 (4.2MP each)
Depth Range	0.3 - 20m
Baseline	120mm
FOV	110 (H) x 70 (V) degrees
IMU	Accelerometer + gyroscope + magnetometer
Interface	USB 3.0
Highlights	Built-in SLAM, object detection, body tracking
Waterproof	IP66
Price	~$450

Advantages:

Large baseline (120mm) provides better long-range depth accuracy
IP66 waterproof and dustproof, suitable for outdoor robots
SDK includes built-in visual SLAM and AI features
Passive stereo works normally outdoors in sunlight

import pyzed.sl as sl

zed = sl.Camera()

# Configure
init_params = sl.InitParameters()
init_params.camera_resolution = sl.RESOLUTION.HD720
init_params.camera_fps = 60
init_params.depth_mode = sl.DEPTH_MODE.ULTRA
init_params.coordinate_units = sl.UNIT.METER

zed.open(init_params)

# Enable positional tracking
tracking_params = sl.PositionalTrackingParameters()
zed.enable_positional_tracking(tracking_params)

image = sl.Mat()
depth = sl.Mat()
pose = sl.Pose()

while True:
    if zed.grab() == sl.ERROR_CODE.SUCCESS:
        zed.retrieve_image(image, sl.VIEW.LEFT)
        zed.retrieve_measure(depth, sl.MEASURE.DEPTH)

        # Get pose (SLAM)
        state = zed.get_position(pose)
        translation = pose.get_translation()
        print(f"Position: {translation.get()}")

Orbbec Series

Model	Depth Technology	Range	Resolution	Price	Highlight
Femto Bolt	iToF	0.25-5.5m	640x576@30fps	~$300	Azure Kinect alternative
Femto Mega	iToF	0.25-5.5m	640x576@30fps	~$450	Onboard compute
Astra 2	Structured light	0.3-6m	640x400@30fps	~$100	Low cost
Gemini 2	Stereo	0.15-10m	1280x800@30fps	~$200	Suitable for robotic arms

Depth Camera Comparison

Feature	RealSense D435i	RealSense D455	OAK-D Lite	ZED 2i	Orbbec Femto
Technology	Active stereo	Active stereo	Passive stereo	Passive stereo	iToF
Depth Range	0.3-10m	0.4-20m	0.35-10m	0.3-20m	0.25-5.5m
Accuracy @1m	~2mm	~1.5mm	~5mm	~3mm	~2mm
Outdoor Performance	Medium	Medium	Medium	Good	Medium
IMU	Yes	Yes (better)	Yes	Yes	Yes
AI Acceleration	No	No	4 TOPS	No	No
ROS2 Support	Official	Official	Official	Official	Official
Waterproof	No	No	No	IP66	No
Price	$250	$350	$150	$450	$300

Depth Accuracy vs. Distance

Stereo Vision Depth Error

\[ \sigma_Z = \frac{Z^2}{fB} \sigma_d \]

Where $\sigma_d$ is the disparity matching error (typically 0.5-1 pixel).

Distance	D435i (B=50mm)	D455 (B=95mm)	ZED 2i (B=120mm)
1m	~2mm	~1mm	~0.8mm
3m	~18mm	~10mm	~7mm
5m	~50mm	~26mm	~21mm
10m	~200mm	~105mm	~83mm

How to Improve Long-Range Depth Accuracy

Increase baseline (but increases near-range blind zone)
Increase resolution (reduces $\sigma_d$)
Fuse with other sensors (e.g., LiDAR)

See: Multi-Sensor Fusion for depth camera fusion with other sensors.

Depth Map Post-Processing

Common Filters

import cv2
import numpy as np

def postprocess_depth(depth_image):
    """Depth map post-processing pipeline"""

    # 1. Remove invalid values
    depth_image[depth_image == 0] = np.nan

    # 2. Median filter (remove outliers)
    depth_filtered = cv2.medianBlur(
        depth_image.astype(np.float32), 5
    )

    # 3. Bilateral filter (edge-preserving smoothing)
    depth_smooth = cv2.bilateralFilter(
        depth_filtered, d=9, sigmaColor=75, sigmaSpace=75
    )

    # 4. Hole filling (nearest-neighbor interpolation)
    mask = np.isnan(depth_smooth)
    if mask.any():
        from scipy.ndimage import distance_transform_edt
        indices = distance_transform_edt(
            mask, return_distances=False, return_indices=True
        )
        depth_smooth = depth_smooth[tuple(indices)]

    return depth_smooth

Depth Map to Point Cloud

import numpy as np
import open3d as o3d

def depth_to_pointcloud(depth_image, color_image, intrinsics):
    """Convert depth map to point cloud"""
    fx, fy, cx, cy = intrinsics

    h, w = depth_image.shape
    u, v = np.meshgrid(np.arange(w), np.arange(h))

    z = depth_image
    x = (u - cx) * z / fx
    y = (v - cy) * z / fy

    points = np.stack([x, y, z], axis=-1).reshape(-1, 3)
    colors = color_image.reshape(-1, 3) / 255.0

    # Filter invalid points
    valid = z.reshape(-1) > 0
    points = points[valid]
    colors = colors[valid]

    pcd = o3d.geometry.PointCloud()
    pcd.points = o3d.utility.Vector3dVector(points)
    pcd.colors = o3d.utility.Vector3dVector(colors[:, ::-1])  # BGR->RGB

    return pcd

Summary

Active stereo (RealSense) offers the best overall indoor performance
Passive stereo (ZED) performs better outdoors and at long range
Depth accuracy is inversely proportional to the square of the distance; larger baselines improve long-range accuracy
D435i is the most versatile choice; D455 is for better long-range accuracy
OAK-D's on-chip AI acceleration is a unique advantage
Post-processing filters can significantly improve depth map quality

References

Intel RealSense SDK: https://github.com/IntelRealSense/librealsense
Stereolabs ZED SDK: https://www.stereolabs.com/developers
Luxonis DepthAI: https://docs.luxonis.com/
Orbbec SDK: https://developer.orbbec.com.cn/