Depth Cameras
Overview
Depth cameras (RGB-D cameras) simultaneously provide color images and per-pixel depth information, making them core sensors for 3D perception in robotics. Compared to pure RGB cameras, depth cameras directly output three-dimensional data, greatly simplifying tasks such as obstacle avoidance, grasping, and SLAM.
Depth Sensing Principles
Major Technical Approaches
| Technology | Principle | Accuracy | Range | Outdoor Performance | Representative Product |
|---|---|---|---|---|---|
| Active Stereo | IR projection + dual IR cameras | High | 0.3-10m | Medium | RealSense D435i |
| Passive Stereo | Dual RGB cameras + disparity matching | Medium | 0.5-20m | Good | ZED 2i |
| Structured Light | Coded pattern projection + decoding | Highest | 0.2-4m | Poor | Kinect Azure |
| Time-of-Flight (ToF) | Measure light flight time | Medium | 0.1-5m | Medium | PMD Flexx2 |
Depth Accuracy Model
Depth error is typically proportional to the square of the distance:
Where \(Z\) is the distance, \(f\) is the focal length, and \(B\) is the baseline length.
Intel RealSense Series
RealSense D435i
| Parameter | Value |
|---|---|
| Depth Technology | Active infrared stereo vision |
| Depth Resolution | 1280 x 720 @90fps |
| Depth Range | 0.3 - 10m |
| RGB Resolution | 1920 x 1080 @30fps |
| FOV (Depth) | 87 x 58 degrees |
| FOV (RGB) | 69 x 42 degrees |
| Baseline | 50mm |
| IMU | BMI055 (accelerometer + gyroscope) |
| Interface | USB 3.0 Type-C |
| Size | 90 x 25 x 25mm |
| Power | ~2.5W |
| Price | ~$250 |
Suitable scenarios: General choice for SLAM, obstacle avoidance, and robotic arm manipulation.
RealSense D455
| Parameter | D435i | D455 |
|---|---|---|
| Baseline | 50mm | 95mm |
| Depth Range | 0.3-10m | 0.4-20m |
| IMU | BMI055 | BMI085 (more accurate) |
| RGB FOV | 69x42 degrees | 90x65 degrees |
| Depth Accuracy @4m | ~2% | <1% |
| Size | 90x25x25mm | 124x26x29mm |
| Price | ~$250 | ~$350 |
D435i vs. D455 Selection
- D435i: Compact size, suitable for space-constrained scenarios (e.g., small robots, end-effectors)
- D455: Longer baseline = better long-range depth accuracy, suitable for mobile robot navigation
RealSense D405
| Parameter | Value |
|---|---|
| Depth Technology | Active infrared stereo vision |
| Optimal Depth Range | 0.07 - 0.7m |
| Depth Resolution | 1280 x 720 @90fps |
| Baseline | 18mm |
| Size | 42 x 42 x 23mm |
| Price | ~$200 |
Suitable scenarios: Close-range fine manipulation (e.g., eye-in-hand camera on robotic arms).
RealSense Programming
import pyrealsense2 as rs
import numpy as np
import cv2
# Configure pipeline
pipeline = rs.pipeline()
config = rs.config()
config.enable_stream(rs.stream.depth, 640, 480, rs.format.z16, 30)
config.enable_stream(rs.stream.color, 640, 480, rs.format.bgr8, 30)
# Start
profile = pipeline.start(config)
# Get depth sensor intrinsics
depth_sensor = profile.get_device().first_depth_sensor()
depth_scale = depth_sensor.get_depth_scale() # ~0.001 (1mm)
# Align depth to color
align = rs.align(rs.stream.color)
# Post-processing filters
decimation = rs.decimation_filter()
spatial = rs.spatial_filter()
temporal = rs.temporal_filter()
hole_filling = rs.hole_filling_filter()
try:
while True:
frames = pipeline.wait_for_frames()
aligned_frames = align.process(frames)
depth_frame = aligned_frames.get_depth_frame()
color_frame = aligned_frames.get_color_frame()
# Apply post-processing
depth_frame = decimation.process(depth_frame)
depth_frame = spatial.process(depth_frame)
depth_frame = temporal.process(depth_frame)
# Convert to numpy arrays
depth_image = np.asanyarray(depth_frame.get_data())
color_image = np.asanyarray(color_frame.get_data())
# Depth to meters
depth_meters = depth_image * depth_scale
# Get 3D coordinates of a specific pixel
depth_intrin = depth_frame.profile.as_video_stream_profile().intrinsics
pixel = [320, 240]
depth = depth_frame.get_distance(pixel[0], pixel[1])
point_3d = rs.rs2_deproject_pixel_to_point(
depth_intrin, pixel, depth
)
print(f"3D Point: {point_3d}")
finally:
pipeline.stop()
RealSense + ROS2
# Install ROS2 wrapper
sudo apt install ros-humble-realsense2-camera
# Launch camera node
ros2 launch realsense2_camera rs_launch.py \
depth_module.profile:=640x480x30 \
rgb_camera.profile:=640x480x30 \
enable_gyro:=true \
enable_accel:=true \
align_depth.enable:=true \
pointcloud.enable:=true
OAK-D Series (Luxonis)
OAK-D Lite
| Parameter | Value |
|---|---|
| Depth Technology | Passive stereo (stereo IR) |
| Depth Resolution | 640 x 400 @60fps |
| Depth Range | 0.35 - 10m |
| RGB | 4K @30fps (IMX214) |
| AI Acceleration | Intel Myriad X VPU (4 TOPS) |
| Interface | USB 3.0 Type-C |
| Highlight | On-chip neural network inference |
| Price | ~$150 |
Core advantage: Built-in AI accelerator enabling on-camera object detection and other models, reducing the load on the main controller.
import depthai as dai
# Create pipeline
pipeline = dai.Pipeline()
# Configure RGB camera
cam_rgb = pipeline.create(dai.node.ColorCamera)
cam_rgb.setPreviewSize(300, 300)
cam_rgb.setInterleaved(False)
# Configure neural network (MobileNet-SSD)
nn = pipeline.create(dai.node.MobileNetDetectionNetwork)
nn.setBlobPath("mobilenet-ssd.blob")
nn.setConfidenceThreshold(0.5)
# Connect
cam_rgb.preview.link(nn.input)
# Output
xout_nn = pipeline.create(dai.node.XLinkOut)
xout_nn.setStreamName("nn")
nn.out.link(xout_nn.input)
with dai.Device(pipeline) as device:
q_nn = device.getOutputQueue("nn", maxSize=4, blocking=False)
while True:
detections = q_nn.get().detections
for det in detections:
print(f"Label: {det.label}, Confidence: {det.confidence:.2f}")
ZED 2i (Stereolabs)
Specifications
| Parameter | Value |
|---|---|
| Depth Technology | Passive stereo |
| Resolution | 2x 2208x1242 (4.2MP each) |
| Depth Range | 0.3 - 20m |
| Baseline | 120mm |
| FOV | 110 (H) x 70 (V) degrees |
| IMU | Accelerometer + gyroscope + magnetometer |
| Interface | USB 3.0 |
| Highlights | Built-in SLAM, object detection, body tracking |
| Waterproof | IP66 |
| Price | ~$450 |
Advantages:
- Large baseline (120mm) provides better long-range depth accuracy
- IP66 waterproof and dustproof, suitable for outdoor robots
- SDK includes built-in visual SLAM and AI features
- Passive stereo works normally outdoors in sunlight
import pyzed.sl as sl
zed = sl.Camera()
# Configure
init_params = sl.InitParameters()
init_params.camera_resolution = sl.RESOLUTION.HD720
init_params.camera_fps = 60
init_params.depth_mode = sl.DEPTH_MODE.ULTRA
init_params.coordinate_units = sl.UNIT.METER
zed.open(init_params)
# Enable positional tracking
tracking_params = sl.PositionalTrackingParameters()
zed.enable_positional_tracking(tracking_params)
image = sl.Mat()
depth = sl.Mat()
pose = sl.Pose()
while True:
if zed.grab() == sl.ERROR_CODE.SUCCESS:
zed.retrieve_image(image, sl.VIEW.LEFT)
zed.retrieve_measure(depth, sl.MEASURE.DEPTH)
# Get pose (SLAM)
state = zed.get_position(pose)
translation = pose.get_translation()
print(f"Position: {translation.get()}")
Orbbec Series
| Model | Depth Technology | Range | Resolution | Price | Highlight |
|---|---|---|---|---|---|
| Femto Bolt | iToF | 0.25-5.5m | 640x576@30fps | ~$300 | Azure Kinect alternative |
| Femto Mega | iToF | 0.25-5.5m | 640x576@30fps | ~$450 | Onboard compute |
| Astra 2 | Structured light | 0.3-6m | 640x400@30fps | ~$100 | Low cost |
| Gemini 2 | Stereo | 0.15-10m | 1280x800@30fps | ~$200 | Suitable for robotic arms |
Depth Camera Comparison
| Feature | RealSense D435i | RealSense D455 | OAK-D Lite | ZED 2i | Orbbec Femto |
|---|---|---|---|---|---|
| Technology | Active stereo | Active stereo | Passive stereo | Passive stereo | iToF |
| Depth Range | 0.3-10m | 0.4-20m | 0.35-10m | 0.3-20m | 0.25-5.5m |
| Accuracy @1m | ~2mm | ~1.5mm | ~5mm | ~3mm | ~2mm |
| Outdoor Performance | Medium | Medium | Medium | Good | Medium |
| IMU | Yes | Yes (better) | Yes | Yes | Yes |
| AI Acceleration | No | No | 4 TOPS | No | No |
| ROS2 Support | Official | Official | Official | Official | Official |
| Waterproof | No | No | No | IP66 | No |
| Price | $250 | $350 | $150 | $450 | $300 |
Depth Accuracy vs. Distance
Stereo Vision Depth Error
Where \(\sigma_d\) is the disparity matching error (typically 0.5-1 pixel).
| Distance | D435i (B=50mm) | D455 (B=95mm) | ZED 2i (B=120mm) |
|---|---|---|---|
| 1m | ~2mm | ~1mm | ~0.8mm |
| 3m | ~18mm | ~10mm | ~7mm |
| 5m | ~50mm | ~26mm | ~21mm |
| 10m | ~200mm | ~105mm | ~83mm |
How to Improve Long-Range Depth Accuracy
- Increase baseline (but increases near-range blind zone)
- Increase resolution (reduces \(\sigma_d\))
- Fuse with other sensors (e.g., LiDAR)
See: Multi-Sensor Fusion for depth camera fusion with other sensors.
Depth Map Post-Processing
Common Filters
import cv2
import numpy as np
def postprocess_depth(depth_image):
"""Depth map post-processing pipeline"""
# 1. Remove invalid values
depth_image[depth_image == 0] = np.nan
# 2. Median filter (remove outliers)
depth_filtered = cv2.medianBlur(
depth_image.astype(np.float32), 5
)
# 3. Bilateral filter (edge-preserving smoothing)
depth_smooth = cv2.bilateralFilter(
depth_filtered, d=9, sigmaColor=75, sigmaSpace=75
)
# 4. Hole filling (nearest-neighbor interpolation)
mask = np.isnan(depth_smooth)
if mask.any():
from scipy.ndimage import distance_transform_edt
indices = distance_transform_edt(
mask, return_distances=False, return_indices=True
)
depth_smooth = depth_smooth[tuple(indices)]
return depth_smooth
Depth Map to Point Cloud
import numpy as np
import open3d as o3d
def depth_to_pointcloud(depth_image, color_image, intrinsics):
"""Convert depth map to point cloud"""
fx, fy, cx, cy = intrinsics
h, w = depth_image.shape
u, v = np.meshgrid(np.arange(w), np.arange(h))
z = depth_image
x = (u - cx) * z / fx
y = (v - cy) * z / fy
points = np.stack([x, y, z], axis=-1).reshape(-1, 3)
colors = color_image.reshape(-1, 3) / 255.0
# Filter invalid points
valid = z.reshape(-1) > 0
points = points[valid]
colors = colors[valid]
pcd = o3d.geometry.PointCloud()
pcd.points = o3d.utility.Vector3dVector(points)
pcd.colors = o3d.utility.Vector3dVector(colors[:, ::-1]) # BGR->RGB
return pcd
Summary
- Active stereo (RealSense) offers the best overall indoor performance
- Passive stereo (ZED) performs better outdoors and at long range
- Depth accuracy is inversely proportional to the square of the distance; larger baselines improve long-range accuracy
- D435i is the most versatile choice; D455 is for better long-range accuracy
- OAK-D's on-chip AI acceleration is a unique advantage
- Post-processing filters can significantly improve depth map quality
References
- Intel RealSense SDK: https://github.com/IntelRealSense/librealsense
- Stereolabs ZED SDK: https://www.stereolabs.com/developers
- Luxonis DepthAI: https://docs.luxonis.com/
- Orbbec SDK: https://developer.orbbec.com.cn/