Skip to content

Panoramic and Event Cameras

Overview

Panoramic cameras and event cameras are two types of non-conventional visual sensors. Panoramic cameras provide 360-degree omnidirectional field of view, while event cameras asynchronously capture brightness change events. Both have unique application value in robot perception.

Panoramic Cameras

The Value of 360-Degree Vision

Panoramic vision provides robots with blind-spot-free environmental perception:

  • Omnidirectional obstacle detection: No need to stitch multiple cameras
  • Visual SLAM: More feature points, more robust tracking
  • Scene understanding: Capture complete environment information in a single shot
  • Teleoperation: Operators can freely look in any direction

Panoramic Camera Types

Type Principle FOV Resolution Loss Representative
Dual fisheye stitching Two back-to-back fisheye lenses 360x360 degrees Medium Ricoh Theta, Insta360
Multi-camera array Multiple cameras in a ring 360x~120 degrees Low Ladybug, autonomous driving solutions
Catadioptric Curved mirror + normal camera 360x~60 degrees High Academic use
Single fisheye Ultra-wide-angle lens ~190 degrees Medium Fisheye for obstacle avoidance

Common Panoramic Camera Products

Ricoh Theta Z1

Parameter Value
Sensor 2x 1" CMOS
Photo Resolution 6720 x 3360 (23MP)
Video Resolution 3840 x 1920 @30fps
Interface USB Type-C / WiFi
Highlights RAW capture, plugin system
Price ~$1000

Insta360 X4

Parameter Value
Sensor 2x 1/2" CMOS
Photo Resolution 11520 x 5760 (72MP)
Video Resolution 5.7K @60fps / 8K @30fps
Waterproof 10m
Highlights FlowState stabilization, AI editing
Price ~$500

Fisheye Lenses

Fisheye lenses achieve ultra-wide-angle imaging through an extremely large FOV (typically >180 degrees).

Fisheye Projection Models

Projection Model Formula Features
Equidistant \(r = f\theta\) Linear angle mapping
Equisolid angle \(r = 2f\sin(\theta/2)\) Area-preserving
Stereographic \(r = 2f\tan(\theta/2)\) Conformal
Orthographic \(r = f\sin\theta\) Physical limit

Where \(r\) is the pixel distance to the center, \(\theta\) is the angle of incidence, and \(f\) is the focal length.

Fisheye Dewarping

import cv2
import numpy as np

# Fisheye calibration parameters
K = np.array([[fx, 0, cx], [0, fy, cy], [0, 0, 1]])
D = np.array([k1, k2, k3, k4])  # Fisheye distortion coefficients

# Method 1: Undistort to pinhole image
new_K = cv2.fisheye.estimateNewCameraMatrixForUndistortRectify(
    K, D, (w, h), np.eye(3), balance=0.0  # 0=crop, 1=keep all
)
map1, map2 = cv2.fisheye.initUndistortRectifyMap(
    K, D, np.eye(3), new_K, (w, h), cv2.CV_16SC2
)
undistorted = cv2.remap(fisheye_img, map1, map2, cv2.INTER_LINEAR)

# Method 2: Equirectangular projection (for panoramic stitching)
def fisheye_to_equirectangular(fisheye_img, K, D, output_size=(1920, 960)):
    out_w, out_h = output_size
    # Generate equirectangular coordinates
    lon = np.linspace(-np.pi, np.pi, out_w)
    lat = np.linspace(-np.pi/2, np.pi/2, out_h)
    lon_grid, lat_grid = np.meshgrid(lon, lat)

    # Spherical to Cartesian
    x = np.cos(lat_grid) * np.sin(lon_grid)
    y = np.sin(lat_grid)
    z = np.cos(lat_grid) * np.cos(lon_grid)

    # Project to fisheye image
    theta = np.arctan2(np.sqrt(x**2 + y**2), z)
    phi = np.arctan2(y, x)
    r = K[0,0] * theta  # Equidistant projection

    u = r * np.cos(phi) + K[0,2]
    v = r * np.sin(phi) + K[1,2]

    map_x = u.astype(np.float32)
    map_y = v.astype(np.float32)

    return cv2.remap(fisheye_img, map_x, map_y, cv2.INTER_LINEAR)

Panoramic Vision Applications in Robotics

Panoramic SLAM

The wide FOV of panoramic cameras provides more feature points and greater rotational tolerance:

  • ORB-SLAM3: Supports fisheye camera models
  • OpenVSLAM: Supports equirectangular projection
  • Advantage: Abundant feature point tracking even during rotation

Surround-View System (Autonomous Driving)

     Front camera (fisheye)
         |
    +----+----+
    | Vehicle  |
Left----+        +----Right
camera  | Bird's  |  camera
(fisheye)| eye view| (fisheye)
    +----+----+
         |
     Rear camera (fisheye)

4 fisheye cameras stitched -> 360-degree BEV

Event Cameras

Working Principle

Traditional cameras output complete images at a fixed frame rate, while event cameras (Event Camera / Dynamic Vision Sensor, DVS) have each pixel independently and asynchronously responding to brightness changes:

When the logarithmic brightness change of pixel \((x, y)\) exceeds threshold \(C\), an event is generated:

\[ e = (x, y, t, p) \quad \text{when} \quad |\log I(x,y,t) - \log I(x,y,t-\delta t)| \geq C \]

Where:

  • \((x, y)\): Pixel coordinates
  • \(t\): Timestamp (microsecond precision)
  • \(p \in \{+1, -1\}\): Polarity (brightness increase/decrease)
Traditional camera (frames):
Time --> [Frame1][Frame2][Frame3][Frame4] ... Fixed 30/60fps
         All pixels output synchronously

Event camera (event stream):
Time --> .  ..  . ...  .. . ... .  . ... Asynchronous, sparse
         Only pixels with brightness changes generate events

Core Advantages

Feature Traditional Camera Event Camera
Temporal Resolution 10-30ms (inter-frame) 1us (event-level)
Dynamic Range 60-70dB >120dB
Motion Blur Present (intra-frame integration) None (asynchronous events)
Data Volume Large (full frames) Small (sparse events)
Power Consumption Higher Low (on-demand output)
Redundant Data Large static areas Almost none

Representative Products

iniVation DAVIS346

Parameter Value
Resolution 346 x 260
Event Bandwidth 12M events/s
Dynamic Range >120dB
Latency <1us
Highlight Synchronized event+frame output (DAVIS = DVS + APS)
Interface USB 3.0
Price ~$5000

Prophesee EVK4

Parameter Value
Resolution 1280 x 720 (HD)
Event Bandwidth >1G events/s
Pixel Size 4.86um
Dynamic Range >120dB
Latency ~100ns
Price ~$3000

Comparison

Feature DAVIS346 Prophesee EVK4
Resolution 346x260 1280x720
Frame Output Yes (APS) No
Event Bandwidth 12M/s >1G/s
Ecosystem DV software Metavision SDK
Suitable For Research prototypes High-performance applications

Event Data Representation

Event streams need to be converted to formats suitable for processing:

import numpy as np

# Raw event data
# events: Nx4 array [x, y, timestamp, polarity]

def events_to_frame(events, height, width, time_window):
    """Accumulate events into a frame (event histogram within time window)"""
    frame = np.zeros((height, width, 2))  # Positive and negative polarity

    for x, y, t, p in events:
        if p > 0:
            frame[int(y), int(x), 0] += 1
        else:
            frame[int(y), int(x), 1] += 1

    return frame

def events_to_voxel_grid(events, height, width, num_bins=5):
    """Event voxel grid (temporal discretization)"""
    voxel = np.zeros((num_bins, height, width))

    t_min, t_max = events[:, 2].min(), events[:, 2].max()
    t_norm = (events[:, 2] - t_min) / (t_max - t_min + 1e-6)

    for x, y, t_n, p in zip(events[:, 0], events[:, 1], t_norm, events[:, 3]):
        bin_idx = min(int(t_n * num_bins), num_bins - 1)
        voxel[bin_idx, int(y), int(x)] += p

    return voxel

def events_to_surface(events, height, width, tau=30000):
    """Time Surface"""
    surface = np.zeros((2, height, width))

    for x, y, t, p in events:
        pol = 0 if p > 0 else 1
        surface[pol, int(y), int(x)] = t

    # Normalize (exponential decay)
    t_now = events[-1, 2]
    surface = np.exp(-(t_now - surface) / tau)
    surface[surface < np.exp(-1)] = 0  # Truncate

    return surface

Event Camera Applications in Robotics

High-Speed Obstacle Avoidance

\[ \text{Reaction Time} = t_{\text{sensor}} + t_{\text{process}} + t_{\text{actuator}} \]

Event cameras reduce \(t_{\text{sensor}}\) from ~33ms (30fps) to ~1us, which is critical for high-speed drones.

Visual Odometry

Event-driven VIO (Visual-Inertial Odometry):

  • EVO: Event-based Visual Odometry
  • ESVO: Event-based Stereo Visual Odometry
  • Advantage: No loss of tracking during fast motion (no motion blur)

Optical Flow Estimation

Events naturally encode motion information, making them suitable for optical flow estimation:

\[ \frac{\partial \log I}{\partial t} = -\nabla(\log I) \cdot \mathbf{v} \]

Where \(\mathbf{v} = (v_x, v_y)\) is the optical flow.

HDR Scene Perception

  • Factory scenes with alternating bright and dark areas
  • Tunnel entrances and exits
  • Welding/cutting and other high-intensity light scenarios

Challenges of Event Cameras

Challenge Description Research Direction
Immature algorithm ecosystem Traditional CV algorithms not applicable Event-driven algorithm research
Low resolution Maximum 720p New sensor designs
High cost >$3000 Mass production price reduction (expected)
No output for static scenes Events require motion Fusion with frame cameras (DAVIS)
Noise Dark current generates false events Spatiotemporal filtering
Large data volume (high-speed) Millions of events/second Hardware-accelerated processing

Combined Approaches

Fisheye + Event

Fisheye event camera:
- Wide FOV (>180 degrees)
- Microsecond temporal resolution
- Use: High-speed drone omnidirectional perception

Panoramic + Depth

Panoramic RGB-D:
- 4x RealSense D435i in a ring arrangement
- 360-degree depth perception
- Use: Service robot omnidirectional navigation

DAVIS + Traditional Camera

Event + frame fusion:
- DAVIS346 outputs synchronized events and frames
- Frames for texture-rich scenes
- Events for high-speed motion and HDR scenes
- Use: Robust visual SLAM

Summary

  1. Panoramic cameras provide 360-degree perception, eliminating blind spots
  2. Fisheye lenses require specialized projection models and dewarping
  3. Event cameras break through traditional camera limits with microsecond temporal resolution and >120dB dynamic range
  4. Event data requires special representation methods (voxel grids, time surfaces, etc.)
  5. Event cameras have clear advantages in high-speed scenarios and extreme lighting
  6. Both technologies are rapidly developing, with costs and ecosystems steadily improving

References


评论 #