Skip to content

Camera Calibration and Selection

Overview

Camera calibration is the process of determining a camera's intrinsic parameters and lens distortion coefficients. It is a prerequisite for all accurate visual measurements. This section covers the pinhole camera model, calibration methods, distortion correction, and guidelines for selecting the right camera for your application.

Pinhole Camera Model

Projection Equation

The pinhole model projects 3D world points onto a 2D image plane:

\[ \begin{bmatrix} u \\ v \\ 1 \end{bmatrix} = \frac{1}{Z} K [R|t] \begin{bmatrix} X \\ Y \\ Z \\ 1 \end{bmatrix} \]

Where:

  • \((u, v)\): Pixel coordinates
  • \((X, Y, Z)\): World coordinates
  • \(K\): Camera intrinsic matrix
  • \([R|t]\): Extrinsic matrix (rotation + translation)

Intrinsic Matrix

\[ K = \begin{bmatrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{bmatrix} \]
Parameter Meaning Unit
\(f_x, f_y\) Focal length in x/y direction Pixels
\(c_x, c_y\) Principal point (intersection of optical axis and image plane) Pixels

Converting focal length from mm to pixels:

\[ f_x = \frac{f_{\text{mm}} \times \text{width}_{\text{pixel}}}{\text{sensor\_width}_{\text{mm}}} \]

IMX219 Intrinsic Estimation

Focal length 3.04mm, sensor width 3.68mm, image width 1920 pixels:

\[f_x = \frac{3.04 \times 1920}{3.68} = 1585.7 \text{ pixels}\]

Extrinsic Matrix

\[ [R|t] = \begin{bmatrix} r_{11} & r_{12} & r_{13} & t_x \\ r_{21} & r_{22} & r_{23} & t_y \\ r_{31} & r_{32} & r_{33} & t_z \end{bmatrix} \]
  • \(R\) (3x3 rotation matrix): Camera orientation relative to the world coordinate system
  • \(t\) (3x1 translation vector): Camera position in the world coordinate system

Complete Projection Process

\[ \begin{bmatrix} x' \\ y' \\ z' \end{bmatrix} = [R|t] \begin{bmatrix} X_w \\ Y_w \\ Z_w \\ 1 \end{bmatrix} \quad \rightarrow \quad \begin{bmatrix} u \\ v \end{bmatrix} = \begin{bmatrix} f_x \frac{x'}{z'} + c_x \\ f_y \frac{y'}{z'} + c_y \end{bmatrix} \]

Lens Distortion

Distortion Types

Actual lenses introduce distortion, mainly of two types:

Radial Distortion:

\[ \begin{aligned} x_{\text{distorted}} &= x(1 + k_1 r^2 + k_2 r^4 + k_3 r^6) \\ y_{\text{distorted}} &= y(1 + k_1 r^2 + k_2 r^4 + k_3 r^6) \end{aligned} \]

Where \(r^2 = x^2 + y^2\), and \((x, y)\) are normalized image coordinates.

Tangential Distortion:

\[ \begin{aligned} x_{\text{distorted}} &= x + 2p_1 xy + p_2(r^2 + 2x^2) \\ y_{\text{distorted}} &= y + p_1(r^2 + 2y^2) + 2p_2 xy \end{aligned} \]

Distortion Coefficients

OpenCV uses 5 (or more) distortion coefficients:

\[ D = [k_1, k_2, p_1, p_2, k_3] \]
Coefficient Type Effect
\(k_1\) Radial Primary distortion term (most important)
\(k_2\) Radial Secondary correction
\(k_3\) Radial Higher-order correction (needed for wide-angle lenses)
\(p_1\) Tangential Lens not parallel to sensor
\(p_2\) Tangential Lens not parallel to sensor

Distortion Visualization

Normal grid:         Barrel distortion(k1<0):   Pincushion distortion(k1>0):
+--+--+--+--+      +--+--+--+--+              +--+--+--+--+
+--+--+--+--+      +--+--+--+--+              |\ |  |  |/ |
+--+--+--+--+      +--+--+--+--+              | \|  |  |/ |
+--+--+--+--+      +--+--+--+--+              | /|  |  |\ |
+--+--+--+--+      +--+--+--+--+              +--+--+--+--+
Common in wide-angle    Occasionally in telephoto

Zhang's Calibration Method

Principle

Zhang's method uses a planar calibration board (e.g., checkerboard) to estimate camera intrinsics and distortion coefficients from multiple photos taken at different angles.

Basic idea:

  1. Corner world coordinates on the checkerboard are known (\(Z=0\) plane)
  2. Detect corner pixel coordinates in images
  3. Each image provides a homography matrix \(H\)
  4. Solve for intrinsic \(K\) and distortion coefficients from multiple \(H\) matrices

Calibration Steps

import cv2
import numpy as np
import glob

# 1. Prepare checkerboard parameters
CHECKERBOARD = (9, 6)  # Number of inner corners
SQUARE_SIZE = 0.025    # Square size (meters)

# 3D world coordinates
objp = np.zeros((CHECKERBOARD[0]*CHECKERBOARD[1], 3), np.float32)
objp[:, :2] = np.mgrid[0:CHECKERBOARD[0], 0:CHECKERBOARD[1]].T.reshape(-1, 2)
objp *= SQUARE_SIZE

obj_points = []  # 3D points
img_points = []  # 2D points

# 2. Detect corners
images = glob.glob('calibration_images/*.jpg')
for fname in images:
    img = cv2.imread(fname)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    ret, corners = cv2.findChessboardCorners(
        gray, CHECKERBOARD,
        cv2.CALIB_CB_ADAPTIVE_THRESH + cv2.CALIB_CB_NORMALIZE_IMAGE
    )

    if ret:
        # Sub-pixel refinement
        corners = cv2.cornerSubPix(
            gray, corners, (11, 11), (-1, -1),
            (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)
        )
        obj_points.append(objp)
        img_points.append(corners)

# 3. Calibrate
ret, K, D, rvecs, tvecs = cv2.calibrateCamera(
    obj_points, img_points, gray.shape[::-1], None, None
)

print(f"Intrinsic matrix K:\n{K}")
print(f"Distortion coefficients D: {D.ravel()}")
print(f"Reprojection error: {ret:.4f} pixels")

# 4. Save calibration results
np.savez('calibration.npz', K=K, D=D)

Calibration Quality Assessment

Reprojection Error:

\[ \text{RMS} = \sqrt{\frac{1}{N} \sum_{i=1}^{N} \|\mathbf{p}_i - \hat{\mathbf{p}}_i\|^2} \]
Reprojection Error Quality
< 0.3 pixel Excellent
0.3 - 0.5 pixel Good
0.5 - 1.0 pixel Acceptable
> 1.0 pixel Needs recalibration

Calibration Best Practices

Tips for Improving Calibration Accuracy

  1. At least 20 images, taken from different angles and distances
  2. The checkerboard should cover all areas of the image (including edges and corners)
  3. Include multiple tilt angles (not just facing the camera directly)
  4. Use a flat calibration board (print and attach to a flat surface)
  5. Ensure images are sharp (no motion blur)
  6. The calibration board should occupy 20-80% of the image area

Undistortion

Image Undistortion

# Method 1: Direct undistortion (simple but computed per frame)
undistorted = cv2.undistort(img, K, D)

# Method 2: Pre-computed mapping tables (recommended, computed once)
new_K, roi = cv2.getOptimalNewCameraMatrix(K, D, (w, h), alpha=0)
map1, map2 = cv2.initUndistortRectifyMap(K, D, None, new_K, (w, h), cv2.CV_16SC2)

# Apply mapping to each frame
undistorted = cv2.remap(img, map1, map2, cv2.INTER_LINEAR)

Point Undistortion

# Convert distorted image points to undistorted coordinates
points_distorted = np.array([[100, 200], [300, 400]], dtype=np.float32)
points_distorted = points_distorted.reshape(-1, 1, 2)

points_undistorted = cv2.undistortPoints(
    points_distorted, K, D, P=new_K
)

Field of View (FOV) Calculation

Computing FOV from Intrinsics

\[ \text{FOV}_H = 2 \arctan\left(\frac{w}{2f_x}\right), \quad \text{FOV}_V = 2 \arctan\left(\frac{h}{2f_y}\right) \]
import numpy as np

def compute_fov(K, image_size):
    fx, fy = K[0, 0], K[1, 1]
    w, h = image_size

    fov_h = 2 * np.degrees(np.arctan(w / (2 * fx)))
    fov_v = 2 * np.degrees(np.arctan(h / (2 * fy)))
    fov_d = 2 * np.degrees(np.arctan(
        np.sqrt(w**2 + h**2) / (2 * np.sqrt(fx * fy))
    ))

    return fov_h, fov_v, fov_d

fov_h, fov_v, fov_d = compute_fov(K, (1920, 1080))
print(f"Horizontal FOV: {fov_h:.1f} deg, Vertical FOV: {fov_v:.1f} deg, Diagonal FOV: {fov_d:.1f} deg")

FOV and Coverage Range

Coverage at distance \(d\):

\[ W = 2d \tan\left(\frac{\text{FOV}_H}{2}\right), \quad H = 2d \tan\left(\frac{\text{FOV}_V}{2}\right) \]
FOV Coverage at 1m Coverage at 3m Coverage at 5m
60 degrees 1.15m 3.46m 5.77m
90 degrees 2.00m 6.00m 10.0m
120 degrees 3.46m 10.4m 17.3m

Stereo Calibration

# Stereo calibration
ret, K_left, D_left, K_right, D_right, R, T, E, F = cv2.stereoCalibrate(
    obj_points, img_points_left, img_points_right,
    K_left, D_left, K_right, D_right,
    gray.shape[::-1],
    flags=cv2.CALIB_FIX_INTRINSIC  # Fix already calibrated intrinsics
)

print(f"Baseline length: {np.linalg.norm(T):.4f}m")
print(f"Stereo calibration error: {ret:.4f} pixels")

Hand-Eye Calibration

Robot hand-eye calibration solves for the transform \(T_{CE}\) between the camera and the robot end-effector:

\[ A_i X = X B_i \]

Where \(A_i\) is the relative motion of the robot end-effector, \(B_i\) is the relative motion observed by the camera, and \(X = T_{CE}\).

# OpenCV hand-eye calibration
R_target2cam = [rvecs]  # Calibration board to camera rotation
t_target2cam = [tvecs]  # Calibration board to camera translation
R_gripper2base = [...]  # Robot end-effector to base rotation
t_gripper2base = [...]  # Robot end-effector to base translation

R_cam2gripper, t_cam2gripper = cv2.calibrateHandEye(
    R_gripper2base, t_gripper2base,
    R_target2cam, t_target2cam,
    method=cv2.CALIB_HAND_EYE_TSAI
)

Selection Guide

Selection by Application Scenario

Scenario Key Requirement Recommended Camera Reason
Indoor SLAM Depth + IMU RealSense D435i Active stereo + IMU
Outdoor SLAM Waterproof, outdoor depth ZED 2i IP66 + passive stereo
Robotic Arm Grasping Close-range accuracy RealSense D405 7cm minimum, high accuracy
High-Speed VIO Global shutter + IMU OV9281 + IMU Global shutter, no motion distortion
Object Detection High-resolution RGB IMX477 + lens 12.3MP, interchangeable lenses
Night Vision Low-light performance IMX462 Starlight-grade CMOS
Omnidirectional Perception 360-degree coverage 4x fisheye solution No blind spots
High-Speed Scenarios Microsecond temporal resolution DAVIS346 Event camera
Education Prototypes Low cost RPi Camera V2 $25 entry

Selection Decision Flow

graph TD
    A[Vision Task] --> B{Budget?}
    B -->|<$50| C[RPi Camera V2/V3]
    B -->|$50-200| D{Need depth?}
    B -->|$200-500| E{Special requirements?}
    B -->|>$500| F[Industrial camera / Event camera]

    D -->|No| G[IMX477 + lens]
    D -->|Yes| H[OAK-D Lite]

    E -->|Outdoor waterproof| I[ZED 2i]
    E -->|Close-range accuracy| J[RealSense D405]
    E -->|General purpose| K[RealSense D435i]
    E -->|Long range| L[RealSense D455]

Key Parameter Checklist

Parameters to confirm during selection:

Parameter Question
Resolution How fine a detection granularity is needed?
Frame Rate Robot speed and control frequency requirements?
FOV How large an area needs to be covered?
Depth Range Minimum and maximum working distances?
Depth Accuracy Positioning/grasping accuracy requirements?
Interface CSI (low latency) or USB (flexible)?
Shutter Global shutter needed for high-speed motion?
Environment Indoor/outdoor/waterproof/temperature range?
Power Battery-powered budget?
Size & Weight Installation space constraints?
Software Ecosystem ROS2 drivers, SDK maturity?

See: Calibration and System Integration for more on calibration workflows in actual deployment.

Online and Self-Calibration

Why Online Calibration Is Needed

  • Camera parameters may change with temperature
  • Physical impacts may alter mounting pose
  • Lenses and structures may loosen over long-term operation

Self-Calibration Methods

Method Principle Accuracy Suitable For
Motion self-calibration Recover intrinsics from image sequences Medium Without calibration board
Vanishing point calibration Use parallel lines in the scene Medium Structured environments
Marker-based calibration AprilTag/ArUco High When markers can be placed
Continuous optimization Joint intrinsic optimization in SLAM High Long-term operation
# ArUco marker detection and pose estimation
import cv2

aruco_dict = cv2.aruco.getPredefinedDictionary(cv2.aruco.DICT_4X4_50)
parameters = cv2.aruco.DetectorParameters()
detector = cv2.aruco.ArucoDetector(aruco_dict, parameters)

corners, ids, rejected = detector.detectMarkers(image)

if ids is not None:
    # Estimate marker pose
    rvecs, tvecs, _ = cv2.aruco.estimatePoseSingleMarkers(
        corners, marker_size, K, D
    )
    for rvec, tvec in zip(rvecs, tvecs):
        cv2.drawFrameAxes(image, K, D, rvec, tvec, marker_size/2)

Summary

  1. The pinhole model is the basic mathematical framework for camera projection
  2. Zhang's calibration method is the most commonly used, requiring 20+ multi-angle images
  3. Reprojection error <0.5px is the benchmark for good calibration
  4. Pre-computed mapping tables are best practice for real-time undistortion
  5. Selection should comprehensively consider resolution, frame rate, FOV, depth, interface, environment and other factors
  6. Hand-eye calibration is a necessary step for robotic arm vision

References

  • Zhang, Z. "A Flexible New Technique for Camera Calibration" IEEE TPAMI, 2000
  • Hartley, R., & Zisserman, A. Multiple View Geometry in Computer Vision
  • OpenCV Camera Calibration Tutorial: https://docs.opencv.org/
  • Tsai, R. Y., & Lenz, R. K. "A New Technique for Fully Autonomous and Efficient 3D Robotics Hand/Eye Calibration"

评论 #