Safety and Robustness

Overview

Safety in robot deployment is the core barrier from the lab to the real world. Safety is not only a technical issue but also involves standards certification, system architecture, and human-robot collaboration design. This article covers safety standard systems, collision detection algorithms, human-robot interaction safety zone design, and emergency stop mechanisms.

Safety-First Principle

Any robot system must undergo a complete safety assessment and certification process before deployment. Failure of safety functions can lead to serious personal injury.

I. Safety Standards System

1.1 ISO 10218: Industrial Robot Safety

ISO 10218 is the foundational standard for industrial robot safety, consisting of two parts:

Standard	Scope	Core Requirements
ISO 10218-1	Robot body	Joint limits, speed limits, torque monitoring, emergency stop
ISO 10218-2	Robot system integration	Safety protection design, risk assessment, safety distance calculation

Four collaborative operation modes (ISO 10218-1 Annex):

Safety-rated monitored stop: Robot stops when human enters
Hand guiding: Operator directly guides robot motion
Speed and separation monitoring: Dynamically adjusts speed
Power and force limiting: Limits contact forces

1.2 ISO 15066: Collaborative Robot Safety

ISO 15066 supplements ISO 10218, specifically for human-robot collaboration scenarios, defining force and pressure thresholds for various body parts:

Body Part	Quasi-static Force Limit (N)	Transient Force Limit (N)	Quasi-static Pressure Limit (N/cm^2)	Transient Pressure Limit (N/cm^2)
Skull/forehead	130	195	25	50
Face	65	97.5	11	16.5
Neck (side)	150	225	21	31.5
Chest	140	210	12	18
Abdomen	110	165	11	16.5
Back of hand	190	285	19	28.5
Fingers	140	210	30	45

Threshold Meaning

Quasi-static refers to sustained contact (e.g., clamping), transient refers to brief collision (<500ms). Transient thresholds are typically 1.5x the quasi-static values.

1.3 Safety Integrity Level (SIL)

Per IEC 61508, safety functions are graded by failure probability:

SIL Level	Dangerous Failure Probability per Hour (PFH)	Typical Application
SIL 1	\(10^{-6}\) ~ \(10^{-5}\)	General industrial equipment protection
SIL 2	\(10^{-7}\) ~ \(10^{-6}\)	Collaborative robot safety functions
SIL 3	\(10^{-8}\) ~ \(10^{-7}\)	Autonomous driving critical safety
SIL 4	\(10^{-9}\) ~ \(10^{-8}\)	Nuclear plant control systems

Most collaborative robots need to achieve SIL 2 or Performance Level d (PLd).

II. Collision Detection

2.1 Momentum Observer-based Collision Detection

Collision detection is the core of safety systems. The momentum observer method requires no additional sensors, using joint torque information:

Robot dynamics equation:

\[ M(q)\ddot{q} + C(q, \dot{q})\dot{q} + g(q) = \tau_{cmd} + \tau_{ext} \]

Generalized momentum definition:

\[ p = M(q)\dot{q} \]

Momentum observer:

\[ \hat{\tau}_{ext} = K_I \int_0^t \left[ M(q)\ddot{q} + C(q, \dot{q})\dot{q} + g(q) - \tau_{cmd} - \hat{\tau}_{ext} \right] d\tau \]

where \(K_I\) is the observer gain matrix, determining detection sensitivity and response speed.

2.2 Collision Detection Pipeline

Joint torque readout -> Dynamics model computation -> Residual estimation -> Threshold check -> Reaction policy

Reaction policy levels:

Level 0 - Monitor: Log collision events, take no action
Level 1 - Decelerate: Reduce speed to safe range
Level 2 - Stop: Safety-rated monitored stop (Category 2 stop)
Level 3 - Retract: Retract along collision direction, reduce contact force
Level 4 - E-stop: Immediately cut power (Category 0 stop)

2.3 Sensor Fusion-based Collision Detection

Method	Sensor	Response Time	Advantage	Disadvantage
Joint torque	Joint torque sensor	<5ms	No extra hardware	Limited by model accuracy
Electronic skin	Tactile sensor array	<1ms	Direct contact measurement	High cost, limited coverage
Visual prediction	Depth camera	50-100ms	Can predict collisions	High latency, occlusion
Current observation	Motor current sensor	<2ms	Low cost	Low sensitivity

III. Human-Robot Interaction Safety Zones

3.1 Safety Zone Division

Per ISO 13855 and ISO 10218-2, robot workspace is divided into:

+---------------------------------------------+
|             Forbidden Zone (Zone 0)          |
|        Within robot motion envelope          |
|    +-------------------------------+        |
|    |        Work Zone (Zone 1)     |        |
|    |     Normal robot operation    |        |
|    +-------------------------------+        |
+---------------------------------------------+
|            Warning Zone (Zone 2)             |
|     Robot decelerates, issues warning        |
+---------------------------------------------+
|             Safe Zone (Zone 3)               |
|          Normal human activity area          |
+---------------------------------------------+

3.2 Safety Distance Calculation

Per ISO 13855, minimum safety distance \(S\):

\[ S = K \times T + C \]

where: - \(K\): Human approach speed (typically 1.6 m/s or 2.0 m/s) - \(T\): Total system response time (sensor + control + braking) - \(C\): Additional safety margin (depends on detection device type)

3.3 Dynamic Safety Zone (Speed and Separation Monitoring)

Modern collaborative robots use dynamic safety distances:

\[ S_{min}(t) = v_H \times (T_r + T_s) + v_R \times T_r + S_s + C + Z_d + Z_r \]

IV. Emergency Stop System Design

4.1 Stop Categories (IEC 60204-1)

Category	Description	Implementation	Application Scenario
Category 0	Immediate power cutoff	Directly cut motor power	Highest emergency
Category 1	Controlled stop then power off	Brake decelerate first, then power off	General emergency
Category 2	Controlled stop, power maintained	Brake stop, maintain power	Safety-rated monitored stop

Dual-Channel Redundancy

Emergency stop circuits must use dual-channel design: failure of either channel triggers a stop. Safety PLCs must have self-diagnostic capability.

V. Robustness Design

5.1 Failure Mode Analysis

Failure Type	Example	Detection Method	Response Strategy
Sensor failure	Encoder disconnection	Signal monitoring/redundancy comparison	Switch to backup sensor
Communication interrupt	Bus fault	Heartbeat timeout detection	Switch to safe state
Software exception	Control algorithm crash	Watchdog timer	Safe stop
Power failure	Sudden power loss	Undervoltage detection	Brake lock
Mechanical wear	Increased reducer backlash	Precision monitoring/vibration analysis	Preventive maintenance

5.2 Redundancy Design Principles

Sensor redundancy: Critical axes use dual encoders
Communication redundancy: EtherCAT ring topology with automatic switchover
Compute redundancy: Safety PLC independent from main controller
Power redundancy: UPS ensures safety system power
Braking redundancy: Fail-safe brakes + active braking

VI. Safety for Learning-based Policies

6.1 Safety Constraints for Neural Network Policies

When using learned policies (e.g., VLA models) to control robots, additional safety layers are essential:

Learned policy output -> Safety filter -> Joint limit check -> Velocity limit -> Torque saturation -> Execute

Safety filter design:

\[ u_{safe} = \arg\min_{u \in \mathcal{U}} \|u - u_{learned}\|^2 \quad \text{s.t.} \quad h(x, u) \geq 0 \]

where \(h(x, u) \geq 0\) is the Control Barrier Function (CBF) constraint.

6.2 Safety Considerations for Sim2Real Deployment

Action space clipping: Constrain output range within physical safety limits
Progressive deployment: Verify at low speed first, then gradually increase
Human supervision: Initial deployment must have human-in-the-loop monitoring
Fallback strategy: Switch to safe controller when policy output is abnormal