Evaluating Motion Plan Safety with Trajectory Predictors
The Safety-Soundness Dilemma
Autonomous systems need safety monitors that are both **complete** (catch all true dangers) and **sound** (don't flag safe situations as dangerous). The figure from the paper below illustrates this trade-off.

The FORCE-OPT Framework
Our solution, FORCE-OPT, is a principled framework built on four key components that work together to create a robust and reliable safety monitor.


Trajectory Predictor
Treats modern trajectory predictors as data-driven estimators of Forward Reachable Sets (FRS), grounding predictions in realistic, learned behavior.

Calibration
Calibrates the FRS to correct for model errors, guaranteeing that the true future path is covered with high, user-specified probability.

Convex Optimization
Uses an efficient optimization to find the smallest possible area that captures the most likely future positions, creating tight, accurate sets without slow sampling.

Bayesian Update
Dynamically adjusts the FRS conservativeness based on the predictor's real-time performance, adding a layer of safety for unexpected scenarios.
Performance
Experimental results from the nuScenes dataset showing how different methods perform on in-distribution scenarios.
False Positive Rate (FPR)
Lower is better
False Negative Rate (FNR)
Lower is better
Coverage (Cov)
Higher is better


Adapting to Uncertainty with Bayesian Update
When the system detects the predictor is unreliable, the belief-based version of FORCE-OPT makes the reachable set more conservative to ensure safety, preventing a failure.


Performance Comparison (Balanced Error Rate)
Impact of Multi-Modality on Performance
As shown in the heatmaps from the paper, using more prediction modes (moving from left to right on the x-axis) generally improves the performance of FORCE-OPT and its variants by reducing the Balanced Error Rate (BER).

Conclusion & Future Directions
Key Takeaways
FORCE-OPT offers a robust, efficient, and principled framework for safety monitoring in learned autonomy stacks. By integrating convex optimization, conformal prediction, and Bayesian update, it significantly outperforms existing methods in balancing safety (low false negatives) and practicality (low false positives), even in challenging out-of-distribution scenarios.
Future Work
- Joint Multi-Agent Reachability: Extend the framework to compute FRS for multiple agents simultaneously, capturing complex interactions in dense traffic.
- Direct FRS Generation: Train a neural network specifically to output Forward Reachable Sets directly, potentially improving efficiency and accuracy.