HABIT: Human Action Benchmark for Interactive Traffic in CARLA
Mohan Ramesh · Mark Azer · Fabian Flohr
Abstract
Current autonomous driving (AD) simulations are critically limited by their inadequate representation of realistic and diverse human behavior, which is essential for ensuring safety and reliability. Existing benchmarks often simplify pedestrian interactions, failing to capture complex, dynamic intentions and varied responses critical for robust system deployment. To overcome this, we introduce HABIT (Human Action Benchmark for Interactive Traffic), a high-fidelity simulation benchmark. HABIT integrates real-world human motion, sourced from mocap and videos, into CARLA via a modular, extensible, and physically consistent motion retargeting pipeline. From an initial pool of approximately 30,000 retargeted motions, we curate 4,730 traffic-compatible pedestrian motions, standardized in SMPL format for physically consistent trajectories. HABIT seamlessly integrates with CARLA's Leaderboard, enabling automated scenario generation and rigorous agent evaluation. Our safety metrics, including Abbreviated Injury Scale (AIS) and False Positive Braking Rate (FPBR), reveal critical failure modes in state-of-the-art AD agents missed by prior evaluations. Evaluating zero-shot performance on pose estimation, segmentation, and tracking underscores the visual realism inherent in our benchmark.While modern end-to-end planning methods like Interfuser achieve zero collisions per kilometer on the CARLA Leaderboard, they perform notably worse on HABIT, with $5.24$ collisions/km and a $10.96$\% AIS 3+ injury risk. In scenes with idle pedestrians, they brake unnecessarily in 63.9\% of cases. All components are publicly released to support reproducible, pedestrian-aware AI research.
Successful Page Load