|
Autonomous driving (AD) technology is shifting from traditional rule-based algorithms to datadriven end-to-end (E2E) algorithms to overcome their limitations in complex urban environments. However, E2E algorithms are highly sensitive to the training dataset and often suffer from causal confusion, which is learning spurious correlations, as well as data imbalance, both of which degrade generalization and safety. Moreover, since E2E algorithms typically rely on synchronized sequential inputs to process temporally continuous observations, they often require a predefined inference interval. This constraint limits control frequency and makes the algorithms vulnerable to asynchronous sensor inputs in real-time applications. To mitigate these issues, we propose a purpose-driven dataset construction strategy to alleviate causal confusion and data imbalance. We further develop an adaptive sequential inference framework that removes fixed timing constraints of sequential networks and enables real-time operation with asynchronous sensor inputs. Experiments across diverse scenarios, including downtown traffic and school zones, show robust edge-case performance, over 70% improvement in driving score compared to the baseline, and a second-place finish in the 2025 Hyundai Motor Group (HMG) AD Challenge conducted in real time on the MORAI digital-twin simulator. These results demonstrate that datacentric design combined with simulation based evaluation can substantially enhance the real-world deployability of E2E AD systems. |