自动驾驶中的世界模型:综述与展望
CSTR:
作者:
作者单位:

(北京航空航天大学 交通科学与工程学院,北京 100074)

作者简介:

殷鸿博(2000—),男,博士研究生;田大新(1980—),男,教授,博士生导师

通讯作者:

田大新,dtian@buaa.edu.cn

中图分类号:

U463.6; TP18

基金项目:

国家自然科学基金(4,1, 62432002);京津冀基础研究合作专项课题(F2024201070)


World models in autonomous driving: A review and outlook
Author:
Affiliation:

(School of Transportation Science and Engineering, Beihang University, Beijing 100074, China)

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    在自动驾驶系统向通用智能演进的过程中,世界模型作为一种可对环境进行内在建模、推演与预测的认知引擎,正成为突破传统感知决策范式瓶颈、应对长尾场景的关键技术路径。为系统梳理世界模型在自动驾驶领域的研究进展与关键问题,探讨其推动通用智能驾驶落地的技术路径,文中对自动驾驶世界模型的研究现状与发展趋势开展了系统性综述。首先,阐明了世界模型的基本概念及其在自动驾驶中的核心功能,归纳了其主流技术架构,进而对比分析了各类范式的优势与不足。其次,总结了世界模型在3大关键应用方向的最新进展 —— 未来场景生成与理解、端到端驾驶策略学习、数据驱动的闭环仿真系统,揭示了其在提升系统前瞻性与交互理解能力方面的实际价值。最后,系统整理了世界模型的评估指标与公开数据集的适用范围,为后续分析其技术挑战奠定了基础。结果表明:尽管世界模型在多尺度时空表征与复杂场景生成方面已取得阶段性突破,但在物理规律遵从性、安全可信推理、长时序稳定性及轻量化部署等方面仍存在显著挑战。据此,建议未来研究应重点关注高效计算架构、长时程生成一致性、不确定性建模及融合物理知识的自监督表征,以推动世界模型在各类交通场景中有效发挥作用。

    Abstract:

    Towards to general intelligentization of autonomous driving systems, the world models as a cognitive engine that internally models, infers, and predicts the environment, is becoming a critical technical pathway to break bottlenecks in traditional perception-decision paradigms and address long-tail scenarios. To synthesize the research progress and key issues of the world models in autonomous driving, and explore their technical routes for advancing the implementation of general intelligent driving, the research status and development trends in autonomous driving are reviewed. Firstly, the basic concept of world models and their core functionalities in autonomous driving are clarified, mainstream technical architectures are summarized, and the merits and drawbacks of various paradigms are comparatively analyzed. Secondly, the latest progress of world models in three key application directions are summarized including of future scene generation and understanding, end-to-end driving policy learning, and data-driven closed-loop simulation systems, and practical value in enhancing the system’s forward-looking capabilities and interaction understanding is revealed. Thirdly, the evaluation metrics of world models and the application scopes of public datasets are organized, which lays a foundation for the subsequent analysis of their technical challenges. Overall, despite achieving phased breakthroughs in multi-scale spatiotemporal representation and complex scene generation, the world models still face the challenges in adhering to physical laws, safe and credible reasoning, long-term temporal stability, and lightweight deployment. Accordingly, it is suggested that future research should focus on efficient computing architectures, long-term generation consistency, uncertainty modeling, and self-supervised representation integrated with physical knowledge, so as to promote the effective function of world models in various traffic scenarios.

    参考文献
    相似文献
    引证文献
引用本文

殷鸿博,田大新.自动驾驶中的世界模型:综述与展望[J].哈尔滨工业大学学报,2025,57(12):165. DOI:10.11918/202509069

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-09-16
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2026-01-09
  • 出版日期:
文章二维码