局部风信息启发的AVW-PPO室内气源定位算法
CSTR:
作者:
作者单位:

(1.新疆大学 电气工程学院,乌鲁木齐 830017; 2.新疆大学 智能科学与技术学院,乌鲁木齐 830017)

作者简介:

李世钰(1998—),男,硕士研究生;袁杰(1975—),男,教授,博士生导师

通讯作者:

李世钰,lsy534066742@163.com;袁杰,yuanjie@xju.edu.cn

中图分类号:

TP242.6

基金项目:

国家自然科学基金(62263031); 新疆维吾尔自治区自然科学基金(2022D01C53)


Local wind information-inspired AVW-PPO indoor odor source localization algorithm
Author:
Affiliation:

(1.School of Electrical Engineering, Xinjiang University, Urumqi 830017, China; 2.School of Intelligence Science and Technology, Xinjiang University, Urumqi 830017, China)

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    为解决当前复杂、动态室内羽流环境中气源定位(OSL)效率低下和成功率不足的问题,尤其在湍流条件下机器人难以准确感知环境并实现有效导航的挑战,提出了一种基于深度强化学习的辅助价值与风导向的近端策略优化(AVW-PPO)算法。首先,在原始PPO算法的基础上引入辅助价值网络,以减少单一值网络的估计偏差,从而提升策略更新的稳定性与预测精度。其次,设计了一种风导向策略,将局部环境风场信息融入强化学习框架中的状态空间与奖励函数,使机器人能够更敏锐地感知羽流环境的动态变化,优化其决策路径,从而有效提高气源定位的效率。最后,通过构建二维环境中的气体扩散模型,在3种不同的湍流条件下对所提算法进行了测试。结果表明:相同环境条件下,AVW-PPO算法在平均搜索步数和成功率两个指标上均优于其他同类算法,且定位成功率超过99%。其中,风导向策略在提升搜索效率方面表现尤为突出,有助于减少机器人完成任务所需的时间。本研究为解决室内复杂湍流环境下的气源定位问题提供了新思路和新方法。

    Abstract:

    To address the challenges of low efficiency and insufficient success rates in odor source localization (OSL) within complex and dynamic indoor plume environments, particularly where robots struggle to accurately perceive the environment and navigate effectively under turbulent conditions, this paper proposes an auxiliary value and wind-guided proximal policy optimization (AVW-PPO) algorithm based on deep reinforcement learning. First, an auxiliary value network is introduced into the original PPO framework to reduce the estimation bias of a single value network, thereby improving prediction accuracy and stabilizing policy updates. Next, a wind-guided strategy is designed to integrate local wind field information into the state space and reward function of the reinforcement learning framework, enabling the robot to better perceive dynamic changes in the plume environment and optimize its decision-making path, thus significantly improving the efficiency of odor source localization. Finally, a gas diffusion model in a two-dimensional environment is constructed to test the proposed algorithm under three different turbulence conditions. Experimental results demonstrate that, under identical environmental conditions, the AVW-PPO algorithm outperforms other comparable algorithms in terms of average search steps and success rates, achieving a localization success rate of over 99%. Notably, the wind-guided strategy significantly boosts search efficiency, helping to reduce the time required for the robot to complete tasks. This study provides new insights and methodologies for addressing odor source localization problems in complex turbulent indoor environments.

    参考文献
    相似文献
    引证文献
引用本文

李世钰,袁杰,谢霖伟,郭旭,张宁宁.局部风信息启发的AVW-PPO室内气源定位算法[J].哈尔滨工业大学学报,2025,57(8):57. DOI:10.11918/202410030

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-10-14
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2025-08-11
  • 出版日期: 2025-08-10
文章二维码