Agent-guided video re-localization network
CSTR:
Author:
Affiliation:

(School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China)

Clc Number:

TP391.4

Fund Project:

undefined

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Video re-localization aims to localize a moment that semantically corresponds to a given query video from an untrimmed reference video. This task not only meets the actual browsing needs of users but also plays an important role in various application scenarios. Since videos contain richer information compared to other data forms like images and text, accurately identifying the target moment in a long video and determining its temporal boundaries are significantly challenging. This paper regarded the video re-localization task as a sequential decision-making process and applied reinforcement learning to achieve efficient and accurate localization. Specifically, this paper proposed an agent-guided localization network (AGLN), which trained an agent to progressively refine temporal boundaries of the localized moment based on the learned policy, thereby finding the most relevant moment to the query video. Additionally, AGLN combined reinforcement learning with supervised learning in a multi-task learning framework, aiding the agent in more effectively exploring the environment and learning the optimal policy. Experimental results on the ActivityNet-VRL dataset demonstrate that AGLN outperforms existing methods in the video re-localization task. The average retrieval accuracy of AGLN is 25.9%, which is 0.2 percentage points higher than the current optimal method.

    Reference
    Related
    Cited by
Get Citation
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:August 17,2023
  • Revised:
  • Adopted:
  • Online: March 31,2026
  • Published:
Article QR Code