Skip to content
/ NavHint Public

[EACL 2024] PyTorch code of NavHint: Vision and Language Navigation Agent with a Hint Generator

Notifications You must be signed in to change notification settings

HLR/NavHint

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 

Repository files navigation

NavHint: Vision and Language Navigation Agent with a Hint Generator

This repo provides the official implementation of our Navhint (EACL2024 Findings)

Abstract: Existing work on vision and language navigation mainly relies on navigation-related losses to establish the connection between vision and language modalities, neglecting aspects of helping the navigation agent build a deep understanding of the visual environment. In our work, we provide indirect supervision to the navigation agent through a hint generator that provides detailed visual descriptions. The hint generator assists the navigation agent in developing a global understanding of the visual environment. It directs the agent's attention toward related navigation details, including the relevant sub-instruction, potential challenges in recognition and ambiguities in grounding, and the targeted viewpoint description. To train the hint generator, we construct a synthetic dataset based on landmarks in the instructions and visible and distinctive objects in the visual environment. We evaluate our method on the R2R and R4R datasets and achieve state-of-the-art on several metrics. The experimental results demonstrate that generating hints not only enhances the navigation performance but also helps improve the interpretability of the agent's actions.

Framework

Hints Preparation

  1. processed sub instruction (original sub-instruction: FG-R2R )
  2. stored candidate viewpoints for each view.
  3. CLIP objects for each viewpoint.

The generated hints can be obtained by running python reason_gen.py or directly from hints dataset.

Navigator

  1. Installing Environment and Downloading Dataset
  2. Initial weights for Nav-Hint
  3. Trained Nav-Hint (please put trained weights under snap folder)

Train Navigator

bash run/train_reasoner.bash

Test Navigator

bash run/test_agent.bash

Citation

If you find our work useful in your research, please consider citing:

@article{zhang2024navhint,
 title={NavHint: Vision and language navigation agent with a hint generator},
 author={Zhang, Yue and Guo, Quan and Kordjamshidi, Parisa},
 journal={arXiv preprint arXiv:2402.02559},
 year={2024}
}

About

[EACL 2024] PyTorch code of NavHint: Vision and Language Navigation Agent with a Hint Generator

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published