Basically robot agent: break down a complex tasks to smaller, easier to execute tasks for robot.
The main contribution of our paper is a hierarchical interactive robot learning system (Hi Robot), a novel framework that uses VLMs for both high-level reasoning and low-level task execution.

It is claimed that with the fine tuning, the robot’s higher level model performs better than ChatGPT 4o with prompt engineering.