My analysis of the SINGER An onboard generalist Vision-Language Navigation Policy for Drones paper code
SINGER builds on top of SousVide which builds on top of FiGS which builds on top of Acadoa
so fun times
The code is also very interesting, the README seems to be outdated / referencing the wrong repository
Semantic Scene Generation
SINGER claims that each scene has semantic representation so its searchable with user query. Purpose of this is to use RRT* from the target object (described by query) and generate trajectories to free spaces around the target to synthetically generate data.
The details of how they made it so each scene has semantic representation is not conveyed very well in the paper and the codebase runs you in circles until it references a method thats not in the code