Paper Link: https://arxiv.org/html/2405.10142v1
This paper proposes:
- 3D Reconstruction System using 3DGS with online evaluation
- Evaluation metrics for reconstruction completeness and quality
- Safety constraint and form a trajectory optimization framework
- Simulation experiments to validate effectiveness
Reconstruction
NeRF gained popularity for its capabilities of photorealistic rendering, which can be divided into 3 main types:
- MLP-based methods
- Scale better, but face challenges with catastrophic forgetting in larger scenes
- Hybrid representation methods
- Combine advantages of MLPs and structure features, enhancing scene scalability
- Explicit methods
- Stores map features in voxel directly, without any MLPs, enabling faster optimization
NeRF excelled in photorealistic reconstruction, but are computationally expensive
Active View Planning
3DGS Map Representation
They used existing method SplaTAM SLAM for 3DGS real-time reconstruction. Scene is represented a set of isotropic 3DGS.
How much influence does one gaussian have at location x
center of Gaussian its radius / spread maximum capacity some 3D point where we want to evaluate the Gaussian squared distance from x to the gaussian center
if x is very close to center
if x is far away, the exponential term gets very small, so the Gaussian contributes almost nothing
Rendered pixel color
Computes the final color of one pixel by combining all Gaussians that project onto that pixel
the number of gaussians that overlap that pixel color of the -th Gaussian opacity contribution of the -th Gaussian at that pixel- gaussians are ordered by depth from front to back
Each Gaussian contributes:
- color x opacity x how much light is still visible behind earlier gaussians
- the deeper the gaussian in that pixel the less it contributes
Same idea as the color equation but instead of blending colors blend depth values
depth of center of Gaussian i in camera coordinates final depth value at that pixel
Completeness Evaluation
How much “new scene” can you reveal from this viewpoint
- pick a candidate camera viewpoint
- for each pixel sort the gaussians by depth front to back
- look for gaps between adjacent gaussians
- approximate each gap as a small 3d volume
- discount volume if gaussians are opaque, because gap is less visible
- sum over gaps to get pixel’s completeness gain then sum over pixels to score the whole viewpoint
- n is number of separate unseen chunks along one pixel’s ray
is size of -th unsene chunk is how many gaussians lie in front of that unseen chunk is the opacity of each of those earlier gaussians