Cross-scale 3D Scene Generation
Generated Virtual World
Interactive Viewing
Approach
WonderZoom begins with a single input image and progressively constructs a hierarchy of 3D scenes covering ever‑finer spatial scales. Two key innovations make this possible:
• Multi-Scale Gaussian Surfels – a dynamically updatable representation that lets new fine‑scale surfels be inserted without re‑optimising the entire scene, while still rendering in real time.
• Progressive detail synthesizer – an autoregressive module that leverages image, depth and large‑language models to hallucinate and register novel 3D content whenever the user zooms into a region or issues a text prompt.
These components work together to enable interactive exploration from macro panoramas to micro textures, outperforming prior single‑scale generators in both perceptual quality and scale consistency.