TY - GEN
T1 - PanoDreamer
T2 - 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2025
AU - Xiong, Zhexiao
AU - Chen, Zhang
AU - Li, Zhong
AU - Xu, Yi
AU - Jacobs, Nathan
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Automatically generating a complete 3D scene from a text description, a reference image, or both has significant applications in fields like virtual reality and gaming. However, current methods often generate low-quality textures and inconsistent 3D structures. This is especially true when extrapolating significantly beyond the field of view of the reference image. To address these challenges, we propose PanoDreamer, a novel framework for consistent, 3D scene generation with flexible text and image control. Our approach employs a large language model and a warp-refine pipeline, first generating an initial set of images and then compositing them into a 360-degree panorama. This panorama is then lifted into 3D to form an initial point cloud. We then use several approaches to generate additional images, from different viewpoints, that are consistent with the initial point cloud and expand/refine the initial point cloud. Given the resulting set of images, we utilize 3D Gaussian Splatting to create the final 3D scene, which can then be rendered from different viewpoints. Experiments demonstrate the effectiveness of PanoDreamer in generating high-quality, geometrically consistent 3D scenes.
AB - Automatically generating a complete 3D scene from a text description, a reference image, or both has significant applications in fields like virtual reality and gaming. However, current methods often generate low-quality textures and inconsistent 3D structures. This is especially true when extrapolating significantly beyond the field of view of the reference image. To address these challenges, we propose PanoDreamer, a novel framework for consistent, 3D scene generation with flexible text and image control. Our approach employs a large language model and a warp-refine pipeline, first generating an initial set of images and then compositing them into a 360-degree panorama. This panorama is then lifted into 3D to form an initial point cloud. We then use several approaches to generate additional images, from different viewpoints, that are consistent with the initial point cloud and expand/refine the initial point cloud. Given the resulting set of images, we utilize 3D Gaussian Splatting to create the final 3D scene, which can then be rendered from different viewpoints. Experiments demonstrate the effectiveness of PanoDreamer in generating high-quality, geometrically consistent 3D scenes.
KW - customized scene generation
KW - panorama synthesis
KW - scene generation
KW - text-driven 3d generation
UR - https://www.scopus.com/pages/publications/105017853118
U2 - 10.1109/CVPRW67362.2025.00034
DO - 10.1109/CVPRW67362.2025.00034
M3 - Conference contribution
AN - SCOPUS:105017853118
T3 - IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
SP - 295
EP - 304
BT - Proceedings - 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2025
PB - IEEE Computer Society
Y2 - 11 June 2025 through 12 June 2025
ER -