Recent advances in diffusion models have significantly improved image generation and editing, but extending these capabilities to 3D assets remains challenging, especially for fine-grained edits that require multi-view consistency. Existing methods typically restrict editing to predetermined viewing angles, severely limiting their flexibility and practical applications. We introduce Edit360, a tuning-free framework that extends 2D modifications to multi-view consistent 3D editing. Built upon video diffusion models, Edit360 enables user-specific editing from arbitrary viewpoints while ensuring structural coherence across all views. The framework selects anchor views for 2D modifications and propagates edits across the entire 360-degree range. To achieve this, Edit360 introduces a novel Anchor-View Editing Propagation mechanism, which effectively aligns and merges multi-view information within the latent and attention spaces of diffusion models. The resulting edited multi-view sequences facilitate the reconstruction of high-quality 3D assets, enabling customizable 3D content creation.
Overview of the Edit360, from input instruction and object (text, image, or 3D model) to the edited 3D asset. With vp (the back view for this case) selected as the anchor view for editing while v0 (front view) serving to preserve identity information, our Dual-Stream Diffusion Network progressively generates and fuses multi-view sequences through Spatial Progressive Fusion (SPF) and Cross-View Alignment (CVA) at each sampling step, ensuring consistent view generation for reconstructing the edited 3D asset.
@inproceedings{huang2025edit360,
title={Edit360: 2D Image Edits to 3D Assets from Any Angle},
author={Huang, Junchao and Hu, Xinting and Shi, Shaoshuai and Tian, Zhuotao and Jiang, Li},
booktitle={ICCV},
year={2025}
}