3D Stylization
Demos for llff dataset
Demos for tnt dataset
Demos for dl3dv dataset
More demos
⚠️ If reconstruction and stylization videos are unsynchronized, click the style images again. ⚠️
Multi-Style Blending and Stylization Control
Methodology
Stylos introduces a single-forward 3D Gaussian framework for geometry-aware, view-consistent 3D stylization.
(1) Camera pose-free 3D stylization: Unlike previous 3D stylization methods that require per-scene optimization or known camera poses, Stylos performs instant stylization from unposed content images and a single style reference.
(2) 3D style loss: To enforce cross-view coherence and geometry-aware stylization, we introduce a voxel-based 3D style loss that aligns aggregated scene features with style statistics.
(3) Scalabibilty and generalization: The proposed pipeline enables scaling from a single to hundreds of views with a single style image, and achieving zero-shot generalization to unseen categories, scenes, and styles.
The framework builds upon a pretrained VGGT model, which serves as the backbone. Given a style reference image and one or more content views (also referred to as context views),
Stylos predicts a set of stylized 3D Gaussian primitives together with camera parameters,
enabling faithful reconstruction of the observed scene while transferring the desired style. A key component is the 3D style loss, matching voxelized 3D features with 2D style statistics.
BibTeX
If you find our work useful for your research, please consider citing our paper:
@article{liu2025stylos,
title={Stylos: Multi-View 3D Stylization with Single-Forward Gaussian Splatting},
author={Liu, Hanzhou and Huang, Jia and Lu, Mi and Saripalli, Srikanth and Jiang, Peng},
journal={arXiv preprint arXiv:2509.26455},
year={2025}
}
Paper
Code
Demo