4D head capture aims to generate dynamic topological meshes and corresponding texture maps from videos, which is widely utilized in movies and games for its ability to simulate facial muscle movements and recover dynamic textures in pore-squeezing. The industry often adopts the method involving multi-view stereo and non-rigid alignment. However, this approach is prone to errors and heavily reliant on time-consuming manual processing by artists. To simplify this process, we propose Topo4D, a novel framework for automatic geometry and texture generation, which optimizes densely aligned 4D heads and 8K texture maps directly from calibrated multi-view time-series images. Specifically, we first represent the time-series faces as a set of dynamic 3D Gaussians with fixed topology in which the Gaussian centers are bound to the mesh vertices. Afterward, we perform alternative geometry and texture optimization frame-by-frame for high-quality geometry and texture learning while maintaining temporal topology stability. Finally, we can extract dynamic facial meshes in regular wiring arrangement and high-fidelity textures with pore-level details from the learned Gaussians. Extensive experiments show that our method achieves superior results than the current SOTA face reconstruction methods both in the quality of meshes and textures.
Topo4D can generate dynamic temporal-consistent meshes and corresponding 8K texture maps with pore-level details from calibrated multi-view videos.
Topo4D can capture subtle facial changes and various extreme expressions, representing muscle tremors and dynamic wrinkles.
@article{li2024topo4d,
title={Topo4D: Topology-Preserving Gaussian Splatting for High-Fidelity 4D Head Capture},
author={Xuanchen, Li and Yuhao, Cheng and Xingyu, Ren and Haozhe, Jia and Di, Xu and Wenhan, Zhu and Yichao, Yan},
journal={arXiv preprint arXiv:2406.00440},
year={2024}
}