Springe zum Hauptinhalt
Universitätsbibliothek
Universitätsbibliographie
Universitätsbibliothek 

Eintrag in der Universitätsbibliographie der TU Chemnitz

Volltext zugänglich unter
URN: urn:nbn:de:bsz:ch1-qucosa2-954101


Zeitvogel, Samuel
Hirtz, Gangolf (Prof. Dr.-Ing.) ; Laubenheimer, Astrid (Prof. Dr.-Ing.) (Gutachter)

Joint Optimization for Multi-Person Shape Models from Markerless 3D-Scans


Kurzfassung in englisch

Advancements in 3D sensing technology are driving numerous contemporary applications in augmented reality, virtual try-on, and markerless motion capture. Since the data derived from such technologies frequently suffers from noise, occlusion, and resolution restrictions, we propose a novel approach to overcome these limitations by exploiting high-quality multi-view 3D data to train a statistical 3D human shape model, focusing on body shape analysis. Our research addresses the intricacy of training parametric 3D shape models from unstructured training data. Conventional methods rely on a complex multi-stage training pipeline, involving a registration step and a model parameter estimation step. While these methods have proven valuable, they tend to pose challenges in acquiring high-quality registrations in an unsupervised setting. The objective of this work is thus to simplify, streamline, and unify the various stages of the pipeline while improving the shape model formulation and the training process. The resulting approach can be used to train articulated shape models end-to-end without human supervision. Using noisy 3D data collected via low-cost sensors, the statistical 3D human shape model training pipeline helps to infer the most probable body proportions and postures to generate high-quality human shape models that can represent arbitrary human shapes in different body proportions and postures. To address these challenges, our work seeks to amalgamate the strengths of expressive human body shape models within a holistic training pipeline, whereby the feasibility of a simplified, end-to-end training framework capable of handling 3D scans corrupted by noise is explored. We also investigate the expressiveness of the models produced through this pipeline, comparing it with current state-of-the-art methods. Furthermore, we analyze the possibility of a significant reduction in the parameter count required by these methods without compromising their expressiveness or performance. Building upon established practices for differentiable shape model formulation, objective formulation, and joint optimization, the research yields three key contributions. First, we introduce a differentiable multi-person articulated human shape model that can be trained using joint optimization without any 3D supervision. Our model significantly reduces the parameter count, facilitates realistic 3D avatar generation with a low-poly base mesh, and is compatible with 3D modeling software. To this end, we enhance the prevalent mesh-based morphable models with subdivision surfaces enabling joint optimization methods and reducing the number of model parameters. Secondly, we put forward a singular objective for model training, that can be minimized with slight modifications to the off-the-shelf nonlinear least squares solvers. This objective function comprises non-euclidean manifolds, robust cost functions, and data-to-model correspondences. Regularization methods and best practices from the literature are aggregated to avoid overfitting and degenerate solutions to the optimization problem. Additionally, we incorporate existing 2D pose estimators to improve the convergence behavior of the proposed objective. The objective function is minimized by enhancing an existing solver implementation to cope with non-euclidean parameter spaces. Numerical optimization is performed on Graphical Processing Units to improve the training time of the proposed model. Finally, our model and the proposed optimization procedure are applied to approximately 1,000 markerless, low-resolution point clouds. We demonstrate the capability of large-scale joint optimization for multi-person shape model training, which was previously only considered when using alternating optimization methods. We evaluate the reconstruction quality of our approach and benchmark its competitive generalization capabilities on a challenging shape correspondence benchmark. Additionally, we conduct an extensive qualitative and quantitative evaluation showcasing ablation studies, failure cases, and comparisons to existing methods. Our approach yields competitive results on the shape correspondence benchmark FAUST and outperforms other related unsupervised methods. Moreover, the qualitative evaluations demonstrate lifelike avatar generation capabilities exhibiting realistic body proportions and movement. This work enhances the current capabilities in the realm of end-to-end 3D shape modeling and training, providing a more efficient and unified method for producing high-quality human shape models from noisy sensor data. Our methods and results open up new avenues for future research and improvements in this field.

Universität: Technische Universität Chemnitz
Institut: Professur Digital- und Schaltungstechnik
Fakultät: Fakultät für Elektrotechnik und Informationstechnik
Dokumentart: Dissertation
Betreuer: Hirtz, Gangolf (Prof. Dr.-Ing.)
DOI: doi:10.60687/2025-0011
SWD-Schlagwörter: Modell , 3D-Scanner , 3D-Sensor
Freie Schlagwörter (Englisch): body shape , statistical shape models , morphable models , numerical optimization , joint optimization , end-to-end training , skinning , human keypoints , subdivision surfaces , linear blend-shapes
DDC-Sachgruppe: Technik, Medizin, angewandte Wissenschaften
Sprache: englisch
Tag der mündlichen Prüfung 23.01.2025
OA-Lizenz CC BY-SA 4.0

 

Soziale Medien

Verbinde dich mit uns: