A Hybrid Generator Architecture for Controllable Face Synthesis
Loading...
Access rights
openAccess
Journal Title
Journal ISSN
Volume Title
A4 Artikkeli konferenssijulkaisussa
This publication is imported from Aalto University research portal.
View publication in the Research portal
View/Open full text file from the Research portal
Other link related to publication
View publication in the Research portal
View/Open full text file from the Research portal
Other link related to publication
Date
2023-07-23
Department
Major/Subject
Mcode
Degree programme
Language
en
Pages
10
1-10
1-10
Series
Proceedings - SIGGRAPH 2023 Conference Papers
Abstract
Modern data-driven image generation models often surpass traditional graphics techniques in quality. However, while traditional modeling and animation tools allow precise control over the image generation process in terms of interpretable quantities - e.g., shapes and reflectances - endowing learned models with such controls is generally difficult. In the context of human faces, we seek a data-driven generator architecture that simultaneously retains the photorealistic quality of modern generative adversarial networks (GAN) and allows explicit, disentangled controls over head shapes, expressions, identity, background, and illumination. While our high-level goal is shared by a large body of previous work, we approach the problem with a different philosophy: We treat the problem as an unconditional synthesis task, and engineer interpretable inductive biases into the model that make it easy for the desired behavior to emerge. Concretely, our generator is a combination of learned neural networks and fixed-function blocks, such as a 3D morphable head model and texture-mapping rasterizer, and we leave it up to the training process to figure out how they should be used together. This greatly simplifies the training problem by removing the need for labeled training data; we learn the distributions of the independent variables that drive the model instead of requiring that their values are known for each training image. Furthermore, we need no contrastive or imitation learning for correct behavior. We show that our design successfully encourages the generative model to make use of the internal, interpretable representations in a semantically meaningful manner. This allows sampling of different aspects of the image independently, as well as precise control of the results by manipulating the internal state of the interpretable blocks within the generator. This enables, for instance, facial animation using traditional animation tools.Description
Funding Information: We thank Tero Karras, Pauli Kemppinen, Tuomas Kynkäänniemi and Erik Härkönen for discussions and feedback. This work was partially supported by the European Research Council (ERC Consolidator Grant 866435), and made use of computational resources provided by the Aalto Science-IT project and the Finnish IT Center for Science (CSC). Publisher Copyright: © 2023 Owner/Author. | openaire: EC/H2020/866435/EU//PIPE
Keywords
differentiable rendering, face modeling, generative adversarial networks
Other note
Citation
Mensah, D, Kim, N H, Aittala, M, Laine, S & Lehtinen, J 2023, A Hybrid Generator Architecture for Controllable Face Synthesis . in S N Spencer (ed.), Proceedings - SIGGRAPH 2023 Conference Papers ., 69, ACM, pp. 1-10, ACM International Conference and Exhibition on Computer Graphics Interactive Techniques, Los Angeles, California, United States, 06/08/2023 . https://doi.org/10.1145/3588432.3591563