[CVPR 2023] Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars

Paper Review/GAN, Inversion

[CVPR 2023] Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars

이성훈 Ethan 2024. 2. 2. 15:29

728x90

- Introduction

몇 2D generative model 은 3DMM 을 도입하여 image animation 을 수행

그러나 geometry constraint 의 부족으로 shape distortion 이 발생함

따라서 3D GAN 과 3DMM 을 결합하고자 하는 시도가 있었지만, topological change 와 under-constrained deformation field 로 인해 문제들이 있었음

► 이 task 의 key challenge 는 animation accuracy 와 topological flexibility 를 위한 3D generative setting modeling deformation

Next3D 에서는 머리를 dynamic part 와 static part 로 나누어서 각각 modeling

Generative Texture-Rasterized Triplanes 을 제안하여 Generative Neural Textures 를 통해 facial deformation 학습

+ 추가적으로 입 생성하는 것을 제안했음

UNet 구조로, 이 생성을 잘함

Main Contribution

Animatable 한 3D-aware GAN framework 제시
Generative Texture-Rasterized Triplane 을 통해 효율적인 변형 가능한 3D 표현 방식으로, 메쉬 가이드 명시적 변형의 세밀한 표현 제어와 암시적 체적 표현의 유연성을 모두 상속
3D-aware one-shot facial avatars + 3D Stylization

- Method

Generative Texture-Rasterized Tri-planes

EG3D 에서 tri-plane feature 를 차용하되, animation 에 적합하기 않기 때문에 neural texture 를 사용

FLAME 을 사용하여 mesh 생성

Neural Texture 를 사용하는 이유

Underlying geometry 에 의존적인 다른 explicit deformation 에 비해, NT 는 불완전한 geometry 를 보완해주는 역할을 함
Implicit deformation 과 달리, elaborate imitation learning 의 필요성을 완화시키며 더 좋은 generalization 을 얻음

NT 도 surface 에서 먼 3D points 는 잘 generalize 가 되지 않음

► Generative Texture-Rasterized Tri-planes 를 통해 surface deformation 을 continuous volume 으로 adapt 할 수 있음

두 마리의 토끼

Accurate mesh-guided facial deformation of Neural Textures
Continuity and topological flexibility of volumetric representations

Mouth Synthesis Module

FLAME module 은 이를 생성하지 못하기 때문에, teeth synthesis 가 필요함

이가 들어갈 부분을 crop 하고 style-modulated UNet 을 사용하는 $G_{\text {teeth }}$ 로 process 진행

이후 neural blending 을 통해 $T_{u v}^{\prime}$ 생성

Modeling Static Components

FLAME template 에 포함되지 않는 diverse haircut, background, upper body 는 생성이 어려움

따라서 static part 를 위한 tri-plane $T_{\text {static }}$ 을 생성하여 $T_{u v}^{\prime}$ 와 함께 blend

Facial animation 중에 consistency 를 enforce 하는 장점이 있음

Neural Rendering

MLP 를 통해 Neural Rendering 진행

EG3D 와 비슷하게 super-resolution module 사용

$I_f$: 64x64

$I_{R G B}$: 512x512

Deformation-aware Discriminator

EG3D 와 비슷하게 discriminator 사용하지만, 이 discriminator 가 expected deformation 에 matching 되도록 작동되지는 않음

따라서 Synthetic rendering $I_{\text {synthestic }}$ 를 image pair 와 함께 dual discriminator 에 주어, expression 과 shape 에 aware 할 수 있도록 함

Training Objectives

$\begin{aligned} \mathcal{L}_{D_{\text {dual }}, G}= & \mathbb{E}_{z \sim p_z, \epsilon \sim p_\epsilon}\left[f\left(D_{\text {dual }}(G(z, \epsilon))\right)\right]+ \\ & \mathbb{E}_{I^r \sim p_{I^r}}\left[f\left(-D_{\text {dual }}\left(I^r\right)+\right.\right. \\ & \left.\lambda\left\|\nabla D_{\text {dual }}\left(I^r\right)\right\|^2\right]\end{aligned}$

$\mathcal{L}_{\text {density }}=\sum_{x_s \in \mathcal{S}}\left\|d\left(x_s\right)-d\left(x_s+\epsilon\right)\right\|_2$

$\mathcal{L}_{\text {total }}=\mathcal{L}_{D_{\text {dual }}, G}+\lambda_{\text {density }} \mathcal{L}_{\text {density }}$

- Experiment

Datasets

FFHQ
DECA to estimate FLAME parameters

Baselines

3DFaceShop
AniFaceGAN
DiscoFaceGAN

Inversion 실험 (PTI)

- Discussion

ID 를 측정하는 방법이 우리 논문에서 사용한 Arcface 를 사용하는 방법과 같다.

이 정도 metric 이면 identity 를 대변하기 충분하지 않나 싶다.

- Reference

[1] Sun, Jingxiang, et al. "Next3d: Generative neural texture rasterization for 3d-aware head avatars." CVPR 2023 [Paper link]

728x90

저작자표시 (새창열림)

'Paper Review > GAN, Inversion' 카테고리의 다른 글

[CVPR 2023] 3D GAN Inversion with Facial Symmetry Prior (0)	2023.10.05
2D GAN Inversion: [CVPR 2021] pSp, [ACM TOG 2021] e4e, [ACM TOG 2022] PTI (0)	2023.10.02
[CVPR 2022] EG3D: Efficient Geometry-aware 3D Generative Adversarial Networks (0)	2023.09.28
[CVPR 2019] StyleGAN, [CVPR 2020] StyleGAN2 (0)	2023.09.26

현재글[CVPR 2023] Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars

Ethan's Winery

이성훈 Ethan

250x250

Continual Learning, dl, fewshot, 용어, GAN, incremental learning, image classification, 딥러닝,

Today :
Yesterday :

Ethan's Winery