Awesome Talking Face

This is a repository for organizing papres, codes and other resources related to talking face/head. Most papers are linked to the pdf address provided by "arXiv" or "OpenAccess". However, some papers require an academic license to browse. For example, IEEE, springer, and elsevier journal, etc.

🔆 This project is still on-going, pull requests are welcomed!!

If you have any suggestions (missing papers, new papers, key researchers or typos), please feel free to edit and pull a request. Just letting me know the title of papers can also be a big contribution to me. You can do this by open issue or contact me directly via email.

⭐ If you find this repo useful, please star it!!!

2022.09 Update!

Thanks for PR from everybody! From now on, I'll occasionally include some papers about video-driven talking face generation. Because I found that the community is trying to include the video-driven methods into the talking face generation scope, though it is originally termed as Face Reenactment.

So, if you are looking for video-driven talking face generation, I would suggest you have a star here, and go to search Face Reenactment, you'll find more :)

One more thing, please correct me if you find that there are any paper noted as arXiv paper has been accepted to some conferences or journals.

2021.11 Update!

I updated a batch of papers that appeared in the past few months. In this repo, I was intend to cover the audio-driven talking face generation works. However, I found several text-based research works are also very interesting. So I included them here. Enjoy it!

TO DO LIST

Main paper list
Add paper link
Add codes if have
Add project page if have
Datasets and survey

Papers

2D Video - Person independent

2024

DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation [arXiv 2024] Paper Code ProjectPage
Takin-ADA: Emotion Controllable Audio-Driven Animation with Canonical and Landmark Loss Optimization [arXiv 2024] Paper
MuseTalk: Real-Time High Quality Lip Synchronization with Latent Space Inpainting [arXiv 2024] Paper Code
3D-Aware Text-driven Talking Avatar Generation [ECCV 2024] Paper
LaDTalk: Latent Denoising for Synthesizing Talking Head Videos with High Frequency Details [arXiv 2024] Paper
TalkinNeRF: Animatable Neural Fields for Full-Body Talking Humans [ECCVW 2024] Paper
JoyHallo: Digital human model for Mandarin [arXiv 2024] Paper Code
JEAN: Joint Expression and Audio-guided NeRF-based Talking Face Generation [BMVC 2024] Paper
StyleTalk++: A Unified Framework for Controlling the Speaking Styles of Talking Heads [TPAMI 2024] Paper
DiffTED: One-shot Audio-driven TED Talk Video Generation with Diffusion-based Co-speech Gestures [CVPRW 2024] Paper
EMOdiffhead: Continuously Emotional Control in Talking Head Generation via Diffusion [arXiv 2024] Paper
SVP: Style-Enhanced Vivid Portrait Talking Head Diffusion Model [arXiv 2024] Paper
SegTalker: Segmentation-based Talking Face Generation with Mask-guided Local Editing [arXiv 2024] Paper
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency [arXiv 2024] Paper ProjectPage
PoseTalk: Text-and-Audio-based Pose Control and Motion Refinement for One-Shot Talking Head Generation [arXiv 2024] Paper
CyberHost: Taming Audio-driven Avatar Diffusion Model with Region Codebook Attention [arXiv 2024] Paper ProjectPage
TalkLoRA: Low-Rank Adaptation for Speech-Driven Animation [arXiv 2024] Paper
S^3D-NeRF: Single-Shot Speech-Driven Neural Radiance Field for High Fidelity Talking Head Synthesis [arXiv 2024] Paper
FD2Talk: Towards Generalized Talking Head Generation with Facial Decoupled Diffusion Model [arXiv 2024] Paper
LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control [arXiv 2024] Paper ProjectPage Code
High-fidelity and Lip-synced Talking Face Synthesis via Landmark-based Diffusion Model [arXiv 2024] Paper
Landmark-guided Diffusion Model for High-fidelity and Temporally Coherent Talking Head Generation [arXiv 2024] Paper
LinguaLinker: Audio-Driven Portraits Animation with Implicit Facial Control Enhancement [arXiv 2024] Paper ProjectPage Code
Learning Online Scale Transformation for Talking Head Video Generation [arXiv 2024] Paper
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions [arXiv 2024] Paper ProjectPage GitHub
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation [arXiv 2024] Paper ProjectPage GitHub
RealTalk: Real-time and Realistic Audio-driven Face Generation with 3D Facial Prior-guided Identity Alignment Network [arXiv 2024] Paper
Emotional Conversation: Empowering Talking Faces with Cohesive Expression, Gaze and Pose Generation [arXiv 2024] Paper
Let's Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation [arXiv 2024] Paper ProjectPage
Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement [arXiv 2024] Paper ProjectPage
Controllable Talking Face Generation by Implicit Facial Keypoints Editing [arXiv 2024] Paper
InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation [arXiv 2024] Paper ProjectPage
Faces that Speak: Jointly Synthesising Talking Face and Speech from Text [arXiv 2024] Paper ProjectPage
Listen, Disentangle, and Control: Controllable Speech-Driven Talking Head Generation [arXiv 2024] Paper
SwapTalk: Audio-Driven Talking Face Generation with One-Shot Customization in Latent Space [arXiv 2024] Paper ProjectPage
AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding [arXiv 2024] Paper Code ProjectPage
NeRFFaceSpeech: One-shot Audio-diven 3D Talking Head Synthesis via Generative Prior [CVPR 2024 Workshop] Paper Code ProjectPage
Audio-Visual Speech Representation Expert for Enhanced Talking Face Video Generation and Evaluation [CVPR 2024 Workshop] Paper
EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars [arXiv 2024] Paper ProjectPage
GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting [arXiv 2024] Paper
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time [arXiv 2024] Paper ProjectPage
THQA: A Perceptual Quality Assessment Database for Talking Heads [arXiv 2024] Paper Code
Talk3D: High-Fidelity Talking Portrait Synthesis via Personalized 3D Generative Prior [arXiv 2024] Paper Code ProjectPage
EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis [arXiv 2024] Paper Code ProjectPage
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animations [arXiv 2024] Paper Code
MoDiTalker: Motion-Disentangled Diffusion Model for High-Fidelity Talking Head Generation [arXiv 2024] Paper ProjectPage
Superior and Pragmatic Talking Face Generation with Teacher-Student Framework [arXiv 2024] Paper ProjectPage
X-Portrait: Expressive Portrait Animation with Hierarchical Motion Attention [arXiv 2024] Paper
Adaptive Super Resolution For One-Shot Talking-Head Generation [arXiv 2024] Paper
Style2Talker: High-Resolution Talking Head Generation with Emotion Style and Art Style [arXiv 2024] Paper
FlowVQTalker: High-Quality Emotional Talking Face Generation through Normalizing Flow and Quantization [arXiv 2024] Paper
FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio [arXiv 2024] Paper Code
Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis [CVPR 2024] Paper Code
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions [arXiv 2024] Paper ProjectPage Code
G4G:A Generic Framework for High Fidelity Talking Face Generation with Fine-grained Intra-modal Alignment [arXiv 2024] Paper
Context-aware Talking Face Video Generation [arXiv 2024] Paper
EmoSpeaker: One-shot Fine-grained Emotion-Controlled Talking Face Generation [arXiv 2024] Paper ProjectPage Code
GPAvatar: Generalizable and Precise Head Avatar from Image(s) [ICLR 2024] Paper Code
Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis [ICLR 2024] Paper
EmoTalker: Emotionally Editable Talking Face Generation via Diffusion Model [ICASSP 2024] Paper
CVTHead: One-shot Controllable Head Avatar with Vertex-feature Transformer [WACV 2024] Paper Code

2023

VectorTalker: SVG Talking Face Generation with Progressive Vectorisation [arXiv 2023] Paper
DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models [arXiv 2023] Paper ProjectPage
GMTalker: Gaussian Mixture based Emotional talking video Portraits [arXiv 2023] Paper ProjectPage
DiT-Head: High-Resolution Talking Head Synthesis using Diffusion Transformers [arXiv 2023] Paper
R2-Talker: Realistic Real-Time Talking Head Synthesis with Hash Grid Landmarks Encoding and Progressive Multilayer Conditioning [arXiv 2023] Paper
FT2TF: First-Person Statement Text-To-Talking Face Generation [arXiv 2023] Paper
VividTalk: One-Shot Audio-Driven Talking Head Generation Based on 3D Hybrid Prior [arXiv 2023] Paper Code ProjectPage
SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis [arXiv 2023] Paper Code ProjectPage
GAIA: Zero-shot Talking Avatar Generation [arXiv 2023] Paper
Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis [ICCV 2023] Paper ProjectPage Code
Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head Video Generation [ICCV 2023] Paper ProjectPage Code
MODA: Mapping-Once Audio-driven Portrait Animation with Dual Attentions [ICCV 2023] Paper ProjectPage
ToonTalker: Cross-Domain Face Reenactment [ICCV 2023] Paper
Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation [ICCV 2023] Paper ProjectPage Code
EMMN: Emotional Motion Memory Network for Audio-driven Emotional Talking Face Generation [ICCV 2023] Paper
Emotional Listener Portrait: Realistic Listener Motion Simulation in Conversation [ICCV 2023] Paper
Instruct-NeuralTalker: Editing Audio-Driven Talking Radiance Fields with Instructions [arXiv 2023] Paper
Plug the Leaks: Advancing Audio-driven Talking Face Generation by Preventing Unintended Information Flow [arXiv 2023] Paper
Reprogramming Audio-driven Talking Face Synthesis into Text-driven [arXiv 2023] Paper
Audio-Driven Dubbing for User Generated Contents via Style-Aware Semi-Parametric Synthesis [TCSVT 2023] Paper
Emotional Talking Head Generation based on Memory-Sharing and Attention-Augmented Networks [arXiv 2023] Paper
Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis [arXiv 2023] Paper
SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation [CVPR 2023] Paper Code
MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation [CVPR 2023] Paper ProjectPage Code
Implicit Neural Head Synthesis via Controllable Local Deformation Fields [CVPR 2023] Paper
LipFormer: High-fidelity and Generalizable Talking Face Generation with A Pre-learned Facial Codebook [CVPR 2023] Paper
GANHead: Towards Generative Animatable Neural Head Avatars [CVPR 2023] Paper ProjectPage Code
Parametric Implicit Face Representation for Audio-Driven Facial Reenactment [CVPR 2023] Paper
Identity-Preserving Talking Face Generation with Landmark and Appearance Priors [CVPR 2023] Paper Code
StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator [CVPR 2023] Paper ProjectPage Code
Avatar Fingerprinting for Authorized Use of Synthetic Talking-Head Videos [arXiv 2023] Paper ProjectPage
Multimodal-driven Talking Face Generation, Face Swapping, Diffusion Model [arXiv 2023] Paper
High-fidelity Generalized Emotional Talking Face Generation with Multi-modal Emotion Space Learning [CVPR 2023] Paper
StyleLipSync: Style-based Personalized Lip-sync Video Generation [arXiv 2023] Paper ProjectPage Code
GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation [arXiv 2023] Paper ProjectPage
High-Fidelity and Freely Controllable Talking Head Video Generation [CVPR 2023] Paper Project Page
One-Shot High-Fidelity Talking-Head Synthesis with Deformable Neural Radiance Field [CVPR 2023] Paper ProjectPage
Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert [CVPR 2023] Paper Code
Audio-Driven Talking Face Generation with Diverse yet Realistic Facial Animations [arXiv 2023] Paper
That's What I Said: Fully-Controllable Talking Face Generation [arXiv 2023] Paper ProjectPage
Emotionally Enhanced Talking Face Generation [arXiv 2023] Paper Code ProjectPage
A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation [MLSys Workshop 2023] Paper
TalkCLIP: Talking Head Generation with Text-Guided Expressive Speaking Styles [arXiv 2023] Paper
FONT: Flow-guided One-shot Talking Head Generation with Natural Head Motions [ICME 2023] Paper
DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder [arXiv 2023] Paper ProjectPage
OPT: ONE-SHOT POSE-CONTROLLABLE TALKING HEAD GENERATION [ICASSP 2023] Paper
DisCoHead: Audio-and-Video-Driven Talking Head Generation by Disentangled Control of Head Pose and Facial Expressions [ICASSP 2023] Paper Code ProjectPage
GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis [ICLR 2023] Paper Code ProjectPage
OTAvatar : One-shot Talking Face Avatar with Controllable Tri-plane Rendering [CVPR 2023] Paper Code
Emotionally Enhanced Talking Face Generation [arXiv 2023] Paper Code ProjectPage
Style Transfer for 2D Talking Head Animation [arXiv 2023] Paper
READ Avatars: Realistic Emotion-controllable Audio Driven Avatars [arXiv 2023] Paper
On the Audio-visual Synchronization for Lip-to-Speech Synthesis [arXiv 2023] Paper
DiffTalk: Crafting Diffusion Models for Generalized Talking Head Synthesis [CVPR 2023] Paper
Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation [arXiv 2023] Paper ProjectPage
StyleTalk: One-shot Talking Head Generation with Controllable Speaking Styles [AAAI 2023] Paper Code
Audio-Visual Face Reenactment [WACV 2023] Paper ProjectPage Code

2022

Memories are One-to-Many Mapping Alleviators in Talking Face Generation [arXiv 2022] Paper ProjectPage
Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers [SIGGRAPH Asia 2022] Paper
Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors [arXiv 2022] Paper ProjectPage
Progressive Disentangled Representation Learning for Fine-Grained Controllable Talking Head Synthesis [CVPR 2022] Paper ProjectPage
SPACE: Speech-driven Portrait Animation with Controllable Expression [arXiv 2022] Paper ProjectPage
Compressing Video Calls using Synthetic Talking Heads [BMVC 2022] Paper Project Page
Synthesizing Photorealistic Virtual Humans Through Cross-modal Disentanglement [arXiv 2022] Paper
StyleTalker: One-shot Style-based Audio-driven Talking Head Video Generation [arXiv 2022] Paper
Free-HeadGAN: Neural Talking Head Synthesis with Explicit Gaze Control [arXiv 2022] Paper
EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model [SIGGRAPH 2022] Paper
Talking Head from Speech Audio using a Pre-trained Image Generator [ACM MM 2022] Paper
Latent Image Animator: Learning to Animate Images via Latent Space Navigation [ICLR 2022] Paper ProjectPage(note this page has auto-play music...) Code
Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition [arXiv 2022] Paper ProjectPage Code
Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis [ECCV 2022] Paper ProjectPage Code
Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation [ECCV 2022] Paper ProjectPage Code
Text2Video: Text-driven Talking-head Video Synthesis with Phonetic Dictionary [ICASSP 2022] Paper ProjectPage Code
StableFace: Analyzing and Improving Motion Stability for Talking Face Generation [arXiv 2022] Paper ProjectPage
Emotion-Controllable Generalized Talking Face Generation [IJCAI 2022] Paper
StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN [arXiv 2022] Paper Code ProjectPage
DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering [arXiv 2022] Paper
Dynamic Neural Textures: Generating Talking-Face Videos with Continuously Controllable Expressions [arXiv 2022] Paper
Audio-Driven Talking Face Video Generation with Dynamic Convolution Kernels [TMM 2022] Paper
Depth-Aware Generative Adversarial Network for Talking Head Video Generation [CVPR 2022] Paper ProjectPage Code
Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning [CVPR 2022] Paper Code ProjectPage
Depth-Aware Generative Adversarial Network for Talking Head Video Generation [CVPR 2022] Paper Code ProjectPage
Expressive Talking Head Generation with Granular Audio-Visual Control [CVPR 2022] Paper
Talking Face Generation with Multilingual TTS [CVPR 2022 Demo] Paper DemoPage
SyncTalkFace: Talking Face Generation with Precise Lip-syncing via Audio-Lip Memory [AAAI 2022] Paper

2021

Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation [SIGGRAPH Asia 2021] Paper Code
Imitating Arbitrary Talking Style for Realistic Audio-Driven Talking Face Synthesis [ACMMM 2021] Paper Code
AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis [ICCV 2021] Paper Code
FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning [ICCV 2021] Paper Code
Learned Spatial Representations for Few-shot Talking-Head Synthesis [ICCV 2021] Paper
Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation [CVPR 2021] Paper Code ProjectPage
One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing [CVPR 2021] Paper
Audio-Driven Emotional Video Portraits [CVPR 2021] Paper Code
AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary Person [arXiv 2021] Paper
Talking Head Generation with Audio and Speech Related Facial Action Units [BMVC 2021] Paper
Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion [IJCAI 2021] Paper
Write-a-speaker: Text-based Emotional and Rhythmic Talking-head Generation [AAAI 2021] Paper
Text2Video: Text-driven Talking-head Video Synthesis with Phonetic Dictionary [arXiv 2021] Paper Code

2020

Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose [arXiv 2020] Paper Code
A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild [ACMMM 2020] Paper Code
Talking Face Generation with Expression-Tailored Generative Adversarial Network [ACMMM 2020] Paper
Speech Driven Talking Face Generation from a Single Image and an Emotion Condition [arXiv 2020] Paper Code
A Neural Lip-Sync Framework for Synthesizing Photorealistic Virtual News Anchors [ICPR 2020] Paper
Everybody's Talkin': Let Me Talk as You Want [arXiv 2020] Paper
HeadGAN: Video-and-Audio-Driven Talking Head Synthesis [arXiv 2020] Paper
Talking-head Generation with Rhythmic Head Motion [ECCV 2020] Paper
Neural Voice Puppetry: Audio-driven Facial Reenactment [ECCV 2020] Paper Project Code
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis [CVPR 2020] Paper
Robust One Shot Audio to Video Generation [CVPRW 2020] Paper
MakeItTalk: Speaker-Aware Talking Head Animation [SIGGRAPH Asia 2020] Paper Code
FLNet: Landmark Driven Fetching and Learning Network for Faithful Talking Facial Animation Synthesis. [AAAI 2020] Paper
Realistic Face Reenactment via Self-Supervised Disentangling of Identity and Pose [AAAI 2020] Paper
Photorealistic Lip Sync with Adversarial Temporal Convolutional [arXiv 2020] Paper
SPEECH-DRIVEN FACIAL ANIMATION USING POLYNOMIAL FUSION OF FEATURES [arXiv 2020] Paper
Animating Face using Disentangled Audio Representations [WACV 2020] Paper

Before 2020

Realistic Speech-Driven Facial Animation with GANs. [IJCV 2019] Paper PorjectPage
Few-Shot Adversarial Learning of Realistic Neural Talking Head Models [ICCV 2019] Paper Code
Hierarchical Cross-Modal Talking Face Generation with Dynamic Pixel-Wise Loss [CVPR 2019] Paper Code
Talking Face Generation by Adversarially Disentangled Audio-Visual Representation [AAAI 2019] Paper Code ProjectPage
Lip Movements Generation at a Glance [ECCV 2018] Paper
X2Face: A network for controlling face generation using images, audio, and pose codes [ECCV 2018] Paper Code ProjectPage
Talking Face Generation by Conditional Recurrent Adversarial Network [IJCAI 2019] Paper Code
Speech-Driven Facial Reenactment Using Conditional Generative Adversarial Networks [arXiv 2018] Paper
High-Resolution Talking Face Generation via Mutual Information Approximation [arXiv 2018] Paper
Generative Adversarial Talking Head: Bringing Portraits to Life with a Weakly Supervised Neural Network [arXiv 2018] Paper
You said that? [BMVC 2017] Paper

2D Video - Person dependent

Continuously Controllable Facial Expression Editing in Talking Face Videos [TAFFC 2023] Paper Project Page
Synthesizing Obama: Learning Lip Sync from Audio [SIGGRAPH 2017] Paper Project Page
PHOTOREALISTIC ADAPTATION AND INTERPOLATION OF FACIAL EXPRESSIONS USING HMMS AND AAMS FOR AUDIO-VISUAL SPEECH SYNTHESIS [ICIP 2017] Paper
HMM-Based Photo-Realistic Talking Face Synthesis Using Facial Expression Parameter Mapping with Deep Neural Networks [Journal of Computer and Communications2017] Paper
ObamaNet: Photo-realistic lip-sync from text [arXiv 2017] Paper
A deep bidirectional LSTM approach for video-realistic talking head [Multimedia Tools Appl 2015] Paper
Photo-Realistic Expressive Text to Talking Head Synthesis [Interspeech 2013] Paper
PHOTO-REAL TALKING HEAD WITH DEEP BIDIRECTIONAL LSTM [ICASSP 2015] Paper
Expressive Speech-Driven Facial Animation [TOG 2005] Paper

3D Animation

MimicTalk: Mimicking a personalized and expressive 3D talking face in few minutes [NeurIPS 2024] Paper Code ProjectPage
ScanTalk: 3D Talking Heads from Unregistered Scans [ECCV 2024] Paper Code
Audio-Driven Emotional 3D Talking-Head Generation [arXiv 2024] Paper
Beyond Fixed Topologies: Unregistered Training and Comprehensive Evaluation Metrics for 3D Talking Heads [arXiv 2024] Paper
3DFacePolicy: Speech-Driven 3D Facial Animation with Diffusion Policy [arxiv 2024] Paper
ProbTalk3D: Non-Deterministic Emotion Controllable Speech-Driven 3D Facial Animation Synthesis Using VQ-VAE [arXiv 2024] Paper Code
KMTalk: Speech-Driven 3D Facial Animation with Key Motion Embedding [ECCV 2024] Paper Code
EmoFace: Emotion-Content Disentangled Speech-Driven 3D Talking Face with Mesh Attention [arXiv 2024] Paper
DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation [arXiv 2024] Paper
JambaTalk: Speech-Driven 3D Talking Head Generation Based on Hybrid Transformer-Mamba Model [arXiv 2024] Paper
GLDiTalker: Speech-Driven 3D Facial Animation with Graph Latent Diffusion Transformer [arXiv 2024] Paper
UniTalker: Scaling up Audio-Driven 3D Facial Animation through A Unified Model [arXiv 2024] Paper
EmoTalk3D: High-Fidelity Free-View Synthesis of Emotional 3D Talking Head [arXiv 2024] Paper
EmoFace: Audio-driven Emotional 3D Face Animation [arXiv 2024] Paper Code
MultiTalk: Enhancing 3D Talking Head Generation Across Languages with Multilingual Video Dataset [InterSpeed 2024] Paper ProjectPage
3D Gaussian Blendshapes for Head Avatar Animation [SIGGRAPH 2024] Paper
CSTalk: Correlation Supervised Speech-driven 3D Emotional Facial Animation Generation [arXiv 2024] Paper
GaussianTalker: Speaker-specific Talking Head Synthesis via 3D Gaussian Splatting [arXiv 2024] Paper
Learn2Talk: 3D Talking Face Learns from 2D Talking Face [arXiv 2024] Paper ProjectPage
Beyond Talking -- Generating Holistic 3D Human Dyadic Motion for Communication [arXiv 2024] Paper
AnimateMe: 4D Facial Expressions via Diffusion Models [arXiv 2024] Paper
EmoVOCA: Speech-Driven Emotional 3D Talking Heads [arXiv 2024] Paper
FaceTalk: Audio-Driven Motion Diffusion for Neural Parametric Head Models [CVPR 2024] Paper Code ProjectPage
AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation [arXiv 2024] Paper
DiffSpeaker: Speech-Driven 3D Facial Animation with Diffusion Transformer [arXiv 2024] Paper Code
Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance [arXiv 2024] Paper ProjectPage
EMOTE: Emotional Speech-Driven Animation with Content-Emotion Disentanglement [SIGGRAPH Asia 2023] Paper ProjectPage
PMMTalk: Speech-Driven 3D Facial Animation from Complementary Pseudo Multi-modal Features [arXiv] Paper
3DiFACE: Diffusion-based Speech-driven 3D Facial Animation and Editing [arXiv 2023] Paper Code ProjectPage
Probabilistic Speech-Driven 3D Facial Motion Synthesis: New Benchmarks, Methods, and Applications [arXiv 2023] Paper
DiffusionTalker: Personalization and Acceleration for Speech-Driven 3D Face Diffuser [arXiv 2023] Paper
DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models [arXiv 2023] Paper ProjectPage Code
Imitator: Personalized Speech-driven 3D Facial Animation [ICCV 2023] Paper ProjectPage Code
Speech4Mesh: Speech-Assisted Monocular 3D Facial Reconstruction for Speech-Driven 3D Facial Animation [ICCV 2023] Paper
Semi-supervised Speech-driven 3D Facial Animation via Cross-modal Encoding [ICCV 2023] Paper
Audio-Driven 3D Facial Animation from In-the-Wild Videos [arXiv 2023] Paper ProjectPage
EmoTalk: Speech-driven emotional disentanglement for 3D face animation [ICCV 2023] Paper ProjectPage
FaceXHuBERT: Text-less Speech-driven E(X)pressive 3D Facial Animation Synthesis Using Self-Supervised Speech Representation Learning [arXiv 2023] Paper Code ProjectPage
Pose-Controllable 3D Facial Animation Synthesis using Hierarchical Audio-Vertices Attention [arXiv 2023] Paper
Learning Audio-Driven Viseme Dynamics for 3D Face Animation [arXiv 2023] Paper ProjectPage
CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior [CVPR 2023] Paper ProjectPage
Expressive Speech-driven Facial Animation with controllable emotions [arXiv 2023] Paper
Imitator: Personalized Speech-driven 3D Facial Animation [arXiv 2022] Paper ProjectPage
PV3D: A 3D Generative Model for Portrait Video Generation [arXiv 2022] Paper ProjectPage
Neural Emotion Director: Speech-preserving semantic control of facial expressions in “in-the-wild” videos [CVPR 2022] Paper Code
FaceFormer: Speech-Driven 3D Facial Animation with Transformers [CVPR 2022] Paper Code ProjectPage
LipSync3D: Data-Efficient Learning of Personalized 3D Talking Faces from Video using Pose and Lighting Normalization [CVPR 2021] Paper
MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement [ICCV 2021] Paper
AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis [ICCV 2021] Paper Code
3D-TalkEmo: Learning to Synthesize 3D Emotional Talking Head [arXiv 2021] Paper
Modality Dropout for Improved Performance-driven Talking Faces [ICMI 2020] Paper
Audio- and Gaze-driven Facial Animation of Codec Avatars [arXiv 2020] Paper
Capture, Learning, and Synthesis of 3D Speaking Styles [CVPR 2019] Paper
VisemeNet: Audio-Driven Animator-Centric Speech Animation [TOG 2018] Paper
Speech-Driven Expressive Talking Lips with Conditional Sequential Generative Adversarial Networks [TAC 2018] Paper
End-to-end Learning for 3D Facial Animation from Speech [ICMI 2018] Paper
Visual Speech Emotion Conversion using Deep Learning for 3D Talking Head [MMAC 2018]
A Deep Learning Approach for Generalized Speech Animation [SIGGRAPH 2017] Paper
Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion [TOG 2017] Paper
Speech-driven 3D Facial Animation with Implicit Emotional Awareness A Deep Learning Approach [CVPR 2017]
Expressive Speech Driven Talking Avatar Synthesis with DBLSTM using Limited Amount of Emotional Bimodal Data [Interspeech 2016] Paper
Real-Time Speech-Driven Face Animation With Expressions Using Neural Networks [TONN 2012] Paper
Facial Expression Synthesis Based on Emotion Dimensions for Affective Talking Avatar [SIST 2010] Paper

Datasets & Benchmark

Responsive Listening Head Generation: A Benchmark Dataset and Baseline [ECCV 2022] Paper ProjectPage
TalkingHead-1KH Link
MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV 2020] ProjectPage
VoxCeleb Link
LRW Link
LRS2 Link
GRID Link
CREMA-D Link
MMFace4D Link
DPCD Link Paper

Survey

A Comprehensive Taxonomy and Analysis of Talking Head Synthesis: Techniques for Portrait Generation, Driving Mechanisms, and Editing [arXiv 2024] Paper
From Pixels to Portraits: A Comprehensive Survey of Talking Head Generation Techniques and Applications [arXiv 2023] Paper
Deep Learning for Visual Speech Analysis: A Survey [arXiv 2022] Paper
What comprises a good talking-head video generation?: A Survey and Benchmark [arXiv 2020] Paper

Colabs

Avatars4All: https://github.com/eyaler/avatars4all

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Awesome Talking Face

🔆 This project is still on-going, pull requests are welcomed!!

⭐ If you find this repo useful, please star it!!!

2022.09 Update!

2021.11 Update!

TO DO LIST

Papers

2D Video - Person independent

2024

2023

2022

2021

2020

Before 2020

2D Video - Person dependent

3D Animation

Datasets & Benchmark

Survey

Colabs

Files

README.md

Latest commit

History

README.md

File metadata and controls

Awesome Talking Face

🔆 This project is still on-going, pull requests are welcomed!!

⭐ If you find this repo useful, please star it!!!

2022.09 Update!

2021.11 Update!

TO DO LIST

Papers

2D Video - Person independent

2024

2023

2022

2021

2020

Before 2020

2D Video - Person dependent

3D Animation

Datasets & Benchmark

Survey

Colabs