This is a repository for organizing papres, codes and other resources related to talking face/head. Most papers are linked to the pdf address provided by "arXiv" or "OpenAccess". However, some papers require an academic license to browse. For example, IEEE, springer, and elsevier journal, etc.
If you have any suggestions (missing papers, new papers, key researchers or typos), please feel free to edit and pull a request. Just letting me know the title of papers can also be a big contribution to me. You can do this by open issue or contact me directly via email.
Thanks for PR from everybody! From now on, I'll occasionally include some papers about video-driven talking face generation. Because I found that the community is trying to include the video-driven methods into the talking face generation scope, though it is originally termed as Face Reenactment.
So, if you are looking for video-driven talking face generation, I would suggest you have a star here, and go to search Face Reenactment, you'll find more :)
One more thing, please correct me if you find that there are any paper noted as arXiv paper has been accepted to some conferences or journals.
I updated a batch of papers that appeared in the past few months. In this repo, I was intend to cover the audio-driven talking face generation works. However, I found several text-based research works are also very interesting. So I included them here. Enjoy it!
- Main paper list
- Add paper link
- Add codes if have
- Add project page if have
- Datasets and survey
- DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation [arXiv 2024] Paper Code ProjectPage
- Takin-ADA: Emotion Controllable Audio-Driven Animation with Canonical and Landmark Loss Optimization [arXiv 2024] Paper
- MuseTalk: Real-Time High Quality Lip Synchronization with Latent Space Inpainting [arXiv 2024] Paper Code
- 3D-Aware Text-driven Talking Avatar Generation [ECCV 2024] Paper
- LaDTalk: Latent Denoising for Synthesizing Talking Head Videos with High Frequency Details [arXiv 2024] Paper
- TalkinNeRF: Animatable Neural Fields for Full-Body Talking Humans [ECCVW 2024] Paper
- JoyHallo: Digital human model for Mandarin [arXiv 2024] Paper Code
- JEAN: Joint Expression and Audio-guided NeRF-based Talking Face Generation [BMVC 2024] Paper
- StyleTalk++: A Unified Framework for Controlling the Speaking Styles of Talking Heads [TPAMI 2024] Paper
- DiffTED: One-shot Audio-driven TED Talk Video Generation with Diffusion-based Co-speech Gestures [CVPRW 2024] Paper
- EMOdiffhead: Continuously Emotional Control in Talking Head Generation via Diffusion [arXiv 2024] Paper
- SVP: Style-Enhanced Vivid Portrait Talking Head Diffusion Model [arXiv 2024] Paper
- SegTalker: Segmentation-based Talking Face Generation with Mask-guided Local Editing [arXiv 2024] Paper
- Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency [arXiv 2024] Paper ProjectPage
- PoseTalk: Text-and-Audio-based Pose Control and Motion Refinement for One-Shot Talking Head Generation [arXiv 2024] Paper
- CyberHost: Taming Audio-driven Avatar Diffusion Model with Region Codebook Attention [arXiv 2024] Paper ProjectPage
- TalkLoRA: Low-Rank Adaptation for Speech-Driven Animation [arXiv 2024] Paper
- S^3D-NeRF: Single-Shot Speech-Driven Neural Radiance Field for High Fidelity Talking Head Synthesis [arXiv 2024] Paper
- FD2Talk: Towards Generalized Talking Head Generation with Facial Decoupled Diffusion Model [arXiv 2024] Paper
- LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control [arXiv 2024] Paper ProjectPage Code
- High-fidelity and Lip-synced Talking Face Synthesis via Landmark-based Diffusion Model [arXiv 2024] Paper
- Landmark-guided Diffusion Model for High-fidelity and Temporally Coherent Talking Head Generation [arXiv 2024] Paper
- LinguaLinker: Audio-Driven Portraits Animation with Implicit Facial Control Enhancement [arXiv 2024] Paper ProjectPage Code
- Learning Online Scale Transformation for Talking Head Video Generation [arXiv 2024] Paper
- EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions [arXiv 2024] Paper ProjectPage GitHub
- Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation [arXiv 2024] Paper ProjectPage GitHub
- RealTalk: Real-time and Realistic Audio-driven Face Generation with 3D Facial Prior-guided Identity Alignment Network [arXiv 2024] Paper
- Emotional Conversation: Empowering Talking Faces with Cohesive Expression, Gaze and Pose Generation [arXiv 2024] Paper
- Let's Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation [arXiv 2024] Paper ProjectPage
- Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement [arXiv 2024] Paper ProjectPage
- Controllable Talking Face Generation by Implicit Facial Keypoints Editing [arXiv 2024] Paper
- InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation [arXiv 2024] Paper ProjectPage
- Faces that Speak: Jointly Synthesising Talking Face and Speech from Text [arXiv 2024] Paper ProjectPage
- Listen, Disentangle, and Control: Controllable Speech-Driven Talking Head Generation [arXiv 2024] Paper
- SwapTalk: Audio-Driven Talking Face Generation with One-Shot Customization in Latent Space [arXiv 2024] Paper ProjectPage
- AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding [arXiv 2024] Paper Code ProjectPage
- NeRFFaceSpeech: One-shot Audio-diven 3D Talking Head Synthesis via Generative Prior [CVPR 2024 Workshop] Paper Code ProjectPage
- Audio-Visual Speech Representation Expert for Enhanced Talking Face Video Generation and Evaluation [CVPR 2024 Workshop] Paper
- EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars [arXiv 2024] PaperProjectPage
- GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting [arXiv 2024] Paper
- VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time [arXiv 2024] Paper ProjectPage
- THQA: A Perceptual Quality Assessment Database for Talking Heads [arXiv 2024] Paper Code
- Talk3D: High-Fidelity Talking Portrait Synthesis via Personalized 3D Generative Prior [arXiv 2024] Paper Code ProjectPage
- EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis [arXiv 2024] Paper Code ProjectPage
- AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animations [arXiv 2024] Paper Code
- MoDiTalker: Motion-Disentangled Diffusion Model for High-Fidelity Talking Head Generation [arXiv 2024] Paper ProjectPage
- Superior and Pragmatic Talking Face Generation with Teacher-Student Framework [arXiv 2024] Paper ProjectPage
- X-Portrait: Expressive Portrait Animation with Hierarchical Motion Attention [arXiv 2024] Paper
- Adaptive Super Resolution For One-Shot Talking-Head Generation [arXiv 2024] Paper
- Style2Talker: High-Resolution Talking Head Generation with Emotion Style and Art Style [arXiv 2024] Paper
- FlowVQTalker: High-Quality Emotional Talking Face Generation through Normalizing Flow and Quantization [arXiv 2024] Paper
- FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio [arXiv 2024] Paper Code
- Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis [CVPR 2024] Paper Code
- EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions [arXiv 2024] Paper ProjectPage Code
- G4G:A Generic Framework for High Fidelity Talking Face Generation with Fine-grained Intra-modal Alignment [arXiv 2024] Paper
- Context-aware Talking Face Video Generation [arXiv 2024] Paper
- EmoSpeaker: One-shot Fine-grained Emotion-Controlled Talking Face Generation [arXiv 2024] Paper ProjectPage Code
- GPAvatar: Generalizable and Precise Head Avatar from Image(s) [ICLR 2024] Paper Code
- Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis [ICLR 2024] Paper
- EmoTalker: Emotionally Editable Talking Face Generation via Diffusion Model [ICASSP 2024] Paper
- CVTHead: One-shot Controllable Head Avatar with Vertex-feature Transformer [WACV 2024] Paper Code
- VectorTalker: SVG Talking Face Generation with Progressive Vectorisation [arXiv 2023] Paper
- DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models [arXiv 2023] Paper ProjectPage
- GMTalker: Gaussian Mixture based Emotional talking video Portraits [arXiv 2023] Paper ProjectPage
- DiT-Head: High-Resolution Talking Head Synthesis using Diffusion Transformers [arXiv 2023] Paper
- R2-Talker: Realistic Real-Time Talking Head Synthesis with Hash Grid Landmarks Encoding and Progressive Multilayer Conditioning [arXiv 2023] Paper
- FT2TF: First-Person Statement Text-To-Talking Face Generation [arXiv 2023] Paper
- VividTalk: One-Shot Audio-Driven Talking Head Generation Based on 3D Hybrid Prior [arXiv 2023] Paper Code ProjectPage
- SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis [arXiv 2023] Paper Code ProjectPage
- GAIA: Zero-shot Talking Avatar Generation [arXiv 2023] Paper
- Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis [ICCV 2023] Paper ProjectPage Code
- Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head Video Generation [ICCV 2023] Paper ProjectPage Code
- MODA: Mapping-Once Audio-driven Portrait Animation with Dual Attentions [ICCV 2023] Paper ProjectPage
- ToonTalker: Cross-Domain Face Reenactment [ICCV 2023] Paper
- Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation [ICCV 2023] Paper ProjectPage Code
- EMMN: Emotional Motion Memory Network for Audio-driven Emotional Talking Face Generation [ICCV 2023] Paper
- Emotional Listener Portrait: Realistic Listener Motion Simulation in Conversation [ICCV 2023] Paper
- Instruct-NeuralTalker: Editing Audio-Driven Talking Radiance Fields with Instructions [arXiv 2023] Paper
- Plug the Leaks: Advancing Audio-driven Talking Face Generation by Preventing Unintended Information Flow [arXiv 2023] Paper
- Reprogramming Audio-driven Talking Face Synthesis into Text-driven [arXiv 2023] Paper
- Audio-Driven Dubbing for User Generated Contents via Style-Aware Semi-Parametric Synthesis [TCSVT 2023] Paper
- Emotional Talking Head Generation based on Memory-Sharing and Attention-Augmented Networks [arXiv 2023] Paper
- Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis [arXiv 2023] Paper
- SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation [CVPR 2023] Paper Code
- MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation [CVPR 2023] Paper ProjectPage Code
- Implicit Neural Head Synthesis via Controllable Local Deformation Fields [CVPR 2023] Paper
- LipFormer: High-fidelity and Generalizable Talking Face Generation with A Pre-learned Facial Codebook [CVPR 2023] Paper
- GANHead: Towards Generative Animatable Neural Head Avatars [CVPR 2023] Paper ProjectPage Code
- Parametric Implicit Face Representation for Audio-Driven Facial Reenactment [CVPR 2023] Paper
- Identity-Preserving Talking Face Generation with Landmark and Appearance Priors [CVPR 2023] Paper Code
- StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator [CVPR 2023] Paper ProjectPage Code
- Avatar Fingerprinting for Authorized Use of Synthetic Talking-Head Videos [arXiv 2023] Paper ProjectPage
- Multimodal-driven Talking Face Generation, Face Swapping, Diffusion Model [arXiv 2023] Paper
- High-fidelity Generalized Emotional Talking Face Generation with Multi-modal Emotion Space Learning [CVPR 2023] Paper
- StyleLipSync: Style-based Personalized Lip-sync Video Generation [arXiv 2023] Paper ProjectPage Code
- GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation [arXiv 2023] Paper ProjectPage
- High-Fidelity and Freely Controllable Talking Head Video Generation [CVPR 2023] Paper Project Page
- One-Shot High-Fidelity Talking-Head Synthesis with Deformable Neural Radiance Field [CVPR 2023] Paper ProjectPage
- Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert [CVPR 2023] Paper Code
- Audio-Driven Talking Face Generation with Diverse yet Realistic Facial Animations [arXiv 2023] Paper
- That's What I Said: Fully-Controllable Talking Face Generation [arXiv 2023] Paper ProjectPage
- Emotionally Enhanced Talking Face Generation [arXiv 2023] Paper Code ProjectPage
- A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation [MLSys Workshop 2023] Paper
- TalkCLIP: Talking Head Generation with Text-Guided Expressive Speaking Styles [arXiv 2023] Paper
- FONT: Flow-guided One-shot Talking Head Generation with Natural Head Motions [ICME 2023] Paper
- DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder [arXiv 2023] Paper ProjectPage
- OPT: ONE-SHOT POSE-CONTROLLABLE TALKING HEAD GENERATION [ICASSP 2023] Paper
- DisCoHead: Audio-and-Video-Driven Talking Head Generation by Disentangled Control of Head Pose and Facial Expressions [ICASSP 2023] Paper Code ProjectPage
- GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis [ICLR 2023] Paper Code ProjectPage
- OTAvatar : One-shot Talking Face Avatar with Controllable Tri-plane Rendering [CVPR 2023] Paper Code
- Emotionally Enhanced Talking Face Generation [arXiv 2023] Paper Code ProjectPage
- Style Transfer for 2D Talking Head Animation [arXiv 2023] Paper
- READ Avatars: Realistic Emotion-controllable Audio Driven Avatars [arXiv 2023] Paper
- On the Audio-visual Synchronization for Lip-to-Speech Synthesis [arXiv 2023] Paper
- DiffTalk: Crafting Diffusion Models for Generalized Talking Head Synthesis [CVPR 2023] Paper
- Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation [arXiv 2023] Paper ProjectPage
- StyleTalk: One-shot Talking Head Generation with Controllable Speaking Styles [AAAI 2023] Paper Code
- Audio-Visual Face Reenactment [WACV 2023] Paper ProjectPage Code
- Memories are One-to-Many Mapping Alleviators in Talking Face Generation [arXiv 2022] Paper ProjectPage
- Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers [SIGGRAPH Asia 2022] Paper
- Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors [arXiv 2022] Paper ProjectPage
- Progressive Disentangled Representation Learning for Fine-Grained Controllable Talking Head Synthesis [CVPR 2022] Paper ProjectPage
- SPACE: Speech-driven Portrait Animation with Controllable Expression [arXiv 2022] Paper ProjectPage
- Compressing Video Calls using Synthetic Talking Heads [BMVC 2022] Paper Project Page
- Synthesizing Photorealistic Virtual Humans Through Cross-modal Disentanglement [arXiv 2022] Paper
- StyleTalker: One-shot Style-based Audio-driven Talking Head Video Generation [arXiv 2022] Paper
- Free-HeadGAN: Neural Talking Head Synthesis with Explicit Gaze Control [arXiv 2022] Paper
- EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model [SIGGRAPH 2022] Paper
- Talking Head from Speech Audio using a Pre-trained Image Generator [ACM MM 2022] Paper
- Latent Image Animator: Learning to Animate Images via Latent Space Navigation [ICLR 2022] Paper ProjectPage(note this page has auto-play music...) Code
- Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition [arXiv 2022] Paper ProjectPage Code
- Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis [ECCV 2022] Paper ProjectPage Code
- Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation [ECCV 2022] Paper ProjectPage Code
- Text2Video: Text-driven Talking-head Video Synthesis with Phonetic Dictionary [ICASSP 2022] Paper ProjectPage Code
- StableFace: Analyzing and Improving Motion Stability for Talking Face Generation [arXiv 2022] Paper ProjectPage
- Emotion-Controllable Generalized Talking Face Generation [IJCAI 2022] Paper
- StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN [arXiv 2022] Paper Code ProjectPage
- DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering [arXiv 2022] Paper
- Dynamic Neural Textures: Generating Talking-Face Videos with Continuously Controllable Expressions [arXiv 2022] Paper
- Audio-Driven Talking Face Video Generation with Dynamic Convolution Kernels [TMM 2022] Paper
- Depth-Aware Generative Adversarial Network for Talking Head Video Generation [CVPR 2022] Paper ProjectPage Code
- Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning [CVPR 2022] Paper Code ProjectPage
- Depth-Aware Generative Adversarial Network for Talking Head Video Generation [CVPR 2022] Paper Code ProjectPage
- Expressive Talking Head Generation with Granular Audio-Visual Control [CVPR 2022] Paper
- Talking Face Generation with Multilingual TTS [CVPR 2022 Demo] Paper DemoPage
- SyncTalkFace: Talking Face Generation with Precise Lip-syncing via Audio-Lip Memory [AAAI 2022] Paper
- Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation [SIGGRAPH Asia 2021] Paper Code
- Imitating Arbitrary Talking Style for Realistic Audio-Driven Talking Face Synthesis [ACMMM 2021] Paper Code
- AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis [ICCV 2021] Paper Code
- FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning [ICCV 2021] Paper Code
- Learned Spatial Representations for Few-shot Talking-Head Synthesis [ICCV 2021] Paper
- Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation [CVPR 2021] Paper Code ProjectPage
- One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing [CVPR 2021] Paper
- Audio-Driven Emotional Video Portraits [CVPR 2021] Paper Code
- AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary Person [arXiv 2021] Paper
- Talking Head Generation with Audio and Speech Related Facial Action Units [BMVC 2021] Paper
- Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion [IJCAI 2021] Paper
- Write-a-speaker: Text-based Emotional and Rhythmic Talking-head Generation [AAAI 2021] Paper
- Text2Video: Text-driven Talking-head Video Synthesis with Phonetic Dictionary [arXiv 2021] Paper Code
- Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose [arXiv 2020] Paper Code
- A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild [ACMMM 2020] Paper Code
- Talking Face Generation with Expression-Tailored Generative Adversarial Network [ACMMM 2020] Paper
- Speech Driven Talking Face Generation from a Single Image and an Emotion Condition [arXiv 2020] Paper Code
- A Neural Lip-Sync Framework for Synthesizing Photorealistic Virtual News Anchors [ICPR 2020] Paper
- Everybody's Talkin': Let Me Talk as You Want [arXiv 2020] Paper
- HeadGAN: Video-and-Audio-Driven Talking Head Synthesis [arXiv 2020] Paper
- Talking-head Generation with Rhythmic Head Motion [ECCV 2020] Paper
- Neural Voice Puppetry: Audio-driven Facial Reenactment [ECCV 2020] Paper Project Code
- Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis [CVPR 2020] Paper
- Robust One Shot Audio to Video Generation [CVPRW 2020] Paper
- MakeItTalk: Speaker-Aware Talking Head Animation [SIGGRAPH Asia 2020] Paper Code
- FLNet: Landmark Driven Fetching and Learning Network for Faithful Talking Facial Animation Synthesis. [AAAI 2020] Paper
- Realistic Face Reenactment via Self-Supervised Disentangling of Identity and Pose [AAAI 2020] Paper
- Photorealistic Lip Sync with Adversarial Temporal Convolutional [arXiv 2020] Paper
- SPEECH-DRIVEN FACIAL ANIMATION USING POLYNOMIAL FUSION OF FEATURES [arXiv 2020] Paper
- Animating Face using Disentangled Audio Representations [WACV 2020] Paper
- Realistic Speech-Driven Facial Animation with GANs. [IJCV 2019] Paper PorjectPage
- Few-Shot Adversarial Learning of Realistic Neural Talking Head Models [ICCV 2019] Paper Code
- Hierarchical Cross-Modal Talking Face Generation with Dynamic Pixel-Wise Loss [CVPR 2019] Paper Code
- Talking Face Generation by Adversarially Disentangled Audio-Visual Representation [AAAI 2019] Paper Code ProjectPage
- Lip Movements Generation at a Glance [ECCV 2018] Paper
- X2Face: A network for controlling face generation using images, audio, and pose codes [ECCV 2018] Paper Code ProjectPage
- Talking Face Generation by Conditional Recurrent Adversarial Network [IJCAI 2019] Paper Code
- Speech-Driven Facial Reenactment Using Conditional Generative Adversarial Networks [arXiv 2018] Paper
- High-Resolution Talking Face Generation via Mutual Information Approximation [arXiv 2018] Paper
- Generative Adversarial Talking Head: Bringing Portraits to Life with a Weakly Supervised Neural Network [arXiv 2018] Paper
- You said that? [BMVC 2017] Paper
- Continuously Controllable Facial Expression Editing in Talking Face Videos [TAFFC 2023] Paper Project Page
- Synthesizing Obama: Learning Lip Sync from Audio [SIGGRAPH 2017] Paper Project Page
- PHOTOREALISTIC ADAPTATION AND INTERPOLATION OF FACIAL EXPRESSIONS USING HMMS AND AAMS FOR AUDIO-VISUAL SPEECH SYNTHESIS [ICIP 2017] Paper
- HMM-Based Photo-Realistic Talking Face Synthesis Using Facial Expression Parameter Mapping with Deep Neural Networks [Journal of Computer and Communications2017] Paper
- ObamaNet: Photo-realistic lip-sync from text [arXiv 2017] Paper
- A deep bidirectional LSTM approach for video-realistic talking head [Multimedia Tools Appl 2015] Paper
- Photo-Realistic Expressive Text to Talking Head Synthesis [Interspeech 2013] Paper
- PHOTO-REAL TALKING HEAD WITH DEEP BIDIRECTIONAL LSTM [ICASSP 2015] Paper
- Expressive Speech-Driven Facial Animation [TOG 2005] Paper
- MimicTalk: Mimicking a personalized and expressive 3D talking face in few minutes [NeurIPS 2024] Paper Code ProjectPage
- ScanTalk: 3D Talking Heads from Unregistered Scans [ECCV 2024] Paper Code
- Audio-Driven Emotional 3D Talking-Head Generation [arXiv 2024] Paper
- Beyond Fixed Topologies: Unregistered Training and Comprehensive Evaluation Metrics for 3D Talking Heads [arXiv 2024] Paper
- 3DFacePolicy: Speech-Driven 3D Facial Animation with Diffusion Policy [arxiv 2024] Paper
- ProbTalk3D: Non-Deterministic Emotion Controllable Speech-Driven 3D Facial Animation Synthesis Using VQ-VAE [arXiv 2024] Paper Code
- KMTalk: Speech-Driven 3D Facial Animation with Key Motion Embedding [ECCV 2024] Paper Code
- EmoFace: Emotion-Content Disentangled Speech-Driven 3D Talking Face with Mesh Attention [arXiv 2024] Paper
- DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation [arXiv 2024] Paper
- JambaTalk: Speech-Driven 3D Talking Head Generation Based on Hybrid Transformer-Mamba Model [arXiv 2024] Paper
- GLDiTalker: Speech-Driven 3D Facial Animation with Graph Latent Diffusion Transformer [arXiv 2024] Paper
- UniTalker: Scaling up Audio-Driven 3D Facial Animation through A Unified Model [arXiv 2024] Paper
- EmoTalk3D: High-Fidelity Free-View Synthesis of Emotional 3D Talking Head [arXiv 2024] Paper
- EmoFace: Audio-driven Emotional 3D Face Animation [arXiv 2024] Paper Code
- MultiTalk: Enhancing 3D Talking Head Generation Across Languages with Multilingual Video Dataset [InterSpeed 2024] Paper ProjectPage
- 3D Gaussian Blendshapes for Head Avatar Animation [SIGGRAPH 2024] Paper
- CSTalk: Correlation Supervised Speech-driven 3D Emotional Facial Animation Generation [arXiv 2024] Paper
- GaussianTalker: Speaker-specific Talking Head Synthesis via 3D Gaussian Splatting [arXiv 2024] Paper
- Learn2Talk: 3D Talking Face Learns from 2D Talking Face [arXiv 2024] Paper ProjectPage
- Beyond Talking -- Generating Holistic 3D Human Dyadic Motion for Communication [arXiv 2024] Paper
- AnimateMe: 4D Facial Expressions via Diffusion Models [arXiv 2024] Paper
- EmoVOCA: Speech-Driven Emotional 3D Talking Heads [arXiv 2024] Paper
- FaceTalk: Audio-Driven Motion Diffusion for Neural Parametric Head Models [CVPR 2024] Paper Code ProjectPage
- AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation [arXiv 2024] Paper
- DiffSpeaker: Speech-Driven 3D Facial Animation with Diffusion Transformer [arXiv 2024] Paper Code
- Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance [arXiv 2024] Paper ProjectPage
- EMOTE: Emotional Speech-Driven Animation with Content-Emotion Disentanglement [SIGGRAPH Asia 2023] Paper ProjectPage
- PMMTalk: Speech-Driven 3D Facial Animation from Complementary Pseudo Multi-modal Features [arXiv] Paper
- 3DiFACE: Diffusion-based Speech-driven 3D Facial Animation and Editing [arXiv 2023] Paper Code ProjectPage
- Probabilistic Speech-Driven 3D Facial Motion Synthesis: New Benchmarks, Methods, and Applications [arXiv 2023] Paper
- DiffusionTalker: Personalization and Acceleration for Speech-Driven 3D Face Diffuser [arXiv 2023] Paper
- DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models [arXiv 2023] Paper ProjectPage Code
- Imitator: Personalized Speech-driven 3D Facial Animation [ICCV 2023] Paper ProjectPage Code
- Speech4Mesh: Speech-Assisted Monocular 3D Facial Reconstruction for Speech-Driven 3D Facial Animation [ICCV 2023] Paper
- Semi-supervised Speech-driven 3D Facial Animation via Cross-modal Encoding [ICCV 2023] Paper
- Audio-Driven 3D Facial Animation from In-the-Wild Videos [arXiv 2023] Paper ProjectPage
- EmoTalk: Speech-driven emotional disentanglement for 3D face animation [ICCV 2023] Paper ProjectPage
- FaceXHuBERT: Text-less Speech-driven E(X)pressive 3D Facial Animation Synthesis Using Self-Supervised Speech Representation Learning [arXiv 2023] Paper Code ProjectPage
- Pose-Controllable 3D Facial Animation Synthesis using Hierarchical Audio-Vertices Attention [arXiv 2023] Paper
- Learning Audio-Driven Viseme Dynamics for 3D Face Animation [arXiv 2023] Paper ProjectPage
- CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior [CVPR 2023] Paper ProjectPage
- Expressive Speech-driven Facial Animation with controllable emotions [arXiv 2023] Paper
- Imitator: Personalized Speech-driven 3D Facial Animation [arXiv 2022] Paper ProjectPage
- PV3D: A 3D Generative Model for Portrait Video Generation [arXiv 2022] Paper ProjectPage
- Neural Emotion Director: Speech-preserving semantic control of facial expressions in “in-the-wild” videos [CVPR 2022] Paper Code
- FaceFormer: Speech-Driven 3D Facial Animation with Transformers [CVPR 2022] Paper Code ProjectPage
- LipSync3D: Data-Efficient Learning of Personalized 3D Talking Faces from Video using Pose and Lighting Normalization [CVPR 2021] Paper
- MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement [ICCV 2021] Paper
- AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis [ICCV 2021] Paper Code
- 3D-TalkEmo: Learning to Synthesize 3D Emotional Talking Head [arXiv 2021] Paper
- Modality Dropout for Improved Performance-driven Talking Faces [ICMI 2020] Paper
- Audio- and Gaze-driven Facial Animation of Codec Avatars [arXiv 2020] Paper
- Capture, Learning, and Synthesis of 3D Speaking Styles [CVPR 2019] Paper
- VisemeNet: Audio-Driven Animator-Centric Speech Animation [TOG 2018] Paper
- Speech-Driven Expressive Talking Lips with Conditional Sequential Generative Adversarial Networks [TAC 2018] Paper
- End-to-end Learning for 3D Facial Animation from Speech [ICMI 2018] Paper
- Visual Speech Emotion Conversion using Deep Learning for 3D Talking Head [MMAC 2018]
- A Deep Learning Approach for Generalized Speech Animation [SIGGRAPH 2017] Paper
- Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion [TOG 2017] Paper
- Speech-driven 3D Facial Animation with Implicit Emotional Awareness A Deep Learning Approach [CVPR 2017]
- Expressive Speech Driven Talking Avatar Synthesis with DBLSTM using Limited Amount of Emotional Bimodal Data [Interspeech 2016] Paper
- Real-Time Speech-Driven Face Animation With Expressions Using Neural Networks [TONN 2012] Paper
- Facial Expression Synthesis Based on Emotion Dimensions for Affective Talking Avatar [SIST 2010] Paper
- Responsive Listening Head Generation: A Benchmark Dataset and Baseline [ECCV 2022] Paper ProjectPage
- TalkingHead-1KH Link
- MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV 2020] ProjectPage
- VoxCeleb Link
- LRW Link
- LRS2 Link
- GRID Link
- CREMA-D Link
- MMFace4D Link
- DPCD Link Paper
- A Comprehensive Taxonomy and Analysis of Talking Head Synthesis: Techniques for Portrait Generation, Driving Mechanisms, and Editing [arXiv 2024] Paper
- From Pixels to Portraits: A Comprehensive Survey of Talking Head Generation Techniques and Applications [arXiv 2023] Paper
- Deep Learning for Visual Speech Analysis: A Survey [arXiv 2022] Paper
- What comprises a good talking-head video generation?: A Survey and Benchmark [arXiv 2020] Paper
- Avatars4All: https://github.com/eyaler/avatars4all