-
Hunyuan3D-1.0
😻270Text-to-3D and Image-to-3D Generation
-
ROICtrl: Boosting Instance Control for Visual Generation
Paper • 2411.17949 • Published • 87 -
VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models
Paper • 2502.02492 • Published • 66 -
Phantom: Subject-consistent video generation via cross-modal alignment
Paper • 2502.11079 • Published • 59
Collections
Discover the best community collections!
Collections including paper arxiv:2502.02492
-
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
Paper • 2401.09985 • Published • 18 -
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Paper • 2401.09962 • Published • 9 -
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution
Paper • 2401.10404 • Published • 10 -
ActAnywhere: Subject-Aware Video Background Generation
Paper • 2401.10822 • Published • 13
-
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 45 -
AToM: Amortized Text-to-Mesh using 2D Diffusion
Paper • 2402.00867 • Published • 11 -
Neural Network Diffusion
Paper • 2402.13144 • Published • 100 -
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Paper • 2402.19479 • Published • 35
-
Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling
Paper • 2401.15977 • Published • 39 -
Lumiere: A Space-Time Diffusion Model for Video Generation
Paper • 2401.12945 • Published • 86 -
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
Paper • 2307.04725 • Published • 65 -
Boximator: Generating Rich and Controllable Motions for Video Synthesis
Paper • 2402.01566 • Published • 27
-
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions
Paper • 2312.08578 • Published • 20 -
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Paper • 2312.08583 • Published • 11 -
Vision-Language Models as a Source of Rewards
Paper • 2312.09187 • Published • 12 -
StemGen: A music generation model that listens
Paper • 2312.08723 • Published • 48
-
Understanding Diffusion Models: A Unified Perspective
Paper • 2208.11970 • Published -
Tutorial on Diffusion Models for Imaging and Vision
Paper • 2403.18103 • Published • 2 -
Denoising Diffusion Probabilistic Models
Paper • 2006.11239 • Published • 9 -
Denoising Diffusion Implicit Models
Paper • 2010.02502 • Published • 4
-
Hunyuan3D-1.0
😻270Text-to-3D and Image-to-3D Generation
-
ROICtrl: Boosting Instance Control for Visual Generation
Paper • 2411.17949 • Published • 87 -
VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models
Paper • 2502.02492 • Published • 66 -
Phantom: Subject-consistent video generation via cross-modal alignment
Paper • 2502.11079 • Published • 59
-
Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling
Paper • 2401.15977 • Published • 39 -
Lumiere: A Space-Time Diffusion Model for Video Generation
Paper • 2401.12945 • Published • 86 -
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
Paper • 2307.04725 • Published • 65 -
Boximator: Generating Rich and Controllable Motions for Video Synthesis
Paper • 2402.01566 • Published • 27
-
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
Paper • 2401.09985 • Published • 18 -
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Paper • 2401.09962 • Published • 9 -
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution
Paper • 2401.10404 • Published • 10 -
ActAnywhere: Subject-Aware Video Background Generation
Paper • 2401.10822 • Published • 13
-
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions
Paper • 2312.08578 • Published • 20 -
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Paper • 2312.08583 • Published • 11 -
Vision-Language Models as a Source of Rewards
Paper • 2312.09187 • Published • 12 -
StemGen: A music generation model that listens
Paper • 2312.08723 • Published • 48
-
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 45 -
AToM: Amortized Text-to-Mesh using 2D Diffusion
Paper • 2402.00867 • Published • 11 -
Neural Network Diffusion
Paper • 2402.13144 • Published • 100 -
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Paper • 2402.19479 • Published • 35
-
Understanding Diffusion Models: A Unified Perspective
Paper • 2208.11970 • Published -
Tutorial on Diffusion Models for Imaging and Vision
Paper • 2403.18103 • Published • 2 -
Denoising Diffusion Probabilistic Models
Paper • 2006.11239 • Published • 9 -
Denoising Diffusion Implicit Models
Paper • 2010.02502 • Published • 4