Cinematic Music Descriptor v2 β All Checkpoints
This repository consolidates all training checkpoints for the Cinematic Music Descriptor v2 pipeline (Modules 1, 2, 3).
Repository Layout
module1/
module1_mlm_final.pt β Phase 2a: MLM pre-training
module1_finetune_final.pt β Phase 2b: Supervised fine-tuning
module1_regression_final.pt β Phase 2c: Regression heads
module1_e2e_final.pt β Phase 6: After E2E joint training
module2/
module2_pretrain_final.pt β Phase 3a: Masked scene pre-training
module2_finetune_final.pt β Phase 3b: Supervised fine-tuning
module2_e2e_best.pt β Phase 5: E2E best checkpoint
module2_e2e_final.pt β Phase 6: Final E2E checkpoint
module3/
module3_m3_final.pt β Phase 4: M3 standalone training
module3_e2e_best.pt β Phase 5: E2E best checkpoint β
recommended
module3_e2e_final.pt β Phase 6: Final E2E checkpoint
Architecture Summary
| Module | Role | Backbone |
|---|---|---|
| Module 1 | Scene-level encoding & classification | RoBERTa-base + task heads |
| Module 2 | Cross-scene narrative context | Transformer encoder (4L Γ 8H) |
| Module 3 | Music descriptor prediction | Gated fusion (M1 + M2) + heads |
Recommended Checkpoint Combination
For best end-to-end performance, load:
module1/module1_e2e_final.ptmodule2/module2_e2e_final.ptmodule3/module3_e2e_best.ptβ ormodule3_e2e_final.pt
Source Repositories
Originally spread across:
suyashnpande/cinematic-music-descriptor-v2-module1suyashnpande/cinematic-music-descriptor-v2-module2suyashnpande/cinematic-music-descriptor-v2-module3
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support