File size: 4,725 Bytes
12a8e0f | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 | # BeatHeritage V1 Model Documentation
## Overview
BeatHeritage V1 is an enhanced version of the Mapperatorinator V30 model, designed specifically for improved stability and generation quality in osu! beatmap creation. This model represents a significant advancement in AI-driven beatmap generation with enhanced features and optimizations.
## Model Information
- **Model Path:** `hongminh54/BeatHeritage-v1`
- **Base Architecture:** Whisper-based model (219M parameters)
- **Version:** BeatHeritage V1
- **Release Date:** September 2025
## Configuration Files
- **Inference Config:** `configs/inference/beatheritage_v1.yaml`
- **Training Config:** `configs/train/beatheritage_v1.yaml`
- **Diffusion Checkpoint:** `hongminh54/osu-diffusion-v2`
## Key Improvements Over Mapperatorinator V30
### 1. Enhanced Sampling Parameters
- **Temperature:** 0.85 (reduced from 0.9) for better stability
- **Top-p:** 0.92 (increased from 0.9) for improved diversity
- **Top-k:** 50 (newly added) for better control
- **Repetition Penalty:** 1.1 (new) to reduce repetitive patterns
### 2. Quality Control Features
- **Min Distance Threshold:** 20 pixels between objects
- **Max Overlap Ratio:** 0.15 maximum allowed overlap
- **Auto-correction:** Automatic spacing issue fixes
- **Flow Optimization:** Enhanced flow pattern generation
### 3. Advanced Generation Features
- **Context-Aware Generation:** Better understanding of beatmap context
- **Style Preservation:** Maintains consistent mapping style
- **Difficulty Scaling:** Improved difficulty progression
- **Pattern Variety:** More diverse pattern generation
### 4. Training Enhancements
- **Flash Attention:** Enabled for better performance
- **Dataset Size:** 40,000 training samples (expanded from 38,689)
- **Gamemode Support:** All gamemodes (0, 1, 2, 3)
- **Data Augmentation:** Rotation, flip, scale, and noise
- **Regularization:** Weight decay (0.01) and gradient clipping (1.0)
### 5. Performance Optimizations
- **Batch Processing:** Optimized batch size (48 with gradient accumulation)
- **Mixed Precision:** BF16 precision for better stability
- **Caching:** 4096 context cache size
- **Memory Efficiency:** Gradient checkpointing enabled
## Usage
### Web UI
Select "BeatHeritage V1 (Enhanced stability & quality)" from the model dropdown in the web interface.
### CLI
```bash
# When prompted for model selection, choose option 5
Select Model:
1) Mapperatorinator V28
2) Mapperatorinator V29 (Supports gamemodes and descriptors)
3) Mapperatorinator V30 (Best stable model)
4) Mapperatorinator V31 (Slightly more accurate than V29)
5) BeatHeritage V1 (Enhanced stability & quality)
```
### Python API
```python
python inference.py -cn beatheritage_v1 \
audio_path='path/to/audio.mp3' \
output_path='output/' \
gamemode=0 \
difficulty=5.5
```
## Model Features
### Supported Tokens
- **Gamemode tokens:** std, taiko, ctb, mania
- **Difficulty tokens:** 1.0-10.0 star rating
- **Style tokens:** jump aim, stream, tech, flow, clean, complex
- **Mapper ID:** Style-specific generation
- **Year tokens:** 2007-2023
- **Special tokens:** timing, kiai, hitsounds, SV
### Context Types
- Map context
- Timing context
- Map-to-map learning
### Post-processing
- Automatic resnapping to ticks
- Overlap detection and fixing
- Slider path generation
- Coordinate refinement via diffusion
## Best Practices
### For Best Results
1. Use high-quality audio files (MP3 or OGG)
2. Specify appropriate difficulty rating
3. Use descriptors for style guidance
4. Enable super_timing for variable BPM songs
5. Use in-context learning with reference beatmaps
### Common Settings
```yaml
temperature: 0.85
top_p: 0.92
cfg_scale: 7.5
generate_positions: true
position_refinement: true
```
## Limitations
- Maximum context length: 8.192 seconds
- Requires CUDA-compatible GPU for optimal performance
- Best results with songs under 5 minutes
## Troubleshooting
### If generation quality is poor:
1. Lower temperature to 0.7-0.8
2. Increase cfg_scale to 10-15
3. Use more specific descriptors
4. Provide reference beatmap for context
### If generation is too repetitive:
1. Increase repetition_penalty to 1.2-1.5
2. Increase top_p to 0.95
3. Use negative descriptors to avoid patterns
## Future Improvements
- Extended context length support
- Real-time generation capabilities
- Multi-difficulty set generation
- Enhanced gamemode-specific features
## Credits
- Based on Mapperatorinator by OliBomby
- Enhanced by hongminh54
- Using osu-diffusion for coordinate refinement
## License
Same as the original BeatHeritage/Mapperatorinator project
---
For more information, see the main [README.md](../README.md) or visit the project repository.
|