Model Card for Model ID
An updated version of Neural History Chat, using the mighty-history-merge dataset to fine-tune the previous version (v1.0).
Model Details
Run history:
train/epoch ββββββββββ
β
β
βββββββ
train/global_step ββββββββββ
β
β
βββββββ
train/learning_rate βββ
ββββββββ
βββββββ
train/loss ββββββββββββββββββ
train/total_flos β
train/train_loss β
train/train_runtime β
train/train_samples_per_second β
train/train_steps_per_second β
Run summary:
train/epoch 1.98
train/global_step 92
train/learning_rate 0.0
train/loss 0.7792
train/total_flos 1.756453697101824e+16
train/train_loss 1.30356
train/train_runtime 1176.2194
train/train_samples_per_second 10.068
train/train_steps_per_second 0.078
Training Explained
We went with a shorter training session of roughly 2 epochs for testing and evaluation. More steps/epochs might be in the future, but colab pricing is pretty steep. Currently to merge the peft back to the model, requires roughly 40GB of GPU RAM. So renting a Google Colab A100 is required and runs through credits quickly.
- Downloads last month
- 4