secret-model-stage-1-8B-32
This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.1070
- Centroid Acc: 0.9811
- Centroid Macro F1: 0.9805
- Knn Acc: 0.9811
- Knn Macro F1: 0.9805
- Alignment: 0.4123
- Uniformity: -2.8989
- Combined Score: 0.9805
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 16
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.06
- num_epochs: 100.0
Training results
| Training Loss | Epoch | Step | Validation Loss | Centroid Acc | Centroid Macro F1 | Knn Acc | Knn Macro F1 | Alignment | Uniformity | Combined Score |
|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 2.3436 | 0.5660 | 0.5370 | 0.7170 | 0.7131 | 0.2797 | -0.7130 | 0.5957 |
| 1.2412 | 3.125 | 100 | 0.7993 | 0.8113 | 0.8149 | 0.7925 | 0.7874 | 0.3830 | -1.9092 | 0.8057 |
| 0.9887 | 6.25 | 200 | 0.6368 | 0.9057 | 0.9043 | 0.9434 | 0.9438 | 0.4639 | -2.3435 | 0.9175 |
| 0.7032 | 9.375 | 300 | 0.5491 | 0.9057 | 0.9103 | 0.9245 | 0.9265 | 0.3843 | -2.1929 | 0.9157 |
| 0.2618 | 12.5 | 400 | 0.1410 | 0.9434 | 0.9438 | 0.9245 | 0.9241 | 0.3929 | -2.5564 | 0.9372 |
| 0.2934 | 15.625 | 500 | 0.2402 | 0.9811 | 0.9805 | 0.9434 | 0.9394 | 0.4081 | -2.5045 | 0.9668 |
| 0.2267 | 18.75 | 600 | 0.3960 | 0.9434 | 0.9417 | 0.9434 | 0.9438 | 0.4676 | -2.6223 | 0.9424 |
| 0.1858 | 21.875 | 700 | 0.1469 | 0.9623 | 0.9612 | 0.9434 | 0.9407 | 0.4225 | -2.8028 | 0.9544 |
| 0.0626 | 25.0 | 800 | 0.2411 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4344 | -2.8140 | 0.9805 |
| 0.0626 | 25.0 | 800 | 0.2411 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4344 | -2.8140 | 0.9805 |
| 0.0373 | 28.125 | 900 | 0.1800 | 0.9811 | 0.9805 | 1.0 | 1.0 | 0.4696 | -2.8784 | 0.9870 |
| 0.0176 | 31.25 | 1000 | 0.1727 | 1.0 | 1.0 | 1.0 | 1.0 | 0.4318 | -2.8063 | 1.0 |
| 0.111 | 34.375 | 1100 | 0.0621 | 0.9811 | 0.9805 | 0.9811 | 0.9829 | 0.3770 | -2.7065 | 0.9813 |
| 0.0486 | 37.5 | 1200 | 0.1078 | 1.0 | 1.0 | 0.9811 | 0.9805 | 0.4132 | -2.8674 | 0.9935 |
| 0.0054 | 40.625 | 1300 | 0.1198 | 1.0 | 1.0 | 1.0 | 1.0 | 0.4120 | -2.8506 | 1.0 |
| 0.0069 | 43.75 | 1400 | 0.1805 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4114 | -2.7904 | 0.9805 |
| 0.0196 | 46.875 | 1500 | 0.1678 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4262 | -2.9247 | 0.9805 |
| 0.0027 | 50.0 | 1600 | 0.0957 | 1.0 | 1.0 | 1.0 | 1.0 | 0.4106 | -2.8659 | 1.0 |
| 0.0027 | 50.0 | 1600 | 0.0957 | 1.0 | 1.0 | 1.0 | 1.0 | 0.4106 | -2.8659 | 1.0 |
| 0.0777 | 53.125 | 1700 | 0.0687 | 1.0 | 1.0 | 1.0 | 1.0 | 0.4015 | -2.8900 | 1.0 |
| 0.0011 | 56.25 | 1800 | 0.0804 | 1.0 | 1.0 | 1.0 | 1.0 | 0.4102 | -2.9196 | 1.0 |
| 0.0151 | 59.375 | 1900 | 0.0749 | 1.0 | 1.0 | 1.0 | 1.0 | 0.4151 | -2.9207 | 1.0 |
| 0.0284 | 62.5 | 2000 | 0.0865 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4014 | -2.8595 | 0.9805 |
| 0.001 | 65.625 | 2100 | 0.1106 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4099 | -2.8875 | 0.9805 |
| 0.0009 | 68.75 | 2200 | 0.0807 | 0.9811 | 0.9805 | 1.0 | 1.0 | 0.4144 | -2.9166 | 0.9870 |
| 0.0012 | 71.875 | 2300 | 0.1107 | 0.9811 | 0.9805 | 1.0 | 1.0 | 0.4192 | -2.9153 | 0.9870 |
| 0.0009 | 75.0 | 2400 | 0.0987 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4138 | -2.9017 | 0.9805 |
| 0.0009 | 75.0 | 2400 | 0.0987 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4138 | -2.9017 | 0.9805 |
| 0.0011 | 78.125 | 2500 | 0.1045 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4161 | -2.9174 | 0.9805 |
| 0.0008 | 81.25 | 2600 | 0.0895 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4054 | -2.8906 | 0.9805 |
| 0.0089 | 84.375 | 2700 | 0.0899 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4092 | -2.9021 | 0.9805 |
| 0.0006 | 87.5 | 2800 | 0.0933 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4102 | -2.9016 | 0.9805 |
| 0.0008 | 90.625 | 2900 | 0.1126 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4110 | -2.8889 | 0.9805 |
| 0.0009 | 93.75 | 3000 | 0.1084 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4116 | -2.8958 | 0.9805 |
| 0.0387 | 96.875 | 3100 | 0.1089 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4123 | -2.8985 | 0.9805 |
| 0.0007 | 100.0 | 3200 | 0.1070 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4123 | -2.8989 | 0.9805 |
| 0.0007 | 100.0 | 3200 | 0.1070 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4123 | -2.8989 | 0.9805 |
Framework versions
- Transformers 4.56.0
- Pytorch 2.8.0+cu128
- Datasets 4.0.0
- Tokenizers 0.22.0
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support