secret-model-stage-1-8B-32

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1070
  • Centroid Acc: 0.9811
  • Centroid Macro F1: 0.9805
  • Knn Acc: 0.9811
  • Knn Macro F1: 0.9805
  • Alignment: 0.4123
  • Uniformity: -2.8989
  • Combined Score: 0.9805

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 16
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.06
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Centroid Acc Centroid Macro F1 Knn Acc Knn Macro F1 Alignment Uniformity Combined Score
No log 0 0 2.3436 0.5660 0.5370 0.7170 0.7131 0.2797 -0.7130 0.5957
1.2412 3.125 100 0.7993 0.8113 0.8149 0.7925 0.7874 0.3830 -1.9092 0.8057
0.9887 6.25 200 0.6368 0.9057 0.9043 0.9434 0.9438 0.4639 -2.3435 0.9175
0.7032 9.375 300 0.5491 0.9057 0.9103 0.9245 0.9265 0.3843 -2.1929 0.9157
0.2618 12.5 400 0.1410 0.9434 0.9438 0.9245 0.9241 0.3929 -2.5564 0.9372
0.2934 15.625 500 0.2402 0.9811 0.9805 0.9434 0.9394 0.4081 -2.5045 0.9668
0.2267 18.75 600 0.3960 0.9434 0.9417 0.9434 0.9438 0.4676 -2.6223 0.9424
0.1858 21.875 700 0.1469 0.9623 0.9612 0.9434 0.9407 0.4225 -2.8028 0.9544
0.0626 25.0 800 0.2411 0.9811 0.9805 0.9811 0.9805 0.4344 -2.8140 0.9805
0.0626 25.0 800 0.2411 0.9811 0.9805 0.9811 0.9805 0.4344 -2.8140 0.9805
0.0373 28.125 900 0.1800 0.9811 0.9805 1.0 1.0 0.4696 -2.8784 0.9870
0.0176 31.25 1000 0.1727 1.0 1.0 1.0 1.0 0.4318 -2.8063 1.0
0.111 34.375 1100 0.0621 0.9811 0.9805 0.9811 0.9829 0.3770 -2.7065 0.9813
0.0486 37.5 1200 0.1078 1.0 1.0 0.9811 0.9805 0.4132 -2.8674 0.9935
0.0054 40.625 1300 0.1198 1.0 1.0 1.0 1.0 0.4120 -2.8506 1.0
0.0069 43.75 1400 0.1805 0.9811 0.9805 0.9811 0.9805 0.4114 -2.7904 0.9805
0.0196 46.875 1500 0.1678 0.9811 0.9805 0.9811 0.9805 0.4262 -2.9247 0.9805
0.0027 50.0 1600 0.0957 1.0 1.0 1.0 1.0 0.4106 -2.8659 1.0
0.0027 50.0 1600 0.0957 1.0 1.0 1.0 1.0 0.4106 -2.8659 1.0
0.0777 53.125 1700 0.0687 1.0 1.0 1.0 1.0 0.4015 -2.8900 1.0
0.0011 56.25 1800 0.0804 1.0 1.0 1.0 1.0 0.4102 -2.9196 1.0
0.0151 59.375 1900 0.0749 1.0 1.0 1.0 1.0 0.4151 -2.9207 1.0
0.0284 62.5 2000 0.0865 0.9811 0.9805 0.9811 0.9805 0.4014 -2.8595 0.9805
0.001 65.625 2100 0.1106 0.9811 0.9805 0.9811 0.9805 0.4099 -2.8875 0.9805
0.0009 68.75 2200 0.0807 0.9811 0.9805 1.0 1.0 0.4144 -2.9166 0.9870
0.0012 71.875 2300 0.1107 0.9811 0.9805 1.0 1.0 0.4192 -2.9153 0.9870
0.0009 75.0 2400 0.0987 0.9811 0.9805 0.9811 0.9805 0.4138 -2.9017 0.9805
0.0009 75.0 2400 0.0987 0.9811 0.9805 0.9811 0.9805 0.4138 -2.9017 0.9805
0.0011 78.125 2500 0.1045 0.9811 0.9805 0.9811 0.9805 0.4161 -2.9174 0.9805
0.0008 81.25 2600 0.0895 0.9811 0.9805 0.9811 0.9805 0.4054 -2.8906 0.9805
0.0089 84.375 2700 0.0899 0.9811 0.9805 0.9811 0.9805 0.4092 -2.9021 0.9805
0.0006 87.5 2800 0.0933 0.9811 0.9805 0.9811 0.9805 0.4102 -2.9016 0.9805
0.0008 90.625 2900 0.1126 0.9811 0.9805 0.9811 0.9805 0.4110 -2.8889 0.9805
0.0009 93.75 3000 0.1084 0.9811 0.9805 0.9811 0.9805 0.4116 -2.8958 0.9805
0.0387 96.875 3100 0.1089 0.9811 0.9805 0.9811 0.9805 0.4123 -2.8985 0.9805
0.0007 100.0 3200 0.1070 0.9811 0.9805 0.9811 0.9805 0.4123 -2.8989 0.9805
0.0007 100.0 3200 0.1070 0.9811 0.9805 0.9811 0.9805 0.4123 -2.8989 0.9805

Framework versions

  • Transformers 4.56.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.0
Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
131k params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support