secret-model-stage-1-8B-32

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.1070
Centroid Acc: 0.9811
Centroid Macro F1: 0.9805
Knn Acc: 0.9811
Knn Macro F1: 0.9805
Alignment: 0.4123
Uniformity: -2.8989
Combined Score: 0.9805

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 16
eval_batch_size: 64
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.06
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Centroid Acc	Centroid Macro F1	Knn Acc	Knn Macro F1	Alignment	Uniformity	Combined Score
No log	0	0	2.3436	0.5660	0.5370	0.7170	0.7131	0.2797	-0.7130	0.5957
1.2412	3.125	100	0.7993	0.8113	0.8149	0.7925	0.7874	0.3830	-1.9092	0.8057
0.9887	6.25	200	0.6368	0.9057	0.9043	0.9434	0.9438	0.4639	-2.3435	0.9175
0.7032	9.375	300	0.5491	0.9057	0.9103	0.9245	0.9265	0.3843	-2.1929	0.9157
0.2618	12.5	400	0.1410	0.9434	0.9438	0.9245	0.9241	0.3929	-2.5564	0.9372
0.2934	15.625	500	0.2402	0.9811	0.9805	0.9434	0.9394	0.4081	-2.5045	0.9668
0.2267	18.75	600	0.3960	0.9434	0.9417	0.9434	0.9438	0.4676	-2.6223	0.9424
0.1858	21.875	700	0.1469	0.9623	0.9612	0.9434	0.9407	0.4225	-2.8028	0.9544
0.0626	25.0	800	0.2411	0.9811	0.9805	0.9811	0.9805	0.4344	-2.8140	0.9805
0.0626	25.0	800	0.2411	0.9811	0.9805	0.9811	0.9805	0.4344	-2.8140	0.9805
0.0373	28.125	900	0.1800	0.9811	0.9805	1.0	1.0	0.4696	-2.8784	0.9870
0.0176	31.25	1000	0.1727	1.0	1.0	1.0	1.0	0.4318	-2.8063	1.0
0.111	34.375	1100	0.0621	0.9811	0.9805	0.9811	0.9829	0.3770	-2.7065	0.9813
0.0486	37.5	1200	0.1078	1.0	1.0	0.9811	0.9805	0.4132	-2.8674	0.9935
0.0054	40.625	1300	0.1198	1.0	1.0	1.0	1.0	0.4120	-2.8506	1.0
0.0069	43.75	1400	0.1805	0.9811	0.9805	0.9811	0.9805	0.4114	-2.7904	0.9805
0.0196	46.875	1500	0.1678	0.9811	0.9805	0.9811	0.9805	0.4262	-2.9247	0.9805
0.0027	50.0	1600	0.0957	1.0	1.0	1.0	1.0	0.4106	-2.8659	1.0
0.0027	50.0	1600	0.0957	1.0	1.0	1.0	1.0	0.4106	-2.8659	1.0
0.0777	53.125	1700	0.0687	1.0	1.0	1.0	1.0	0.4015	-2.8900	1.0
0.0011	56.25	1800	0.0804	1.0	1.0	1.0	1.0	0.4102	-2.9196	1.0
0.0151	59.375	1900	0.0749	1.0	1.0	1.0	1.0	0.4151	-2.9207	1.0
0.0284	62.5	2000	0.0865	0.9811	0.9805	0.9811	0.9805	0.4014	-2.8595	0.9805
0.001	65.625	2100	0.1106	0.9811	0.9805	0.9811	0.9805	0.4099	-2.8875	0.9805
0.0009	68.75	2200	0.0807	0.9811	0.9805	1.0	1.0	0.4144	-2.9166	0.9870
0.0012	71.875	2300	0.1107	0.9811	0.9805	1.0	1.0	0.4192	-2.9153	0.9870
0.0009	75.0	2400	0.0987	0.9811	0.9805	0.9811	0.9805	0.4138	-2.9017	0.9805
0.0009	75.0	2400	0.0987	0.9811	0.9805	0.9811	0.9805	0.4138	-2.9017	0.9805
0.0011	78.125	2500	0.1045	0.9811	0.9805	0.9811	0.9805	0.4161	-2.9174	0.9805
0.0008	81.25	2600	0.0895	0.9811	0.9805	0.9811	0.9805	0.4054	-2.8906	0.9805
0.0089	84.375	2700	0.0899	0.9811	0.9805	0.9811	0.9805	0.4092	-2.9021	0.9805
0.0006	87.5	2800	0.0933	0.9811	0.9805	0.9811	0.9805	0.4102	-2.9016	0.9805
0.0008	90.625	2900	0.1126	0.9811	0.9805	0.9811	0.9805	0.4110	-2.8889	0.9805
0.0009	93.75	3000	0.1084	0.9811	0.9805	0.9811	0.9805	0.4116	-2.8958	0.9805
0.0387	96.875	3100	0.1089	0.9811	0.9805	0.9811	0.9805	0.4123	-2.8985	0.9805
0.0007	100.0	3200	0.1070	0.9811	0.9805	0.9811	0.9805	0.4123	-2.8989	0.9805
0.0007	100.0	3200	0.1070	0.9811	0.9805	0.9811	0.9805	0.4123	-2.8989	0.9805

Framework versions

Transformers 4.56.0
Pytorch 2.8.0+cu128
Datasets 4.0.0
Tokenizers 0.22.0

Downloads last month: -; Downloads are not tracked for this model. How to track

Safetensors

Model size

131k params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support