Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Rakushaking
/
Qwen4b-SFT-d9-merged-after-dpo-d2
like
0
Text Generation
Transformers
Safetensors
u-10bei/dpo-dataset-qwen-cot
English
qwen3
dpo
unsloth
qwen
alignment
structured-data
chain-of-thought
conversational
text-generation-inference
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
Qwen4b-SFT-d9-merged-after-dpo-d2
Commit History
Update README.md
7f942e4
verified
Rakushaking
commited on
Feb 8
Update README.md
50768e8
verified
Rakushaking
commited on
Feb 8
Upload README.md with huggingface_hub
aba9187
verified
Rakushaking
commited on
Feb 7
(Trained with Unsloth)
e4518d2
verified
Rakushaking
commited on
Feb 7
(Trained with Unsloth)
bba6ad9
verified
Rakushaking
commited on
Feb 7
(Trained with Unsloth)
a313e8d
verified
Rakushaking
commited on
Feb 7
(Trained with Unsloth)
7306171
verified
Rakushaking
commited on
Feb 7
Unsloth Model Card
7987c8a
verified
Rakushaking
commited on
Feb 7
initial commit
11ac227
verified
Rakushaking
commited on
Feb 7