PleIAs
/

Monad

Text Generation

text-generation-inference

Model card Files Files and versions

Pclanglais commited on Nov 10, 2025

Commit

874e4d6

·

verified ·

1 Parent(s): aabacd5

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -47,7 +47,7 @@ Monad is strictly monolingual in English. We trained a new custom tokenizer (lik
 ## Model design and training
 Monad is a 56M parameters decoders with a standard Qwen/Llama-like design, except for its extremely compact size and overall opiniated architecture for depth (with 64 layers)
 <p align="center">
-  <img width="80%" src="figures/baguettotron_structure.png">
 </p>
 Monad was trained on 16 h100 from Jean Zay (compute plan n°A0191016886). Full pre-training took a bit less than 6 hours.

 ## Model design and training
 Monad is a 56M parameters decoders with a standard Qwen/Llama-like design, except for its extremely compact size and overall opiniated architecture for depth (with 64 layers)
 <p align="center">
+  <img width="80%" src="figures/monad_structure.png">
 </p>
 Monad was trained on 16 h100 from Jean Zay (compute plan n°A0191016886). Full pre-training took a bit less than 6 hours.