Update README.md
Browse files
README.md
CHANGED
|
@@ -175,6 +175,10 @@ We use the [Transformer Reinforcement Learning](https://huggingface.co/docs/trl/
|
|
| 175 |
|
| 176 |
#### Training Hyperparameters
|
| 177 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 178 |
```python
|
| 179 |
LORA_CONFIG = {
|
| 180 |
"r": 64,
|
|
@@ -215,6 +219,9 @@ DPO_CONFIG = {
|
|
| 215 |
}
|
| 216 |
```
|
| 217 |
|
|
|
|
|
|
|
|
|
|
| 218 |
#### Speeds, Sizes, Times
|
| 219 |
|
| 220 |
Below are some useful parameters showing the results of the latest training logs.
|
|
@@ -296,13 +303,10 @@ python=3.11
|
|
| 296 |
flash_attn>=2.5.8
|
| 297 |
datasets
|
| 298 |
numpy
|
| 299 |
-
tabulate
|
| 300 |
-
openpyxl
|
| 301 |
trl
|
| 302 |
peft
|
| 303 |
bitsandbytes
|
| 304 |
huggingface_hub
|
| 305 |
-
tensorboard
|
| 306 |
```
|
| 307 |
|
| 308 |
## Citation
|
|
|
|
| 175 |
|
| 176 |
#### Training Hyperparameters
|
| 177 |
|
| 178 |
+
|
| 179 |
+
<details><summary>#### Training Hyperparameters</summary>
|
| 180 |
+
<p>
|
| 181 |
+
|
| 182 |
```python
|
| 183 |
LORA_CONFIG = {
|
| 184 |
"r": 64,
|
|
|
|
| 219 |
}
|
| 220 |
```
|
| 221 |
|
| 222 |
+
</p>
|
| 223 |
+
</details>
|
| 224 |
+
|
| 225 |
#### Speeds, Sizes, Times
|
| 226 |
|
| 227 |
Below are some useful parameters showing the results of the latest training logs.
|
|
|
|
| 303 |
flash_attn>=2.5.8
|
| 304 |
datasets
|
| 305 |
numpy
|
|
|
|
|
|
|
| 306 |
trl
|
| 307 |
peft
|
| 308 |
bitsandbytes
|
| 309 |
huggingface_hub
|
|
|
|
| 310 |
```
|
| 311 |
|
| 312 |
## Citation
|