Update README.md
Browse files
README.md
CHANGED
|
@@ -68,10 +68,6 @@ Final metrics after 1B tokens:
|
|
| 68 |
|
| 69 |
This model is a base model trained on a mix of educational data. It demonstrates reasonable storytelling and factual knowledge for its size, but may hallucinate and is not yet fine-tuned for instruction following.
|
| 70 |
|
| 71 |
-
```python
|
| 72 |
-
# Example inference code coming soon
|
| 73 |
-
```
|
| 74 |
-
|
| 75 |
## Historical Context
|
| 76 |
|
| 77 |
This model (151M parameters) reached similar complexity to OpenAI's original GPT-1 (117M) in under 3 hours on a single consumer GPU, showcasing the massive improvement in training efficiency in recent years.
|
|
|
|
| 68 |
|
| 69 |
This model is a base model trained on a mix of educational data. It demonstrates reasonable storytelling and factual knowledge for its size, but may hallucinate and is not yet fine-tuned for instruction following.
|
| 70 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 71 |
## Historical Context
|
| 72 |
|
| 73 |
This model (151M parameters) reached similar complexity to OpenAI's original GPT-1 (117M) in under 3 hours on a single consumer GPU, showcasing the massive improvement in training efficiency in recent years.
|