TinyStories-1M Indonesian Fine-tune (Experimental)
An experimental model fine-tuned from TinyStories-1M using Indonesian language dataset for exploration and learning purposes.
⚠️ Note: This is an experimental model for testing purposes only. Performance is far from optimal.
Model Details
Model Description
This model is a fine-tuned version of TinyStories-1M using an Indonesian language dataset. The primary purpose is for experimentation and learning, not for production use.
Model Sources
- Base Model: TinyStories-1M
- Training Dataset: Lyon28/Corpus-Indonesia
Performance Metrics
⚠️ Warning: This model is still in early stages and performance is not optimal.
Training Loss & Perplexity
Training loss remains quite high and perplexity shows the model has not converged well:
| Rank | Train Loss | Perplexity |
|---|---|---|
| 1 | 5.092371 | 162.775409 |
| 2 | 5.710950 | 302.158057 |
| 3 | 9.836301 | 18,700.406340 |
| 4 | 11.639643 | 113,509.674623 |
| 5 | 11.639969 | 113,546.630401 |
Training Details
- Training Time: >3 hours
- Hardware: T4 GPU
- Training Regime: Full fine-tuning
Uses
Direct Use
This model can be used for:
- Experimentation with Indonesian language modeling
- Learning about model fine-tuning
- Research and development
Out-of-Scope Use
NOT recommended for:
- Production applications
- Critical tasks requiring high accuracy
- Professional text generation
This model is still experimental with very high perplexity, indicating poor prediction quality.
Bias, Risks, and Limitations
- High Perplexity: Model shows very high perplexity (>100k on some checkpoints), indicating highly uncertain predictions
- Training Loss: High loss indicates the model has not learned optimally
- Experimental Status: This model was created for experimentation, not for serious applications
- Data Bias: Model may inherit biases from the Lyon28/Corpus-Indonesia dataset
Recommendations
- Use only for learning and experimentation purposes
- Not recommended for production use
- Requires further training with hyperparameter tuning for better results
- Consider increasing epochs, adjusting learning rate, or using a larger dataset
Evaluation
Results
The model shows suboptimal performance:
- Highest training loss: 11.639969
- Lowest perplexity: 162.775409
All metrics indicate that the model requires:
- Further training
- Hyperparameter tuning
- Possibly better architecture or more suitable dataset
- Downloads last month
- 407