---
tags:
- text-generation
- causal-lm
- instruction-tuning
- chat
- rag
- code-generation
- summarization
- extraction
- synthetic-data
- generated_from_trainer
license: other
pipeline_tag: text-generation
library_name: transformers
language:
- en
base_model:
- allenai/OLMo-2-0425-1B-Instruct
- allenai/OLMo-3-7B-Instruct
- allenai/OLMo-3.1-32B-Instruct
---

# Bolt Instruct Models

Bolt Instruct is a family of **instruction-tuned language models designed for high-quality generation, reasoning, and enterprise workflows**.

These models are **fine-tuned from Allen Institute for AI OLMo instruct models** and optimized for:

- General conversational AI
- Structured and controllable generation
- Retrieval-Augmented Generation (RAG)
- Enterprise document understanding
- Code generation and transformation

---

# Model Overview

Bolt Instruct models provide **strong instruction-following capabilities** across diverse tasks with robust long-context support.

Key design goals:

- Strong instruction adherence
- High-quality structured outputs (JSON, extraction)
- RAG-grounded responses
- Long-context support (65k tokens for 7B and 32B)
- Balanced chat, reasoning, and coding performance

---

# Model Variants

| Model | Base Model | Positioning |
|------|------------|------------|
| bolt-instruct-1b | allenai/OLMo-2-0425-1B-Instruct | Lightweight / low-latency |
| bolt-instruct-7b | allenai/OLMo-3-7B-Instruct | Balanced |
| bolt-instruct-32b | allenai/OLMo-3.1-32B-Instruct | Highest quality |

---

# Model Details

- **Type:** Causal LM (instruction-tuned)
- **Max context:** 65,536 tokens (7B and 32B), 4,096 tokens (1B) 
- **Training context:** 32k (7B), 16k (32B), 4k (1B)

### Capabilities

- Chat / multi-turn dialogue  
- Instruction following  
- Structured output (JSON)  
- Summarization & transformation  
- Extraction  
- RAG generation  
- Code generation  

---

# Training

- **Method:** Supervised Fine-Tuning (SFT)
- **Dataset size:** ~125k conversations
- **Eval set:** ~10k examples
- **Data mix:** public + synthetic + internal tasks

### Training Approach

- 1B → full fine-tune  
- 7B / 32B → QLoRA (4-bit)

### Hardware

- 1× A100 80GB GPU

---

# Intended Use

- Chat assistants  
- Enterprise copilots  
- RAG pipelines  
- Document processing  
- Structured extraction  
- Code assistance  

---

# Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "aisquared/bolt-instruct-7b"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
```

---

# Evaluation

To evaluate these models, we ran a subset of tasks using the [Eleuther AI Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness). Below are the metrics for each model.

## Language Model Evaluation Harness

### Evaluation results for aisquared/bolt-instruct-1b:

|                          Tasks                           |Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|----------------------------------------------------------|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|arc_challenge                                             |      1|none            |     0|acc        |↑  |0.3490|±  |0.0139|
|                                                          |       |none            |     0|acc_norm   |↑  |0.3823|±  |0.0142|
|arc_easy                                                  |      1|none            |     0|acc        |↑  |0.6098|±  |0.0100|
|                                                          |       |none            |     0|acc_norm   |↑  |0.5560|±  |0.0102|
|bbh                                                       |      3|get-answer      |      |exact_match|↑  |0.3081|±  |0.0052|
| - bbh_cot_fewshot_boolean_expressions                    |      4|get-answer      |     3|exact_match|↑  |0.5840|±  |0.0312|
| - bbh_cot_fewshot_causal_judgement                       |      4|get-answer      |     3|exact_match|↑  |0.5508|±  |0.0365|
| - bbh_cot_fewshot_date_understanding                     |      4|get-answer      |     3|exact_match|↑  |0.2600|±  |0.0278|
| - bbh_cot_fewshot_disambiguation_qa                      |      4|get-answer      |     3|exact_match|↑  |0.3640|±  |0.0305|
| - bbh_cot_fewshot_dyck_languages                         |      4|get-answer      |     3|exact_match|↑  |0.0040|±  |0.0040|
| - bbh_cot_fewshot_formal_fallacies                       |      4|get-answer      |     3|exact_match|↑  |0.5040|±  |0.0317|
| - bbh_cot_fewshot_geometric_shapes                       |      4|get-answer      |     3|exact_match|↑  |0.0920|±  |0.0183|
| - bbh_cot_fewshot_hyperbaton                             |      4|get-answer      |     3|exact_match|↑  |0.5240|±  |0.0316|
| - bbh_cot_fewshot_logical_deduction_five_objects         |      4|get-answer      |     3|exact_match|↑  |0.1720|±  |0.0239|
| - bbh_cot_fewshot_logical_deduction_seven_objects        |      4|get-answer      |     3|exact_match|↑  |0.1080|±  |0.0197|
| - bbh_cot_fewshot_logical_deduction_three_objects        |      4|get-answer      |     3|exact_match|↑  |0.3520|±  |0.0303|
| - bbh_cot_fewshot_movie_recommendation                   |      4|get-answer      |     3|exact_match|↑  |0.5040|±  |0.0317|
| - bbh_cot_fewshot_multistep_arithmetic_two               |      4|get-answer      |     3|exact_match|↑  |0.0600|±  |0.0151|
| - bbh_cot_fewshot_navigate                               |      4|get-answer      |     3|exact_match|↑  |0.5560|±  |0.0315|
| - bbh_cot_fewshot_object_counting                        |      4|get-answer      |     3|exact_match|↑  |0.4360|±  |0.0314|
| - bbh_cot_fewshot_penguins_in_a_table                    |      4|get-answer      |     3|exact_match|↑  |0.2123|±  |0.0340|
| - bbh_cot_fewshot_reasoning_about_colored_objects        |      4|get-answer      |     3|exact_match|↑  |0.2440|±  |0.0272|
| - bbh_cot_fewshot_ruin_names                             |      4|get-answer      |     3|exact_match|↑  |0.2440|±  |0.0272|
| - bbh_cot_fewshot_salient_translation_error_detection    |      4|get-answer      |     3|exact_match|↑  |0.1920|±  |0.0250|
| - bbh_cot_fewshot_snarks                                 |      4|get-answer      |     3|exact_match|↑  |0.3989|±  |0.0368|
| - bbh_cot_fewshot_sports_understanding                   |      4|get-answer      |     3|exact_match|↑  |0.6560|±  |0.0301|
| - bbh_cot_fewshot_temporal_sequences                     |      4|get-answer      |     3|exact_match|↑  |0.2760|±  |0.0283|
| - bbh_cot_fewshot_tracking_shuffled_objects_five_objects |      4|get-answer      |     3|exact_match|↑  |0.1920|±  |0.0250|
| - bbh_cot_fewshot_tracking_shuffled_objects_seven_objects|      4|get-answer      |     3|exact_match|↑  |0.0360|±  |0.0118|
| - bbh_cot_fewshot_tracking_shuffled_objects_three_objects|      4|get-answer      |     3|exact_match|↑  |0.2840|±  |0.0286|
| - bbh_cot_fewshot_web_of_lies                            |      4|get-answer      |     3|exact_match|↑  |0.5240|±  |0.0316|
| - bbh_cot_fewshot_word_sorting                           |      4|get-answer      |     3|exact_match|↑  |0.0360|±  |0.0118|
|gsm8k                                                     |      3|flexible-extract|     5|exact_match|↑  |0.5072|±  |0.0138|
|                                                          |       |strict-match    |     5|exact_match|↑  |0.4943|±  |0.0138|
|hellaswag                                                 |      1|none            |     0|acc        |↑  |0.4729|±  |0.0050|
|                                                          |       |none            |     0|acc_norm   |↑  |0.6181|±  |0.0048|
|mmlu_pro                                                  |      2|custom-extract  |      |exact_match|↑  |0.1435|±  |0.0032|
| - biology                                                |      3|custom-extract  |     5|exact_match|↑  |0.2050|±  |0.0151|
| - business                                               |      3|custom-extract  |     5|exact_match|↑  |0.1369|±  |0.0122|
| - chemistry                                              |      3|custom-extract  |     5|exact_match|↑  |0.0848|±  |0.0083|
| - computer_science                                       |      3|custom-extract  |     5|exact_match|↑  |0.1415|±  |0.0172|
| - economics                                              |      3|custom-extract  |     5|exact_match|↑  |0.1943|±  |0.0136|
| - engineering                                            |      3|custom-extract  |     5|exact_match|↑  |0.0929|±  |0.0093|
| - health                                                 |      3|custom-extract  |     5|exact_match|↑  |0.1528|±  |0.0126|
| - history                                                |      3|custom-extract  |     5|exact_match|↑  |0.1549|±  |0.0186|
| - law                                                    |      3|custom-extract  |     5|exact_match|↑  |0.1081|±  |0.0094|
| - math                                                   |      3|custom-extract  |     5|exact_match|↑  |0.1414|±  |0.0095|
| - other                                                  |      3|custom-extract  |     5|exact_match|↑  |0.1916|±  |0.0130|
| - philosophy                                             |      3|custom-extract  |     5|exact_match|↑  |0.1383|±  |0.0155|
| - physics                                                |      3|custom-extract  |     5|exact_match|↑  |0.1186|±  |0.0090|
| - psychology                                             |      3|custom-extract  |     5|exact_match|↑  |0.2130|±  |0.0145|
|truthfulqa_mc2                                            |      3|none            |     0|acc        |↑  |0.4734|±  |0.0153|
|winogrande                                                |      1|none            |     0|acc        |↑  |0.6156|±  |0.0137|


### Evaluation results for aisquared/bolt-instruct-7b:

|                          Tasks                           |Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|----------------------------------------------------------|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|arc_challenge                                             |      1|none            |     0|acc        |↑  |0.4778|±  |0.0146|
|                                                          |       |none            |     0|acc_norm   |↑  |0.4957|±  |0.0146|
|arc_easy                                                  |      1|none            |     0|acc        |↑  |0.7534|±  |0.0088|
|                                                          |       |none            |     0|acc_norm   |↑  |0.7311|±  |0.0091|
|bbh                                                       |      3|get-answer      |      |exact_match|↑  |0.3038|±  |0.0047|
| - bbh_cot_fewshot_boolean_expressions                    |      4|get-answer      |     3|exact_match|↑  |0.0000|±  |0.0000|
| - bbh_cot_fewshot_causal_judgement                       |      4|get-answer      |     3|exact_match|↑  |0.5668|±  |0.0363|
| - bbh_cot_fewshot_date_understanding                     |      4|get-answer      |     3|exact_match|↑  |0.4480|±  |0.0315|
| - bbh_cot_fewshot_disambiguation_qa                      |      4|get-answer      |     3|exact_match|↑  |0.0000|±  |0.0000|
| - bbh_cot_fewshot_dyck_languages                         |      4|get-answer      |     3|exact_match|↑  |0.0000|±  |0.0000|
| - bbh_cot_fewshot_formal_fallacies                       |      4|get-answer      |     3|exact_match|↑  |0.2240|±  |0.0264|
| - bbh_cot_fewshot_geometric_shapes                       |      4|get-answer      |     3|exact_match|↑  |0.2960|±  |0.0289|
| - bbh_cot_fewshot_hyperbaton                             |      4|get-answer      |     3|exact_match|↑  |0.5200|±  |0.0317|
| - bbh_cot_fewshot_logical_deduction_five_objects         |      4|get-answer      |     3|exact_match|↑  |0.0200|±  |0.0089|
| - bbh_cot_fewshot_logical_deduction_seven_objects        |      4|get-answer      |     3|exact_match|↑  |0.0000|±  |0.0000|
| - bbh_cot_fewshot_logical_deduction_three_objects        |      4|get-answer      |     3|exact_match|↑  |0.6720|±  |0.0298|
| - bbh_cot_fewshot_movie_recommendation                   |      4|get-answer      |     3|exact_match|↑  |0.1200|±  |0.0206|
| - bbh_cot_fewshot_multistep_arithmetic_two               |      4|get-answer      |     3|exact_match|↑  |0.0000|±  |0.0000|
| - bbh_cot_fewshot_navigate                               |      4|get-answer      |     3|exact_match|↑  |0.5560|±  |0.0315|
| - bbh_cot_fewshot_object_counting                        |      4|get-answer      |     3|exact_match|↑  |0.1520|±  |0.0228|
| - bbh_cot_fewshot_penguins_in_a_table                    |      4|get-answer      |     3|exact_match|↑  |0.4110|±  |0.0409|
| - bbh_cot_fewshot_reasoning_about_colored_objects        |      4|get-answer      |     3|exact_match|↑  |0.1880|±  |0.0248|
| - bbh_cot_fewshot_ruin_names                             |      4|get-answer      |     3|exact_match|↑  |0.4800|±  |0.0317|
| - bbh_cot_fewshot_salient_translation_error_detection    |      4|get-answer      |     3|exact_match|↑  |0.4760|±  |0.0316|
| - bbh_cot_fewshot_snarks                                 |      4|get-answer      |     3|exact_match|↑  |0.2921|±  |0.0342|
| - bbh_cot_fewshot_sports_understanding                   |      4|get-answer      |     3|exact_match|↑  |0.6760|±  |0.0297|
| - bbh_cot_fewshot_temporal_sequences                     |      4|get-answer      |     3|exact_match|↑  |0.5880|±  |0.0312|
| - bbh_cot_fewshot_tracking_shuffled_objects_five_objects |      4|get-answer      |     3|exact_match|↑  |0.0000|±  |0.0000|
| - bbh_cot_fewshot_tracking_shuffled_objects_seven_objects|      4|get-answer      |     3|exact_match|↑  |0.0000|±  |0.0000|
| - bbh_cot_fewshot_tracking_shuffled_objects_three_objects|      4|get-answer      |     3|exact_match|↑  |0.8280|±  |0.0239|
| - bbh_cot_fewshot_web_of_lies                            |      4|get-answer      |     3|exact_match|↑  |0.6560|±  |0.0301|
| - bbh_cot_fewshot_word_sorting                           |      4|get-answer      |     3|exact_match|↑  |0.1400|±  |0.0220|
|gsm8k                                                     |      3|flexible-extract|     5|exact_match|↑  |0.7998|±  |0.0110|
|                                                          |       |strict-match    |     5|exact_match|↑  |0.7392|±  |0.0121|
|hellaswag                                                 |      1|none            |     0|acc        |↑  |0.4882|±  |0.0050|
|                                                          |       |none            |     0|acc_norm   |↑  |0.6165|±  |0.0049|
|mmlu_pro                                                  |      2|custom-extract  |      |exact_match|↑  |0.4978|±  |0.0044|
| - biology                                                |      3|custom-extract  |     5|exact_match|↑  |0.6848|±  |0.0174|
| - business                                               |      3|custom-extract  |     5|exact_match|↑  |0.5729|±  |0.0176|
| - chemistry                                              |      3|custom-extract  |     5|exact_match|↑  |0.5380|±  |0.0148|
| - computer_science                                       |      3|custom-extract  |     5|exact_match|↑  |0.5878|±  |0.0243|
| - economics                                              |      3|custom-extract  |     5|exact_match|↑  |0.5592|±  |0.0171|
| - engineering                                            |      3|custom-extract  |     5|exact_match|↑  |0.2405|±  |0.0137|
| - health                                                 |      3|custom-extract  |     5|exact_match|↑  |0.4670|±  |0.0175|
| - history                                                |      3|custom-extract  |     5|exact_match|↑  |0.3727|±  |0.0248|
| - law                                                    |      3|custom-extract  |     5|exact_match|↑  |0.2525|±  |0.0131|
| - math                                                   |      3|custom-extract  |     5|exact_match|↑  |0.7158|±  |0.0123|
| - other                                                  |      3|custom-extract  |     5|exact_match|↑  |0.4351|±  |0.0163|
| - philosophy                                             |      3|custom-extract  |     5|exact_match|↑  |0.4128|±  |0.0221|
| - physics                                                |      3|custom-extract  |     5|exact_match|↑  |0.5142|±  |0.0139|
| - psychology                                             |      3|custom-extract  |     5|exact_match|↑  |0.5602|±  |0.0176|
|truthfulqa_mc2                                            |      3|none            |     0|acc        |↑  |0.5666|±  |0.0162|
|winogrande                                                |      1|none            |     0|acc        |↑  |0.6385|±  |0.0135|


### Evaluation results for aisquared/bolt-instruct-32b:

|                          Tasks                           |Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|----------------------------------------------------------|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|arc_challenge                                             |      1|none            |     0|acc        |↑  |0.5776|±  |0.0144|
|                                                          |       |none            |     0|acc_norm   |↑  |0.6007|±  |0.0143|
|arc_easy                                                  |      1|none            |     0|acc        |↑  |0.8333|±  |0.0076|
|                                                          |       |none            |     0|acc_norm   |↑  |0.8228|±  |0.0078|
|bbh                                                       |      3|get-answer      |      |exact_match|↑  |0.3087|±  |0.0048|
| - bbh_cot_fewshot_boolean_expressions                    |      4|get-answer      |     3|exact_match|↑  |0.5760|±  |0.0313|
| - bbh_cot_fewshot_causal_judgement                       |      4|get-answer      |     3|exact_match|↑  |0.5882|±  |0.0361|
| - bbh_cot_fewshot_date_understanding                     |      4|get-answer      |     3|exact_match|↑  |0.6640|±  |0.0299|
| - bbh_cot_fewshot_disambiguation_qa                      |      4|get-answer      |     3|exact_match|↑  |0.1920|±  |0.0250|
| - bbh_cot_fewshot_dyck_languages                         |      4|get-answer      |     3|exact_match|↑  |0.0000|±  |0.0000|
| - bbh_cot_fewshot_formal_fallacies                       |      4|get-answer      |     3|exact_match|↑  |0.0480|±  |0.0135|
| - bbh_cot_fewshot_geometric_shapes                       |      4|get-answer      |     3|exact_match|↑  |0.2760|±  |0.0283|
| - bbh_cot_fewshot_hyperbaton                             |      4|get-answer      |     3|exact_match|↑  |0.3200|±  |0.0296|
| - bbh_cot_fewshot_logical_deduction_five_objects         |      4|get-answer      |     3|exact_match|↑  |0.0000|±  |0.0000|
| - bbh_cot_fewshot_logical_deduction_seven_objects        |      4|get-answer      |     3|exact_match|↑  |0.0000|±  |0.0000|
| - bbh_cot_fewshot_logical_deduction_three_objects        |      4|get-answer      |     3|exact_match|↑  |0.5400|±  |0.0316|
| - bbh_cot_fewshot_movie_recommendation                   |      4|get-answer      |     3|exact_match|↑  |0.6000|±  |0.0310|
| - bbh_cot_fewshot_multistep_arithmetic_two               |      4|get-answer      |     3|exact_match|↑  |0.0000|±  |0.0000|
| - bbh_cot_fewshot_navigate                               |      4|get-answer      |     3|exact_match|↑  |0.0160|±  |0.0080|
| - bbh_cot_fewshot_object_counting                        |      4|get-answer      |     3|exact_match|↑  |0.5120|±  |0.0317|
| - bbh_cot_fewshot_penguins_in_a_table                    |      4|get-answer      |     3|exact_match|↑  |0.2945|±  |0.0379|
| - bbh_cot_fewshot_reasoning_about_colored_objects        |      4|get-answer      |     3|exact_match|↑  |0.2280|±  |0.0266|
| - bbh_cot_fewshot_ruin_names                             |      4|get-answer      |     3|exact_match|↑  |0.5120|±  |0.0317|
| - bbh_cot_fewshot_salient_translation_error_detection    |      4|get-answer      |     3|exact_match|↑  |0.5440|±  |0.0316|
| - bbh_cot_fewshot_snarks                                 |      4|get-answer      |     3|exact_match|↑  |0.7079|±  |0.0342|
| - bbh_cot_fewshot_sports_understanding                   |      4|get-answer      |     3|exact_match|↑  |0.4880|±  |0.0317|
| - bbh_cot_fewshot_temporal_sequences                     |      4|get-answer      |     3|exact_match|↑  |0.3120|±  |0.0294|
| - bbh_cot_fewshot_tracking_shuffled_objects_five_objects |      4|get-answer      |     3|exact_match|↑  |0.0000|±  |0.0000|
| - bbh_cot_fewshot_tracking_shuffled_objects_seven_objects|      4|get-answer      |     3|exact_match|↑  |0.0000|±  |0.0000|
| - bbh_cot_fewshot_tracking_shuffled_objects_three_objects|      4|get-answer      |     3|exact_match|↑  |0.6280|±  |0.0306|
| - bbh_cot_fewshot_web_of_lies                            |      4|get-answer      |     3|exact_match|↑  |0.4400|±  |0.0315|
| - bbh_cot_fewshot_word_sorting                           |      4|get-answer      |     3|exact_match|↑  |0.0280|±  |0.0105|
|gsm8k                                                     |      3|flexible-extract|     5|exact_match|↑  |0.8795|±  |0.0090|
|                                                          |       |strict-match    |     5|exact_match|↑  |0.7801|±  |0.0114|
|hellaswag                                                 |      1|none            |     0|acc        |↑  |0.5407|±  |0.0050|
|                                                          |       |none            |     0|acc_norm   |↑  |0.6763|±  |0.0047|
|mmlu_pro                                                  |      2|custom-extract  |      |exact_match|↑  |0.6340|±  |0.0042|
| - biology                                                |      3|custom-extract  |     5|exact_match|↑  |0.8117|±  |0.0146|
| - business                                               |      3|custom-extract  |     5|exact_match|↑  |0.6907|±  |0.0165|
| - chemistry                                              |      3|custom-extract  |     5|exact_match|↑  |0.6431|±  |0.0142|
| - computer_science                                       |      3|custom-extract  |     5|exact_match|↑  |0.6951|±  |0.0228|
| - economics                                              |      3|custom-extract  |     5|exact_match|↑  |0.7405|±  |0.0151|
| - engineering                                            |      3|custom-extract  |     5|exact_match|↑  |0.3447|±  |0.0153|
| - health                                                 |      3|custom-extract  |     5|exact_match|↑  |0.6540|±  |0.0166|
| - history                                                |      3|custom-extract  |     5|exact_match|↑  |0.5512|±  |0.0255|
| - law                                                    |      3|custom-extract  |     5|exact_match|↑  |0.3860|±  |0.0147|
| - math                                                   |      3|custom-extract  |     5|exact_match|↑  |0.7979|±  |0.0109|
| - other                                                  |      3|custom-extract  |     5|exact_match|↑  |0.6028|±  |0.0161|
| - philosophy                                             |      3|custom-extract  |     5|exact_match|↑  |0.5912|±  |0.0220|
| - physics                                                |      3|custom-extract  |     5|exact_match|↑  |0.6551|±  |0.0132|
| - psychology                                             |      3|custom-extract  |     5|exact_match|↑  |0.7243|±  |0.0158|
|truthfulqa_mc2                                            |      3|none            |     0|acc        |↑  |0.6906|±  |0.0153|
|winogrande                                                |      1|none            |     0|acc        |↑  |0.6630|±  |0.0133|

---

# Limitations

- May hallucinate without grounding  
- Performance varies by model size  
- Not suitable for high-risk domains without oversight  

---

# License

Bolt Instruct is released under the [AI Squared Community License](https://docs.squared.ai/terms-of-use).