File size: 1,739 Bytes
6895a00
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9c9384b
91798c2
6895a00
0338b62
9c9384b
0338b62
 
9c9384b
f422ad2
9c9384b
 
 
f422ad2
0338b62
f422ad2
6895a00
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9c9384b
f422ad2
9c9384b
 
 
f422ad2
6895a00
 
 
 
 
 
 
0338b62
6895a00
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
license: mit
language:
- en
tags:
- mechanistic-interpretability
- sparse-autoencoder
- cross-layer-transcoder
- qwen2.5-vl
- vision-language-model
- circuit-tracer
library_name: safetensors
pipeline_tag: feature-extraction
---

# Qwen2.5-VL-7B CLTs (circuit-tracer format)

This is the circuit-tracer compatible version of Cross-Layer Transcoders (CLTs) trained on Qwen2.5-VL-7B-Instruct.

## Usage with circuit-tracer

```python
from circuit_tracer import ReplacementModel

model = ReplacementModel.from_pretrained(
    model_name="Qwen/Qwen2.5-VL-7B-Instruct",
    transcoder_set="KokosDev/qwen2p5vl-7b-clt",
)
```

Or use the convenience shortcut:

```python
from circuit_tracer.vlm import VLModelWrapper

model = VLModelWrapper.from_pretrained(
    'Qwen/Qwen2.5-VL-7B-Instruct',
    transcoder_set='qwen',  # Shortcut for this repo
    dtype=torch.bfloat16
)
```

## Model Details

- **Architecture**: Cross-Layer Transcoders (CLTs)
- **Base Model**: Qwen/Qwen2.5-VL-7B-Instruct
- **Hidden Dimension**: 3584
- **Feature Dimension**: 8192
- **Layers**: 27 (layers 0-26)
- **Sparsity**: ~12% L0
- **Training Steps**: 5000

## Format

- `layer_*.safetensors`: Transcoder weights for each layer
- `config.yaml`: Configuration for circuit-tracer
- Uses safetensors format for fast loading

## Training Details

- **Optimizer**: AdamW
- **Learning Rate**: 3e-4
- **Scheduler**: Cosine
- **Target L0**: 0.12
- **Validation Loss**: 10.3 - 19.1

## Citation

If you use these transcoders in your research, please cite:

```bibtex
@misc{qwen2p5vl7b-clt,
  author = {KokosDev},
  title = {Qwen2.5-VL-7B Cross-Layer Transcoders},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/KokosDev/qwen2p5vl-7b-clt}
}
```