File size: 2,245 Bytes
d8d9dc2
 
 
 
 
 
929d642
 
d8d9dc2
 
 
 
64d45ae
 
d8d9dc2
 
 
 
 
 
 
0563b40
 
 
d8d9dc2
 
 
 
 
 
 
 
 
 
 
 
 
0563b40
 
 
ea48747
0563b40
 
 
 
 
d8d9dc2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
---
license: apache-2.0
base_model:
- Qwen/Qwen3-VL-2B-Instruct
language:
- en
new_version: goodman2001/colqwen3-v0.1
pipeline_tag: visual-document-retrieval
---

# ColQwen3: Visual Retriever based on Qwen3-VL-2B-Instruct with ColBERT strategy

### source code: [Mungeryang/colqwen3](https://github.com/Mungeryang/colqwen3)

ColQwen is a model based on a novel model architecture and training strategy based on Vision Language Models (VLMs) to efficiently index documents from their visual features.
It is a [Qwen3-VL-2B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-2B-Instruct) extension that generates [ColBERT](https://arxiv.org/abs/2004.12832)- style multi-vector representations of text and images. 
It was introduced in the paper [ColPali: Efficient Document Retrieval with Vision Language Models](https://arxiv.org/abs/2407.01449) and first released in [this repository](https://github.com/ManuelFay/colpali)

This version is the untrained base version to guarantee deterministic projection layer initialization.


<p align="center"><img width=800 src="https://github.com/illuin-tech/colpali/blob/main/assets/colpali_architecture.webp?raw=true"/></p>


## Usage

> [!WARNING]
> This version should not be used: it is solely the base version useful for deterministic LoRA initialization.


## Contact

- Mungeryang: mungerygm@gmail.com/yangguimiao@iie.ac.cn


## Acknowledgments

❤️❤️❤️

> [!WARNING]
> Thanks to the **Colpali team** and **Qwen team** for their excellent open-source works!
> I accomplished this work by **standing on the shoulders of giants~**

<p align="center">
    <img src="https://cdn.mos.cms.futurecdn.net/pqHroHNqYyQoJvEPrYkbcj-1200-80.jpg" width="80%"/>
<p>


## Citation

If you use any datasets or models from this organization in your research, please cite the original dataset as follows:

```bibtex
@misc{faysse2024colpaliefficientdocumentretrieval,
  title={ColPali: Efficient Document Retrieval with Vision Language Models}, 
  author={Manuel Faysse and Hugues Sibille and Tony Wu and Bilel Omrani and Gautier Viaud and Céline Hudelot and Pierre Colombo},
  year={2024},
  eprint={2407.01449},
  archivePrefix={arXiv},
  primaryClass={cs.IR},
  url={https://arxiv.org/abs/2407.01449}, 
}
```