LinWeizheDragon
/

ColBERT-v2

Passage Retrieval

Model card Files Files and versions

LinWeizheDragon commited on Feb 27, 2024

Commit

2aee0f7

·

verified ·

1 Parent(s): 7ac7d85

Update README.md

Files changed (1) hide show

README.md +29 -0

README.md CHANGED Viewed

@@ -55,6 +55,35 @@ This model can be used combined with language models to create a retrieval-augme
 For details of training, indexing, and performing retrieval, please refer to [here](https://github.com/LinWeizheDragon/FLMR).
 ## Training Details

 For details of training, indexing, and performing retrieval, please refer to [here](https://github.com/LinWeizheDragon/FLMR).
+1. Install the [FLMR package](https://github.com/LinWeizheDragon/FLMR).
+2. A simple example use of this model:
+```python
+from flmr import FLMRConfig, FLMRModelForRetrieval, FLMRQueryEncoderTokenizer, FLMRContextEncoderTokenizer
+checkpoint_path = "LinWeizheDragon/ColBERT-v2"
+query_tokenizer = FLMRQueryEncoderTokenizer.from_pretrained(checkpoint_path, subfolder="query_tokenizer")
+context_tokenizer = FLMRContextEncoderTokenizer.from_pretrained(checkpoint_path, subfolder="context_tokenizer")
+model = FLMRModelForRetrieval.from_pretrained(checkpoint_path,
+                                query_tokenizer=query_tokenizer,
+                                context_tokenizer=context_tokenizer,
+                                )
+Q_encoding = query_tokenizer(["What is the capital of France?", "What is the capital of China?"])
+D_encoding = context_tokenizer(["Paris is the capital of France.", "Beijing is the capital of China.",
+                            "Paris is the capital of France.", "Beijing is the capital of China."])
+inputs = dict(
+    query_input_ids=Q_encoding['input_ids'],
+    query_attention_mask=Q_encoding['attention_mask'],
+    context_input_ids=D_encoding['input_ids'],
+    context_attention_mask=D_encoding['attention_mask'],
+    use_in_batch_negatives=True,
+)
+res = model.forward(**inputs)
+```
 ## Training Details