Instructions to use jinaai/jina-clip-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use jinaai/jina-clip-v2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="jinaai/jina-clip-v2", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("jinaai/jina-clip-v2", trust_remote_code=True, dtype="auto") - sentence-transformers
How to use jinaai/jina-clip-v2 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("jinaai/jina-clip-v2", trust_remote_code=True) sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Transformers.js
How to use jinaai/jina-clip-v2 with Transformers.js:
// npm i @huggingface/transformers import { pipeline } from '@huggingface/transformers'; // Allocate pipeline const pipe = await pipeline('feature-extraction', 'jinaai/jina-clip-v2'); - Notebooks
- Google Colab
- Kaggle
Exception: data did not match any variant of untagged enum PyPreTokenizerTypeWrapper at line 83 column 3
/home/h3c/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py:1150: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.
warnings.warn(
2025-01-07 15:32:03.162136: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2025-01-07 15:32:03.173274: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2025-01-07 15:32:03.176658: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-01-07 15:32:03.185350: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-01-07 15:32:03.787266: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/home/h3c/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py:1150: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.
warnings.warn(
/home/h3c/.local/lib/python3.10/site-packages/flash_attn/ops/triton/layer_norm.py:959: FutureWarning: torch.cuda.amp.custom_fwd(args...) is deprecated. Please use torch.amp.custom_fwd(args..., device_type='cuda') instead.
def forward(
/home/h3c/.local/lib/python3.10/site-packages/flash_attn/ops/triton/layer_norm.py:1018: FutureWarning: torch.cuda.amp.custom_bwd(args...) is deprecated. Please use torch.amp.custom_bwd(args..., device_type='cuda') instead.
def backward(ctx, dout, *args):
Traceback (most recent call last):
File "/home/h3c/Desktop/a.py", line 27, in
text_embeddings = model.encode_text(sentences, truncate_dim=truncate_dim)
File "/home/h3c/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/h3c/.cache/huggingface/modules/transformers_modules/jinaai/jina-clip-implementation/51f02de9f2cf8afcd3bac4ce996859ba96f9f8e9/modeling_clip.py", line 565, in encode_text
self.tokenizer = self.get_tokenizer()
File "/home/h3c/.cache/huggingface/modules/transformers_modules/jinaai/jina-clip-implementation/51f02de9f2cf8afcd3bac4ce996859ba96f9f8e9/modeling_clip.py", line 333, in get_tokenizer
self.tokenizer = AutoTokenizer.from_pretrained(
File "/home/h3c/.local/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 814, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/home/h3c/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2029, in from_pretrained
return cls._from_pretrained(
File "/home/h3c/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2261, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/home/h3c/.local/lib/python3.10/site-packages/transformers/models/xlm_roberta/tokenization_xlm_roberta_fast.py", line 155, in init
super().init(
File "/home/h3c/.local/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 111, in init
fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file)
Exception: data did not match any variant of untagged enum PyPreTokenizerTypeWrapper at line 83 column 3