License and commercial use

This model redistributes DINO Materials under the DINOv3 License Agreement. Commercial use is permitted provided you comply with that agreement and with applicable export and trade control laws. Full terms: LICENSE.md, TERMS_OF_USE.md.

DINOv3 ViT-L/16

Original model by Meta AI: facebookresearch/dinov3

Vision backbone for dense visual features (ViT-L, patch 16). Built with DINOv3.

Model Card

This repository hosts DINOv3 ViT-L/16 pretrained on LVD-1689M: a Vision Transformer (ViT-L, patch size 16) distilled from the DINOv3 ViT-7B teacher. It produces dense visual features suitable for classification, retrieval, segmentation, and other vision tasks without fine-tuning.

Model Details

This model takes an image as input and returns a class token, patch tokens, and register tokens. For a 224×224 image: 1 class token + 4 register tokens + 196 patch tokens = 201 tokens. Inputs can be larger provided dimensions are multiples of 16; otherwise the image is cropped to the nearest smaller multiple.

Model Description

Original model: Meta AI (DINOv3)
Model type: Vision Transformer (ViT-L/16)
License: DINOv3 License

Model Sources

Repository: https://github.com/facebookresearch/dinov3
Paper: https://arxiv.org/abs/2508.10104

Uses

This model is a vision backbone providing multi-purpose features for downstream tasks.

Direct Use

The model can be used without fine-tuning, with downstream classifiers as simple as linear layers, to obtain competitive results:

on image classification, using k-NN classifiers on the class token
on image classification, with logistic regression classifiers applied on the class token
on image classification, with a linear layer applied on the class token and the average of the patch tokens
on image retrieval using nearest neighbors
on geometric and semantic 3D keypoint correspondances
on depth estimation, semantic segmentation, using linear layers
on unsupervised object discovery
on video segmentation tracking
on video classification, using a small 4-layer attentive probe

Downstream Use

Fine-tuning can yield additional gains but is optional; frozen features are typically strong out-of-the-box.

Bias, Risks, and Limitations

Compared to DINOv2 and SEERv2, DINOv3 delivers somewhat consistent performance across income categories on geographical fairness and diversity, although with a notable performance drop in the low-income bucket compared to the highest-income bucket.

DINOv3 also achieves relatively good scores across different regions, improving over its predecessor DINOv2. However, a relative difference is still observed between Europe and Africa.

Evaluation

Representative results for DINOv3 ViT-L/16 (LVD-1689M) from the paper:

Model	IN-ReaL	IN-R	Obj.Net	Ox.-H	ADE20k	NYU↓	DAVIS	NAVI	SPair
DINOv3 ViT-L/16	90.2	88.1	74.8	63.1	54.9	0.352	79.9	62.3	61.3

See the paper for evaluation protocols and full benchmarks.

Technical Specifications

Architecture: ViT-L (300M parameters), patch size 16, embedding dimension 1024, 4 register tokens, 16 heads, MLP FFN, RoPE

More Information

Citation

BibTeX

@misc{simeoni2025dinov3,
  title={{DINOv3}},
  author={Sim{\'e}oni, Oriane and Vo, Huy V. and Seitzer, Maximilian and Baldassarre, Federico and Oquab, Maxime and Jose, Cijo and Khalidov, Vasil and Szafraniec, Marc and Yi, Seungeun and Ramamonjisoa, Micha{\"e}l and Massa, Francisco and Haziza, Daniel and Wehrstedt, Luca and Wang, Jianyuan and Darcet, Timoth{\'e}e and Moutakanni, Th{\'e}o and Sentana, Leonel and Roberts, Claire and Vedaldi, Andrea and Tolan, Jamie and Brandt, John and Couprie, Camille and Mairal, Julien and J{\'e}gou, Herv{\'e} and Labatut, Patrick and Bojanowski, Piotr},
  year={2025},
  eprint={2508.10104},
  archivePrefix={arXiv},
  primaryClass={cs.CV},
  url={https://arxiv.org/abs/2508.10104},
}

DINOv3 by Meta. Use subject to the DINOv3 License.

Downloads last month: 392

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

Image Feature Extraction

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for 404-Gen/dinov3-vitl16-pretrain-lvd1689m

DINOv3

Paper • 2508.10104 • Published Aug 13, 2025 • 306