Inference Providers
Active filters: vllm
FlorianJc/MegaBeam-Mistral-7B-300k-vllm-fp8
Text Generation
• 7B • Updated • 4
RedHatAI/gemma-2-9b-it-FP8
Text Generation
• 9B • Updated • 1.26k
• 5
mistralai/Mathstral-7B-v0.1
7B • Updated • 1.23k
• 242
mistralai/Mamba-Codestral-7B-v0.1
7B • Updated • 27.6k
• 614
RedHatAI/Qwen2-57B-A14B-Instruct-FP8
Text Generation
• 57B • Updated • 732
• 1
nm-testing/Meta-Llama-3-8B-Instruct-FP8-K-V
Text Generation
• 8B • Updated • 5
RedHatAI/DeepSeek-Coder-V2-Lite-Instruct-FP8
Text Generation
• 16B • Updated • 79.3k
• 11
RedHatAI/DeepSeek-Coder-V2-Lite-Base-FP8
Text Generation
• 16B • Updated • 31
mistralai/Mistral-Nemo-Base-2407
12B • Updated • 56.3k
• 342
mgoin/Mistral-Nemo-Instruct-2407-FP8-Dynamic
Text Generation
• 12B • Updated • 124
mgoin/Mistral-Nemo-Instruct-2407-FP8-KV
Text Generation
• 12B • Updated • 1
RedHatAI/Mistral-Nemo-Instruct-2407-FP8
Text Generation
• 12B • Updated • 1.47k
• 18
FlorianJc/Mistral-Nemo-Instruct-2407-vllm-fp8
Text Generation
• 12B • Updated • 93
• 8
RedHatAI/DeepSeek-Coder-V2-Base-FP8
Text Generation
• 236B • Updated • 15
RedHatAI/DeepSeek-Coder-V2-Instruct-FP8
Text Generation
• 236B • Updated • 376
• 7
mgoin/Minitron-4B-Base-FP8
Text Generation
• 4B • Updated • 5
• 3
mgoin/Minitron-8B-Base-FP8
Text Generation
• 8B • Updated • 2
• 3
mgoin/nemotron-3-8b-chat-4k-sft-hf
Text Generation
• 9B • Updated • 11
RedHatAI/Meta-Llama-3.1-8B-Instruct-FP8
Text Generation
• 8B • Updated • 341k
• 44
RedHatAI/Meta-Llama-3.1-8B-Instruct-FP8-dynamic
Text Generation
• 8B • Updated • 38.6k
• 9
RedHatAI/Meta-Llama-3.1-70B-Instruct-FP8-dynamic
Text Generation
• 71B • Updated • 2.64k
• 7
RedHatAI/Meta-Llama-3.1-70B-Instruct-FP8
Text Generation
• 71B • Updated • 11.7k
• 51
RedHatAI/Meta-Llama-3.1-405B-Instruct-FP8
Text Generation
• 406B • Updated • 1.93k
• 31
RedHatAI/Meta-Llama-3.1-405B-Instruct-FP8-dynamic
Text Generation
• 406B • Updated • 7.09k
• 15
RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w8a16
Text Generation
• 3B • Updated • 519
• 12
RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w8a8
Text Generation
• 8B • Updated • 7.92k
• 20
mistralai/Mistral-Large-Instruct-2407
Updated • 7.02k
• 859
mgoin/Nemotron-4-340B-Base-hf
Text Generation
• 341B • Updated • 9
• 1
mgoin/Nemotron-4-340B-Base-hf-FP8
Text Generation
• 341B • Updated • 33
• 2
RedHatAI/Meta-Llama-3.1-70B-Instruct-quantized.w8a16
Text Generation
• 19B • Updated • 63
• 5