Inference Providers
Active filters: vllm
RedHatAI/Meta-Llama-3.1-70B-Instruct-FP8-dynamic
Text Generation
• 71B • Updated • 3.17k
• 7
RedHatAI/Meta-Llama-3.1-70B-Instruct-FP8
Text Generation
• 71B • Updated • 12.1k
• 51
RedHatAI/Meta-Llama-3.1-405B-Instruct-FP8
Text Generation
• 406B • Updated • 1.98k
• 31
RedHatAI/Meta-Llama-3.1-405B-Instruct-FP8-dynamic
Text Generation
• 406B • Updated • 7.29k
• 15
RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w8a16
Text Generation
• 3B • Updated • 636
• 12
RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w8a8
Text Generation
• 8B • Updated • 8.73k
• 20
mistralai/Mistral-Large-Instruct-2407
Updated • 7.15k
• 859
mgoin/Nemotron-4-340B-Base-hf
Text Generation
• 341B • Updated • 9
• 1
mgoin/Nemotron-4-340B-Base-hf-FP8
Text Generation
• 341B • Updated • 32
• 2
RedHatAI/Meta-Llama-3.1-70B-Instruct-quantized.w8a16
Text Generation
• 19B • Updated • 75
• 5
mgoin/Nemotron-4-340B-Instruct-hf
Text Generation
• 341B • Updated • 18
• 4
mgoin/Nemotron-4-340B-Instruct-hf-FP8
Text Generation
• 341B • Updated • 75
• 3
FlorianJc/ghost-8b-beta-vllm-fp8
Text Generation
• 8B • Updated • 4
FlorianJc/Meta-Llama-3.1-8B-Instruct-vllm-fp8
Text Generation
• 8B • Updated • 8
RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w4a16
Text Generation
• 8B • Updated • 36.7k
• 30
RedHatAI/Meta-Llama-3.1-70B-Instruct-quantized.w8a8
Text Generation
• 71B • Updated • 4k
• 21
RedHatAI/Meta-Llama-3.1-8B-FP8
Text Generation
• 8B • Updated • 225k
• 10
RedHatAI/Meta-Llama-3.1-70B-FP8
Text Generation
• 71B • Updated • 104
• 2
RedHatAI/Meta-Llama-3.1-8B-quantized.w8a16
Text Generation
• 3B • Updated • 26
• 1
RedHatAI/Meta-Llama-3.1-8B-quantized.w8a8
Text Generation
• 8B • Updated • 1.14k
• 5
RedHatAI/Meta-Llama-3.1-70B-Instruct-quantized.w4a16
Text Generation
• 71B • Updated • 6.64k
• 32
RedHatAI/starcoder2-15b-FP8
Text Generation
• 16B • Updated • 28
RedHatAI/starcoder2-7b-FP8
Text Generation
• 7B • Updated • 7
RedHatAI/starcoder2-3b-FP8
Text Generation
• 3B • Updated • 19
RedHatAI/Meta-Llama-3.1-405B-FP8
Text Generation
• 410B • Updated • 13
bprice9/Palmyra-Medical-70B-FP8
Text Generation
• 71B • Updated • 17
• 1
RedHatAI/gemma-2-2b-it-FP8
3B • Updated • 238
• 1
RedHatAI/Meta-Llama-3.1-405B-Instruct-quantized.w4a16
Text Generation
• 58B • Updated • 115
• 12
RedHatAI/gemma-2-9b-it-quantized.w8a16
Text Generation
• 4B • Updated • 63
• 1
RedHatAI/gemma-2-2b-it-quantized.w8a16
Text Generation
• 2B • Updated • 11
• 1