DedeProGames's picture
Create .eval_results/gpqa.yaml
cc05bc8 verified
raw
history blame contribute delete
214 Bytes
- dataset:
id: Idavidrein/gpqa
task_id: diamond
value: 12.5
date: '2026-04-06'
source:
url: https://huggingface.co/OrionLLM/GRM-2.5-Air/
name: Official GRM-2.5 Benchmark
user: DedeProGames