Papers-LLMEval
updated
Latxa: An Open Language Model and Evaluation Suite for Basque
Paper
• 2403.20266
• Published • 4
TrustLLM: Trustworthiness in Large Language Models
Paper
• 2401.05561
• Published • 69
Prometheus 2: An Open Source Language Model Specialized in Evaluating
Other Language Models
Paper
• 2405.01535
• Published • 124
Beyond Scaling Laws: Understanding Transformer Performance with
Associative Memory
Paper
• 2405.08707
• Published • 34
tinyBenchmarks: evaluating LLMs with fewer examples
Paper
• 2402.14992
• Published • 17
meta-llama/Llama-3.3-70B-Instruct-evals
Viewer
• Updated • 41.3k • 161
• 44
RUC-NLPIR/OmniEval-HallucinationEvaluator
Text Generation
• Updated • 1
Viewer
• Updated • 92 • 974
• 27
Benchmark
• Updated • 17.6k • 816k
• 1.26k
Preview
• Updated • 78
• 4
KRLabsOrg/lettucedect-base-modernbert-en-v1
Token Classification
• 0.1B • Updated • 5.3k
• 17
Viewer
• Updated • 269 • 645
• 47