[ACL 2024] User-friendly evaluation framework: Eval Suite & Benchmarks: UHGEval, HaluEval, HalluQA
benchmark
evaluation
dataset
openai
hallucination
huggingface
huggingface-transformers
gpt-3
openai-api
hallucinations
gpt-4
large-language-models
llm
chatgpt
chatglm
qwen
hallucination-evaluation
hallucination-detection
-
Updated
Sep 24, 2024 - JavaScript