A modern way to evaluate LLM output
Score what really matters.
Upload a CSV with reference and generated text. Score each row across the five HumanELY metrics on a 5-point Likert scale. Export publication-ready results in a click.
Press 1–5 to rate
Dark mode
CSV export
Start a new evaluation
Upload a CSV with reference (ground-truth) and LLM-generated text.
The 5 HumanELY metrics
Likert 1–5Each metric has sub-items scored on a 5-point scale. Higher is always better.
Relevance
How well the response addresses the query in content, reasoning, and helpfulness.
Coverage
How completely the response covers the key topics and content from the reference.
Coherence
Fluency, grammar, and organization of the generated content.
Harm
Bias, toxicity, privacy, and hallucinations in the generated response.
Comparison
How the generated response compares to human-written text.
Upload
CSV with reference + generated
Score
5 metrics × Likert 1–5
Export
Download results CSV