HumanELY : Human Evaluation of LLM Yield

To provide a structured way to perform human evaluation, we propose the first and most comprehensive guidance using some commonly used evaluation metrics in a tool form called HumanELY. Our approach and tool helps perform evaluation of LLM outputs in a comprehensive, consistent,measurable and comparable manner. HumanELY comprises of 5 key evaluation metrics of relevance, coverage,coherence, harm and comparison. Additional submetrics within these 5 key metrics provide for likert scale based human evaluation of LLM outputs.

Cite us : Awasthi, R., S. Mishra, D. Mahapatra, A. Khanna, K. Maheshwari, J. Cywinski, F. Papay and P. Mathur (2023). "HumanELY: Human evaluation of LLM yield, using a novel web-based evaluation tool." medRxiv: 2023.2012.2022.23300458.

Note : No user uploaded data is stored on the server.

LLM Name :

Upload File

#	Reference Text	Generated Text	Status	Evaluate