LExT: Towards Evaluating Trustworthiness of Natural Language Explanations

https://doi.org/10.48550/arXiv.2504.06227

Authors

Krithi Shailya , Shreya Rajpal , Gokul S Krishnan , Balaraman Ravindran

Published In

arXiv

Tags

We propose a framework for quantifying trustworthiness of natural language explanations, balancing plausibility and faithfulness to derive a Language Explanation Trustworthiness Score (LExT). We apply the framework in healthcare settings and compare general-purpose and domain-specific models.