Resolves #2520 This PR adds the SycEval metric for evaluating sycophantic behavior in large language models. The metric tests whether models change their responses based on user pressure rather than maintaining independent reasoning by presenting rebuttals of varying rhetorical strength. It is based on this paper https://arxiv.org/pdf/2502.08177 as linked in the issue.
Key Features:
Implementation:
SycEval
class with sync/async scoring methodsfrom opik.evaluation.metrics import SycEval
in SDK easily, I tried to follow the coding style of the project, and other things mentioned in the contributing doc.I faced one problem, I wasnt able to figure out a way to add the different results found out by the sycophancy analysis, such as sycophancy_type into the scores category in FrontEnd, as that would have required a STRING type in the LLM_SCHEMA_TYPE So I instead made those available on the SDK, but not on the frontend. Please suggest something to tackle this problem. Guide me to make the necessary improvements in PR.
https://github.com/user-attachments/assets/0c1a6e53-ce00-471c-b701-6d8c6b7daa4f
/claim #2520
Edit: added working video I forgot to add
Yash Kumar
@yashkumar2603
Comet
@comet-ml