Details

Added new evaluation metric “structure output compliance” to the frontend for using LLM-as-a-judge from the UI (Online Evaluation tab) as well as in the Python SDK.

Resolves #2528 /claim #2528

video ::

https://github.com/user-attachments/assets/47ffd3e9-6642-4678-9e72-87765c747bac

Claim

Total prize pool $50
Total paid $0
Status Pending
Submitted June 23, 2025
Last updated June 23, 2025

Contributors

VI

Vikas_pal8923

@Vikaspal8923

100%

Sponsors

CO

Comet

@comet-ml

$50