PR
Writing-Zero
Prime Intellect
$1,200

Share on socials

Writing-Zero

Exclusives

Open to everyone

Description

Paper: https://arxiv.org/pdf/2506.00103

Difficulty: Very Hard

Notes: Full solution should include a pipeline to generate pairwise training samples from existing LLMs+rubrics or public sources, an environment to train GenRM, and an environment to train main model using GenRM. Requires some creative decision-making; discuss proposal with Will before getting too deep into it, will give sufficient compute for train experiments conditional on implementation progress.

Contributor chat

HT
Nov 10
SA
Is this task still open? I would love to work on it and propose a plan of execution.
06:11