Algora

Bounties

Create new bounties by commenting /bounty $1000 on GitHub issues.

$1,200 Writing-Zero Paper: https://arxiv.org/pdf/2506.00103 Difficulty: Very Hard Notes: Full solution should include a pipeline to generate pairwise training samples from existing LLMs+rubrics or public sources, an environment to train GenRM, and an environment to train main model using GenRM. Requires some creative decision-making; discuss proposal with Will before getting too deep into it, will give sufficient compute for train experiments conditional on implementation progress. 7 months ago
$1,200 MLE-Bench Organization: OpenAI Paper: https://arxiv.org/abs/2410.07095 Code: https://github.com/openai/mle-bench Owner: https://x.com/creet_z Difficulty: Very Hard 7 months ago
$1,200 PaperBench Organization: OpenAI Paper: https://arxiv.org/abs/2504.01848 Code: https://github.com/openai/preparedness/tree/main/project/paperbench Owner: https://x.com/_manan2005 Difficulty: Very Hard 7 months ago
$800 ART-E / ART Paper: https://openpipe.ai/blog/art-e-mail-agent Code: https://github.com/OpenPipe/ART/tree/main/examples Owner: https://x.com/dhruvrnaik Difficulty: Easy-Hard Notes: Bounty: $200 - 800. Easy = ART-E agent task, Hard = 2-way portability between ART <-> verifiers tasks 7 months ago
$800 Kimi-K2 Tool Sim Paper: https://arxiv.org/abs/2507.20534 Owner: https://x.com/kalocide Difficulty: Hard 7 months ago
$800 WebArena Code: https://github.com/web-arena-x/webarena/tree/main Owner: https://x.com/fido20222 Difficulty: Hard 7 months ago
$800 SWE-bench Verified Owner: https://x.com/NoisyTails Difficulty: Hard 7 months ago
$800 SWE-Swiss Organization: Bytedance Seed Paper: https://pebble-potato-fc6.notion.site/SWE-Swiss-A-Multi-Task-Fine-Tuning-and-RL-Recipe-for-High-Performance-Issue-Resolution-21e174dedd4880ea829ed4c861c44f88 Code: https://github.com/zhenyuhe00/SWE-Swiss Dataset(s): https://huggingface.co/SWE-Swiss Train/Eval?: Train Owner: https://x.com/SrgntSaltNPepa Difficulty: Medium-Hard Notes: Bounty: $400 - 800. $400 = existing RL dataset, $800 = full auto-gen pipeline 7 months ago
$800 LangGraph Difficulty: Hard+ Notes: Part of this will be a scoping problem, i.e. determining what set of LangGraph features can be supported with linear generation (i.e. forward sequence of turns w/o forking) 7 months ago
$800 BixBench Organization: Future House Paper: https://arxiv.org/abs/2503.00096 Code: https://github.com/Future-House/BixBench Difficulty: Hard 7 months ago

Edit Bounty Amount

Update the bounty amount for this issue