Hacker News

Favorites Setup

Comment by ben_w | original | Senior SWE-Bench: open-source benchmark that assesses agents as senior engineers

[−]ben_w · 2026-07-02 Thu 07:19 UTC · link

We could call this "reinforcement learning from human feedback" (RLHF) :)

https://en.wikipedia.org/wiki/Reinforcement_learning_from_hu...