Hacker News

Favorites Setup
Comment by vincnetas | original | Senior SWE-Bench: open-source benchmark that assesses agents as senior engineers
[−]vincnetas · 2026-07-02 Thu 05:39 UTC · link
We could call this "generative adversarial network" (GAN) :)

https://en.wikipedia.org/wiki/Generative_adversarial_network

[−]wwind123 · 2026-07-02 Thu 06:37 UTC · link
This kind of approach would generally still need human guidance, otherwise these models might get stuck in weird niche corners of the problem space that would not be relevant to any real world project.
[−]ben_w · 2026-07-02 Thu 07:19 UTC · link
We could call this "reinforcement learning from human feedback" (RLHF) :)

https://en.wikipedia.org/wiki/Reinforcement_learning_from_hu...