Hacker News

Favorites Setup
Comment by jascha_eng | original | Senior SWE-Bench: open-source benchmark that assesses agents as senior engineers
[−]jascha_eng · 2026-07-02 Thu 06:42 UTC · link
I mean these were all solved before I assume so 100% not the same human ofc but models are expected to be good at a variety of code bases while human can specialize in one and learn. I think it's fair to compare to an individual that is used to working on a product.

I'm more interested in how fable would do