Hacker News
Favorites
Setup
☰
Home
Favorites
Setup
Comment by hypfer |
original
|
Senior SWE-Bench: open-source benchmark that assesses agents as senior engineers
[−]
hypfer
· 2026-07-02 Thu 06:44 UTC ·
link
fave
Similarly, it explains to me why people found Claude so amazing, while I just thought "eh."
Tool expectations
Tool expectations