News
Newest
Ask
Show
Jobs
Open on GitHub
HWE Bench: A new unbounded Benchmark for LLMs (GPT 5.5 is on top)
(hwebench.com)
3 points | by
fesens
40 minutes ago
2 comments
fesens
40 minutes ago
Current benchmarks have ceilings, usually 100%. This benchmark aims to be a long lasting, high correlation with the ability to solve real world problems and follow complex instructions, and unbounded (meaning it can always go higher).
fabiofachini92
36 minutes ago
Amazing!
2 comments