Show HN: Is grep enough? A transparent benchmark for agentic code navigation

(entelligentsia.github.io)

3 points | by bonigv 2 hours ago

2 comments

6thbit 2 hours ago
This is nicely put together, it does make sense that lsps help more as complexity grows because makes navigation across symbols easier.
I hope someone with a large budget can reproduce these with latest Opus/gpt.
My gut feeling is that higher reasoning models tend to use grep more effectively. But intuitively lsp should still win there.
[-]
- bonigv 2 hours ago
  You are absolutely right about what we feel intuitively - LSPs should beat the shit out of the competition. But surprisingly it did not. Across 10 different LSP servers, across 5 different levels of prompt complexity it did not. Mind you, I painstakingly warmed up the LSP servers that needed it warmed. Some liked it cold and it fared equally non impressively. The pattern I saw was, LLMs (sonnet w.6 with cc) was very clever to use whatever it had to get to a verifiable answer. It could do it just with bash for sure. But as the prompt complexity grew the cost also rose.
  Treesitter is sitting in a sweet spot here. a vrainy LLM can find the shortest path with high quality with treesitter and a few bash calls.
bonigv 2 hours ago
[flagged]