It’s not a solution but it’s why humans have developed the obvious approach of “build one thing, then everyone can see that one thing and agree what needs to happen next” (ie the space of P solutions is reduced by creating one thing and then the next set of choices is reduced by the original
Choice.
This might be obvious to everyone but it’s a nice way to me to view it (sort of restating the non-waterfall (agile?) approach to specification discovery)
Ie waterfall design without coding is too under specified, hence the agile waterfall of using code iteratively to find an exact specification
Doesn't this whole argument fall apart if we consider iteration over time? Sure, the initial implementation might be uncoordinated, but once the subagents have implemented it, what stops the main agent from reviewing the code and sorting out any inconsistencies, ultimately arriving at a solution faster than it could if it wrote it by itself?
I'd wager that a "main agent" is really just a bunch of subagents in a sequential trench coat.
At the end, in both cases, it's a back and forth with an LLM, and every request has its own lifecycle. So it's unfortunately at least a networked systems problem. I think your point works with infinite context window and one-shot ting the whole repo every time... Maybe quantum LLM models will enable that
The thing that TFA doesn't seem to go into is that these mathematical results apply to human agents in exactly the same way as they do to AI agents, and nevertheless we have massive codebases like Linux. If people can figure out how to do it, then there's no math that can help you prove that AIs can't.
I doubt it is possible to mathematically prove much inside of a black box of billions of interconnected weights. But at least in the narrow case of the strawberry problem, it seems likely that LLM inference could reliably recognizing that sort of problem as the type that would benefit from a letter counting tool call as part of the response.
LLMs don't see words. They see tokens, which is why previously they had a hard time counting the r's.
You can certainly prove that mathematically, and giving that proof to an LLM it will give you the correct answer. Which is a prompting technique btw to improve accuracy of an LLMs results.
Although the more recent LLMs have built in approaches to answer this question.
Gemma4 for example:
-----
The user is asking for the number of 'r's in the word "strawberry".
* S
* T
* *R* (1)
* A
* W
* B
* E
* *R* (2)
* *R* (3)
* Y
There are 3 'r's.
-----
GPT-OSS-120b example.
-----
The user asks: "How many r's in strawberry." This is a straightforward counting problem: count the letter 'r' in the word "strawberry". The word "strawberry" contains letters: s t r a w b e r r y. Count r's: there are three 'r's? Let's check: s(1) t(2) r(3) a(4) w(5) b(6) e(7) r(8) r(9) y(10). Actually the word is "strawberry". Let's write out: s, t, r, a, w, b, e, r, r, y. So there are three r's: at positions 3, 8, 9. Yes, three r's. So answer: 3.
But we may also note that "r" appears three times.
Good architecture, actor models, and collaboration patterns do not emerge magically from “more agents”.
Maybe what’s missing is the architect’s role.
This might be obvious to everyone but it’s a nice way to me to view it (sort of restating the non-waterfall (agile?) approach to specification discovery)
Ie waterfall design without coding is too under specified, hence the agile waterfall of using code iteratively to find an exact specification
At the end, in both cases, it's a back and forth with an LLM, and every request has its own lifecycle. So it's unfortunately at least a networked systems problem. I think your point works with infinite context window and one-shot ting the whole repo every time... Maybe quantum LLM models will enable that
You can certainly prove that mathematically, and giving that proof to an LLM it will give you the correct answer. Which is a prompting technique btw to improve accuracy of an LLMs results.
Although the more recent LLMs have built in approaches to answer this question.
Gemma4 for example:
-----
The user is asking for the number of 'r's in the word "strawberry".
* S
* T
* *R* (1)
* A
* W
* B
* E
* *R* (2)
* *R* (3)
* Y
There are 3 'r's.
-----
GPT-OSS-120b example.
-----
The user asks: "How many r's in strawberry." This is a straightforward counting problem: count the letter 'r' in the word "strawberry". The word "strawberry" contains letters: s t r a w b e r r y. Count r's: there are three 'r's? Let's check: s(1) t(2) r(3) a(4) w(5) b(6) e(7) r(8) r(9) y(10). Actually the word is "strawberry". Let's write out: s, t, r, a, w, b, e, r, r, y. So there are three r's: at positions 3, 8, 9. Yes, three r's. So answer: 3.
But we may also note that "r" appears three times.
Thus answer: 3.
We can provide a short answer.
Thus final: There are three r's in "strawberry".
----