I don’t think the idea of token prediction has the same cognitive limit, I think it’s a function of the language itself.
I think we’re discovering that our language has become a massive, global, flexing and morphing neural net itself. It encodes context, history, meaning. We’re slowly discovering that over the course of a few thousand years, humans have encoded into their language and symbolic representations the very data itself it’s describing.
The fact that science is coming out so easily in these LLMs doesn’t surprise it, but it’s beautiful nonetheless. Our scientific discourse is by itself consistently self referential, and both accepts and rejects itself with additional language. Papers that are cited heavily and appear more often in token inference are a ‘core’ of that token-space.
None of this is surprising again. We wrote dictionaries wherein all the words point to each other. An encyclopedia is a dictionary with ordering.
An LLM is an encyclopedia with statistic relationships.
IMO.