This is an automated archive.
The original was posted on /r/singularity by /u/HeroicLife on 2024-01-18 19:13:46+00:00.
Token prediction is the optimization function of an LLM - how it gets better. The optimization function is independent of its internal algorithm -- how it actually comes up with answers. LLMs don't just spit out the next token; they utilize advanced neural networks whose intricacies we're still deciphering. These networks navigate a myriad of linguistic and contextual subtleties, going way beyond basic token prediction. Think of token prediction as a facade, masking their elaborate cognitive mechanisms.
Consider evolution: its core optimization function, gene maximization, didn't restrict its outcomes to mere DNA replication. Instead, it spawned the entire spectrum of biological diversity and human intellect.
Similarly, an LLM's optimization function, token prediction, is just a means to an end. It doesn't confine the system's potential complexity. Moreover, within such systems, secondary optimization functions can emerge, overshadowing the primary ones. For instance, human cultural evolution now overshadows genetic evolution as the primary driver of our species' development.
We don't really understand what actually limits the capability of today's LLMs. (If we did, we would already be building AGI models.) It may be that the training algorithm is the limiting factor, but it could also be a lack of data quality, quantity, or medium. It could be a lack of computational resources or some other paradigm that we have yet to discover. The systems may even be sentient but lack the persistent memory or other structures needed to express it.