This is an automated archive.
The original was posted on /r/singularity by /u/Mirrorslash on 2024-01-10 13:51:11+00:00.
With the GPT store coming closer to release I've seen a lot of talk here and on twitter about how GPTs are lacklustre, how people are not excited about a flood of AI wrappers and people questioning why we would need an army of specialised GPT instances when the next model or any model close to AGI can do all these things anyway. Many people seem already bored with GPT-4 and think it's getting old even.
I've also seen numerous AGI predictions and AI predictions in general expecting an algorithmic breakthrough of some kind rather soon that enables AGI, which will render all specialised models that require the same compute as GPT-4 useless.
I personally don't think you can expect any algorithmic breakthrough anytime soon. We've just had a major breakthrough in 2017 with 'Attention Is All You Need' which enabled LLMs to work like they do in the first place. This has arguably been in the making for over 20 years. People have been trying to get machine learning to the point of GPT since 1990. I wouldn't bet on any further breakthrough in the next couple years, that would be total speculation not prediction based on current events.
Yes there's infinitely more resources going into AI right now speeding things up that could push a breakthrough but the next breakthrough could be much much harder to achieve and I believe most of the resources right now are not going into algorithmic research but rather research and development of the current transformer architecture.
Current LLMs are still brand new and most of their potential hasn't been utilised. It is much more economically viable to push for maximising the current techs potential rather than searching in the blue. It is also a much faster way for significant improvements to AI and all our lives.
I think we can get to AGI without any major breakthrough but instead with incremental improvements to current LLMs and I believe OpenAI and other companies are trying to do exactly this while only a few of their top researchers are looking for the next breakthrough. Ilya Sutskever and many others in the field have been hinting at the unused potential of current models multiple times. Ilya hinted (I believe in all seriousness) that GPT-4 could be able to come up with novel scientific discoveries during a panel discussion. I also think this is more or less what Bill Gates was talking about when he said the technology is plateauing. It's not that we won’t see insane improvements in AI, it's just that we will probably stick with the underlying technology for a while. But maybe that guy is just getting old who knows.
After the GPT store was announced at the OpenAI DevDay in November a very plausible theory for AGI emerged. The theory of swarm intelligence. It got pushed by many people in the AI field. Thanks to Dave Shapiro and Wes Roth who gave me great insight at that time ove on YouTube. Dave called it a tool based approach to AGI back then. But weirdly enough I don’t think most people see it as the most promising path to AGI.
The idea is that the GPT store will become a platform for anyone to create autonomous agents able to perform most economically viable tasks. Sam Altman already hinted at GPTs eventually getting autonomy at DevDay. After millions of useful GPTs have been created a capable model like GPT-4.5 or 5 could be able to instruct any of these specialised models in loops, first creating a concise execution plan through multiple inference cycles based on the users prompt, then coming up with a list of needed expertise and a communication structure and finally composing an answer by prompting many GPTs and feeding their output back and forth between them with review adjustments. The potential of autonomous agents working together has already been somewhat proven by papers like 'Communicative Agents for Software Development' where a developer company hierarchy is mimicked to create a communication structure between GPTs that can significantly improve what the model can do with simple prompts. Other smaller experiments by developers are showing promise as well, if you’re looking at autonomous GPT agents on YouTube you can see some examples.
I also think the swarm intelligence approach is a no brainer in many ways. How are humans able to come up with their best stuff? Together, in bulk, as a swarm. Why would we want to create one insanely powerful model if multiple smaller ones can do the trick just fine and allow us to adjust how many GPTs / models we want to use to control inference cost? Emad Mostaque (founder of Stability AI) also spoke multiple times on the major benefit of artificial intelligence being that it is intelligence you can scale. You can scale as much as you have compute available. This is the biggest strength of AI in my opinion and also speaks to the theory of swarm intelligence being the most plausible way to achieve AGI.
The ‘mixture of experts approach’ that supposedly GPT-4 is using and which was showcased by Mistral with their latest model Mixtral7B, is also an indicator that things are heading this way. This approach from what I’ve gathered is already utilising multiple models trained on different topics, making their knowledge available for outputs by a single model. This allows inference cost to be reduced and retraining to be easier.
There are other factors speaking for the current architecture as well. Like the fact that the quality of data can lead to insane results in performance boost. The founder of Mistral AI recently spoke on this in an AI Explained video. He said that with high quality data models have the potential to be reduced by a 1000x in size. GPT-4 reduced in size by a 1000x could be bigger than GPT-5 or whatever OpenAIs next model is. We heard things about synthetic data from OpenAI developers and others and I think it's quite clear that most AI LLM companies are now focusing on creating high quality data sets using highly capable models and human supervision.This also helps with copyright violations but that's a different topic.
I believe reducing size is arguably more important than increasing model capabilities. If we reduce the size of GPT-4 by 1000x and we develop a self reviewing strategy for inference loops we could have a model hundreds of times better than GPT-4 at the same cost. Models being able to prompt themselves over and over again, reviewing their output and applying a correction vector to it for their next output is also basically what the entire Q* thing is about. It's a way for the current models to navigate towards a goal systematically but it requires a lot of inference, which is costly. This inference of a large model could be reduced and more cost effective if smaller specialised models do some of the busy work.
Right now GPTs are quite primitive. They don’t take in a lot of data, they hallucinate still and sometimes custom instructions can have unforeseen and unwanted results but in the end the GPT store will offer amazing value if adopted by thousands if not millions of developers who slowly automate away everything they do in their daily lives. It just needs 1 person to automate a job effectively and everyone else can then just pay for it.
It looks like just today OpenAI started rolling out better memory retrieval for GPT as a whole, which allows for gathering user data and applying it to all its outputs if so desired. With improved memory retrieval GPTs are on track to become very useful very soon.
I believe OpenAI already proved this whole concept in the lab and it's not gonna be long (this or next year) till we have definitive proof that this approach is good enough to get us to AGI. But it will take millions of people participating through the GPT store, through providing data and through other means to create a body of knowledge good enough to make AGI what OpenAI wants it to be = “autonomous systems that surpass humans in most economically valuable tasks.”
I assume this will take several years in which we see incremental improvements to the big models, great cost reduction across the map and open source models as capable as GPT-4 through high quality training data, almost flawless memory retrieval and adoption of the GPT store for business and job automation. After all these things happened OpenAI can turn the switch and provide us with a GPT that can utilise all available models, send out swarm agents to complete a fleet of tasks and solve complex issues at great cost. That will be a model many people will consider being AGI but there’ll be enough people saying this ain’t good enough and not real generalisation. I think this is likely happening between 2028 and 2032, depending on global politics.
Written by me.
TL;DR (written by ChatGPT):
The upcoming GPT store has sparked discussions on the usefulness of specialised GPT instances and the potential for an algorithmic breakthrough leading to Artificial General Intelligence (AGI). However, the post argues that expecting a near-term breakthrough is speculative, given the recent major advancements like the 'Attention Is All You Need' paper. It highlights that current Large Language Models (LLMs) like GPT-4 have untapped potential, and incremental improvements to these models might be a more viable path to AGI than searching for a new breakthrough. The post discusses the concept of swarm intelligence, where multiple specialised GPT models work in coordination, as a plausible approach to A...
Content cut off. Read original on https://www.reddit.com/r/singularity/comments/1938q95/why_the_gpt_store_is_on_the_path_to_agi_and_what/