This is an automated archive.
The original was posted on /r/singularity by /u/Xtianus21 on 2024-01-19 03:35:01+00:00.
Here is my part 2 of Turing on Conscious Convergence series. Again, I am putting paint onto a canvas so this is very theoretical.
Knowing that memory is a very important part of how an ASI would work I decided to expand out the thought process of how memory would/could potentially work in an AI agentic system.
In part 1, the Stream of Thoughts was used to constantly forward stream a running thoughts and input method. However, it has another very strong use case. Previously it was being used to act upon the W* model of the world view which is a mechanism for constantly updating it's understanding by a continuous, if not MEMS driven, feedback loop.
Then it hit me, why not bring in other mechanisms that could feed off of such a system. The biggest challenge of them all. Memory. Now, in this I am only interested in long term and medium term memory. Medium term being something more quickly accessible than longer term with a weight of importance of what is ultimately kept in long term.
And, like any well architected application program I began to think. Is there a real world way that brain can use a partitioning system and how might that work. I began to think of something practical that is right in front of us the whole time. Situations. Situations are a great memory partitioning system that is natural to our existence.
Think about it. We as humans do not have to hold every aspect of every situation in our minds all at once at all times. Every part of your day is a situation. And for the most part, those situations are planned and known. Yes, there is surprise and unaccounted for situations but for the most part you know how your day is going to go. In this, planning is in large part very related to this "situation."
The architecture goes through a concept of organizing the world view to go through an A* system that would find the best situational model to derive from.
The situational models would have the characteristic of being worldview and situational awareness micro models that could be updated and or generated very quickly.
Their main purpose to provide situational information to the world view model. I am in this environment, it looks like this, the scene is like this, i am getting audio like this, and so on. The situation could be, I am going on vacation with the family and it is out of the country. Each part of that situation is in some parts repetitive i.e. you've gone on vacation before internationally, you have gotten a cab to the hotel, you've checked in and so on...
Yes, they are the different situations but there are things in them that are the same and repetitive.
The micro model nodes would become various overtime and hold weights and properties that are tuned to a variety of situations but within the same scope or frame of reference.
The A* system would make a decision about which model to choose from at the time of inferencing them.
The model would grey out (or die) over time leaving room for either the most up to date and accurate model or perhaps newer and fresher models that are needed for new situations or information that is recently obtained.
The memory engine here is a system to ingest and progress the memories into situational engines that it can identify as being in the same reference or scope. Trying to keep them partitioned as best it can without overly creating knew situational model groupings. A balance if you will.
This memory feedback loop would serve to feed information to the continuous Rapid Situational Model Trainer - Micro World Model Generator & Refresher system. The system would then serve to create, update, reweight or delete a model in the grouping. The black section serving at that surprise or unknown situation section that needs to have a direction of action but not yet has found a home for its own situational worldview grouping.
The other major part here is how the memory engine is fed and refreshed. The W* World situational Stimuli system would serve to have multi-modal inputs that cover everything from video, audio, communications to all different types of sensory inputs.
This system ultimately feeds into the W* LLM looped by the STRoT system acting upon all of the things mentioned in part 1.
Does this solve memory long term. Short term memory is handled well by cache systems and inference token count context so I don't think that needs this type of mechanism per se.
Let me know in the comments if any of you think this is a viable solution brainstorm.
Here is a TLDR recap
In this second installment of the "Turing On Conscious Convergence" series, we delve deeper into the theoretical construct of memory within an Advanced Super Intelligent (ASI) system. Building on the foundational concept of a 'Stream of Thoughts' introduced in Part 1, we explore the dual utility of this stream, not only as a forward-moving thought and input method but also as a potent tool for memory management.
Memory Management in AI:
The introduction of a 'Memory Engine' is pivotal. Here, the focus is on long-term and medium-term memory, distinguished by accessibility and significance. Medium-term memory is readily accessible, serving as a prelude to what is retained in the more permanent long-term memory. This bifurcation is akin to the brain's own partitioning system, elegantly mirrored in the AI through a situational-based framework.
Situational Framework:
Human cognition naturally partitions memories into 'situations,' a concept that translates seamlessly into AI architecture. We navigate daily life through a series of situations — planned, anticipated, and occasionally unexpected. This natural partitioning inspires the AI's memory architecture, where each 'situation' represents a potential partition, a contextual framework for organizing experiences.
Architectural Dynamics:
The architecture employs an A* system, carefully selecting the most appropriate situational model that aligns with the current world view. These situational models, characterized by their rapid update and generation capabilities, serve a singular purpose: to inform the world view model with situational context — visual, auditory, and beyond.
Micro Model Nodes and A System:*
Micro model nodes, within this architecture, accrue over time, carrying weights and properties attuned to a spectrum of situations yet within the same frame of reference. The A* system, at the heart of the decision-making process, determines the optimal model for inference at any given moment.
Memory Engine and Feedback Loop:
A memory feedback loop feeds into the 'Continuous Rapid Situational Model Trainer - Micro World Model Generator & Refresher,' which is responsible for the creation, updating, reweighting, or deletion of models. This dynamic system allows for the 'fading out' of outdated models, making way for the most current or new models needed for fresh situations.
World Situational Stimuli and Multi-Modal Inputs:
The W* World situational Stimuli system captures multi-modal inputs ranging from video to audio and other sensory data. This rich input feeds into the W* LLM, which, in turn, is looped by the STRoT system, integrating all elements discussed in Part 1.
Considerations for Short-Term Memory:
The system posits that short-term memory management is effectively handled by existing cache systems and contextual inference tokens, suggesting that the proposed memory architecture is specifically designed to enhance long-term situational recall and adaptability.
Conclusion:
This exploration offers a visionary blueprint for an AI's memory architecture, deeply rooted in the concept of situational awareness and adaptability. The proposed system is a harmonious blend of theoretical constructs and practical mechanisms, aiming to capture the essence of human memory processing within the realms of artificial intelligence. It's a contemplative leap towards understanding and designing an ASI's memory — a brainstorm that invites further discussion on its viability and potential realization.