Many modern theories in cognitive science posit that the brain’s objective is to be a kind of “prediction machine” to predict the incoming stream of sensory information from the top down, as well as processing it from the bottom up. This is sometimes referred to through the aphorism “perception is controlled hallucination”.
In a sense… yes! Although of course it’s thought to be across many modalities and time-scales, and not just text. Also a crucial piece of the picture is the Bayesian aspect - which also involves estimating one’s uncertainty over predictions. Further info: https://en.wikipedia.org/wiki/Predictive_coding
It’s also important to note the recent trends towards so-called “Embodied” and “4E cognition”, which emphasize the importance of being situated in a body, in an environment, with control over actions, as essential to explaining the nature of mental phenomena.
But yeah, it’s very exciting how in recent years we’ve begun to tap into the power of these kinds of self-supervised learning objectives for practical applications like Word2Vec and Large Language/Multimodal Models.
We can have robots with bodies that talk and form relationships with people now. Not deep intimate relationships, but simple things like maintaining conversations with people. You wouldn’t need much more software on top of the LLM to make a really functional person.
in which strategies are proposed to decouple the AI’s internal “world model” from its language capabilities, to facilitate hierarchical planning and mitigate hallucination.
Many modern theories in cognitive science posit that the brain’s objective is to be a kind of “prediction machine” to predict the incoming stream of sensory information from the top down, as well as processing it from the bottom up. This is sometimes referred to through the aphorism “perception is controlled hallucination”.
That sounds extremely interesting, i gotta look into that when i have more time
Some reading material:
https://en.wikipedia.org/wiki/Predictive_coding
https://en.wikipedia.org/wiki/Bayesian_approaches_to_brain_function
https://plato.stanford.edu/entries/embodied-cognition/
https://global.oup.com/academic/product/surfing-uncertainty-9780190933210?cc=us&lang=en&
So human thought is … text prediction?
In a sense… yes! Although of course it’s thought to be across many modalities and time-scales, and not just text. Also a crucial piece of the picture is the Bayesian aspect - which also involves estimating one’s uncertainty over predictions. Further info: https://en.wikipedia.org/wiki/Predictive_coding
It’s also important to note the recent trends towards so-called “Embodied” and “4E cognition”, which emphasize the importance of being situated in a body, in an environment, with control over actions, as essential to explaining the nature of mental phenomena.
But yeah, it’s very exciting how in recent years we’ve begun to tap into the power of these kinds of self-supervised learning objectives for practical applications like Word2Vec and Large Language/Multimodal Models.
We can have robots with bodies that talk and form relationships with people now. Not deep intimate relationships, but simple things like maintaining conversations with people. You wouldn’t need much more software on top of the LLM to make a really functional person.
I have to disagree about that last sentence. Augmenting LLMs to have any remotely person-like attributes is far from trivial.
The current thought in the field about this centers around so-called “Objective Driven AI”:
https://openreview.net/pdf?id=BZ5a1r-kVsf
https://arxiv.org/abs/2308.10135
in which strategies are proposed to decouple the AI’s internal “world model” from its language capabilities, to facilitate hierarchical planning and mitigate hallucination.
The latter half of this talk by Yann LeCun addresses this topic too: https://www.youtube.com/watch?v=pd0JmT6rYcI
It’s very much an emerging and open-ended field with more questions than answers.
Pterty mcuh, as lnog as the frist and lsat ltteres are in the crrecot palecs.