I think the best way to think of current Deep Learning models is that they're more like nervous system organs or brain lobes than fully realized self-entities.
Think of vision models as really complex eyes that do a lot of processing in the eye itself -- not the brain, but can't do much else, akin to a Mantis Shrimp's eyes.
Think of language models as sort of Wernicke's area of the brain. Capable of utilizing language, but not actually capable of self-reflection or planning without the rest of the temporal lobe.
We are probably a decade or two away (at the earliest) from building a sapient entity capable of self-reflection and cognition. Top researchers are only now starting to map out what sort of architecture that might require and the maps are very general with many roadblocks that need to be solved.
Yann LeCun has a very general idea mapped in a path toward autonomous machine intelligence.
.png)







