a path towards AGI?
while we don't understand it, concept models & step-reasoning might allow us to simulate intelligence
transformers can’t obtain a conceptual understanding of their environment. therefore, they cannot achieve AGI; let me explain …
language, for a human, is one way of describing our perception of our env
language, for transformers, is their env
during the sandinista revolution in nicaragua, deaf children were put into separate schools, which lead to the development of an entirely new form of sign language.
semantic language is a tool we use to map meaning to our environment, but the language itself is not really all that important - it’s the higher-order conceptual meaning that is valuable.
in a transformer, language is turned into the environment via tokenization. tokens may have complex relationships which can then be used to very accurately determine the probability of a correct output; but a transformer is not determining this output to be correct based off of any conceptual understanding of the input.
models like meta’s new large context model propose a new frontier, transforming language into higher-order semantic “concepts”. this has the potential to resolve a fundamental contradiction in transformers, and allows for models to reason with a full understanding of the conceptual representations of their input.
facebook utilizes their new SONAR embedding space for concept modelling. i think there is certainly more work to be done in this area to develop better embedding spaces. however, more fundamentally, more development is going to be needed on lcms for them to properly zero-shot model concepts.
this frontier represents an opportunity to model human reasoning fairly accurately, by applying step-reasoning to a conceptual understanding of the environment, rather than an as a reward mechanism for it’s future output.
NOTE: the step-reasoning approach is the one taken by o1 & o3, distinct from the various rlft approaches for post-training.
conceptual step-reasoning represents close-parity to human intelligence.
i’m fairly convinced that this is the vein of research sam altman, in his most recent blog post, was referring to when he references “knowing how to build AGI”.
however; we still aren’t even sure what human intelligence is.
highlighted recently by francois chollet, founder of arc-agi & keras:
i believe the term “AGI” is still mostly a pedantic one. we continue unraveling the mysteries of intelligence, & while we have gotten great at modelling it, this model is based entirely on observation, rather than on understanding.
take for instance, the work done only last year into the quantum mechanics of neuron tunnels, or decoding the contents of neuronal transmissions. research like this could fundamentally transform our understanding of neuron activation, sequencing, and architecture more broadly.
i believe that qm is the final frontier for AGI, as it represents the ability to truly understand consciousness; and subsequently, potentially build machines that could properly model it.
i could go on about consciousness, cognition & how we really haven’t even come close to understanding intelligence yet - and i won’t, because i think you get my point.
i hypothesize we will get, in the next 1-3 years, some sort of generalized business or consumer intelligence; which will obviously be revolutionary for other reasons.
but those of us transfixed by intelligence, we will continue to dig deeper!