Feature Another day, another headline. Last week, a year-old startup attracted $1.3 billion from investors including Microsoft and Nvidia, valuing Inflection AI at $4 billion.
Outlandish valuations such as these vie with warnings of existential risks, mass job losses and killer drone death threats in media hype around AI. But bubbling under the headlines is a debate about who gets to own the intellectual landscape, with 60 years of scientific research arguably swept under the carpet. At stake is when it will equal humans with something called Artificial General Intelligence (AGI).
Enter Yale School of Management economics professor Jason Abaluck, who in May took to Twitter to proclaim: “If you don’t agree that AGI is coming soon, you need to explain why your views are more informed than expert AI researchers.”
Also known as strong AI, the concept of AGI has been around since the 1980 as a means of distinguishing between a system that can produce results, and one which can do so by thinking.
The recent spike in interest in the topic in stems from OpenAI’s GPT-4, a large language model which relies on crunching huge volumes of text, turning associations between them into vectors, which can be resolved into viable outputs in many forms, including poetry and computer code.
Following a string of impressive results – including passing a legal Uniform Bar Exam – and bold claims for its economic benefits – a £31 billion ($39.3 billion) increase in UK productivity, according to KPMG – proponents are getting bolder.
OpenAI CEO Sam Altman last month declared to an audience in India: “I grew up implicitly thinking that intelligence was this, like, really special human thing and kind of somewhat magical. And I now think that it’s sort of a fundamental property of matter…”
Microsoft, which put $10 billion into OpenAI in January, has been conducting its own experiments on GPT-4. A team led by Sebastien Bubeck, senior principal research manager in the software giant’s machine learning foundations, concluded [PDF] its “skills clearly demonstrate that GPT-4 can manipulate complex concepts, which is a core aspect of reasoning.”
But scientists have been thinking about thinking a lot longer than Altman and Bubeck. In 1960, American psychologists George Miller and Jerome Bruner founded the Harvard Center for Cognitive Studies, providing as good a starting point as any for the birth of the discipline, although certain strands go back to the 1940s. Those who have inherited this scientific legacy are critical of the grandiose claims made by economists and computer scientists about large language models and generative AI.
Dr Andrea Martin, Max Planck Research group leader for language and computation in neural systems, said AGI was a “red herring.”
“My problem is with the notion of general intelligence in and of itself. It’s mainly predictive: one test largely predictive of how you score on another test. These behaviors or measures may be correlated with some essentialist traits [but] we have very little evidence for that,” she told The Register.
Martin is also dismissive of using the Turing Test – proposed by Alan Turing, who played a founding role in computer science, AI and cognitive science – as a bar for AI to demonstrate human-like thinking or intelligence.
The test sets out to assess if a machine can fool people into thinking that it is a human through a natural language question-and-answer session. If a human evaluator cannot reliably tell the unseen machine from an unseen human, via a text interface, then the machine has passed.
Both ChatGPT and Google’s AI have passed the test, but to use this as evidence of thinking computers is “just a terrible misreading of Turing,” Martin said.
“His intentions there was always an engineering or computer science concept rather than a concept in cognitive science or psychology.”
New York University psychology and neural science emeritus professor Gary Marcus has also criticized the test as a means of assessing machine intelligence or cognition.
Another problem with the LLM approach is it only captures aspects of language that are statistically driven, rather than trying to understand the structure of language, or its capacity to capture knowledge. “That’s essentially an engineering goal. And I don’t want to say that doesn’t belong in science, but I just think it’s definitionally, a different goal,” Martin said.
Claiming that LLMs are intelligent or can reason also runs into the challenge of transparency in the methods employed to development. Despite its name, OpenAI hasn’t been open with how it has used training data or human feedback to develop some of its models.
“The models are getting a lot of feedback about what the parameter weights are for pleasing responses that get marked as good. In the ’90s and Noughties, that would not have been allowed at cognitive science conferences,” Martin said.
Arguing that human-like performance in LLMs is not enough to establish that they are thinking like humans, Martin said: “The idea that correlation is sufficient, that it gives you some kind of meaningful causal structure, is not true.”
Nonetheless, large language models can be valuable, even if their value is overstated by their proponents, she said.
“The disadvantage is that they can gloss over a lot of important findings… in the philosophy of cognitive science, we can’t give that give up and we can’t get away from it.”
Not everyone in cognitive science agrees, though. Tali Sharot, professor of cognitive neuroscience at University College London, has a different perspective. “The use of language of course is very impressive: coming up with arguments and the skills like coding,” she said.
“There’s kind of a misunderstanding between intelligence and being human. Intelligence is the ability to learn right, acquire knowledge and skills.
“So these language models are certainly able to learn and acquire knowledge and acquire skills. For example, if coding is a skill, then it is able to acquire skills – that does not mean it’s human, in any sense.”
One key difference is AIs don’t have agency and LLMs are not thinking about the world in the same way people do. “They’re reflecting back – maybe we are doing the same, but I don’t think that’s true. The way that I see it, they are not thinking at all,” Sharot said.
Total recall
Caswell Barry, professor of UCL’s Cell and Developmental Biology department, works on uncovering the neural basis of memory. He says OpenAI made a big bet on an approach to AI that many in the field did not think would be fruitful.
While word embeddings and language models were well understood in the field, OpenAI reckoned that by getting more data and “essentially sucking in everything humanity’s ever written that you can find on the internet, then something interesting might happen,” he said.
“In retrospect, everyone is saying it kind of makes sense, but actually knew that it was a huge bet, and it totally sidestepped a lot of the big players in the machine learning world, like DeepMind. They were not pursuing that direction of research; the view was we should look at inspiration from the brain and that was the way we would get to AGI,” said Barry, whose work is partly funded by health research charity Wellcome, DeepMind, and Nvidia.
While OpenAI might have surprised the industry and academia with the success of its approach, sooner or later it could run out of road without necessarily getting closer to AGI, he argued.
“OpenAI literally sucked in a large proportion of the readily accessible digital texts on the internet, you can’t just like get 10 times more, because you’ve got to get it from somewhere. There are ways of finessing and getting smarter about how you use it, but actually, fundamentally, it’s still missing some abilities. There’re no solid indications that it can generate abstract concepts and manipulate them.”
Meanwhile, if the objective is to get to AGI, that concept is still poorly understood and difficult to pin down, with a fraught history colored by eugenics and cultural bias, he said.
In its paper [PDF], after claiming it had created an “early (yet still incomplete) version of an artificial general intelligence (AGI) system,” Microsoft talks more about the definition of AGI.
“We use AGI to refer to systems that demonstrate broad capabilities of intelligence, including reasoning, planning, and the ability to learn from experience, and with these capabilities at or above human-level,” the paper says.
Abductive reasoning
Cognitive science and neuroscience experts are not the only ones begging to differ. Grady Booch, a software engineer famed for developing the Unified Modeling Language, has backed doubters by declaring on Twitter AGI will not happen in our lifetime, or any time soon after, because of a lack of a “proper architecture for the semantics of causality, abductive reasoning, common sense reasoning, theory of mind and of self, or subjective experience.”
The mushrooming industry around LLMs may have bigger fish to fry right now. OpenAI has been hit with a class-action suit for scraping copyrighted data, while there are challenges to the ethics of the training data, with one study showing they harbor numerous racial and societal biases.
If LLMs can provide valid answers to questions and code that works, perhaps that’s to justify the bold claims made by their makers – simply as an exercise in engineering.
But for Dr Martin, the approach is insufficient and misses the possibility of learning from other fields.
“That goes back to whether you’re interested in science or not. Science is about coming up with explanations, ontologies and description of phenomena in the world that then have a mechanistic or causal structure aspect to them. Engineering is fundamentally not about that. But, to quote [physicist] Max Planck, insight must come before application. Understanding how something works, in and of itself, can lead us to better applications.”
In a rush to find applications for much-hyped LLM technologies, it might be best not to ignore decades of cognitive science. ®