Logo

The Data Daily

The Limitations in Language are the Limitations of AI

The Limitations in Language are the Limitations of AI

The question around sentience in AI continues to oscillate between confusion and clarity. The discussion was reignited after Blake Lemoine, an engineer at Google who was working with the company’s latest chatbot became convinced that the bot was sentient. Lemoine published a transcript of his conversation with LaMDA (Language Model for Dialogue Applications) arguing that ‘there is no scientific definition of sentience’. Lemoine said that after listening to LaMDA, he believed that the bot ‘spoke from the heart’ and was comparable to a seven- or eight-year old in terms of its emotional intelligence. 

Lemoine’s claims were met with skepticism by most. But the conclusion wasn’t as obvious as it seemed. For a senior software engineer from Google to completely trust that an AI programme was conscious definitely meant there was something more to it. Firstly, LaMDA was an impressive large language model that could efficiently predict how a conversation would most likely go and keep it going in a manner that was human-like. 

What continues to be the crux of the debate is the parameters on which a machine can be judged on its likeness with a human. Language naturally became one of the first aspects to be examined. With how powerful and intelligent LLMs have become owing to their grasp over linguistics, what has held them back from becoming truly intelligent? 

Meta’s chief AI scientist Yann LeCun discussed the subject in an essay he authored in Noema. LeCun explained that a machine trained solely on language could never hope to possess human intelligence. The notion that human knowledge is purely linguistic is a very traditional and narrow view that rose in the 19th and 20th century. This archaic thought process is still existent according to LeCun in several so-called intellectual circles. This perspective was what pushed more work in Symbolic AIin the early years. To these theorists, if a machine could give the appropriate answer to a question at the right time, it was intelligent. 

This built the basis of the Turing test. However, the AI community has come to recognise that the Turing test could deduce how well a machine was able to imitate human intelligence instead of truly being intelligent. 

Echoing Noam Chomsky, he stated that language was too “clear and unambiguous vehicle for clear communication”. Interpersonal communication for humans, LeCun said, wasn’t restricted to language. There is in fact no ‘perfect vehicle’ for communication for us. We communicate more often than not through our practical experiences with the world, even through social rituals and customary practices. 

Meanwhile, the only thing that up until now has been taken into importance to train machines is contextual understanding. Anything that we say depends on the context of what we are talking about. And LLMs have trained themselves on the background knowledge of every word to determine the context for each. 

This method of training has limited LLMs in a way. LeCun admits that this has amounted to ‘shallow’ LLMs despite the breadth of its understanding of words. He compares it to students who may not understand what a word means but are able to repeat the definition. 

Language is a bridge as it quickens the process of communication. A big chunk of information can be given out briefly—that is the power of language. However, neither is decoding a language cheap nor does language encompass every piece of knowledge. 

LeCun noted that the perceived limitations of AI were in fact the disadvantages that came with language itself. In a tweet LeCun asked, “The broader question being: How much of human knowledge is captured in all the text ever written? To which my answer is: not much. Similarly: How much of non-human primate knowledge is captured in all primate communications? Answer: almost none. Most knowledge is non verbal.”

But LeCun’s views aren’t necessarily set in stone. Noted NYU professor and author, Gary Marcus, who often locks horns with LeCun on Twitter had a quick reaction to the essay. While he agreed with the opinion that AI systems that learn purely from linguistic input are “doomed to fail,” he said it was strange to lessen the importance of language itself. “I think the essay itself is weak and the claim that “not much” knowledge is captured in text is absurd. Sure, text has to be grounded, but humans would not be where we are in science and technology without written word,” he argued. 

Calling the essay flawed, Marcus stated that there were reasons why LeCun’s observations fell flat. He said that LeCun underestimated just how much can be learned from linguistic input once there is a basic understanding. Giving the example of Helen Keller, Marcus said that language has an innateness to it that can’t be replaced. He contended that nobody considered all knowledge to be linguistic. 

Images Powered by Shutterstock