Training computer systems in the comprehension of language has been a goal coveted by researchers for many decades. Now, the Chinese tech giant Alibaba and Microsoft claim to have taken the very first step towards that lofty aim.
Both systems were tested using the Stanford University’s Question Answering Dataset, a collection of 100,000 questions based on 500 Wikipedia articles, which has become the battleground for different AI research groups vying to become the first one to beat standard human performance.
The systems were fed a number of paragraphs from articles on a variety of topics, and then prompted to answer a number of questions based on the available information.
The current human score is 82.3, while the systems built by Alibaba and Microsoft racked up 82.44 and 82.65 points respectively, beating their biological rivals by a hair’s breadth.
“It is our great honour to witness the milestone where machines surpass humans in reading comprehension. That means objective questions such as ‘what causes rain’ can now be answered with high accuracy by machines,” said Luo Si, Chief Scientist of Natural Language Processing at Alibaba.
Some news outlets have taken hold of the news, claiming that artificial intelligence can now read better than humans, and even that it will decrease the need for human input in an unprecedented way.
However, as many other commentators, as well as the companies themselves have noted – this is not ‘real’ comprehension. In other words, the algorithms have no clue what they’re ‘reading’.
Apparent comprehension arises from the systems’ ability to identify patterns and match terms contained within the articles.
Furthermore, the systems were only fed cleanly formatted materials from Wikipedia that were guaranteed to contain answers. ‘Polluting’ the text with gibberish or asking the systems to infer meaning from several sentences breaks the process down, which means real comprehension is still a ways off.
Before that happens, though, researchers hope to soon implement similarly designed systems in museums, customer service establishments, an online systems designed to provide answers to medical inquiries.
Furthermore, research teams around the world are currently training AI systems to solve SAT-style math problems and basic science questions.