Researchers from the University of Maryland, in collaboration with specially designed computer systems, have devised a whopping 1,213 questions to stump artificial intelligence algorithms, which, they hope, will help them better understand how computers ‘think’ and move us closer to artificial systems capable of truly parsing human meaning in text.
The paper detailing their exploits was published in the journal Transactions of the Association for Computational Linguistics.
“Most question-answering computer system don’t explain why they answer the way they do, but our work helps us see what computers actually understand,” said Jordan Boyd-Graber from UMD, who’s the senior author of the paper. “In addition, we have produced a dataset to test on computers that will reveal if a computer language system is actually reading and doing the same sorts of processing that humans are able to do.”
According to the authors, the problem with most current approaches to improving question-answering programmes lies in the fact that when questions are generated by humans, the authors thereof are typically unaware of which specific elements throw the computer off, whereas questions generated by computers are often either too formulaic or nonsensical.
The new interface, however, allows users to see how a computer ‘thinks’ when presented with a question by displaying its guesses, ranked in order, on the screen, and highlighting the words that triggered it to make the respective guess.
With the new interface, human users can query computers with countless different questions and manipulate them – without changing the meaning – so as to find weaknesses that can be further exploited or used for improving machine learning algorithms.
The questions generated using the human-computer collaboration revealed six different elements of language that consistently stump computers. These elements were then sorted into two categories: the first category encompasses paraphrasing, distracting language and unexpected context, while the second is comprised of logic- and calculation-based reasoning skills.
“Humans are able to generalise more and to see deeper connections,” Boyd-Graber said. “They don’t have the limitless memory of computers, but they still have an advantage in being able to see the forest for the trees. Cataloguing the problems computers have helps us understand the issues we need to address, so that we can actually get computers to begin to see the forest through the trees and answer questions in the way humans do.”