Researchers at the Max Planck Institute for Biological Cybernetics in Tübingen have examined the general intelligence of the language model GPT-3, a powerful AI tool. Using psychological tests, they studied competencies such as causal reasoning and deliberation, and compared the results with the abilities of humans. Their findings paint a heterogeneous picture: while GPT-3 can keep up with humans in some areas, it falls behind in others, probably due to a lack of interaction with the real world.
The Linda problem: to err is not only human
These impressive abilities raise the question whether GPT-3 possesses human-like cognitive abilities. To find out, scientists at the Max Planck Institute for Biological Cybernetics have now subjected GPT-3 to a series of psychological tests that examine different aspects of general intelligence. Marcel Binz and Eric Schulz scrutinized GPT-3’s skills in decision making, information search, causal reasoning, and the ability to question its own initial intuition. Comparing the test results of GPT-3 with answers of human subjects, they evaluated both if the answers were correct and how similar GPT-3’s mistakes were to human errors.
This phenomenon could be explained by that fact that GPT-3 may already be familiar with this precise task; it may happen to know what people typically reply to this question,” says Binz
Hence, the researchers wanted to rule out that GPT-3 mechanically reproduces a memorized solution to a concrete problem. To make sure that it really exhibits human-like intelligence, they designed new tasks with similar challenges. Their findings paint a disparate picture: in decision-making, GPT-3 performs nearly on par with humans. In searching specific information or causal reasoning, however, the artificial intelligence clearly falls behind. The reason for this may be that GPT-3 only passively gets information from texts, whereas “actively interacting with the world will be crucial for matching the full complexity of human cognition,” as the publication states. The authors surmise that this might change in the future: since users already communicate with models like GPT-3 in many applications, future networks could learn from these interactions and thus converge more and more towards what we would call human-like intelligence.