The test, known as the Wechlsler Preschool and Primary Scale of Intelligence exam, measures performance in the following categories: information, vocabulary, word reasoning, comprehension, and similarities. While ConceptNet4 did well on vocabulary and similarities, it did poorly on word reasoning and comprehension. The AI machine had difficulty threading several concepts together, thus, producing rather bizarre answers like “epileptic fit” to simple questions like “why do we shake hands?” The machine’s results was the equivalent to that of an average 4-year old, and below average for a 5 to 7-year old.
In order for the AI to improve, it will likely need to be given additional natural language processing capabilities. The AI that was used for the exam actually dates back to 2012 and wasn’t leveraging many of the most up-to-date techniques. There has been tremendous progress in AI over the past three years, and a much more interesting experiment would have had the AI leverage current advancements while taking the exam. If that had occurred, maybe the results would have more closely mirrored that of the AI that scored as well as the average 11th grade student on geometry SAT questions