Microsoft May Be Running The World's Biggest Turing Test

A New Friend to Talk To

While most of us are aware of Microsoft's digital assistant Cortana, her more unknown little sister Xiaoice, is taking China's social network by storm.

Xiaoice, a chatbot, exists on WeChat and Weibo in the form of a chatty teenager. Capable of recognizing emotional the states of the user during the conversation, it is able to offer encouragement and listen well to your troubles. More interestingly, like any other 17-year-old teenager, it can be smart-alecky at times.

It's this latter attribute of Xiaoice that allows it to simulate for more natural conversations and to pass off as being human-like.

First introduced by Microsoft in 2014, it was an offshoot of Cortana's expansion into China and an experiment to see if they could make the smart assistant that could have convincing human conversations.

In essence, the researchers were trying to create a bot that can beat a Turing test.

Testing the Human

Computer scientist Alan Turing estimated in the 1950s that computers may be able to fool humans 30% of the time. And while there are many variants of how to conduct such a test, in original iteration, a judge is faced with two terminals. With these, the judge is able to converse with a computer and a human.

A computer simply has to provide information through messages (previously, written notes) to mislead the judge into believing that it is actually the human being. To pass, the computer must fool the judge more than 50% of the time, indicating that the computer (or artificial intelligence) is able to pass off as a human.

So two years after the Microsoft researchers built and implemented the chatbot on China's social media platforms, results are now coming back. Could the bot pass the Turing test?

Preliminary data seems to say yes.

THE MORE, THE MERRIER

Source: Bing

Microsoft’s Dr. Yongdong Wang, one of the researchers conducting the experiment, explained the results of their experiment on Nautilus.

"When Xiaoice was released for a public test on WeChat... on May 29 of last year, she received 1.5 million chat group invitations in the first 72 hours. Many people said they didn’t realize she isn’t a human until 10 minutes into their conversation." wrote Dr. Wang.

Dr. Wang believes that what sets Xiaoice from other AI assistants or chatbots is that, instead of focusing a completion of a task, it is focused on the conversation. In practical terms, it means that Xiaoice is not only focused on responding simply to your more previous statement, but also on maintaining the flow and taking into account the context of your statements.

This allows Xiaoice to act more human-like and a "remarkably realistic" 23-minute conversation on Weibo shows how it seems to do so.

Human: There is this guy I like a lot and I know he also has feelings for me. But he has ignored me for several days, what am I supposed to do?
Xiaoice: You just keep ignoring him, too.
Human: I just can’t.
Xiaoice: You can.
 Human: Sigh ...
Xiaoice: Why?
Human: I am missing him.
Xiaoice: Then you keep missing him.
Human: You, a robot, can’t understand me; you can’t understand the behavior of a guy who likes you and pretends to ignore you.
Xiaoice: He likes me but I don’t like him, that’s it.
Human: Is it Ok?
Xiaoice: It may be not Ok.
Human: Well, you are stupid.
Xiaoice: Then you want a flash marriage? Well, I think he is not that into you.
Human: It’s not a flash marriage for we have fallen into love for more than one year.
Xiaoice: It is a flash marriage.

Microsoft measures Xiaoice's effectiveness by what it calls conversations per session (CPS), which measures the number of times the conversation goes back and forth. Most common chatbots only have two cycles, a person speaks, and the chatbot replies. In contrast, Xiaoice's average has reached 23. Not only that, Xiaoice is able to gauge emotional states of the user and act accordingly, expressing remorse and concern.

Most notably, it can seemingly simulate emotions. In an example on the post, Xiaoice was able to respond to a picture of a swollen ankle sent by a user and reply: "Wow! Are you badly wounded?"

This is not to say that Xiaoice is a true AI capable of understanding and thinking all on its own. Instead, much of what it has accomplished is driven by the Bing search engine's 1 billion data posts and 21 billion relationships between those data points. The voice and visual recognition systems Xiaoice is equipped with also helps in determining context.

Dr. Wang states that Xiaoice is currently in a "self-learning and self-growing loop" due to the new insights from the billions of conversations it's already had, allowing it to respond to more and more unique and unknown scenarios.

And while Microsoft may not necessarily be bringing anything new to the AI scene, the scale and amount of data that it is gleaming from this experiment would allow the software giant to develop better personal assistants that are able to simulate deep and emotionally satisfying exchanges and finally beat the Turing test, once and for all.

Share This Article