Translation was traditionally considered a job in which the magic human touch would always ultimately trump a machine. That may no longer be the case, as a Microsoft AI translator just nailed one of the hardest challenges: translating Chinese into English with accuracy comparable to that of a bilingual person.
Chinese is so difficult a language that it takes years for a non-native speaker to just about manage the 3,000 characters needed to read a newspaper. Previous attempts at automatic translation have amused the world, with gems such as "hand grenade" to indicate a fire extinguisher or a mysterious "whatever" dish on a restaurant menu.
"For alphabetic languages, there's what they call a virtuous loop between the writing, speaking and listening — those three categories constitute one composite skill," linguist David Moser told the Los Angeles Times. "But the problem with Chinese [...] is it breaks that loop. Speaking does not necessarily help your reading. Reading doesn't necessarily help your writing." These are three different skills that, when learning Chinese, have to be mastered in parallel.
After years of working on what it seemed a nearly impossible feat, Microsoft engineers finally achieved the so called "human parity" in translating a sample of sentences from Chinese news articles into English.
The team used a sample of 2000 sentences from online newspapers that had been previously translated by a professional. Not only did they compare the machine's job with that of the human translator, but they also hired a team of independent bilingual consultants to keep an eye on the process.
“Hitting human parity in a machine translation task is a dream that all of us have had,” Xuedong Huang, a technical fellow in charge of Microsoft’s speech, natural language and machine translation told the company's blog. “We just didn’t realize we’d be able to hit it so soon.”
Teaching a system to translate a language is particularly complex because two different translations of the same word may sound equally right. People choose different words depending on context, mood and who they are communicating with.
“Machine translation is much more complex than a pure pattern recognition task,” Ming Zhou, assistant managing director of Microsoft Research Asia told the Microsoft blog. “People can use different words to express the exact same thing, but you cannot necessarily say which one is better.”
The next challenge, he said, will be to test the new AI translator on real-time news articles.