Language unites and separates us. It allows us to express our ideas and shared cultural values, but it can also fence us in when we encounter an unfamiliar tongue. In an increasingly connected world, companies are racing to tear down these language barriers with sophisticated new translation technologies that enable them (and us) to communicate with a global audience.
When a business expands into a foreign market, one of its top priorities is providing accurate and culturally specific translations of any company or product-related information, marketing materials and customer support data. This process of localization has become much more challenging due to the vast amount of communication facilitated by the Internet. Today, localization is a $35 billion annual industry.
While many assume that English is the lingua franca of the Web, it’s not as prevalent it once was. In fact, the percentage of online spending power that could be reached through English-only communication fell from 48 percent in 2009 to 36 percent in 2012, according to localization consulting firm Common Sense Advisory. To reach 80 percent of the world’s total online population, you need to communicate in at least 12 languages, and to reach 98 percent you need to translate across 48 languages.
“It’s not really possible to get mass adoption without localization,” said Ben Sargent, content globalization strategist at Common Sense Advisory.
“When you first enter a market, the early adopters for any new product — whether it’s a personal care product or a tech product — tend to be internationally focused and English friendly, so you might think you’re doing well. But that’s a thin layer of people. As demand grows for the product, that next layer is less comfortable with English, and once you get about 20 percent into the market, the English tolerance drops off.”
Having high-quality translation is a requirement for succeeding in foreign markets, and many tech firms are betting on machine translation — computer systems that automatically scan and translate written or spoken words — to handle the enormous volume of language transmitted in the modern world. The potential commercial value is enormous.
“Businesses are increasingly using platforms such as Facebook and Twitter to communicate with customers and better understand consumer sentiment,” explained Abdessamad Echihabi, vice president of research and product development at SDL, one of the world’s largest language technology firms.
“If businesses can enable customers who speak different languages to communicate online with each other in near real-time, the social media opportunity is substantially magnified. Real-time translation of online conversations and comments will enable businesses to interact with billions, rather than millions, of people.”
Microsoft’s new machine translator for Skype can translate speech from one language into another in close to real-time, while Google Translate converts text and can be integrated into mobile apps, Web interfaces and browsers.
eBay is also making a major effort to provide automated translation on its product listings, but its approach is unique: rather than measuring machine-translated text against its human-written original, the company is optimizing toward user behavior and experience.
“We know the item, we can give you the key aspects of the description and even related products from the database for comparison, so the machine is able to translate accurately using these information sources,” said Hassan Sawaf, senior director of machine translation at eBay.
“The next thing is using behavioral data to learn how to translate best. If I can see a user did a query, clicked on a product, read the description and purchased it, I can make the assumption that the translation I showed this user is good because he bought the item. We can teach the machines to translate better based on that feedback — the system is learning.”
This behavior-based model could also extend beyond the transactional level.
“When we learned over time what choice of words was better, we used that methodology outside of commerce,” added Sawaf. “And even then, we saw the quality of translation was improved, so it has vast uses outside of commerce as well.”
According to Sargent, apart from Microsoft and Google’s services, the third major pillar of modern machine translation is a free, open-source engine called Moses, which provides statistical, phrase-based translation.
Philipp Koehn, chair of machine translation at the University of Edinburgh and one of the original creators of Moses, believes there are significant advantages to automated translation.
“Machine translation is two to three times faster than human translation, and that’s really economically important. I think whenever the language is formalized and structured, machine translation is a great option,” he said.
Although a human translator can typically work through 2,000 to 3,000 words per day, the big data environment means that enterprises may need to translate millions of words on a daily basis.
“Humans cannot be replaced by machines, but the amount of data we have is growing faster than the population of translators. So machine translation will have fill in that gap,” explained Sawaf.
Machine translation may be the solution to the volume problem, but in terms of accuracy, it often falls behind human-led efforts. Koehn believes several elements have to come together for accurate machine translation.
“There are three big ingredients: linguistics, machine learning and data collection,” he noted. “Given the vast amount of data, how do you collect it and keep the translation fast? You want a model that can identify good English versus bad English, but the model has to know trillions of words.
“The variables are huge, and at some point it just becomes computationally burdensome. The challenge now is not so much the core insights, but coming up with algorithms that are fast enough to actually do all that processing.”
The obstacles are daunting. While automated language systems can typically grasp the type of clean, structured language found in, say, a technical manual or product review, more advanced wording presents a higher order of difficulty. For a machine translation to be accurate, the system has to understand broad contexts, slang, euphemism, figures of speech and variations in tone and sentence structure among different countries, regions, industries and even individuals. Mastering such nuances would, in fact, be a leap toward the development of A.I.
“Machines understanding natural language is one of the core challenges in artificial intelligence,” said Koehn. “You might not get machine translation perfectly right until you achieve artificial intelligence, but for now we want to get it good enough to be usable.”
One promising advance lies in Dragon Assistant, a new tool designed for Intel’s RealSense technology. Dragon Assistant uses natural language understanding to turn normal, everyday speech into actions on a user’s Ultrabook, laptop, tablet, 2 in 1 or All-in-One computer. It’s also able to learn a user’s unique speech patterns from repeated use.
The next step, as far as breaking down language barriers, would be a device that not only understands natural human speech but can also translate back-and-forth conversation into multiple languages as a user interacts with a non-English speaker.
While Star Trek’s universal translator is still a long way off, great strides are being made in the implementation of semantic-based machine translation, as well as the use of neural networks to amplify machine learning capabilities.
So, are we headed toward a future in which you can speak with anyone in the world without losing the meaning or the subtlety of your native voice? That scenario is closer than you might think.
“There are already apps that can translate menus and street signs when you’re in a foreign country,” said Koehn. “It’s not going to be perfect, but we’ll get there, and it’ll be much more common for people to use translation.”