Beyond Words - Language Blog

Evaluating Machine Translation:The Present and Future of Multilingual Search

A recent study conducted by researchers at The University of Granada’s School of Translation and Interpretation attempts to analyze and evaluate the results of machine translations done with popular online tools such as Google Translator, Promt, and WorldLingo. The study was published in this month’s issue of Translation Journal, and it raised interesting questions for me about the possible uses for online machine translation.

Looking at the findings, it should come as no surprise that all of the machine translation tools produced poor results in terms of the number of errors, or that after the translations passed through a round of human editing, the number of errors were drastically reduced. What is interesting, though, is that certain online tools performed better than others, and specific language combinations produced varying results. The graph below shows results from German into Spanish (the researchers used EvalTrans Software). The best translation machine is the one showing the lowest word error percentage (WER). Check out the study for more charts and an explanation of the sentence error rate (SER).

Doctors Lola García-Santiago and María-Dolores Olvera-Lobo do a thorough job of laying out the methodology that they followed, and of acknowledging the difficulties inherent to such studies. They write that,

Evaluation of machine translation is an unresolved research problem that has been addressed by numerous studies in recent years. The most extensively used assessment tools are classified into two major groups: automatic objective methods, and subjective methods (Tomás, Mas & Casacuberta, 2003). The objective evaluation methods compare a set of correct translations of reference against the set of translations produced by the translation software under evaluation. The units of measurement most often used work at the lexical level, comparing strings of text.

For García-Santiago and Olvera-Lobo, a possible solution to the challenge of multilingual information retrieval lies in the potential of cross-language Question-Answering Systems (QA Systems). Their model takes into account only the scope of this particular study, but it led me to think in terms of potential multilingual data retrieval for the internet as a whole. Considering the amount of information being created and stored online, it simply isn’t possible for every document that a searcher might find useful (web page, journal article, etc) to be translated into the language that every searcher speaks. Rather than focus on whole documents, García-Santiago and Olvera-Lobo’s cross-language QA System is very much in tune with the current trend in online technology that aims to improve search capability by creating a natural language data retrieval system — one in which questions are entered as search queries and a search engine would retrieve not whole documents, but discreet sections from documents that are found to contain answers to the question. As the researchers put it,

Cross-language question-answering systems can be the solution, as they pursue the search for a minimal fragment of text, not a complete document.

This would enable the evolution of the Semantic Web, towards which we are already headed, to deal properly with multilingual content. With the rise of internet use and website creation in China and throughout the developing world, the web has the potential to become a universal source of research and knowledge; however, that can’t happen until we deal with the basic question of language. For professionals in the language industry, the question is amplified a bit: who will be the ones opening up the web to cross-linguistic data retrieval, humans or machines? I suspect that the simple answer is both.

As more research is conducted, and the developers behind the powerful online translation tools continue to improve their machines (which, in Google’s case, they are already doing with the help of everyone who uses the tool and suggests better translations or participates in their Translator Toolkit) we will likely see a hybrid effort with developers and professional translators working together to break language barriers online.

In the meantime, while these advances are being made behind the tech and academic scenes, Language Service Providers are trying new and interesting ways to combine human and machine translation efforts to improve business. Several large scale translation agencies offer free machine translation services through their websites, or by developing tools for web publishers. In most cases, these tools are provided to increase the number of visitors to their corporate sites, and to increase brand visibility. Since the quality of machine translation remains so low, all professional translation jobs must be handled by human translators, and the free tools are just a way to get people in the door, so to speak.

Other translation companies have developed tools that provide machine translation for web publishers to place on their sites, making it easy for a visitor to switch between English, Japanese, German, or even Welsh. The downside, of course, is low quality. Translation tools that offer only automated translation are good first steps for web publishers who can’t afford to pay professionals to do it right, but the multilingual visitor will get no more than the gist of the content. For our own part, ALTA’s developers are working on a side project called Virtual Language that combines the usability of the major automated translation services with a web analytics feature that shows how many times a site was translated, and which languages were chosen. In addition to site translation data, Virtual Language will offer suggestions on the best translation machines for certain language combinations, as determined by our own team of professional translators using the EvalTrans software in much the same way that the researchers from The University of Granada conducted their study. This insight from professional translators will help developers, users, and publishers with quantitative data on machine translation quality, and the web analytics provide hard data displayed in charts like the following:

By providing this analytics platform to bloggers and business sites we hope to encourage publishers to seek professional translation. It is data that can be used to justify the quality investment of using professionals to translate online content. We think of it as harnessing the power of machines to help provide work for humans, and as most translators would probably agree, we’d like to see a future that keeps the balance that way. As technological advances continue to improve the way we conduct research, do business, and get media online, multilingual search will undoubtedly become an integral aspect of the work we do.