MetaTOC stay on top of your field, easily

Metadata records machine translation combining multi‐engine outputs with limited parallel data

, , , ,

Journal of the American Society for Information Science and Technology

Published online on

Abstract

One way to facilitate Multilingual Information Access (MLIA) for digital libraries is to generate multilingual metadata records by applying Machine Translation (MT) techniques. Current online MT services are available and affordable, but are not always effective for creating multilingual metadata records. In this study, we implemented 3 different MT strategies and evaluated their performance when translating English metadata records to Chinese and Spanish. These strategies included combining MT results from 3 online MT systems (Google, Bing, and Yahoo!) with and without additional linguistic resources, such as manually‐generated parallel corpora, and metadata records in the two target languages obtained from international partners. The open‐source statistical MT platform Moses was applied to design and implement the three translation strategies. Human evaluation of the MT results using adequacy and fluency demonstrated that two of the strategies produced higher quality translations than individual online MT systems for both languages. Especially, adding small, manually‐generated parallel corpora of metadata records significantly improved translation performance. Our study suggested an effective and efficient MT approach for providing multilingual services for digital collections.