How Not to Translate a Videogame

by
Ryan Davis
13 Mar 2019

At the Brisbane Azure Users Group's March 2019 Meetup, I gave a presentation entitled 'How Not to Translate a Videogame'. The talk explored the application of automated realtime text detection, recognition and machine translation techniques to the translation of a relatively obscure Japanese visual novel, '12Riven' - the final game in a series of four that all otherwise have official or fan translations available.

In the session we first look at a basic translator that uses a combination of LINQPad, Azure OCR and Azure Text Translate to perform the detection of text, character recognition and translation respectively. As the results vary but tend towards incomprehensible, we then look at the use of Azure Custom Translator to train our own model on ~35K ja/en pairings taken from the translation of an earlier Infinity series game. The custom model provides a noticeable lift in translation quality, bringing it to a level that a charitable assessor might consider broadly understandable (though still very very far from the level of quality a human translator would provide).

In the spirit of 'doing silly things' (like attempting to machine translate a game), the demo ended with the addition of SignalR and a Xamarin.iOS app + ARKit. This allowed the user to point their camera at the untranslated game and view the translated content overlaid convincingly in 3D space, powered by ARKit image tracking. Additionally, tapping the phone screen allowed the advancement of text in game by sending key presses back to the running application.

Though (deliberately) working on a fundamentally misguided premise, the talk did give a good and hopefully entertaining overview of several Azure services, as well as some insight into the kind of quality currently achievable through machine translation.

Slides (42): PDF