April 13, 2005

Entish as a computer-aided translation interlanguage

Ever wondered how a universal translator could work? Obviously it's impossible to instantly begin translating a previously unknown language, but the problem of translating between known languages is more than hard enough. The last time I checked, the EU had 20 official languages and ran the world's largest translation operation (2nd place was the UN, with 6 official languages). With 20 languages to translate, the EU has to deal with 190 language pairs, and pays over €500 million/year for translator salaries alone.

One way to simplify the problem would be to translate every language to the same intermediate language, then translate that to the target language. That would reduce the problem to 19 language pairs in the case of the EU. But which one? For political reasons, nobody would agree to using any of the current official languages (which is why the EU is in this situation in the first place). More importantly, because each translation introduces errors, the more intermediate steps there are, the worse the final translation will be, turning the whole thing into a giant game of telephone.

So, why not create a language meant solely to act as an intermediate, used only by computer translators, and designed to reduce translation errors to a minimum? (That's a rhetorical question, by the way.)

Some translation errors are caused by the addition of information that is mandatory in the target language but unspecified in the original language. But many more errors come from the loss of subtle meanings during the translation. An intermediate language would ideally lose no information at all, carefully recording every nuance of the original utterance. This would lead to a language that would be unacceptably verbose for use by humans, but computers should be able to handle it. Entish happens to be a perfect model for this kind of language, since every word in Entish is an exhaustive description of the attributes of the thing named. Unfortunately, only a few fragments of Entish are known, presumably because it is too boring to listen to Ents speak it.


At April 16, 2005, Anonymous Robert said...

reminds me of when I was learning about client/server programming. One of the reasons for middleware is so that you don't need to worry about translating data for every possible system your program may be communicating with (as in big-endian/little-endian). You only worry about translating data for the middleware, and let the other half of the translation take care of itself on the other side.


Post a Comment

<< Home