Friday 15 September 2017

Translating a Chatbot

You have a trained chatbot built in English and your boss wants it working in German next week. Here is what I would do.
Tell her it is impossible. Building a new chatbot in a different language involves starting the whole process from scratch. Linguistic, cultural and company process reasons means a translated chatbot won't work.
Then I would build it anyway.
Create a language translation service in Bluemix. Note the username and password this service has.
Get the codes of the language you want to translate between. From English 'en' in Sept 2017 you can translate to Arabic 'ar', Brazilian Portuguese 'pt', French 'fr', German 'de', Italian 'it', Japanese 'ja', Korean 'ko', and Spanish 'es'.
Take your current Ground truth of Questions, Intentions. Put the Questions and intents in a spreadsheet. Sort the list in intentions. So that all the questions about a topic are together.
Now the python code to translate the questions into German is below. You need to use the username and password you set up earlier.

import json
from watson_developer_cloud import LanguageTranslatorV2 as LanguageTranslator
import csv

language_translator = LanguageTranslator(
  username= "",
  password= "")


text= ''
with open('myfile.csv', newline='') as csvfile:
    spamreader = csv.reader(csvfile, delimiter=',', quotechar='|')
    for row in spamreader:
        print(row[0], end=',')
        translation = language_translator.translate(text=row[0],
        source='en',
        target='de')
        a=(json.dumps(translation, indent=2, ensure_ascii=False))
        print(a.strip('\"'))
myfile.csv here is
Am I allowed to speak in german,
Do you speak German?,
Do you speak English?,
What languages can I talk to you in?,
And this gives the output
davids-mbp:translate davidcur$ python translate.py
Am I allowed to speak in german,Bin ich konnte in Deutsch zu sprechen
Do you speak German?,Möchten Sie Deutsch sprechen?
Do you speak English?,Wollen Sie Englisch sprechen?
What languages can I talk to you in?,Welche Sprachen kann ich zu Ihnen sprechen?
Certain phrases and words you will want to translate in a non standard way. If the chatbot talks about a company called 'Exchange' you will want to warn the translator that it should not translate that into the German word for exchange. To do this you load a glossary file into your translator. glossary.tmx looks like this
and then run code to tell your translation service to use this glossary. In node.js this is

var watson = require('watson-developer-cloud');
var fs = require('fs');

var language_translator = watson.language_translator({
    version: 'v2',
  url: "https://gateway.watsonplatform.net/language-translator/api",
  username: "",
  password: ""
});

var params = {
  name: 'custom-english-to-german',
  base_model_id: 'en-de',
  forced_glossary: fs.createReadStream('glossary.tmx')
};

language_translator.createModel(params,
  function(err, model) {
    if (err)
      console.log('error:', err);
    else
      console.log(JSON.stringify(model, null, 2));
  }
);

The terms in your glossary are likely to be in your Entities as most of your common proper nouns, industry terms and abbreviations end up in there.
Now get someone who speaks both English and the Target language fluently. Stick these translated questions in the spreadsheet. Go through each question and humanify the translation. This is a pretty quick process. They will occasionally come across words that will have to be added to the glossary.tmx file and the translations rerun.
At the end of this you have an attempt at a ground truth in another language for a chatbot. There are several reasons why this wont be perfect. Germans do not speak like translated British people. They have different weather, culture and laws. And the differences between Germany and Britain are smaller than between many countries.
Their questions are likely to be different. But as a first cut to get a chatbot up to the point where it might recognise a fair chunk of what people are saying. This can at least be used to help you gather new questions or to help you classify new questions you collect from actual Germans.
Manufacturing questions never really works well. As Simon pointed out here. And translation of questions is close to that. But it can be a useful tool to quickly get a chatbot up to the point where it can be bootstrapped into a full system.












1 comment:

  1. Nice one Dave, but this also works in run time for some domains...saves ongoing maintenance and updates...just plug and play to add another language.

    ReplyDelete