What does Google Machine Translation (GMT) do when we give HTML document to translate ?

When we upload a HTML document to GMT it does the following things ---

1. It accepts the HTML document.

2. It extracts the translatable content from the HTML page and do translation of  page and,  preserves the structure and formatting (all font properties, spacing, paragraph breaks and images are kept in place) of HTML page.

3. It entertain the class=”notranslate” and translate=”yes/no”, which tells whether the content of an element should be translated or not.

4.  It adds the lang  attribute  in <html> tag that provide the information of source and target language.
ex. <html lang=hi-x-mtfrom-en>

5. It adds the <script> tags in HTML page.
6. It adds the <style> tags in HTML page.
7. It adds the <meta> tags in HTML page.
ex. <meta http-equiv="X-Translated-By" content="Google"> 

8. It adds tag in HTML page that contains the address of original web page and the language of page.

ex. <link hreflang=en rel="alternate machine-translated-from">

9. It adds <iframe> tag inside <body> tag in HTML page.

10. It adds a <span> tag for each “sentence” that contain the source language sentence and target language sentence.

11. It makes changes to all hyper-link (<a> tag)

Before translation

<a href=" hl=en&prev=_t&sl=auto&tl=hi&u=">

After translation

<a href=" hl=en&prev=_t&sl=auto&tl=hi&u="$gt;

Note :  The tags like <script>, <style> and <iframe>  is inserted by GMT in web page to show the “GMT UI” inline in a web page. Below is a screen shot of this.

No comments:

Post a Comment