Pages in topic:   < [1 2 3 4 5 6 7 8 9 10] >
What's your opinion on machine translation and quality?
Thread poster: Daniela Zambrini
Giles Watson
Giles Watson  Identity Verified
Italy
Local time: 16:55
Italian to English
In memoriam
Second-guessing a machine Jul 8, 2014

Giovanni Guarnieri MITI, MIL wrote:

but when the sentence is unusable, you apply the same thought process as in translation...
in fact, you would develop a different skill, whilst preserving the original one...



If the text is unusable, you need to look at the original, which you might as well have done in the first place without wasting time on gobbledegook generated by a lucky bag of algorithms.


 
Riccardo Schiaffino
Riccardo Schiaffino  Identity Verified
United States
Local time: 09:55
Member (2003)
English to Italian
+ ...
Not that much better than Google Translate - at least for Spanish Jul 8, 2014

Kirti Vashee wrote:

Source

All studios are equipped with a small kitchen, fridge and separate bathroom.
The hotels facilities include an outdoor swimming pool and a beauty parlour.
Enjoy typical French cuisine in the traditional restaurant Aux Trois Cochons.
The hotel is in the heart of the historical centre, near to all major attractions.
The hotel is located at the heart of the Huangpu District, close to Nanpu Bridge.
Apartments are in very good condition, well equipped and furnished to a very good standard.
The rooms are also fully equipped with TV, Telephone, Air conditional, Refrigerator and mini bar.
Spice Market Buffet offers a mix of oriental and western style cuisine.
Contemporary and friendly, our Novotel Cafe will tempt you with its original and varied menu.
The hotel always uses the flower arrangements in the lobby for its promotional activities.

MT of above source sentences:

Todos los estudios están equipados con una pequeña cocina, nevera y un baño independiente.
Las instalaciones del hotel incluyen una piscina exterior y un salón de belleza.
Disfrute de la típica cocina francesa en el restaurante tradicional Aux Trois Cochons.
El hotel está en el corazón del centro histórico, cerca de todas las atracciones principales.
El hotel está situado en el corazón del distrito de Huangpu, cerca de puente Nanpu.
Los apartamentos están en muy buenas condiciones, bien equipados y amueblados a un nivel muy bueno.
Las habitaciones también están completamente equipadas con TV, teléfono, aire acondicionado, nevera y minibar.
Spice Market buffet ofrece una mezcla de cocina de estilo oriental y occidental.
Coetáneo y amable, nuestra cafetería Novotel le tentará con sus originales y variados menús.
El hotel siempre utiliza los arreglos florales en el vestíbulo para sus actividades promocionales.


Google translate of the same:

Todos los estudios están equipados con una pequeña cocina, nevera y baño separado.
Las instalaciones del hotel incluyen una piscina al aire libre y un salón de belleza.
Disfrute de la cocina típica francesa en el restaurante tradicional Aux Trois Cochons.
El hotel está en el corazón del centro histórico, cerca de las principales atracciones.
El hotel está situado en el corazón del distrito de Huangpu, cerca de Puente Nanpu.
Los apartamentos están en muy buenas condiciones, bien equipadas y amuebladas a un muy buen nivel.
Las habitaciones están totalmente equipadas con TV, teléfono, aire condicionado, nevera y mini bar.
Spice Market Buffet ofrece una mezcla de cocina oriental y occidental de estilo.
Contemporáneo y acogedor, nuestro Novotel Café ofrece un menú original y variado.
El hotel siempre utiliza los arreglos florales en el vestíbulo para sus actividades promocionales.


Your MT is better than GT, but only marginally so, really.


 
Kirti Vashee
Kirti Vashee  Identity Verified
United States
Local time: 08:55
Samples Jul 8, 2014

These samples are from a very small set of examples. On a large data set these small differences add up. It is one thing to say that there is little or no difference on a small sample set, and another thing to actually have a very accurate sense for how much more/less effort it would be to post-edit the different output for a large project.

Travel is also one of Google's best domains as there is a lot of web content that can be crawled and use to learn patterns. Google is very com
... See more
These samples are from a very small set of examples. On a large data set these small differences add up. It is one thing to say that there is little or no difference on a small sample set, and another thing to actually have a very accurate sense for how much more/less effort it would be to post-edit the different output for a large project.

Travel is also one of Google's best domains as there is a lot of web content that can be crawled and use to learn patterns. Google is very compelling on many domains for romance languages like ES, PT and even IT and explains why even Moses experiments can work for these languages.

One key value of custom systems is that it is possible to correct specific error patterns and thus make the PEMT task easier and enhance the efficiency on very large projects.

I am only able to show examples from systems where the clients allow it or from one our domain engines, but will try and get some other samples and post at a later date.


[Edited at 2014-07-08 21:59 GMT]
Collapse


 
Siegfried Armbruster
Siegfried Armbruster  Identity Verified
Germany
Local time: 16:55
English to German
+ ...
In memoriam
Moses Experiments Jul 8, 2014

Kirti Vashee wrote:
Google is very compelling on many domains for romance languages like ES, PT and even IT and explains why even Moses experiments can work for these languages.


Kirti, what do you mean with "Moses experiments", as I understand it, even the so called more advanced systems such as Asia Online or KantanMT are based on Moses.

I do understand that you can only show us samples of engines where your clients agreed to publish them and I really appreciate that you are showing samples. For me the interesting part is, that I have now seen samples from various engines, from various "advanced" solutions (all EN -> DE) and they were all of similar quality.

In my specialties (pharma and medical) and in my language pairs (En-DE, NL-DE), I can translate between 6-10 k words per day (using my CAT tool), producing a quality that is good enough that my customers kept coming back for years and are paying good rates.

If I remember it correctly, a MT/PEMT system of publishable quality has a output of 8-12 k words a day (please correct me if I am wrong).

So, why on earth should I switch to MT/PEMT or - coming from a different angle - would it not make sense to teach translators how to use their CAT tools better to help them improve their productivity.

I am not arguing that MT has various useful applications, and I hope that it will soon be good enough to help me to increase my productivity, but up to now, I am kind of disappointed about the promises made for years by some MT marketing people and the actual results.


 
Kirti Vashee
Kirti Vashee  Identity Verified
United States
Local time: 08:55
Moses and why bother with MT Jul 9, 2014

Sigfried

Moses is the very basic set of SMT build it your self tools that is widely used by NLP students at universities. While many commercial offerings are very closely related there are many other tools needed in addition to Moses to build successful MT systems. I have some definite
... See more
Sigfried

Moses is the very basic set of SMT build it your self tools that is widely used by NLP students at universities. While many commercial offerings are very closely related there are many other tools needed in addition to Moses to build successful MT systems. I have some definite opinions on this described here : http://kv-emptypages.blogspot.com/2011/12/moses-madness-and-dead-flowers.html You can also read the comments to hear other opinions.

If you are doing 6-10K words per day with DE I think it will be some time before an MT system will offer you real benefits. The highest performance I have seen is in mature romance language systems where even 20K a day is possible. Very tightly focused (in terms of domain) DE systems could provide you with a boost but such a system takes time to develop and only makes sense if there is long-term work potential with it.

But DE is considered a more difficult language to combine with EN.

We have a customer who reported that he achieved 900 words/hour with an En to HU system which is even harder than DE, however they take great care to train the editors and also make sure the MT system reaches a state where the output makes this possible. You can hear him describe this at http://www.asiaonline.net/EN/Resources/Webinars/default.aspx#Webinars16

We have had good results with both DE to Slovenian and DE to Japanese. MT makes sense here as SME translators are harder to find in these kinds of language combinations.
Collapse


 
Michelle Kusuda
Michelle Kusuda  Identity Verified
United States
Local time: 11:55
English to Spanish
+ ...
Some considerations... Jul 9, 2014

The samples provided were very literally translated. If I heard those sentences I would obviously assume that it was spoken by someone who studied the language.

Secondly, how many words can someone "post-edit/proofread" in one day? Because to me, post-editing/proofreading involves two steps:

1) Checking accuracy against original.
2) Removing any typographical errors, removing unnatural sounding expressions and ensuring the text has a smooth flowing style.
... See more
The samples provided were very literally translated. If I heard those sentences I would obviously assume that it was spoken by someone who studied the language.

Secondly, how many words can someone "post-edit/proofread" in one day? Because to me, post-editing/proofreading involves two steps:

1) Checking accuracy against original.
2) Removing any typographical errors, removing unnatural sounding expressions and ensuring the text has a smooth flowing style.

Also when changing font size to accommodate for text so that it takes the same space given language expansion considerations, MT would not be able to recognize that and would give you "solamente" in Spanish when you could definitely use "sólo".

In a hundred years it is possible that if only top translators, top editors, top programmers were involved in its development, I can see where it will make a real difference. MT vendors cater to big business and in order to make a profit hire not the best and most experienced translators (because expertise costs money) but young inexperienced translators eager to enter the profession or make a living during these tough economic times. MT has to watch out because "garbage in, garbage out"!!!

Right now, MT is in its infancy even if MT companies refuse to admit it.

[Edited at 2014-07-09 08:17 GMT]
Collapse


 
neilmac
neilmac
Spain
Local time: 16:55
Spanish to English
+ ...
PS: Mindful MT Jul 9, 2014

Here's the warning from the GT4T website:

"Warning: do not use GT4T as a mindless machine translation tool. Use it to save key stroke, get translation options for phrases, and keep consistency."

I find MT if used this way to be useful. Just another part of the tech I choose to use.

[Edited at 2014-07-09 09:41 GMT]


 
Giovanni Guarnieri MITI, MIL
Giovanni Guarnieri MITI, MIL  Identity Verified
United Kingdom
Local time: 15:55
Member (2004)
English to Italian
some... Jul 9, 2014

Giles Watson wrote:

Giovanni Guarnieri MITI, MIL wrote:

but when the sentence is unusable, you apply the same thought process as in translation...
in fact, you would develop a different skill, whilst preserving the original one...



If the text is unusable, you need to look at the original, which you might as well have done in the first place without wasting time on gobbledegook generated by a lucky bag of algorithms.



some of the text is unusable... some is usable...


 
Giovanni Guarnieri MITI, MIL
Giovanni Guarnieri MITI, MIL  Identity Verified
United Kingdom
Local time: 15:55
Member (2004)
English to Italian
productivity Jul 9, 2014

Kirti Vashee wrote:

We have a customer who reported that he achieved 900 words/hour


I do 900 words/hour on some specific texts... and I'm not a machine...


 
Kirti Vashee
Kirti Vashee  Identity Verified
United States
Local time: 08:55
Clarification Jul 9, 2014

Giovanni Guarnieri MITI, MIL wrote:

Kirti Vashee wrote:

We have a customer who reported that he achieved 900 words/hour


I do 900 words/hour on some specific texts... and I'm not a machine...


The person here was referring to a translator whose normal productivity is 250 words/hour was able to raise their output to 900 words/hour. Also this was for Life Sciences domain in English to Hungarian which is a very difficult language for MT


 
Kirti Vashee
Kirti Vashee  Identity Verified
United States
Local time: 08:55
How do computers "translate"? Jul 9, 2014

This little video very quickly shows you how computers "translate" .. This is highly simplified but it very quickly explains how a computer learns

https://www.youtube.com/watch?v=_ghMKb6iDMM

You can see that training a computer to "translate" is really a data preparation and data analysis task at a corpus level rather than a segment level. You are looking for "good" p
... See more
This little video very quickly shows you how computers "translate" .. This is highly simplified but it very quickly explains how a computer learns

https://www.youtube.com/watch?v=_ghMKb6iDMM

You can see that training a computer to "translate" is really a data preparation and data analysis task at a corpus level rather than a segment level. You are looking for "good" patterns and trying to avoid "bad" patterns in large corpii. This process is much more complicated for some language combinations than others.

So easy combinations would be Spanish to/from Italian since they have highly similar linguistic structures

Difficult combinations would be English to Arabic, Chinese to English, English to Japanese since the two languages are so different in essential structures. Inflection and morphological differences are very hard to capture and going from a SVO to SOV language also requires special efforts.
Collapse


 
Kirti Vashee
Kirti Vashee  Identity Verified
United States
Local time: 08:55
Samples from a Travel engine Jul 9, 2014

English to Indonesian Bahasa

Our well established restaurant serves a range of culinary delights.
Cuisine options vary from Western to A La Carte menus.
It is close to Mukden Palace and Liaoning Provincial Museum.
Quest On Sturt is perfect for every type of traveller.
Gili Villas are an exclusive resort of 4 stylish villas.
All the 45 rooms are spread in the four stories of the building.
For world class accommodation and amenities, stay at D
... See more
English to Indonesian Bahasa

Our well established restaurant serves a range of culinary delights.
Cuisine options vary from Western to A La Carte menus.
It is close to Mukden Palace and Liaoning Provincial Museum.
Quest On Sturt is perfect for every type of traveller.
Gili Villas are an exclusive resort of 4 stylish villas.
All the 45 rooms are spread in the four stories of the building.
For world class accommodation and amenities, stay at Doubletree by Hilton Qingdao Chenyang.


MT translation

Restoran terkenal hotel ini menyajikan beraneka kelezatan kuliner.
Pilihan masakan beragam dari barat hingga menu satuan.
Hotel ini berada dekat dengan mukden palace dan liaoning provincial museum.
Quest on sturt adalah tempat sempurna untuk semua jenis pelancong.
Gili villas adalah sanggraloka eksklusif 4 vila bergaya.
Semua 45 kamarnya tersebar di bangunan berlantai empat.
Untuk akomodasi dan fasilitas kelas dunia, menginaplah di doubletree by hilton qingdao chenyang.


and also for English to Thai


Bankside Waldorf Apartments are located in the Central Business District of Auckland.
Seashells Resort perfect for every type of traveller.
Tan Son Nhat International Airport is 1 km away.
Today, Bali is a favorite tourist destination.
This boutique hotel offers 40 luxury villas with modern furnishings.
Cello Hotel Songpa offers comfortable accommodation in a prime location for a reasonable price.
Then relax and be pampered in the 'Red Spring Sauna'.

MT

แบงค์ ไซด์ วอลดอร์ฟ อพาร์ทเมนท์ ตั้ง อยู่ ใน ย่าน ศูนย์กลาง ธุรกิจ ของ ออ ค แลนด์
ซี เชลล์ รีสอร์ท เหมาะ สำหรับ นัก เดินทาง ทุก ประเภท
สนามบิน นานาชาติ เติ่น เซินเญิ้ ตอ ยู่ ห่าง ออก ไป 1 km
วันนี้ บาหลี เป็น จุดหมายปลายทาง ยอด นิยม ของ นัก ท่องเที่ยว
บูติค โฮ เท็ล แห่งนี้ มี วิลล่า หรู พร้อม เฟอร์นิเจอร์ ที่ ทันสมัย 40
โรงแรม เซลโล ซง พา ให้ บริการ ที่ พัก ที่ สะดวกสบาย ใน ทำเล ที่ ตั้ง ชั้นเยี่ยม ด้วย ราคา สมเหตุสมผล
แล้ว ผ่อนคลาย และ รับ การ ปรนนิบัติ ใน ' เรด สปริง ซาวน่า '
Collapse


 
Phil Hand
Phil Hand  Identity Verified
China
Local time: 23:55
Chinese to English
Special pleading Jul 10, 2014

Kirti Vashee wrote:

...Difficult combinations would be English to Arabic, Chinese to English, English to Japanese since the two languages are so different in essential structures. Inflection and morphological differences are very hard to capture and going from a SVO to SOV language also requires special efforts.

I see this sort of claim all the time, and I have to call it. There's always some reason why the sample we're looking at is so bad: it's a difficult language pair, it's a difficult text type, whatever...

According to your logic there, English to Chinese should be the easiest pair. English and Chinese share many common features: they're both SVO, they have roughly equivalent ways of expressing many sentence level features (adverbial clauses, time, etc.). Chinese has no (or very little) morphology, so the computer doesn't have to worry about that.

But in reality, English to Chinese output is terrible. Usually completely unreadable.

Now, I don't know about romance languages. But I'm often told that they are some of the hardest pairs to work between because you have to be so careful to avoid false friends. I'd like to hear other romance language colleagues chip in to tell us if Google or any other MT system can really achieve decent results in their pairs.


PS. An illustration of why MT isn't advancing: what the hell kind of texts are these? What on earth does "from Western to A La Carte" mean?

[Edited at 2014-07-10 02:50 GMT]


 
Giovanni Guarnieri MITI, MIL
Giovanni Guarnieri MITI, MIL  Identity Verified
United Kingdom
Local time: 15:55
Member (2004)
English to Italian
well... Jul 10, 2014

Kirti Vashee wrote:


The person here was referring to a translator whose normal productivity is 250 words/hour was able to raise their output to 900 words/hour. Also this was for Life Sciences domain in English to Hungarian which is a very difficult language for MT



Working pairs and domains don't matter... they are your working languages and your domains... so you should be able to reach a good output, not 250 words/hr! Get a good translator instead!


 
Giles Watson
Giles Watson  Identity Verified
Italy
Local time: 16:55
Italian to English
In memoriam
A matter of style Jul 10, 2014

Phil Hand wrote:

Now, I don't know about romance languages. But I'm often told that they are some of the hardest pairs to work between because you have to be so careful to avoid false friends. I'd like to hear other romance language colleagues chip in to tell us if Google or any other MT system can really achieve decent results in their pairs.



It's not just a question of a few false friends. Romance languages tend to have stylistic expectations about sentence structure and the organisation of thought that contrast with English.

For example, Italian - but the comment also applies to other Romance languages - likes its sentences to look solid. Forms and notions balance or offset each other and the ideas often tend to be organised in (nested) pairs. English, in contrast, generally seeks to engage the reader's attention by imparting a sensation of movement. Readers expect sentences to flow and triplets are more common.

If you want an analogy, it's a bit like listening to a tango (2/4 time) and trying to transcribe it as a waltz (6/8 time).

You can, of course, calque the organisation of thought in the Italian but the English will plod and the translation will be far less effective than the original.

MT doesn't even address this issue, except by imposing its own tone-deaf rhythms on the target texts. If and when MT begins to hear language with a native ear (or humanity loses its ability to enjoy language's sounds), it will be time for translators to step down and let the 'puters take over.


[Edited at 2014-07-10 13:39 GMT]


 
Pages in topic:   < [1 2 3 4 5 6 7 8 9 10] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

What's your opinion on machine translation and quality?






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »