Broken translation memories showing extreme high % matches for unrelated text?
Thread poster: Adieu
Adieu
Adieu  Identity Verified
Ukrainian to English
+ ...
Dec 7, 2021

I've got a client who recognizes the issue and even pays out without match discounts, but perseveres in trying to populate a messed-up translation memory. This thing shows 90+% matches between grammatically incorrect vernacular where a mental patient is talking about his issues and pieces of bureaucratic cover letter templates for applications for permits.

There's literally not a word in common. Some aren't even remotely close in length.

Any idea how this is happening?


 
Daryo
Daryo
United Kingdom
Local time: 05:43
Serbian to English
+ ...
Maybe ... Dec 8, 2021

Adieu wrote:


I've got a client who recognizes the issue and even pays out without match discounts, but perseveres in trying to populate a messed-up translation memory. This thing shows 90+% matches between grammatically incorrect vernacular where a mental patient is talking about his issues and pieces of bureaucratic cover letter templates for applications for permits.

There's literally not a word in common. Some aren't even remotely close in length.

Any idea how this is happening?


Maybe that proves the bureaucratic jargon makes about as much sense as the verbal effusion of mental patients?

Or the software was tweaked a bit too much in favour of agencies, in an effort to beat the competition?

One of these explanations ought to make sense...


 
Grigori Gazarian
Grigori Gazarian  Identity Verified
Mexico
Local time: 22:43
Member (2021)
Spanish to Russian
+ ...
Tags, maybe? Dec 8, 2021

Adieu wrote:

There's literally not a word in common. Some aren't even remotely close in length.

Any idea how this is happening?


Are there many tags? Not sure, but I think memoQ's algorithms also consider similar tags. Just a wild guess.


 
Adieu
Adieu  Identity Verified
Ukrainian to English
+ ...
TOPIC STARTER
Nope Dec 8, 2021

No tags.

Tags are a separate type of cancer.

Grigori Gazarian wrote:

Adieu wrote:

There's literally not a word in common. Some aren't even remotely close in length.

Any idea how this is happening?


Are there many tags? Not sure, but I think memoQ's algorithms also consider similar tags. Just a wild guess.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Broken translation memories showing extreme high % matches for unrelated text?






Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »