After having merged and roughly cleaned my reference data, I was eager to start annotating.
After I had narrowed down the topic of my first machine learning project to building a movie recommendation algorithm, I quickly found some movie data from the IMDB database that I could use. That way, I could basically skip the whole step of learning how to use an API properly and get the data myself (haha! I thought. Read on).
When it comes to measuring quality, we are surprisingly unsuspicious once a metric comes into the play. As soon as someone hands you numbers, or a chart, there is a good chance that you will trust in those numbers – especially if they support what you already believe. It is always important to know where those numbers come from, and what exactly they measure. Especially in the field of (neural) machine translation, trusting numbers blindly can have severe consequences.
Back in the days when I was a machine translation specialist, it was part of my job to make sure that the machine translation output we used had a certain quality. I was positioned between the Sales and Production departments of the company, because that certain quality was important for both: As the content usually got post-edited, I had to check if the post-editors would actually be able to work with the output. And as machine translation and post-editing (MTPE) was a cheaper product than good old translation, the Sales guys wanted to know how much they could go down with our rates.
No, this is not yet another article with motivating mantras about you being good enough. You are! Trust me. This is a blog post about quality assurance. Before I became Head of Technology, my position was Machine Translation Specialist. As such, I was confronted with this question on a daily, nay, hourly basis regarding raw or post-edited machine translation output. I often struggled to answer it, and I could imagine that I am not the only one. So here’s my thoughts – maybe they help you the next time when someone asks you exactly this question.
After identifying the topic of my first ML project, I needed to outline my business problem. Following what I had learned in online courses and YouTube videos, I went through these 5 steps.
Mini Series on the history of machine translation! Find out what it means when someone says that something sounds like Google Translate, how Google Sings Songs and what neural machine translation is in part 2 of the series.
Once the initial motivation has worn off, it is hard to stay focused. In the end, who would blame you if you just stopped? The nice thing about the headstart into a new topic is the immediate reward in the form of new knowledge, understanding formerly complicated looking things, and being able to brag about starting a new thing on the next party. After a while, everything is back to normal. Nobody will ask anymore and you are left alone with your motivation.
Mini Series on the history of machine translation! Find out how machine translation started and how statistical engines work in part 1 of the series.
Since I started working as a machine translation specialist, one of the most complex and interesting questions that impacted my daily work was this one: How can machine translation achieve human quality? This article is not a technical description of the numerous options you have to measure human quality, like BLEU score or other evaluation methods. No, in this post, I want to discuss a much more complicated question: What is human quality? Spoiler Alert: Human quality should be called Schrödinger’s quality instead, because it always has different states that are only distinguishable once they are in the past. I will present three reasons for this behavior.