Summer Placement Diaries
Hitalo’s Summer Placement with Unbabel
My first-year placement took place at Unbabel’s HQ in the charming city of Lisbon, Portugal. I was excited to work with them from the get-go, because they project this image of an innovative and up-and-coming company that will be a major player in the near future. And I think I was not wrong.
Unbabel has not only a commercial but also a research approach to translation. Several of its employees are linguists, researchers, PhD students, and data engineers who take the academic knowledge to the business, and this was a perfect fit for me. I believe that academia and the industry should work side by side and not isolated from each other, and I saw that combination working really well for Unbabel.
Another reason for choosing this company was the opportunity to work onsite in their Lisbon office. After two years of pandemics and lockdowns, being able to gradually resume the social contact with work colleagues was a chance I could not pass. Unbabel’s Lisbon offices occupy an entire building in the financial center of Lisbon, fittingly called Tower of Unbabel (get it? 😉
The building has eight floors where all the company’s departments are spread. The company has a start-up vibe that becomes visible as soon as you step into the reception hall. There are rest areas, reading and conference rooms, pool tables, free beer, free lunch on Tuesdays (a good opportunity for those who work from home or on a hybrid scheme to come to the office and socialize), and much more.
Unbabel is a translation company founded in 2013 whose flagship product, the Language Operations platform, blends artificial intelligence and humans in the loop to provide translation services to clients in the Game, Fintech, Retail, Tech, and Travel industries.
Headquartered in Lisbon, and with offices in 5 other cities in the US and Europe, including New York and Berlin, Unbabel has 178 direct employees working in its hubs, and thousands of freelance editors that regularly provide services to the company. Its business model centers around the use of machine learning, Neural Machine Translation and crowdsourcing to facilitate multilingual communication between customers and companies via emails, chats, and FAQs.
Despite having cutting-edge, award-winning systems which are constantly finetuned to better serve their clients and maintain their competitive edge and turnaround times, Unbabel is still working out ways to provide better training to their own linguists and language service providers on the use of their annotation platform.
Currently, all the training material available to annotators is published in the form of guidelines on their intranet website, which is not always read by the intended audience nor are they super fun to follow. Seeing an opportunity to bridge this gap, Community R&D – the department in charge of annotation quality – proposed a project for the EMTTI intern to work on during the program placement, and that’s where I come in.
My placement activities
My tasks involved understanding the Annotation process at Unbabel and creating a training module for new and existing editors to learn how to use the company’s error typology correctly.
My entire first week was dedicated to onboarding, reading internal materials, guidelines, and other documents, and watching internal training videos to have an initial understanding of Unbabel’s workflow and routines. I had daily online meetings with my placement supervisor, Marina Sánchez, as a way to put me up to speed with the necessary knowledge that would help in my task.
In order to gain a deeper perspective into the annotation tool used at the company, I was assigned a batch of jobs that allowed me to have a more in-depth view of the tool’s interface, as well as of the challenges themselves. This was particularly useful as it permitted me to have an insider’s look into the same difficulties and doubts that other annotators might face.
After understanding their procedures, and practicing annotations myself, I drafted an initial training proposal and submitted it to my placement supervisor. She had great expectations for this, so I felt compelled to go the extra mile to do a great job.
During the remaining weeks of the placement, I used all the information gathered to develop the content I had proposed. With the training module ready, we executed a dry-run with some volunteers and asked for their feedback. We knew that a project of this magnitude could not be completed in just 4 weeks, so we tried to keep our expectations in check and create something that, though more basic, would still make sense within the company’s structure.
However, we were surprised by the wonderful reception it received. The volunteers (experient annotators who already know and use the Error Typology) commented that they wish they had had something like that when they first began and that, even if they are more experienced now, they still felt that the way the content was created was very pleasant and easy to follow.
My placement supervisor even invited me to present the project during their All-Hands Meeting (an internal meeting where all the departments present their current projects), but this would take place after the end of my placement, so she stepped in for me.
All in all, despite the challenges and the short time we had to put this out for their internal audience, the result was more than satisfying.
In sum, my weeks at Unbabel proved to be quite fruitful and enriching. I particularly appreciated the fact that, even as an intern, I was given access to programs that are in production and to real data extracted from them. It was also relevant to show how the EM TTI students can contribute with the companies that accept them as interns.
Though the placement project was not directly related to my research topic, the insight and experience I’ve gained with it were invaluable. Apart from having a behind-the-scenes look into the systems, methods, and processes of a translation company that is at the forefront in the development and perfecting of tools that use NLP and Machine Learning, I have also used and experimented with their platform and learned from senior employees who are, ultimately, researchers themselves.
If you are reading this and haven’t decided where to do your next placement, I highly recommend Unbabel. They offer different projects every year and they usually take 2 to 3 EM TTI students, so you stand a good chance of being accepted if that is your choice. You might even bump into other EM TTI colleagues while there, as was my case J