Panels of the 9th Annual International Translation Conference
Evaluation of Machine Translation-Google Translate vs. Yandex Translate: From Kyrgyz into English
Yahya Polat
Ala-Too International University
Although Kyrgyz language is old, rich and a bearer of the glorious epic Manas, it is not well represented in the area of macine translation yet. So far it has shown a comparatively slow progress in Google and Yandex translation services. This study investigates the accuracy of machine Kyrgyz-to-English translation at lexical, semantic, and syntactic levels. The present study uses Groves and Mundt (2015) Model of error taxonomy to compare Kyrgyz-to-English translations produced by Google and Yandex Translate. We have selected, 100 texts from four domains, including law, literature, medicine, and mass media, i.e. 25 texts from each domain. The texts have been translated by Google and Yandex Translate, as well as human translators and then evaluated with respect to lexical, semantic and grammatical accuracy. Materials are composed of four groups, they are (a) very short noun phrases, with 2 words, (b) short noun phrases, with 2 to 5 words, (c) long phrases, with 10 to 13 words, and (d) sentences, with 18 to 23 words in length. In this study, we have done a descriptive-comparative human analysis of translations based on Groves and Mundt (2015) Model as the criterion for evaluating and scoring the translations made by machine and human translators. The reason for adopting this model is that it allows for detailed analysis and scoring of the translated materials. We have also got benefited from the studies of Saffari, Sajjadi, Mohammadi (2017) and Ghasemi, Hashemian (2015) as the practical models. Summing up the results, it can be concluded that Google Translate was more accurate than Yandex Translate at lexical, semantic and syntactic levels in translating phrases and sentences from Kyrgyz into English from the four different domains under investigation. Error analysis of grammatical items revealed that verb tense, comma, and spelling were the most frequent errors generated by the two machine translation systems.
Speech Recognition + Machine Translation = Fully Automatic Conference Interpreting?
Stephan Vogel
Qatar Computing Research Institute-Hamad Bin Khalifa University
Machine translation has become a fact: The amount of material translated fully automatically – mostly web pages, e-commerce customer reviews, and social media postings – is 100 times more than content translated by translators. Similarly speech recognition is used in many applications, from call centers, to dictation of medical reports, to personal assistants like Cortana and Siri on the phone. In the paper we will describe a speech translation system, which combines our speech recognition and machine translation technology to build a fully automatic conference interpreting system for Arabic 0t English and English to Arabic. We will highlight the challenges, provide an over view of the underlying technologies, esp. highlight the new developments by using the so-called deep learning and also give a live demonstration of the system. The Arabic speech recognition system, more precisely, the acoustic model is built on more than 1000 hours of transcribed Arabic speech, mostly in MSA (modern standard Arabic), and mostly from the broadcast news domain. In contrast, the English speech recognition system is built on recordings and transcription of about 150 hours of TED talks. Both system use also much larger amounts of text data to learn the language models. For machine translation different technologies are explored. On one side we build so-called phrase-based statistical machine translation systems (PBSMT), on the other side we explore the new developments in deep learning to build neural machine translation systems (NMT). One problem in building such systems is the limitation of the available vocabulary. No matter how much training data is used, there are always words and word forms, which have not been seen in the data. One attempt to overcome this problem, esp. in the machine translation component, is to use sub-word units as internal representation. Another problem – for humans as well as machines – is the fact that a good translation can only be generated when enough context has been seen. In the other side, simultaneous interpretation requires that output is generated in a continuous fashion without too much delay. In human interpretation we observe an average decalage of only a few seconds. To have a similar decalage in automatic interpretation requires that both speech recognition and machine translation performs stream decoding. The paper will present our solution and provide results on the trade-off between longer decalage and quality of the output. By analyzing transcripts of interpretations of talks at conferences (WISE, WISH, ARC) we can provide a comparison between human and fully automatic interpretation thereby highlighting the strong and the weak aspects of interpretation done by a computer. In particular, we look at the quality of the automatic interpretation, loss of content, and decalage
Measuring Usability of Light Post-Editing
Sheila Castilho
Dublin City University
The increasing use of machine translation (MT) in recent years has resulted in a strong focus on MT evaluation. It is usually assumed that the quality of current machine translation systems still requires humans to post-edit, but when this happens the end results are of high quality. High quality, in turn, means that machine translated content is acceptable and usable and the end user will be satisfied. While automated machine translation becomes ever more pervasive, little is known about how end users engage with raw machine-translated text.
This article reports on results from experiments to measure the usability of machine translated content by end users, comparing lightly post-edited content against raw machine translation output for German (DE), Simplified Chinese (ZH) and Japanese (JP) target languages, as well as for the English source language. Usability is defined by “the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency, and satisfaction in a specified content of use” (ISO 2002), effectiveness is then measured via goal completion, and efficiency is measure via task time, and task time when only successful tasks are considered. Satisfaction is defined as “user’s perceptions, feelings, and opinions of the product, usually captured though both written and oral questioning” (Rubin and Chisnell 2011), and as the “freedom from discomfort, and positive attitudes towards the use of the product (ISO 1998).
In order to measure usability, eight tasks were created from Online Help content for a spreadsheet application in collaboration with an industry partner. The tasks were translated from English into German, Simplified Chinese and Japanese by the company’s MT system and lightly post-edited by the company’s translation providers. Post-editing was carried out only when terminology and grammatical errors were found in the output. Fourteen native speakers of German, twenty-one native speakers of Simplified Chinese and twenty-eight of Japanese were divided into two groups – one group used the lightly post-edited instruction, and the second used the raw machine translated instructions. The English participants who were using the source texts were part of one single group. The participants were asked to follow the instructions and perform the tasks in the spreadsheet user interface. After the completion of the tasks, the participants were asked to answer a post-task satisfaction questionnaire in order to account for their opinion on how useful the instructions were.
A web survey was also implemented in order to gather a general indication of satisfaction with genuine users of the software on a large scale. The survey was displayed on the industry partner’s website for 140 articles (EN, DE, ZH and JP) and gathered information on ‘how useful’ the content is for the end user. The online survey consisted of only one multiple choice question: “Was this information helpful?” (YES/NO).
The main objectives of the experiments were to i) investigate the extent to which light human post-editing of machine translation impacts on the acceptability of instructional content, and ii) to compare the level of acceptability between German, Simplified Chinese and Japanese languages. Results show that the implementation of light post-editing directly influences acceptability for German and Simplified Chinese languages, more so than for the Japanese language and, moreover, the findings of this research show that different languages have different thresholds for translation quality.
Post-editing Strategies for Machine Translation Output of User Generated Content
Miguel Angel Candel-Mora
Universitat Politecnica de Valencia
With the advent of Web 2.0 and the active participation of users, online consumer-generated reviews have become a clear reference in purchasing decision-making processes. These reviews have already been studied to a large extent from the point of view of marketing, business, tourism and information technology (Schemmann, 2011), in areas such as the influence on decision-making (Ricci & Wietsma, 2006) or the characteristics of the textual genre (Vásquez, 2014) in order to consolidate this genre with certain special features as well to improve online review platforms.
A common feature of most review platforms is the use of machine translation systems to immediately make that review available to as many users as possible in different languages. Thus, the research question that motivates this work is that in the case of user-generated reviews in the domain of tourism, the message is not only transmitted through linguistic resources but there are other elements or textual artifacts that should be taken into consideration in the post-editing strategy, in addition to relevant grammar and stylistic post-editing guidelines (Babych, 2014; Vilar et al., 2006). In other words, opinions are not only conveyed through language, as there are some genre specific features such as intertextuality, or reference to other opinions, the profile of the reviewer or paralinguistic elements that contribute to the reliability and credibility of consumer reviews.
Several studies have already confirmed that there are no universal guidelines for post-editing (Allen, 2003; TAUS, 2010), and each genre requires specific quality rating scales. Thus, this work highlights the need to pay special attention to the textual conventions during any post-editing strategy in addition to identifying linguistic error patterns common to most post-editing guidelines. More specifically, the objective of this work is to compare textual characteristics of user reviews originally written in English and in Spanish from data derived from a corpus-based approach analysis that serve to design standard guidelines for MT output post-editing tasks.
Evaluation Study of Translation-based Applications Models of Arabic Learning in Smart Devices
Nour El Houda El Karoubi
Focus has been recently shifted from education to learning. The teacher's effort in classroom has become less important compared to the outcome of efforts exerted by the learner himself, who became – along with the learner’s knowledge and skills – the realistic and main criterion of the education process. There is a growing number of programs highlighting the positive part played by the learner which was once considered negative. Similarly, many foreign language teaching programs focus on peers teaching and participatory teaching approaches, and other relevant programs that underline the role of learners, both individually and collectively. This research aims at reaching conclusions, suggestions, and recommendations for developing translation-based programs and applications in teaching Arabic. In addition, it seeks to foster the culture of e-learning and improves academic achievements by providing an interactive electronic learning environment with high quality competencies.
It is expected that the following can make benefits from this research:
Education institutions teaching Arabic as a second language
Arabic language teachers for non-Arabic speakers
Engineers, translators, and those who develop Arabic language learning applications
Methodology:
The study tends to adopt descriptive and analytical methods, because analyzing the models of translation-based applications in teaching Arabic language is mainly based on smart devices. It tackles several issues, including: smart devices’ compatibility and efficiency in teaching Arabic language, especially for non-Arab speakers; their reliability and suitability as a self-learning tool; and how professional are the applications’ developers? To what extent is their knowledge of the Arabic language? What educational curricula used? How can they be improved and used as reliable references in education?
Conclusion:
By examining 12 of the most free download smart phones applications, a list of language skills teaching criteria was devised to evaluate these websites and applications. The websites then were evaluated, with the emerging results being analyzed and explained.
I have noticed that 10 of these applications adopt a "translation without grammar" method, where the sentence is used as an essential element in teaching and practicing of language, making the language learning process easier. I have recorded a set of observations regarding the learning process that is conducted mainly by translating some vocabulary words and sentences from and into the targeted language.
Status of Legal Translation in the Digital Age: Algeria as a Case
Imane Benmohamed
University of AlgiersII
There is no doubt digital technology has greatly affected translation industry through the tremendous development it has made on many levels, mainly on the efficiency of translator and the translation field itself, both in theory and practice. However, there is a disparate in impact that clearly varies in accordance with translational discipline and geographical scale. Our presentation aims at highlighting the reality of Arabic translation in the digital age from the perspective of legal translation in particular, and, more specifically, in Algeria. It tries to find answers to the following questions: How modern technologies are being used in legal translation in Algeria? Is there really an impasse between Algerian legal translators and technologies? In addition, if any, what are the reasons for this impasse?
In order to answer these problematic issues in a scholarly way, we decided to carry out a field study on a sample of sworn translators who own legal translation agencies; for they are the most professional group dealing with translating official documents in Algeria. Based on factual data away from speculations and prejudices, the study aims at closely finding out whether they depend on term banks and electronic corpora (parallel corpora or comparable corpora) to do their work.
The sample involves 20 legal translators from different age groups (20 - 50 years old and above), with various professional experiences (6 - 15 years). We then distributed questionnaires with 12 questions, each with a set of answers. Translators had to make one choice only. The analysis of the questionnaires shows that 70% of the participants indicated that they did not use digital means to translate Algerian legal documents; and those who used technologies (30%), mainly use term banks (37%) as their first choice, then online search engines (27%) as second, and electronic corpora (18%) as third. In order to get accurate data for each digital method, we asked the participants about how much they use each one of them – term banks, parallel corpora, or comparable corpora. They answered: As for term banks, which are databases of terms covering different areas of knowledge, all participants confirmed they knew them, yet only 20% said they used them constantly, 40% said they never used them, and 40% said they used them occasionally. Regarding electronic parallel corpora that contain source texts and their correspondent target texts, one third of the participants (30%) admitted they knew nothing about them; one-third (30%) revealed they never used them; 20% stated they used them regularly, and 20% occasionally. It seems that comparable corpora, which contain source texts in a particular language or various languages (not translated texts) and subject to special criteria in terms of genre, time, style, and content, are the least used among participants in this study. Only 10% used them, while 50% admitted they did not know them, and 40% said they did not use them. The main causes of the uncommon usage of technologies in legal translation in Algeria – according to participants, are mainly attributed to: Translators’ preference for the classical translation methods (50%); difficulties accessing technologies (30%); and lack of good control of technologies (30%).
In light of the above, the following preliminary conclusions can be drawn: Legal translation in Algeria does not depend on digital technologies as much as on classical methods. An actually impasse between specialists and digital applications, which are translation tools, is evident; Electronic term banks are the most widely used technological tools among Algerian sown translators, followed by parallel corpora, and, finally, comparable corpora that are still unknown to many. Despite the qualitative leap in digital technology and its impact on translation industry, it still encounters constrains in some disciplines and countries. It is necessary devising an immediate plan focusing on the close relationship between good quality translation and the effective provision of tools and methods that help a translator establish good control over his work, as in case of modern technologies.
Translation Techniques in the Digital Age: Towards Practical Preparation of Translator and Raising the Stakes of Market
Saida Kohil
Annaba university-Translation Laboratory
This research focuses on the topic of the educational nature of translation. We have chosen to invest in the realm of digital practice, which produces translational competence added to all other related linguistic, cultural, deliberative, methodological, and cost-effective competences. Digital competence will effectively build a translator working in applied languages, as it enables translators to use and engage internal and external resources in the process of transmission by saving time and improving translation quality that we always eager to achieve. Like peers worldwide, Arab translation institutes seek to build a translator up to the requirements of real-life market, like tourism – on which we have focused through analyzing the techniques of achieving digital competence in translating business-related texts.
Problematics: How can we formulate digital competence at translation classroom? What are the possible means to practically implement digitalized measure in the formation of translator in tourism domain? How can we win the market by using digitalized tools in translation? What are the available alternatives to digitalized technologies in communication established by translation? What are the new horizons and their implications on the formation of translators?
Methodology:
Introduction:
From media to digitization in translation
Translation competences in the formation of translator in tourism domain
Training and employment mechanisms in the industry of digitalized competence at translation classrooms: Analysis: Documenting and reading references of digitization; structure: Editing with digitization tools; differences between practice of digitization and tools of digitization in the formation of translator
Digital competence in the formation of translator in tourism domain: Representations and results
Conclusion:
Objective and Results:
The aim of this study is to train tourism translators on how to acquire digital competence and interaction in translation and applied languages so as to win the market outcome.
The research seeks to enable well-trained translators to effectively use work techniques in the age of digitization to win the market and get a proper job opportunity, and to train them to translate tourism websites with related applications, such as Upwork-steps.
Cherifa Belhouts
University of Boumerdes
Mechanisms of Teaching Translation in the Digital Age
In the age of modern technologies, it became extremely rare to find a translator still using pen and paper, either in terms of the source text to be translated or in terms of rendering the translation itself. This change in translation tools resulted in a conceptual paradigm shift. Once a paper-source text was the only teacher’s education tool, in one hand, and the paper-dictionary was the only student’s help tool, in the other. Nowadays, translation academic departments go through a transformational phase from traditional to more digitalized form of education, which is itself imposed as a necessary and inevitable reality resulting from rapid development of our world, as well as the younger generation’s approach to taking advantage of the scientifically state-of-the-art innovations in the age of speed and globalization.
Based on our experience in teaching translation at the Algerian University, we have seen these changes and realized the importance of keeping up with the rapid pace of development. The concepts changed, and were replaced by others, such as machine translator, machine translation, digital corpus, electronic dictionary, translation memories, search engines, and translation websites.
This research raises the following questions: Will this shift achieve the educational goals of teaching translation in university? What are the challenges and constraints? To what extent can students benefit from this type of education? What is the importance of traditional lessons in modern study? Are there any alternatives?
In order to answer these questions, we adopted a study applied within our department by using a sample of university students. The aim of this study is to highlight these transformational changes and challenges occur in teaching translation in this digital age, as well as the need to utilize modern technologies in translation education and treatment of associated negative aspects. The research is outlined as follows:
Teaching translation; professional translation; modern technologies and instructions of translation; teaching translation tools; from traditional education to education in the digital age; definition of sample; definition of corpus; machine translations: Overview, analysis and inference of problems and solutions