Machine Learning in Translation Corpora Processing

Machine Learning in Translation Corpora Processing
Title Machine Learning in Translation Corpora Processing PDF eBook
Author Krzysztof Wolk
Publisher CRC Press
Total Pages 264
Release 2019-02-25
Genre Computers
ISBN 0429590776

Download Machine Learning in Translation Corpora Processing Book in PDF, Epub and Kindle

This book reviews ways to improve statistical machine speech translation between Polish and English. Research has been conducted mostly on dictionary-based, rule-based, and syntax-based, machine translation techniques. Most popular methodologies and tools are not well-suited for the Polish language and therefore require adaptation, and language resources are lacking in parallel and monolingual data. The main objective of this volume to develop an automatic and robust Polish-to-English translation system to meet specific translation requirements and to develop bilingual textual resources by mining comparable corpora.

Machine Learning in Translation

Machine Learning in Translation
Title Machine Learning in Translation PDF eBook
Author Peng Wang
Publisher Taylor & Francis
Total Pages 219
Release 2023-04-12
Genre Language Arts & Disciplines
ISBN 100083865X

Download Machine Learning in Translation Book in PDF, Epub and Kindle

Machine Learning in Translation introduces machine learning (ML) theories and technologies that are most relevant to translation processes, approaching the topic from a human perspective and emphasizing that ML and ML-driven technologies are tools for humans. Providing an exploration of the common ground between human and machine learning and of the nature of translation that leverages this new dimension, this book helps linguists, translators, and localizers better find their added value in a ML-driven translation environment. Part One explores how humans and machines approach the problem of translation in their own particular ways, in terms of word embeddings, chunking of larger meaning units, and prediction in translation based upon the broader context. Part Two introduces key tasks, including machine translation, translation quality assessment and quality estimation, and other Natural Language Processing (NLP) tasks in translation. Part Three focuses on the role of data in both human and machine learning processes. It proposes that a translator’s unique value lies in the capability to create, manage, and leverage language data in different ML tasks in the translation process. It outlines new knowledge and skills that need to be incorporated into traditional translation education in the machine learning era. The book concludes with a discussion of human-centered machine learning in translation, stressing the need to empower translators with ML knowledge, through communication with ML users, developers, and programmers, and with opportunities for continuous learning. This accessible guide is designed for current and future users of ML technologies in localization workflows, including students on courses in translation and localization, language technology, and related areas. It supports the professional development of translation practitioners, so that they can fully utilize ML technologies and design their own human-centered ML-driven translation workflows and NLP tasks.

Using Comparable Corpora for Under-Resourced Areas of Machine Translation

Using Comparable Corpora for Under-Resourced Areas of Machine Translation
Title Using Comparable Corpora for Under-Resourced Areas of Machine Translation PDF eBook
Author Inguna Skadiņa
Publisher Springer
Total Pages 323
Release 2019-02-06
Genre Computers
ISBN 3319990047

Download Using Comparable Corpora for Under-Resourced Areas of Machine Translation Book in PDF, Epub and Kindle

This book provides an overview of how comparable corpora can be used to overcome the lack of parallel resources when building machine translation systems for under-resourced languages and domains. It presents a wealth of methods and open tools for building comparable corpora from the Web, evaluating comparability and extracting parallel data that can be used for the machine translation task. It is divided into several sections, each covering a specific task such as building, processing, and using comparable corpora, focusing particularly on under-resourced language pairs and domains. The book is intended for anyone interested in data-driven machine translation for under-resourced languages and domains, especially for developers of machine translation systems, computational linguists and language workers. It offers a valuable resource for specialists and students in natural language processing, machine translation, corpus linguistics and computer-assisted translation, and promotes the broader use of comparable corpora in natural language processing and computational linguistics.

Handbook of Natural Language Processing and Machine Translation

Handbook of Natural Language Processing and Machine Translation
Title Handbook of Natural Language Processing and Machine Translation PDF eBook
Author Joseph Olive
Publisher Springer Science & Business Media
Total Pages 956
Release 2011-03-02
Genre Computers
ISBN 1441977139

Download Handbook of Natural Language Processing and Machine Translation Book in PDF, Epub and Kindle

This comprehensive handbook, written by leading experts in the field, details the groundbreaking research conducted under the breakthrough GALE program--The Global Autonomous Language Exploitation within the Defense Advanced Research Projects Agency (DARPA), while placing it in the context of previous research in the fields of natural language and signal processing, artificial intelligence and machine translation. The most fundamental contrast between GALE and its predecessor programs was its holistic integration of previously separate or sequential processes. In earlier language research programs, each of the individual processes was performed separately and sequentially: speech recognition, language recognition, transcription, translation, and content summarization. The GALE program employed a distinctly new approach by executing these processes simultaneously. Speech and language recognition algorithms now aid translation and transcription processes and vice versa. This combination of previously distinct processes has produced significant research and performance breakthroughs and has fundamentally changed the natural language processing and machine translation fields. This comprehensive handbook provides an exhaustive exploration into these latest technologies in natural language, speech and signal processing, and machine translation, providing researchers, practitioners and students with an authoritative reference on the topic.

Learning Machine Translation

Learning Machine Translation
Title Learning Machine Translation PDF eBook
Author Cyril Goutte
Publisher MIT Press
Total Pages 329
Release 2009
Genre Computers
ISBN 0262072971

Download Learning Machine Translation Book in PDF, Epub and Kindle

How Machine Learning can improve machine translation: enabling technologies and new statistical techniques.

Machine Translation

Machine Translation
Title Machine Translation PDF eBook
Author Thierry Poibeau
Publisher MIT Press
Total Pages 298
Release 2017-09-15
Genre Computers
ISBN 0262534215

Download Machine Translation Book in PDF, Epub and Kindle

A concise, nontechnical overview of the development of machine translation, including the different approaches, evaluation issues, and major players in the industry. The dream of a universal translation device goes back many decades, long before Douglas Adams's fictional Babel fish provided this service in The Hitchhiker's Guide to the Galaxy. Since the advent of computers, research has focused on the design of digital machine translation tools—computer programs capable of automatically translating a text from a source language to a target language. This has become one of the most fundamental tasks of artificial intelligence. This volume in the MIT Press Essential Knowledge series offers a concise, nontechnical overview of the development of machine translation, including the different approaches, evaluation issues, and market potential. The main approaches are presented from a largely historical perspective and in an intuitive manner, allowing the reader to understand the main principles without knowing the mathematical details. The book begins by discussing problems that must be solved during the development of a machine translation system and offering a brief overview of the evolution of the field. It then takes up the history of machine translation in more detail, describing its pre-digital beginnings, rule-based approaches, the 1966 ALPAC (Automatic Language Processing Advisory Committee) report and its consequences, the advent of parallel corpora, the example-based paradigm, the statistical paradigm, the segment-based approach, the introduction of more linguistic knowledge into the systems, and the latest approaches based on deep learning. Finally, it considers evaluation challenges and the commercial status of the field, including activities by such major players as Google and Systran.

Building and Using Comparable Corpora for Multilingual Natural Language Processing

Building and Using Comparable Corpora for Multilingual Natural Language Processing
Title Building and Using Comparable Corpora for Multilingual Natural Language Processing PDF eBook
Author Serge Sharoff
Publisher Springer Nature
Total Pages 138
Release 2023-08-23
Genre Computers
ISBN 3031313844

Download Building and Using Comparable Corpora for Multilingual Natural Language Processing Book in PDF, Epub and Kindle

This book provides a comprehensive overview of methods to build comparable corpora and of their applications, including machine translation, cross-lingual transfer, and various kinds of multilingual natural language processing. The authors begin with a brief history on the topic followed by a comparison to parallel resources and an explanation of why comparable corpora have become more widely used. In particular, they provide the basis for the multilingual capabilities of pre-trained models, such as BERT or GPT. The book then focuses on building comparable corpora, aligning their sentences to create a database of suitable translations, and using these sentence translations to produce dictionaries and term banks. Then, it is explained how comparable corpora can be used to build machine translation engines and to develop a wide variety of multilingual applications.