Recent Advances in Multiword Units in Machine Translation and Translation Technology


Book Description

The investigation of phraseology through corpus-based and computational approaches holds significant relevance for various professionals, including translators, interpreters, terminologists, lexicographers, language instructors, and learners. Computational Phraseology, and in particular the computational analysis of multiword expressions (also known as multiword units), has gained prominence in recent years and is essential for a number of Natural Language Processing and Translation Technology applications. The failure to detect these units automatically could result in incorrect and problematic automatic translations and could hinder the performance of applications such as text summarisation and web search. Against this background, the volume offers 13 articles carefully selected and organised into two parts: ‘Computational treatment of multiword units’ and ‘Corpus-based and linguistic studies in phraseology‘. The contributions not only highlight the latest advancements in computational and corpus-based phraseology but also reiterate its vital role in all areas of language technologies, including basic and applied research.




Multiword Units in Machine Translation and Translation Technology


Book Description

The correct interpretation of Multiword Units (MWUs) is crucial to many applications in Natural Language Processing but is a challenging and complex task. In recent years, the computational treatment of MWUs has received considerable attention but there is much more to be done before we can claim that NLP and Machine Translation (MT) systems process MWUs successfully. This volume provides a general overview of the field with particular reference to Machine Translation and Translation Technology and focuses on languages such as English, Basque, French, Romanian, German, Dutch and Croatian, among others. The chapters of the volume illustrate a variety of topics that address this challenge, such as the use of rule-based approaches, compound splitting techniques, MWU identification methodologies in multilingual applications, and MWU alignment issues.




Computational and Corpus-Based Phraseology


Book Description

This book constitutes the refereed proceedings of the 4th International Conference on Computational and Corpus-Based Phraseology, Europhras 2022, held in Malaga, Spain, in September 2022. The 16 full papers presented in this book were carefully reviewed and selected from 59 submissions. The papers in this volume cover a number of topics including general corpus-based approaches to phraseology, phraseology in translation and cross-linguistic studies, phraseology in language teaching and learning, phraseology in specialized languages, phraseology in lexicography, cognitive approaches to phraseology, the computational treatment of multiword expressions, and the development, annotation, and exploitation of corpora for phraseological studies.




Corpora in Translation and Contrastive Research in the Digital Age


Book Description

Corpus-based contrastive and translation research are areas that keep evolving in the digital age, as the range of new corpus resources and tools expands, opening up to different approaches and application contexts. The current book contains a selection of papers which focus on corpora and translation research in the digital age, outlining some recent advances and explorations. After an introductory chapter which outlines language technologies applied to translation and interpreting with a view to identifying challenges and research opportunities, the first part of the book is devoted to current advances in the creation of new parallel corpora for under-researched areas, the development of tools to manage parallel corpora or as an alternative to parallel corpora, and new methodologies to improve existing translation memory systems. The contributions in the second part of the book address a number of cutting-edge linguistic issues in the area of contrastive discourse studies and translation analysis on the basis of comparable and parallel corpora in several languages such as English, German, Swedish, French, Italian, Spanish, Portuguese and Turkish, thus showcasing the richness of the linguistic diversity carried out in these recent investigations. Given the multiplicity of topics, methodologies and languages studied in the different chapters, the book will be of interest to a wide audience working in the fields of translation studies, contrastive linguistics and the automatic processing of language.




Idiom Treatment Experiments in Machine Translation


Book Description

In 1975, Searle stated that one should speak idiomatically unless there is some good reason not to do so. Fillmore, Kay, and O’Connor in 1988 defined an idiomatic expression or construction as something that a language user could fail to know while knowing everything else in the language. Our language is rich in conversational phrases, idioms, metaphors, and general expressions used in metaphorical meaning. These idiomatic expressions pose a particular challenge for Machine Translation (MT), because their translation for the most part does not work literally, but logically. The present book shows how idiomatic expressions can be recognized and correctly translated with the help of a bilingual idiom dictionary (English-German), a monolingual (German) corpus, and morphosyntactic rules. The work focuses on the field of Example-based Machine Translation (EBMT). A theory of idiomatic expressions with their syntactic and semantic properties is provided, followed by the practical part of the book which describes how the hybrid EBMT system METIS-II is able to correctly process idiomatic expressions. A comparison of METIS-II with three commercial systems shows that idioms are not impossible to translate as it was predicted in 1952: “The only way for a machine to treat idioms is—not to have idioms!” This book furnishes plenty of examples of idiomatic phrases and provides the foundation for how MT systems can process and translate idioms by means of simple linguistic resources.




The Pragmatics of Multiword Terms


Book Description

This book explores the pragmatics of specialized language with a focus on multiword terms, complex phrases characterized by sequences of nouns or adjectives whose meaning is clarified in the unspecified but implicit links between them, with implications for their use and translation. The volume adopts an innovative approach rooted in Frame-Based Terminology which allows for the analysis of multiword – compound terms in specialized language, such as horizontal-axis wind turbine – term formation from an integrated semantic and pragmatic perspective. The book features data from a corpus on wind power in English, Spanish, and French comprising such specialized texts as research articles, books, reports, and PhD theses to consider term extraction and the identification of terminological correspondences. Cabezas-García highlights the ways in which pragmatic analysis is an integral part of understanding multiword terms, due to the necessary inference of information implicit within them, with applications for future research on pragmatics and specialized language more broadly. This book will be of interest to students and researchers in pragmatics, semantics, corpus linguistics, and terminology.




Lexical Collocation Analysis


Book Description

This book re-examines the notion of word associations, more precisely collocations. It attempts to come to a potentially more generally applicable definition of collocation and how to best extract, identify and measure collocations. The book highlights the role played by (i) automatic linguistic annotation (part-of-speech tagging, syntactic parsing, etc.), (ii) using semantic criteria to facilitate the identification of collocations, (iii) multi-word structured, instead of the widespread assumption of bipartite collocational structures, for capturing the intricacies of the phenomenon of syntagmatic attraction, (iv) considering collocation and valency as near neighbours in the lexis-grammar continuum and (v) the mathematical properties of statistical association measures in the automatic extraction of collocations from corpora. This book is an ideal guide to the use of statistics in collocation analysis and lexicography, as well as a practical text to the development of skills in the application of computational lexicography. Lexical Collocation Analysis: Advances and Applications begins with a proposal for integrating both collocational and valency phenomena within the overarching theoretical framework of construction grammar. Next the book makes the case for integrating advances in syntactic parsing and in collocational analysis. Chapter 3 offers an innovative look at complementing corpus data and dictionaries in the identification of specific types of collocations consisting of restricted predicate-argument combinations. This strategy complements corpus collocational data with network analysis techniques applied to dictionary entries. Chapter 4 explains the potential of collocational graphs and networks both as a visualization tool and as an analytical technique. Chapter 5 introduces MERGE (Multi-word Expressions from the Recursive Grouping of Elements), a data-driven approach to the identification and extraction of multi-word expressions from corpora. Finally the book concludes with an analysis and evaluation of factors influencing the performance of collocation extraction methods in parsed corpora.




The Routledge Handbook of Translation and Technology


Book Description

The Routledge Handbook of Translation and Technology provides a comprehensive and accessible overview of the dynamically evolving relationship between translation and technology. Divided into five parts, with an editor's introduction, this volume presents the perspectives of users of translation technologies, and of researchers concerned with issues arising from the increasing interdependency between translation and technology. The chapters in this Handbook tackle the advent of technologization at both a technical and a philosophical level, based on industry practice and academic research. Containing over 30 authoritative, cutting-edge chapters, this is an essential reference and resource for those studying and researching translation and technology. The volume will also be valuable for translators, computational linguists and developers of translation tools.




Mobile Speech and Advanced Natural Language Solutions


Book Description

"Mobile Speech and Advanced Natural Language Solutions" presents the discussion of the most recent advances in intelligent human-computer interaction, including fascinating new study findings on talk-in-interaction, which is the province of conversation analysis, a subfield in sociology/sociolinguistics, a new and emerging area in natural language understanding. Editors Amy Neustein and Judith A. Markowitz have recruited a talented group of contributors to introduce the next generation natural language technologies for practical speech processing applications that serve the consumer’s need for well-functioning natural language-driven personal assistants and other mobile devices, while also addressing business’ need for better functioning IVR-driven call centers that yield a more satisfying experience for the caller. This anthology is aimed at two distinct audiences: one consisting of speech engineers and system developers; the other comprised of linguists and cognitive scientists. The text builds on the experience and knowledge of each of these audiences by exposing them to the work of the other.




HCTL Open International Journal of Technology Innovations and Research (IJTIR)


Book Description

HCTL Open International Journal of Technology Innovations and Research (IJTIR) [ISSN (Online): 2321-1814] is an International, Open-Access, Peer-Reviewed, Online journal devoted to various disciplines of Science and Technology. HCTL Open IJTIR is a bi-monthly journal published by HCTL Open Publications Solutions, India and Hybrid Computing Technology Labs, India. - Get more information at: http://ijtir.hctl.org/