Machine Translation and the Lexicon


Book Description

This volume constitutes the proceedings of the Third International Workshop of the European Association for Machine Translation, held in Heidelberg, Germany in April 1993. The EAMT Workshops traditionally aim at bringing together researchers, developers, users, and others interested in the field of machine or computer-assisted translation research, development and use. The volume presents thoroughly revised versions of the 15 best workshop contributions together with an introductory survey by the volume editor. The presentations are centered primarily on questions of acquiring, sharing, and managing lexical data, but also address aspects of lexical description.




Machine Translation


Book Description

This book describes a novel, cross-linguistic approach to machine translation that solves certain classes of syntactic and lexical divergences by means of a lexical conceptual structure that can be composed and decomposed in language-specific ways. This approach allows the translator to operate uniformly across many languages, while still accounting for knowledge that is specific to each language.







The KBMT Project


Book Description

Machine translation of natural languages is one of the most complex and comprehensive applications of computational linguistics and artificial intelligence. This is especially true of knowledge-based machine translation (KBMT) systems, which require many knowledge resources and processing modules to carry out the necessary levels of analysis, representation and generation of meaning and form. The number of real-world problems, tasks, and solutions involved in developing any realistic-size knowledge-based machine translation system is enormous. It is thus difficult for researchers in the field to learn what a system "really does". This book fills that need with a detailed case study of a KBMT system implemented at the Center for Machine Translation at Carnegie Mellon University. The research consists in part of the creation of a system for translation between English and Japanese. The corpora used in the project were manuals for installing and maintaining IBM personal computers (sponsorship by IBM, through its Tokyo Research Laboratory) Individual chapters describe the interlingua texts used in knowledge-based machine translation, the grammar formalism embodied in the system, the grammars and lexicons and their roles in the translation process, the process of source language analysis, an augmentation module that interactively and automatically resolves ambiguities remaining after source language analysis, and the generator, which produces target language sentences. Detailed appendices illustrate the process from analysis through generation. This book is intended for developers, researchers and advanced students in natural language processing and computational linguistics, including all those who have an interest in machine translation and machine-aided translation.







Conceptual Basis of the Lexicon in Machine Translation


Book Description

Abstract: "This report describes the organization and content of lexical information required for the task of machine translation. In particular, the lexical-conceptual basis for UNITRAN, an implemented machine translation system, will be described. UNITRAN uses an underlying form called lexical conceptual structure to perform two difficult, but crucial, tasks: lexical selection (i.e., choosing the appropriate target-language terms for a given source-language sentence) and syntactic realization (i.e., mapping an underlying lexical representation to a corresponding syntactic structure)




Machine Translation: From Research to Real Users


Book Description

This book constitutes the refereed proceedings of the 5th Conference of the Association for Machine Translation in the Americas, AMTA 2002, held in Tiburon, CA, USA, in October 2002. The 18 revised full technical papers, 3 user studies, and 9 system descriptions presented were carefully reviewed and selected for inclusion in the book. Among the issues addressed are hybrid translation environments, resource-limited MT, statistical word-level alignment, word formation rules, rule learning, web-based MT, translation divergences, example-based MT, data-driven MT, classification, contextual translation, the lexicon building process, commercial MT systems, speeck-to-speech translation, and language checking systems.




Translation, Brains and the Computer


Book Description

This book is about machine translation (MT) and the classic problems associated with this language technology. It examines the causes of these problems and, for linguistic, rule-based systems, attributes the cause to language’s ambiguity and complexity and their interplay in logic-driven processes. For non-linguistic, data-driven systems, the book attributes translation shortcomings to the very lack of linguistics. It then proposes a demonstrable way to relieve these drawbacks in the shape of a working translation model (Logos Model) that has taken its inspiration from key assumptions about psycholinguistic and neurolinguistic function. The book suggests that this brain-based mechanism is effective precisely because it bridges both linguistically driven and data-driven methodologies. It shows how simulation of this cerebral mechanism has freed this one MT model from the all-important, classic problem of complexity when coping with the ambiguities of language. Logos Model accomplishes this by a data-driven process that does not sacrifice linguistic knowledge, but that, like the brain, integrates linguistics within a data-driven process. As a consequence, the book suggests that the brain-like mechanism embedded in this model has the potential to contribute to further advances in machine translation in all its technological instantiations.




Using Comparable Corpora for Under-Resourced Areas of Machine Translation


Book Description

This book provides an overview of how comparable corpora can be used to overcome the lack of parallel resources when building machine translation systems for under-resourced languages and domains. It presents a wealth of methods and open tools for building comparable corpora from the Web, evaluating comparability and extracting parallel data that can be used for the machine translation task. It is divided into several sections, each covering a specific task such as building, processing, and using comparable corpora, focusing particularly on under-resourced language pairs and domains. The book is intended for anyone interested in data-driven machine translation for under-resourced languages and domains, especially for developers of machine translation systems, computational linguists and language workers. It offers a valuable resource for specialists and students in natural language processing, machine translation, corpus linguistics and computer-assisted translation, and promotes the broader use of comparable corpora in natural language processing and computational linguistics.