Hybrid Approaches to Machine Translation


Book Description

This volume provides an overview of the field of Hybrid Machine Translation (MT) and presents some of the latest research conducted by linguists and practitioners from different multidisciplinary areas. Nowadays, most important developments in MT are achieved by combining data-driven and rule-based techniques. These combinations typically involve hybridization of different traditional paradigms, such as the introduction of linguistic knowledge into statistical approaches to MT, the incorporation of data-driven components into rule-based approaches, or statistical and rule-based pre- and post-processing for both types of MT architectures. The book is of interest primarily to MT specialists, but also – in the wider fields of Computational Linguistics, Machine Learning and Data Mining – to translators and managers of translation companies and departments who are interested in recent developments concerning automated translation tools.




Computational Linguistics and Intelligent Text Processing


Book Description

The two-volume set LNCS 9623 + 9624 constitutes revised selected papers from the CICLing 2016 conference which took place in Konya, Turkey, in April 2016. The total of 89 papers presented in the two volumes was carefully reviewed and selected from 298 submissions. The book also contains 4 invited papers and a memorial paper on Adam Kilgarriff’s Legacy to Computational Linguistics. The papers are organized in the following topical sections: Part I: In memoriam of Adam Kilgarriff; general formalisms; embeddings, language modeling, and sequence labeling; lexical resources and terminology extraction; morphology and part-of-speech tagging; syntax and chunking; named entity recognition; word sense disambiguation and anaphora resolution; semantics, discourse, and dialog. Part II: machine translation and multilingualism; sentiment analysis, opinion mining, subjectivity, and social media; text classification and categorization; information extraction; and applications.




Studies on Multilingual Lexicography


Book Description

Given the new technological advances and their influence and imprint in the design and development of dictionaries and lexicographic resources, it seems important to put together a series of publications that address this new situation, dealing in particular with multilingual and electronic lexicography in an increasingly digital, multilingual and multicultural society. This is the main objective of this volume, which is structured in two central aspects. In the first of them the concept of multilingual lexicography is discussed in regard to the influence that the Internet and the application of digital technologies have exercised and continue to exercise both in the conception and design of dictionaries and new lexicographic application tools as well as the emergence of new types of users and forms of consultation. The role of the dictionary must necessarily be related to social development and changes. In the second thematic section, different dictionaries and resources that focus on a multilingual and electronic approach to the linguistic data for their lexicographical treatment and consultation are presented.




Computational Phraseology


Book Description

Whether you wish to deliver on a promise, take a walk down memory lane or even on the wild side, phraseological units (also often referred to as phrasemes or multiword expressions) are present in most communicative situations and in all world’s languages. Phraseology, the study of phraseological units, has therefore become a rare unifying theme across linguistic theories. In recent years, an increasing number of studies have been concerned with the computational treatment of multiword expressions: these pertain among others to their automatic identification, extraction or translation, and to the role they play in various Natural Language Processing applications. Computational Phraseology is a comparatively new field where better understanding and more advances are urgently needed. This book aims to address this pressing need, by bringing together contributions focusing on different perspectives of this promising interdisciplinary field.







Shallow Discourse Parsing for German


Book Description

The last few decades have seen impressive improvements in several areas of Natural Language Processing. Nevertheless, getting a computer to make sense of the discourse of utterances in a text remains challenging. Several different theories which aim to describe and analyze the coherent structure of a well-written text exist, but with varying degrees of applicability and feasibility for practical use. This book is about shallow discourse parsing, following the paradigm of the Penn Discourse TreeBank, a corpus containing over 1 million words annotated for discourse relations. When it comes to discourse processing, any language other than English must be considered a low-resource language. This book relates to discourse parsing for German. The limited availability of annotated data for German means that the potential of modern, deep-learning-based methods relying on such data is also limited. This book explores to what extent machine-learning and more recent deep-learning-based methods can be combined with traditional, linguistic feature engineering to improve performance for the discourse parsing task. The end-to-end shallow discourse parser for German developed for the purpose of this book is open-source and available online. Work has also been carried out on several connective lexicons in different languages. Strategies are discussed for creating or further developing such lexicons for a given language, as are suggestions on how to further increase their usefulness for shallow discourse parsing. The book will be of interest to all whose work involves Natural Language Processing, particularly in languages other than English.




Dependency Structures from Syntax to Discourse


Book Description

Based on the large corpora of journalistic English, this title examines dependency relations and related properties at both syntactic and discourse levels, seeking to unravel the language patterns of real-life usage. With a focus on rank-frequency distribution, the author investigates the distribution of linguistic properties/units from the perspectives of properties, motifs and sequencings. At the syntactic level, the book analyses the following three dimensions: various combinations of a complete dependency structure, valency and dependency distance. At the discourse level, it proves that the elements can also form dependency relations by exploring (1) the rank-frequency distribution of Rhetorical Structure Theory relations, their motifs, discourse valency and discourse dependency distance; (2) whether there is top-down organisation or an inverted pyramid structure at all the three discourse levels; and (3) whether discourse dependency distances and valencies are lawfully distributed, following the same distribution patterns as those at the syntactic level. This book will be of great value for scholars and students of quantitative linguistics and computational linguistics and its practical insights will also benefit professionals of language teaching and journalistic writing.




Using Comparable Corpora for Under-Resourced Areas of Machine Translation


Book Description

This book provides an overview of how comparable corpora can be used to overcome the lack of parallel resources when building machine translation systems for under-resourced languages and domains. It presents a wealth of methods and open tools for building comparable corpora from the Web, evaluating comparability and extracting parallel data that can be used for the machine translation task. It is divided into several sections, each covering a specific task such as building, processing, and using comparable corpora, focusing particularly on under-resourced language pairs and domains. The book is intended for anyone interested in data-driven machine translation for under-resourced languages and domains, especially for developers of machine translation systems, computational linguists and language workers. It offers a valuable resource for specialists and students in natural language processing, machine translation, corpus linguistics and computer-assisted translation, and promotes the broader use of comparable corpora in natural language processing and computational linguistics.




Creativity Policy, Partnerships and Practice in Education


Book Description

This book examines the gaps in creativity education across the education lifespan and the resulting implications for creative education and economic policy. Building on cutting-edge international research, the editors and contributors explore innovations in interdisciplinary creativities, including STEM agendas and definitions, science and creativity and organisational creativity amongst other subjects. Central to the volume is the idea that good creative educational practice and policy advancement needs to reimagine individual contribution and possibilities, whilst resisting standardization: it is inherently risky, not risk-averse. Prioritising creative partnerships, zones of contact, practice encounters and creative ecologies signal new modes of participatory engagement. Unfortunately, while primary schools continue to construct environments conducive to this kind of ‘slow education’, secondary schools and education policy persistently do not. This book argues, from diverse viewpoints and methodological perspectives, that 21st-century creativity education must find a way to advance in a more integrated and less siloed manner in order to respond to pedagogical innovation, economic imperatives and creative possibilities, and adequately prepare students for creative practice, workplaces and publics. This innovative volume will appeal to students and scholars of creative practice as well as policy makers and practitioners.




Statistical Machine Translation


Book Description

The dream of automatic language translation is now closer thanks to recent advances in the techniques that underpin statistical machine translation. This class-tested textbook from an active researcher in the field, provides a clear and careful introduction to the latest methods and explains how to build machine translation systems for any two languages. It introduces the subject's building blocks from linguistics and probability, then covers the major models for machine translation: word-based, phrase-based, and tree-based, as well as machine translation evaluation, language modeling, discriminative training and advanced methods to integrate linguistic annotation. The book also reports the latest research, presents the major outstanding challenges, and enables novices as well as experienced researchers to make novel contributions to this exciting area. Ideal for students at undergraduate and graduate level, or for anyone interested in the latest developments in machine translation.