Finite-State Computational Morphology


Book Description

This handbook provides a comprehensive account of current research on the finite-state morphology of Georgian and enables the reader to enter quickly into Georgian morphosyntax and its computational processing. It combines linguistic analysis with application of finite-state technology to processing of the language. The book opens with the author’s synoptic overview of the main lines of research, covers the properties of the word and its components, then moves up to the description of Georgian morphosyntax and the morphological analyzer and generator of Georgian.The book comprises three chapters and accompanying appendices. The aim of the first chapter is to describe the morphosyntactic structure of Georgian, focusing on differences between Old and Modern Georgian. The second chapter focuses on the application of finite-state technology to the processing of Georgian and on the compilation of a tokenizer, a morphological analyzer and a generator for Georgian. The third chapter discusses the testing and evaluation of the analyzer’s output and the compilation of the Georgian Language Corpus (GLC), which is now accessible online and freely available to the research community.Since the development of the analyzer, the field of computational linguistics has advanced in several ways, but the majority of new approaches to language processing has not been tested on Georgian. So, the organization of the book makes it easier to handle new developments from both a theoretical and practical viewpoint.The book includes a detailed index and references as well as the full list of morphosyntactic tags. It will be of interest and practical use to a wide range of linguists and advanced students interested in Georgian morphosyntax generally as well as to researchers working in the field of computational linguistics and focusing on how languages with complicated morphosyntax can be handled through finite-state approaches.




Finite State Morphology


Book Description

The finite-state paradigm of computer science has provided a basis for natural-language applications that are efficient, elegant, and robust. This volume is a practical guide to finite-state theory and the affiliated programming languages lexc and xfst. Readers will learn how to write tokenizers, spelling checkers, and especially morphological analyzer/generators for words in English, French, Finnish, Hungarian, and other languages. Included are graded introductions, examples, and exercises suitable for individual study as well as formal courses. These take advantage of widely-tested lexc and xfst applications that are just becoming available for noncommercial use via the Internet.




Computational Morphology


Book Description

Previous work on morphology has largely tended either to avoid precise computational details or to ignore linguistic generality. Computational Morphologyis the first book to present an integrated set of techniques for the rigorous description of morphological phenomena in English and similar languages. By taking account of all facets of morphological analysis, it provides a linguistically general and computationally practical dictionary system for use within an English parsing program. The authors covermorphographemics (variations in spelling as words are built from their component morphemes),morphotactics (the ways that different classes of morphemes can combine, and the types of words that result), andlexical redundancy (patterns of similarity and regularity among the lexical entries for words). They propose a precise rule-notation for each of these areas of linguistic description and present the algorithms for using these rules computationally to manipulate dictionary information. These mechanisms have been implemented in practical and publicly available software, which is described in detail, and appendixes contain a large number of computer-tested sets of rules and lexical entries for English. Graeme D. Ritchie is a Senior Lecturer in the Department of Artificial Intelligence at the University of Edinburgh, where Alan W. Black is currently a research student. Graham J. Russell is a Research Fellow at ISSCO (Institut Dalle Molle pour les etudes semantiques et cognitives) in Geneva, and Stephen G. Pulman is a Lecturer in the University of Cambridge Computer Laboratory and Director of SRI International's Cambridge Computer Science Research Centre.




Arabic Computational Morphology


Book Description

This is the first comprehensive overview of computational approaches to Arabic morphology. The subtitle aims to reflect that widely different computational approaches to the Arabic morphological system have been proposed. The book provides a showcase of the most advanced language technologies applied to one of the most vexing problems in linguistics. It covers knowledge-based and empirical-based approaches.




Computational Approaches to Morphology and Syntax


Book Description

The book will appeal to scholars and advanced students of morphology, syntax, computational linguistics and natural language processing (NLP). It provides a critical and practical guide to computational techniques for handling morphological and syntactic phenomena, showing how these techniques have been used and modified in practice. The authors discuss the nature and uses of syntactic parsers and examine the problems and opportunities of parsing algorithms for finite-state, context-free and various context-sensitive grammars. They relate approaches for describing syntax and morphology to formal mechanisms and algorithms, and present well-motivated approaches for augmenting grammars with weights or probabilities.




Morphology and Computation


Book Description

This book provides the first broad yet thorough coverage of issues in morphological theory. It includes a wide array of techniques and systems in computational morphology (including discussion of their limitations), and describes some unusual applications.Sproat motivates the study of computational morphology by arguing that a computational natural language system, such as a parser or a generator, must incorporate a model of morphology. He discusses a range of applications for programs with knowledge of morphology, some of which are not generally found in the literature. Sproat then provides an overview of some of the basic descriptive facts about morphology and issues in theoretical morphology and (lexical) phonology, as well as psycholinguistic evidence for human processing of morphological structure. He take up the basic techniques that have been proposed for doing morphological processing and discusses at length various systems (such as DECOMP and KIMMO) that incorporate part or all of those techniques, pointing out the inadequacies of such systems from both a descriptive and a computational point of view. He concludes by touching on interesting peripheral areas such as the analysis of complex nominals in English, and on the main contributions of Rumelhart and McClelland's connectionism to the computational analysis of words.




Finite-state Language Processing


Book Description

Finite-state devices, such as finite-state automata, graphs, and finite-state transducers, have been present since the emergence of computer science and are extensively used in areas as various as program compilation, hardware modeling, and database management. Although finite-state devices have been known for some time in computational linguistics, more powerful formalisms such as context-free grammars or unification grammars have typically been preferred. Recent mathematical and algorithmic results in the field of finite-state technology have had a great impact on the representation of electronic dictionaries and on natural language processing, resulting in a new technology for language emerging out of both industrial and academic research. This book presents a discussion of fundamental finite-state algorithms, and constitutes an approach from the perspective of natural language processing.




The Oxford Handbook of Computational Linguistics


Book Description

This handbook of computational linguistics, written for academics, graduate students and researchers, provides a state-of-the-art reference to one of the most active and productive fields in linguistics.




State of the Art in Computational Morphology


Book Description

From the point of view of computational linguistics, morphological resources are the basis for all higher-level applications. This is especially true for languages with a rich morphology, such as German or Finnish. A morphology component should thus be capable of analyzing single word forms as well as whole corpora. For many practical applications, not only morphological analysis, but also generation is required, i.e., the production of surfaces corresponding to speci?c categories. Apart from uses in computational linguistics, there are also numerous practical - plications that either require morphological analysis and generation or that can greatly bene?t from it, for example, in text processing, user interfaces, or information - trieval. These applications have speci?c requirements for morphological components, including requirements from software engineering, such as programming interfaces or robustness. In 1994, the First Morpholympics took place at the University of Erlangen- Nuremberg, a competition between several systems for the analysis and generation of German word forms. Eight systems participated in the First Morpholympics; the conference proceedings [1] thus give a very good overview of the state of the art in computational morphologyfor German as of 1994.




Two-level Morphology


Book Description