Natural Language Parsing and Linguistic Theories


Book Description

presupposition fails, we now give a short introduction into Unification Grammar. Since all implementations discussed in this volume use PROLOG (with the exception of BlockjHaugeneder), we felt that it would also be useful to explain the difference between unification in PROLOG and in UG. After the introduction to UG we briefly summarize the main arguments for using linguistic theories in natural language processing. We conclude with a short summary of the contributions to this volume. UNIFICATION GRAMMAR 3 Feature Structures or Complex Categories. Unification Grammar was developed by Martin Kay (Kay 1979). Martin Kay wanted to give a precise defmition (and implementation) of the notion of 'feature'. Linguists use features at nearly all levels of linguistic description. In phonetics, for instance, the phoneme b is usually described with the features 'bilabial', 'voiced' and 'nasal'. In the case of b the first two features get the value +, the third (nasal) gets the value -. Feature value pairs in phonology are normally represented as a matrix. bilabial: + voiced: + I nasal: - [Feature matrix for b.] In syntax features are used, for example, to distinguish different noun classes. The Latin noun 'murus' would be characterized by the following feature-value pairs: gender: masculin, number: singular, case: nominative, pred: murus. Besides a matrix representation one frequently fmds a graph representation for feature value pairs. The edges of the graph are labelled by features. The leaves denote the value of a feature.




Embeddings in Natural Language Processing


Book Description

Embeddings have undoubtedly been one of the most influential research areas in Natural Language Processing (NLP). Encoding information into a low-dimensional vector representation, which is easily integrable in modern machine learning models, has played a central role in the development of NLP. Embedding techniques initially focused on words, but the attention soon started to shift to other forms: from graph structures, such as knowledge bases, to other types of textual content, such as sentences and documents. This book provides a high-level synthesis of the main embedding techniques in NLP, in the broad sense. The book starts by explaining conventional word vector space models and word embeddings (e.g., Word2Vec and GloVe) and then moves to other types of embeddings, such as word sense, sentence and document, and graph embeddings. The book also provides an overview of recent developments in contextualized representations (e.g., ELMo and BERT) and explains their potential in NLP. Throughout the book, the reader can find both essential information for understanding a certain topic from scratch and a broad overview of the most successful techniques developed in the literature.




Natural Language and Speech


Book Description

This volume presents the proceedings of the Symposium on Natural Language and Speech held during the ESPRIT conference of November 1991. The symposiumwas organized by the newly launched Network of Excellence on Language and Speech which brings together the foremost European experts and institutions in the two domains. The proceedings contain ten invited papers from leading experts in language and speech research, together with a set of position papers from a panel session on 'Spoken language systems: technological goals and integration issues'. The papers cover a wide spectrum of research topics, ranging from logical aspects of discourse structure to problems of prosody and automatic speech understanding. A recurrent theme is the development of an integrated cognitively motivated theory of the process by which spoken language is understood. This volume is the second of the ESPRIT Basic Research Series. The ESPRIT Basic Research efforts aim at forging stronglinks between academic and industrial teams carrying out research, often interdisciplinary, at the forefront of information technology. The quality of content of this series and its broad distribution should have a majorimpact in making these advances accessible to both academic and industrial researchers.




Natural Language Parsing and Linguistic Theories


Book Description

presupposition fails, we now give a short introduction into Unification Grammar. Since all implementations discussed in this volume use PROLOG (with the exception of BlockjHaugeneder), we felt that it would also be useful to explain the difference between unification in PROLOG and in UG. After the introduction to UG we briefly summarize the main arguments for using linguistic theories in natural language processing. We conclude with a short summary of the contributions to this volume. UNIFICATION GRAMMAR 3 Feature Structures or Complex Categories. Unification Grammar was developed by Martin Kay (Kay 1979). Martin Kay wanted to give a precise defmition (and implementation) of the notion of 'feature'. Linguists use features at nearly all levels of linguistic description. In phonetics, for instance, the phoneme b is usually described with the features 'bilabial', 'voiced' and 'nasal'. In the case of b the first two features get the value +, the third (nasal) gets the value -. Feature value pairs in phonology are normally represented as a matrix. bilabial: + voiced: + I nasal: - [Feature matrix for b.] In syntax features are used, for example, to distinguish different noun classes. The Latin noun 'murus' would be characterized by the following feature-value pairs: gender: masculin, number: singular, case: nominative, pred: murus. Besides a matrix representation one frequently fmds a graph representation for feature value pairs. The edges of the graph are labelled by features. The leaves denote the value of a feature.




Linguistic Fundamentals for Natural Language Processing


Book Description

Many NLP tasks have at their core a subtask of extracting the dependencies—who did what to whom—from natural language sentences. This task can be understood as the inverse of the problem solved in different ways by diverse human languages, namely, how to indicate the relationship between different parts of a sentence. Understanding how languages solve the problem can be extremely useful in both feature design and error analysis in the application of machine learning to NLP. Likewise, understanding cross-linguistic variation can be important for the design of MT systems and other multilingual applications. The purpose of this book is to present in a succinct and accessible fashion information about the morphological and syntactic structure of human languages that can be useful in creating more linguistically sophisticated, more language-independent, and thus more successful NLP systems. Table of Contents: Acknowledgments / Introduction/motivation / Morphology: Introduction / Morphophonology / Morphosyntax / Syntax: Introduction / Parts of speech / Heads, arguments, and adjuncts / Argument types and grammatical functions / Mismatches between syntactic position and semantic roles / Resources / Bibliography / Author's Biography / General Index / Index of Languages




Computational Cognitive Modeling and Linguistic Theory


Book Description

This open access book introduces a general framework that allows natural language researchers to enhance existing competence theories with fully specified performance and processing components. Gradually developing increasingly complex and cognitively realistic competence-performance models, it provides running code for these models and shows how to fit them to real-time experimental data. This computational cognitive modeling approach opens up exciting new directions for research in formal semantics, and linguistics more generally, and offers new ways of (re)connecting semantics and the broader field of cognitive science. The approach of this book is novel in more ways than one. Assuming the mental architecture and procedural modalities of Anderson's ACT-R framework, it presents fine-grained computational models of human language processing tasks which make detailed quantitative predictions that can be checked against the results of self-paced reading and other psycho-linguistic experiments. All models are presented as computer programs that readers can run on their own computer and on inputs of their choice, thereby learning to design, program and run their own models. But even for readers who won't do all that, the book will show how such detailed, quantitatively predicting modeling of linguistic processes is possible. A methodological breakthrough and a must for anyone concerned about the future of linguistics! (Hans Kamp) This book constitutes a major step forward in linguistics and psycholinguistics. It constitutes a unique synthesis of several different research traditions: computational models of psycholinguistic processes, and formal models of semantics and discourse processing. The work also introduces a sophisticated python-based software environment for modeling linguistic processes. This book has the potential to revolutionize not only formal models of linguistics, but also models of language processing more generally. (Shravan Vasishth) .




Foundations of Statistical Natural Language Processing


Book Description

Statistical approaches to processing natural language text have become dominant in recent years. This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear. The book contains all the theory and algorithms needed for building NLP tools. It provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations. The book covers collocation finding, word sense disambiguation, probabilistic parsing, information retrieval, and other applications.




The Handbook of Computational Linguistics and Natural Language Processing


Book Description

This comprehensive reference work provides an overview of the concepts, methodologies, and applications in computational linguistics and natural language processing (NLP). Features contributions by the top researchers in the field, reflecting the work that is driving the discipline forward Includes an introduction to the major theoretical issues in these fields, as well as the central engineering applications that the work has produced Presents the major developments in an accessible way, explaining the close connection between scientific understanding of the computational properties of natural language and the creation of effective language technologies Serves as an invaluable state-of-the-art reference source for computational linguists and software engineers developing NLP applications in industrial research and development labs of software companies




Natural Language Processing and Computational Linguistics


Book Description

Work with Python and powerful open source tools such as Gensim and spaCy to perform modern text analysis, natural language processing, and computational linguistics algorithms. Key Features Discover the open source Python text analysis ecosystem, using spaCy, Gensim, scikit-learn, and Keras Hands-on text analysis with Python, featuring natural language processing and computational linguistics algorithms Learn deep learning techniques for text analysis Book Description Modern text analysis is now very accessible using Python and open source tools, so discover how you can now perform modern text analysis in this era of textual data. This book shows you how to use natural language processing, and computational linguistics algorithms, to make inferences and gain insights about data you have. These algorithms are based on statistical machine learning and artificial intelligence techniques. The tools to work with these algorithms are available to you right now - with Python, and tools like Gensim and spaCy. You'll start by learning about data cleaning, and then how to perform computational linguistics from first concepts. You're then ready to explore the more sophisticated areas of statistical NLP and deep learning using Python, with realistic language and text samples. You'll learn to tag, parse, and model text using the best tools. You'll gain hands-on knowledge of the best frameworks to use, and you'll know when to choose a tool like Gensim for topic models, and when to work with Keras for deep learning. This book balances theory and practical hands-on examples, so you can learn about and conduct your own natural language processing projects and computational linguistics. You'll discover the rich ecosystem of Python tools you have available to conduct NLP - and enter the interesting world of modern text analysis. What you will learn Why text analysis is important in our modern age Understand NLP terminology and get to know the Python tools and datasets Learn how to pre-process and clean textual data Convert textual data into vector space representations Using spaCy to process text Train your own NLP models for computational linguistics Use statistical learning and Topic Modeling algorithms for text, using Gensim and scikit-learn Employ deep learning techniques for text analysis using Keras Who this book is for This book is for you if you want to dive in, hands-first, into the interesting world of text analysis and NLP, and you're ready to work with the rich Python ecosystem of tools and datasets waiting for you!




Natural Language Processing and Computational Linguistics


Book Description

Natural language processing (NLP) is a scientific discipline which is found at the interface of computer science, artificial intelligence and cognitive psychology. Providing an overview of international work in this interdisciplinary field, this book gives the reader a panoramic view of both early and current research in NLP. Carefully chosen multilingual examples present the state of the art of a mature field which is in a constant state of evolution. In four chapters, this book presents the fundamental concepts of phonetics and phonology and the two most important applications in the field of speech processing: recognition and synthesis. Also presented are the fundamental concepts of corpus linguistics and the basic concepts of morphology and its NLP applications such as stemming and part of speech tagging. The fundamental notions and the most important syntactic theories are presented, as well as the different approaches to syntactic parsing with reference to cognitive models, algorithms and computer applications.