Foundations of Statistical Natural Language Processing


Book Description

Statistical approaches to processing natural language text have become dominant in recent years. This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear. The book contains all the theory and algorithms needed for building NLP tools. It provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations. The book covers collocation finding, word sense disambiguation, probabilistic parsing, information retrieval, and other applications.




Foundation Models for Natural Language Processing


Book Description

This open access book provides a comprehensive overview of the state of the art in research and applications of Foundation Models and is intended for readers familiar with basic Natural Language Processing (NLP) concepts. Over the recent years, a revolutionary new paradigm has been developed for training models for NLP. These models are first pre-trained on large collections of text documents to acquire general syntactic knowledge and semantic information. Then, they are fine-tuned for specific tasks, which they can often solve with superhuman accuracy. When the models are large enough, they can be instructed by prompts to solve new tasks without any fine-tuning. Moreover, they can be applied to a wide range of different media and problem domains, ranging from image and video processing to robot control learning. Because they provide a blueprint for solving many tasks in artificial intelligence, they have been called Foundation Models. After a brief introduction to basic NLP models the main pre-trained language models BERT, GPT and sequence-to-sequence transformer are described, as well as the concepts of self-attention and context-sensitive embedding. Then, different approaches to improving these models are discussed, such as expanding the pre-training criteria, increasing the length of input texts, or including extra knowledge. An overview of the best-performing models for about twenty application areas is then presented, e.g., question answering, translation, story generation, dialog systems, generating images from text, etc. For each application area, the strengths and weaknesses of current models are discussed, and an outlook on further developments is given. In addition, links are provided to freely available program code. A concluding chapter summarizes the economic opportunities, mitigation of risks, and potential developments of AI.




Introduction to Natural Language Processing


Book Description

A survey of computational methods for understanding, generating, and manipulating human language, which offers a synthesis of classical representations and algorithms with contemporary machine learning techniques. This textbook provides a technical perspective on natural language processing—methods for building computer software that understands, generates, and manipulates human language. It emphasizes contemporary data-driven approaches, focusing on techniques from supervised and unsupervised machine learning. The first section establishes a foundation in machine learning by building a set of tools that will be used throughout the book and applying them to word-based textual analysis. The second section introduces structured representations of language, including sequences, trees, and graphs. The third section explores different approaches to the representation and analysis of linguistic meaning, ranging from formal logic to neural word embeddings. The final section offers chapter-length treatments of three transformative applications of natural language processing: information extraction, machine translation, and text generation. End-of-chapter exercises include both paper-and-pencil analysis and software implementation. The text synthesizes and distills a broad and diverse research literature, linking contemporary machine learning techniques with the field's linguistic and computational foundations. It is suitable for use in advanced undergraduate and graduate-level courses and as a reference for software engineers and data scientists. Readers should have a background in computer programming and college-level mathematics. After mastering the material presented, students will have the technical skill to build and analyze novel natural language processing systems and to understand the latest research in the field.




Natural Language Processing in Artificial Intelligence


Book Description

This volume focuses on natural language processing, artificial intelligence, and allied areas. Natural language processing enables communication between people and computers and automatic translation to facilitate easy interaction with others around the world. This book discusses theoretical work and advanced applications, approaches, and techniques for computational models of information and how it is presented by language (artificial, human, or natural) in other ways. It looks at intelligent natural language processing and related models of thought, mental states, reasoning, and other cognitive processes. It explores the difficult problems and challenges related to partiality, underspecification, and context-dependency, which are signature features of information in nature and natural languages. Key features: Addresses the functional frameworks and workflow that are trending in NLP and AI Looks at the latest technologies and the major challenges, issues, and advances in NLP and AI Explores an intelligent field monitoring and automated system through AI with NLP and its implications for the real world Discusses data acquisition and presents a real-time case study with illustrations related to data-intensive technologies in AI and NLP.




Practical Natural Language Processing


Book Description

Many books and courses tackle natural language processing (NLP) problems with toy use cases and well-defined datasets. But if you want to build, iterate, and scale NLP systems in a business setting and tailor them for particular industry verticals, this is your guide. Software engineers and data scientists will learn how to navigate the maze of options available at each step of the journey. Through the course of the book, authors Sowmya Vajjala, Bodhisattwa Majumder, Anuj Gupta, and Harshit Surana will guide you through the process of building real-world NLP solutions embedded in larger product setups. You’ll learn how to adapt your solutions for different industry verticals such as healthcare, social media, and retail. With this book, you’ll: Understand the wide spectrum of problem statements, tasks, and solution approaches within NLP Implement and evaluate different NLP applications using machine learning and deep learning methods Fine-tune your NLP solution based on your business problem and industry vertical Evaluate various algorithms and approaches for NLP product tasks, datasets, and stages Produce software solutions following best practices around release, deployment, and DevOps for NLP systems Understand best practices, opportunities, and the roadmap for NLP from a business and product leader’s perspective




Natural Language Processing with Transformers, Revised Edition


Book Description

Since their introduction in 2017, transformers have quickly become the dominant architecture for achieving state-of-the-art results on a variety of natural language processing tasks. If you're a data scientist or coder, this practical book -now revised in full color- shows you how to train and scale these large models using Hugging Face Transformers, a Python-based deep learning library. Transformers have been used to write realistic news stories, improve Google Search queries, and even create chatbots that tell corny jokes. In this guide, authors Lewis Tunstall, Leandro von Werra, and Thomas Wolf, among the creators of Hugging Face Transformers, use a hands-on approach to teach you how transformers work and how to integrate them in your applications. You'll quickly learn a variety of tasks they can help you solve. Build, debug, and optimize transformer models for core NLP tasks, such as text classification, named entity recognition, and question answering Learn how transformers can be used for cross-lingual transfer learning Apply transformers in real-world scenarios where labeled data is scarce Make transformer models efficient for deployment using techniques such as distillation, pruning, and quantization Train transformers from scratch and learn how to scale to multiple GPUs and distributed environments




Mathematical Foundations of Speech and Language Processing


Book Description

Speech and language technologies continue to grow in importance as they are used to create natural and efficient interfaces between people and machines, and to automatically transcribe, extract, analyze, and route information from high-volume streams of spoken and written information. The workshops on Mathematical Foundations of Speech Processing and Natural Language Modeling were held in the Fall of 2000 at the University of Minnesota's NSF-sponsored Institute for Mathematics and Its Applications, as part of a "Mathematics in Multimedia" year-long program. Each workshop brought together researchers in the respective technologies on the one hand, and mathematicians and statisticians on the other hand, for an intensive week of cross-fertilization. There is a long history of benefit from introducing mathematical techniques and ideas to speech and language technologies. Examples include the source-channel paradigm, hidden Markov models, decision trees, exponential models and formal languages theory. It is likely that new mathematical techniques, or novel applications of existing techniques, will once again prove pivotal for moving the field forward. This volume consists of original contributions presented by participants during the two workshops. Topics include language modeling, prosody, acoustic-phonetic modeling, and statistical methodology.




Linguistics for the Age of AI


Book Description

A human-inspired, linguistically sophisticated model of language understanding for intelligent agent systems. One of the original goals of artificial intelligence research was to endow intelligent agents with human-level natural language capabilities. Recent AI research, however, has focused on applying statistical and machine learning approaches to big data rather than attempting to model what people do and how they do it. In this book, Marjorie McShane and Sergei Nirenburg return to the original goal of recreating human-level intelligence in a machine. They present a human-inspired, linguistically sophisticated model of language understanding for intelligent agent systems that emphasizes meaning--the deep, context-sensitive meaning that a person derives from spoken or written language.




Deep Natural Language Processing and AI Applications for Industry 5.0


Book Description

To sustain and stay at the top of the market and give absolute comfort to the consumers, industries are using different strategies and technologies. Natural language processing (NLP) is a technology widely penetrating the market, irrespective of the industry and domains. It is extensively applied in businesses today, and it is the buzzword in every engineer’s life. NLP can be implemented in all those areas where artificial intelligence is applicable either by simplifying the communication process or by refining and analyzing information. Neural machine translation has improved the imitation of professional translations over the years. When applied in neural machine translation, NLP helps educate neural machine networks. This can be used by industries to translate low-impact content including emails, regulatory texts, etc. Such machine translation tools speed up communication with partners while enriching other business interactions. Deep Natural Language Processing and AI Applications for Industry 5.0 provides innovative research on the latest findings, ideas, and applications in fields of interest that fall under the scope of NLP including computational linguistics, deep NLP, web analysis, sentiments analysis for business, and industry perspective. This book covers a wide range of topics such as deep learning, deepfakes, text mining, blockchain technology, and more, making it a crucial text for anyone interested in NLP and artificial intelligence, including academicians, researchers, professionals, industry experts, business analysts, data scientists, data analysts, healthcare system designers, intelligent system designers, practitioners, and students.




Natural Language Processing with Python


Book Description

This book offers a highly accessible introduction to natural language processing, the field that supports a variety of language technologies, from predictive text and email filtering to automatic summarization and translation. With it, you'll learn how to write Python programs that work with large collections of unstructured text. You'll access richly annotated datasets using a comprehensive range of linguistic data structures, and you'll understand the main algorithms for analyzing the content and structure of written communication. Packed with examples and exercises, Natural Language Processing with Python will help you: Extract information from unstructured text, either to guess the topic or identify "named entities" Analyze linguistic structure in text, including parsing and semantic analysis Access popular linguistic databases, including WordNet and treebanks Integrate techniques drawn from fields as diverse as linguistics and artificial intelligence This book will help you gain practical skills in natural language processing using the Python programming language and the Natural Language Toolkit (NLTK) open source library. If you're interested in developing web applications, analyzing multilingual news sources, or documenting endangered languages -- or if you're simply curious to have a programmer's perspective on how human language works -- you'll find Natural Language Processing with Python both fascinating and immensely useful.