Robustness in Automatic Speech Recognition


Book Description

Foreword Looking back the past 30 years. we have seen steady progress made in the area of speech science and technology. I still remember the excitement in the late seventies when Texas Instruments came up with a toy named "Speak-and-Spell" which was based on a VLSI chip containing the state-of-the-art linear prediction synthesizer. This caused a speech technology fever among the electronics industry. Particularly. applications of automatic speech recognition were rigorously attempt ed by many companies. some of which were start-ups founded just for this purpose. Unfortunately. it did not take long before they realized that automatic speech rec ognition technology was not mature enough to satisfy the need of customers. The fever gradually faded away. In the meantime. constant efforts have been made by many researchers and engi neers to improve the automatic speech recognition technology. Hardware capabilities have advanced impressively since that time. In the past few years. we have been witnessing and experiencing the advent of the "Information Revolution." What might be called the second surge of interest to com mercialize speech technology as a natural interface for man-machine communication began in much better shape than the first one. With computers much more powerful and faster. many applications look realistic this time. However. there are still tremendous practical issues to be overcome in order for speech to be truly the most natural interface between humans and machines.




Automatic Speech Analysis and Recognition


Book Description

This book is the result of the second NATO Advanced Study Institute on speech processing held at the Chateau de Bonas, France, from June 29th to July 10th, 1981. This Institute provided a high-level coverage of the fields of speech transmission, recognition and understanding, which constitute important areas where research activity has re cently been associated with actual industrial developments. This book will therefore include both fundamental and applied topics. Ten survey papers by some of the best specialists in the field are included. They give an up-to-date presentation of several important problems in automatic speech processing. As a consequence the book can be considered as a reference manual on some important areas of automatic speech processing. The surveys are indicated by 'a * in the table of contents. This book also contains research papers corresponding to original works, which were presented during the panel sessions of the Institute. For the sake of clarity the book has been divided into five sections : 1. Speech Analysis and Transmission: An emphasis has been laid on the techniques of linear prediction (LPC), and the problems involved in the transmission of speech at various bit rates are addressed in details. 2. Acoustics and Phonetics : One'of the major bottleneck in the development of speech recogni tion systems remains the transcription of the continuous speech wave into some discrete strings or lattices of phonetic symbols. Two survey papers discuss this problem from different points of view and several practical systems are also described.




Automatic Speech Recognition


Book Description

This book provides a comprehensive overview of the recent advancement in the field of automatic speech recognition with a focus on deep learning models including deep neural networks and many of their variants. This is the first automatic speech recognition book dedicated to the deep learning approach. In addition to the rigorous mathematical treatment of the subject, the book also presents insights and theoretical foundation of a series of highly successful deep learning models.




Automatic Speech Recognition on Mobile Devices and over Communication Networks


Book Description

The advances in computing and networking have sparked an enormous interest in deploying automatic speech recognition on mobile devices and over communication networks. This book brings together academic researchers and industrial practitioners to address the issues in this emerging realm and presents the reader with a comprehensive introduction to the subject of speech recognition in devices and networks. It covers network, distributed and embedded speech recognition systems.




Automatic Speech and Speaker Recognition


Book Description

Research in the field of automatic speech and speaker recognition has made a number of significant advances in the last two decades, influenced by advances in signal processing, algorithms, architectures, and hardware. These advances include: the adoption of a statistical pattern recognition paradigm; the use of the hidden Markov modeling framework to characterize both the spectral and the temporal variations in the speech signal; the use of a large set of speech utterance examples from a large population of speakers to train the hidden Markov models of some fundamental speech units; the organization of speech and language knowledge sources into a structural finite state network; and the use of dynamic, programming based heuristic search methods to find the best word sequence in the lexical network corresponding to the spoken utterance. Automatic Speech and Speaker Recognition: Advanced Topics groups together in a single volume a number of important topics on speech and speaker recognition, topics which are of fundamental importance, but not yet covered in detail in existing textbooks. Although no explicit partition is given, the book is divided into five parts: Chapters 1-2 are devoted to technology overviews; Chapters 3-12 discuss acoustic modeling of fundamental speech units and lexical modeling of words and pronunciations; Chapters 13-15 address the issues related to flexibility and robustness; Chapter 16-18 concern the theoretical and practical issues of search; Chapters 19-20 give two examples of algorithm and implementational aspects for recognition system realization. Audience: A reference book for speech researchers and graduate students interested in pursuing potential research on the topic. May also be used as a text for advanced courses on the subject.




Robust Automatic Speech Recognition


Book Description

Robust Automatic Speech Recognition: A Bridge to Practical Applications establishes a solid foundation for automatic speech recognition that is robust against acoustic environmental distortion. It provides a thorough overview of classical and modern noise-and reverberation robust techniques that have been developed over the past thirty years, with an emphasis on practical methods that have been proven to be successful and which are likely to be further developed for future applications.The strengths and weaknesses of robustness-enhancing speech recognition techniques are carefully analyzed. The book covers noise-robust techniques designed for acoustic models which are based on both Gaussian mixture models and deep neural networks. In addition, a guide to selecting the best methods for practical applications is provided.The reader will: Gain a unified, deep and systematic understanding of the state-of-the-art technologies for robust speech recognition Learn the links and relationship between alternative technologies for robust speech recognition Be able to use the technology analysis and categorization detailed in the book to guide future technology development Be able to develop new noise-robust methods in the current era of deep learning for acoustic modeling in speech recognition The first book that provides a comprehensive review on noise and reverberation robust speech recognition methods in the era of deep neural networks Connects robust speech recognition techniques to machine learning paradigms with rigorous mathematical treatment Provides elegant and structural ways to categorize and analyze noise-robust speech recognition techniques Written by leading researchers who have been actively working on the subject matter in both industrial and academic organizations for many years







Intelligent Speech Signal Processing


Book Description

Intelligent Speech Signal Processing investigates the utilization of speech analytics across several systems and real-world activities, including sharing data analytics related information, creating collaboration networks between several participants, and implementing video-conferencing in different application areas. It provides a forum for readers to discover the characteristics of intelligent speech signal processing systems across different domains. Chapters focus on the latest applications of speech data analysis and management tools across different recording systems. The book emphasizes the multi-disciplinary nature of the field, presenting different applications and challenges with extensive studies on the design, implementation, development, and management of intelligent systems, neural networks, and related machine learning techniques for speech signal processing. Highlights different data analytics techniques in speech signal processing, including machine learning, and data mining Illustrates different applications and challenges across the design, implementation, and management of intelligent systems and neural networks techniques for speech signal processing Includes coverage of biomodal speech recognition, voice activity detection, spoken language and speech disorder identification, automatic speech to speech summarization, and convolutional neural networks







Automatic Speech and Speaker Recognition


Book Description

This book discusses large margin and kernel methods for speech and speaker recognition Speech and Speaker Recognition: Large Margin and Kernel Methods is a collation of research in the recent advances in large margin and kernel methods, as applied to the field of speech and speaker recognition. It presents theoretical and practical foundations of these methods, from support vector machines to large margin methods for structured learning. It also provides examples of large margin based acoustic modelling for continuous speech recognizers, where the grounds for practical large margin sequence learning are set. Large margin methods for discriminative language modelling and text independent speaker verification are also addressed in this book. Key Features: Provides an up-to-date snapshot of the current state of research in this field Covers important aspects of extending the binary support vector machine to speech and speaker recognition applications Discusses large margin and kernel method algorithms for sequence prediction required for acoustic modeling Reviews past and present work on discriminative training of language models, and describes different large margin algorithms for the application of part-of-speech tagging Surveys recent work on the use of kernel approaches to text-independent speaker verification, and introduces the main concepts and algorithms Surveys recent work on kernel approaches to learning a similarity matrix from data This book will be of interest to researchers, practitioners, engineers, and scientists in speech processing and machine learning fields.