Speech Recognition: Acoustic, Phonetic and Lexical Knowledge


Book Description

During this reporting period we continued to make progress on the acquisition of acoustic-phonetic and lexical knowledge. (1)We completed development of a continuous digit recognition system. The system was constructed to investigate the use of acoustic-phonetic knowledge in a speech recognition system. The significant achievements of this study include the development of a 'soft-failure' procedure for lexical access and the discovery of a set of acoustic-phonetic features for verification. (2) We completed a study of the constraints that lexical stress imposes on word recognition. We found that lexical stress information alone can, on the average, reduce the number of word candidates from a large dictionary by more than 80 percent. In conjunction with this study, we successfully developed a system that automatically determines the stress pattern of a word from the acoustic signal. (3) We performed an acoustic study on the characteristics of nasal consonants and nasalized vowels. We have also developed recognition algorithms for nasal murmurs and nasalized vowels in continuous speech. (4) We finished the preliminary development of a system that aligns a speech waveform with the corresponding phonetic transcription.




Speech Recognition


Book Description

The purpose of this program is to develop a speech data base facility under which the acoustic characteristics of speech sounds in various contexts can be studied conveniently; investigate the phonological properties of a large lexicon of, say 10,000 words and determine to what extent the phonotactic constraints can be utilized in speech recognition; study the acoustic cues that are used to mark work boundaries; develop a test bed in the form of a large-vocabulary, IWR system to study the interactions of acoustic, phonetic and lexical knowledge; and develop a limited continuous speech recognition system with the goal of recognizing any English word from its spelling in order to assess the interactions of higher-level knowledge sources.




The Integration of Phonetic Knowledge in Speech Technology


Book Description

Continued progress in Speech Technology in the face of ever-increasing demands on the performance levels of applications is a challenge to the whole speech and language science community. Robust recognition and understanding of spontaneous speech in varied environments, good comprehensibility and naturalness of expressive speech synthesis are goals that cannot be achieved without a change of paradigm. This book argues for interdisciplinary communication and cooperation in problem-solving in general, and discusses the interaction between speech and language engineering and phonetics in particular. With a number of reports on innovative speech technology research as well as more theoretical discussions, it addresses the practical, scientific and sometimes the philosophical problems that stand in the way of cross-disciplinary collaboration and illuminates some of the many possible ways forward. Audience: Researchers and professionals in speech technology and computational linguists.




Automatic Speech Analysis and Recognition


Book Description

This book is the result of the second NATO Advanced Study Institute on speech processing held at the Chateau de Bonas, France, from June 29th to July 10th, 1981. This Institute provided a high-level coverage of the fields of speech transmission, recognition and understanding, which constitute important areas where research activity has re cently been associated with actual industrial developments. This book will therefore include both fundamental and applied topics. Ten survey papers by some of the best specialists in the field are included. They give an up-to-date presentation of several important problems in automatic speech processing. As a consequence the book can be considered as a reference manual on some important areas of automatic speech processing. The surveys are indicated by 'a * in the table of contents. This book also contains research papers corresponding to original works, which were presented during the panel sessions of the Institute. For the sake of clarity the book has been divided into five sections : 1. Speech Analysis and Transmission: An emphasis has been laid on the techniques of linear prediction (LPC), and the problems involved in the transmission of speech at various bit rates are addressed in details. 2. Acoustics and Phonetics : One'of the major bottleneck in the development of speech recogni tion systems remains the transcription of the continuous speech wave into some discrete strings or lattices of phonetic symbols. Two survey papers discuss this problem from different points of view and several practical systems are also described.




Speech Understanding Systems


Book Description




Speech Recognition


Book Description

Contents--Models of Phonetic Recognition: The Role of Analysis by Synthesis in Phonetic Recognition; The Influence of Phonetic Context on the Acoustic Pro9perties of Stops; The Role of Syllable Structure in the Acoustic Realizations of Stops; A Semivowel Recognition System; Two-Dimensional Characterization of the Speech Signal and Its Potential Applications to Speech Processing; An Acoustic-Phonetic Approach to Speech Recognition: Application to the Semivowels (Thesis); Recognition of Words from their Spellings: Integration of Multiple Knowledge Sour.




Trends in Speech Recognition


Book Description

Thirty speech experts cover computer recognition of spoken words, phrases, & sentences. Introduces the field, future prospects & reasons for voice input to machines. Gives guidelines for advanced work in sentence understanding.




Acoustic-Phonetic Constraints in Continuous Speech Recognition


Book Description

Many types of acoustic-phonetic constraints can be applied in speech recognition. Shipman and Zue proposed an isolated word recognition model in which sequential constraints are applied at a broad phonetic level to hypothesize word candidates. Detailed acoustic constraints are then applied on a subsequent phone representation to determine the best word from the remaining word candidates. This thesis examines how their model can be extended to continuous speech. We used the recognition of continuously spoken digits as a case study. We first conducted a feasibility study in which words and word boundaries were hypothesized from an ideal broad phonetic representation of a digit string. We found that strong sequential constraints exist in continuous digit strings and used these results to extend the Shipman and Zue isolated word recognition model to continuous speech. The continuous speech model consists of three components: broad phonetic classifier, lexical component, and verifier. These components have been implemented for the digit vocabulary for the purpose of exploring how acoustic-phonetic constraints can be applied to natural speech. The broad phonetic classifier produces a string of broad phonetic labels from a set of parameters describing the speech signal. The lexical component uses knowledge about statistical characteristics of the output produced by the broad phonetic classifier to score each of the word hypothesis. Evaluation of this part of the system suggests that it can prune unlikely word candidates effectively. Nine acoustic features were defined to characterize phones for verifying each of the word candidates. Evaluation of the verifier on the digit vocabulary demonstrates the power of a phone-based representation and of using a few well-motivated acoustic features for describing phones in an acoustic-phonetic approach.




Speech Recognition: Acoustic, Phonetic and Lexical


Book Description

As part of her Master's Thesis, Aull constructed a Lexical Stress Determiner for discrete words. In this document the author proposes as her thesis to make Aull's system more robust, both from a programmer's point of view and from a performance and reliability perspective. The second chapter of this thesis describes lexical stress. It explores what lexical stress is and how it might be important to a speech recognition system. The third chapter describes Aull's system for automatic detection of lexical stress in isolated words, exploring the components of her system developed by others. The fourth chapter explains the modifications that have been made Aull's system and how recognition they change system performance. The last chapter contains conclusions and some possible directions for future development.




Time Map Phonology


Book Description

This book is a revised version of my doctoral thesis which was submitted in April 1993. The main extension is a chapter on evaluation of the system de scribed in Chapter 8 as this is clearly an issue which was not treated in the original version. This required the collection of data, the development of a concept for diagnostic evaluation of linguistic word recognition systems and, of course, the actual evaluation of the system itself. The revisions made primarily concern the presentation of the latest version of the SILPA system described in an additional Subsection 8. 3, the development environment for SILPA in Sec tion 8. 4, the diagnostic evaluation of the system as an additional Chapter 9. Some updates are included in the discussion of phonology and computation in Chapter 2 and finite state techniques in computational phonology in Chapter 3. The thesis was designed primarily as a contribution to the area of compu tational phonology. However, it addresses issues which are relevant within the disciplines of general linguistics, computational linguistics and, in particular, speech technology, in providing a detailed declarative, computationally inter preted linguistic model for application in spoken language processing. Time Map Phonology is a novel, constraint-based approach based on a two-stage temporal interpretation of phonological categories as events.