Integral and Diagnostic Intrusive Prediction of Speech Quality


Book Description

This work deals with the instrumental measurement methods for the perceived quality of transmitted speech. These measures simulate the speech perception process employed by human subjects during auditory experiments. The measure standardized by the International Telecommunication Union (ITU), called “Wideband-Perceptual Speech Quality Evaluation (WB-PESQ)”, is not able to quantify all these perceived characteristics on a unidimensional quality scale, the Mean Opinion Score (MOS) scale. Recent experimental studies showed that subjects make use of several perceptual dimensions to judge about the quality of speech signals. In order to represent the signal at a higher stage of perception, a new model, called “Diagnostic Instrumental Assessment of Listening quality (DIAL)”, has been developed. It includes a perceptual and a cognitive model which simulate the whole quality judgment process. Except for strong discontinuities, DIAL predicts very well speech quality of different speech processing and transmission systems, and it outperforms the WB-PESQ.




Deep Learning Based Speech Quality Prediction


Book Description

This book presents how to apply recent machine learning (deep learning) methods for the task of speech quality prediction. The author shows how recent advancements in machine learning can be leveraged for the task of speech quality prediction and provides an in-depth analysis of the suitability of different deep learning architectures for this task. The author then shows how the resulting model outperforms traditional speech quality models and provides additional information about the cause of a quality impairment through the prediction of the speech quality dimensions of noisiness, coloration, discontinuity, and loudness.




Audiovisual Quality Assessment and Prediction for Videotelephony


Book Description

The work presented in this book focuses on modeling audiovisual quality as perceived by the users of IP-based solutions for video communication like videotelephony. It also extends the current framework for the parametric prediction of audiovisual call quality. The book addresses several aspects related to the quality perception of entire video calls, namely, the quality estimation of the single audio and video modalities in an interactive context, the audiovisual quality integration of these modalities and the temporal pooling of short sample-based quality scores to account for the perceptual quality impact of time-varying degradations.




Quality of Synthetic Speech


Book Description

This book reviews research towards perceptual quality dimensions of synthetic speech, compares these findings with the state of the art, and derives a set of five universal perceptual quality dimensions for TTS signals. They are: (i) naturalness of voice, (ii) prosodic quality, (iii) fluency and intelligibility, (iv) absence of disturbances, and (v) calmness. Moreover, a test protocol for the efficient indentification of those dimensions in a listening test is introduced. Furthermore, several factors influencing these dimensions are examined. In addition, different techniques for the instrumental quality assessment of TTS signals are introduced, reviewed and tested. Finally, the requirements for the integration of an instrumental quality measure into a concatenative TTS system are examined.




Dimension-Based Quality Analysis and Prediction for Videotelephony


Book Description

This book provides an in-depth investigation of the quality relevant perceptual video space in the domain of videotelephony. The author presents an extensive investigation and quality modeling of the underlying video quality dimensions and the overall quality. The author examines the underlying quality dimensions and describes a method for subjective evaluation as well as the instrumental estimation of video quality in videotelephony. The book presents a new subjective test method in the field of video quality assessment. Further, it explains the experimental examination of the underlying video quality dimensions and the subjective-based, as well as instrumental-based quality estimation. Provides an investigation of the underlying quality dimensions of video in videotelephony; Presents insights into a new subjective test method, standardized as ITU-T Rec. P.918; Includes insights into the subjective and instrumental video quality estimation.




Human Information Processing in Speech Quality Assessment


Book Description

This book provides a new multi-method, process-oriented approach towards speech quality assessment, which allows readers to examine the influence of speech transmission quality on a variety of perceptual and cognitive processes in human listeners. Fundamental concepts and methodologies surrounding the topic of process-oriented quality assessment are introduced and discussed. The book further describes a functional process model of human quality perception, which theoretically integrates results obtained in three experimental studies. This book’s conceptual ideas, empirical findings, and theoretical interpretations should be of particular interest to researchers working in the fields of Quality and Usability Engineering, Audio Engineering, Psychoacoustics, Audiology, and Psychophysiology.




Quality of Experience


Book Description

This pioneering book develops definitions and concepts related to Quality of Experience in the context of multimedia- and telecommunications-related applications, systems and services and applies these to various fields of communication and media technologies. The editors bring together numerous key-protagonists of the new discipline “Quality of Experience” and combine the state-of-the-art knowledge in one single volume.




Neural Correlates of Quality Perception for Complex Speech Signals


Book Description

This book interconnects two essential disciplines to study the perception of speech: Neuroscience and Quality of Experience, which to date have rarely been used together for the purposes of research on speech quality perception. In five key experiments, the book demonstrates the application of standard clinical methods in neurophysiology on the one hand and of methods used in fields of research concerned with speech quality perception on the other. Using this combination, the book shows that speech stimuli with different lengths and different quality impairments are accompanied by physiological reactions related to quality variations, e.g., a positive peak in an event-related potential. Furthermore, it demonstrates that – in most cases – quality impairment intensity has an impact on the intensity of physiological reactions.




Dimension-based Quality Modeling of Transmitted Speech


Book Description

In this book, speech transmission quality is modeled on the basis of perceptual dimensions. The author identifies those dimensions that are relevant for today's public-switched and packet-based telecommunication systems, regarding the complete transmission path from the mouth of the speaker to the ear of the listener. Both narrowband (300-3400 Hz) as well as wideband (50-7000 Hz) speech transmission is taken into account. A new analytical assessment method is presented that allows the dimensions to be rated by non-expert listeners in a direct way. Due to the efficiency of the test method, a relatively large number of stimuli can be assessed in auditory tests. The test method is applied in two auditory experiments. The book gives the evidence that this test method provides meaningful and reliable results. The resulting dimension scores together with respective overall quality ratings form the basis for a new parametric model for the quality estimation of transmitted speech based on the perceptual dimensions. In a two-step model approach, instrumental dimension models estimate dimension impairment factors in a first step. The resulting dimension estimates are combined by a Euclidean integration function in a second step in order to provide an estimate of the total impairment.




Quality Engineering


Book Description

Der Begriff der Qualität und der Gebrauchstauglichkeit hat in der Informations- und Kommunikationstechnik sowie der Informatik eine herausragende Bedeutung. Der Autor führt in diese Thematik ein, indem er zunächst die Fachbegriffe und die Grundlagen der Psychophysik und Psychometrie erläutert. Darauf aufbauend wird der Kreislauf einer menschenorientierten Systementwicklung vorgestellt. Die Messung und Vorhersage von Qualität und Gebrauchstauglichkeit wird anhand von Beispielen veranschaulicht, u. a. für Sprach- und multimodale Dialogsysteme.