Auditory Scene Analysis


Book Description

Auditory Scene Analysis addresses the problem of hearing complex auditory environments, using a series of creative analogies to describe the process required of the human auditory system as it analyzes mixtures of sounds to recover descriptions of individual sounds. In a unified and comprehensive way, Bregman establishes a theoretical framework that integrates his findings with an unusually wide range of previous research in psychoacoustics, speech perception, music theory and composition, and computer modeling.




Computational Auditory Scene Analysis


Book Description

Provides a comprehensive and coherent account of the state of the art in CASA, in terms of the underlying principles, the algorithms and system architectures that are employed, and the potential applications of this exciting new technology.




Auditory Neuroscience


Book Description

An integrated overview of hearing and the interplay of physical, biological, and psychological processes underlying it. Every time we listen—to speech, to music, to footsteps approaching or retreating—our auditory perception is the result of a long chain of diverse and intricate processes that unfold within the source of the sound itself, in the air, in our ears, and, most of all, in our brains. Hearing is an "everyday miracle" that, despite its staggering complexity, seems effortless. This book offers an integrated account of hearing in terms of the neural processes that take place in different parts of the auditory system. Because hearing results from the interplay of so many physical, biological, and psychological processes, the book pulls together the different aspects of hearing—including acoustics, the mathematics of signal processing, the physiology of the ear and central auditory pathways, psychoacoustics, speech, and music—into a coherent whole.




Computational Auditory Scene Analysis


Book Description

The interest of AI in problems related to understanding sounds has a rich history dating back to the ARPA Speech Understanding Project in the 1970s. While a great deal has been learned from this and subsequent speech understanding research, the goal of building systems that can understand general acoustic signals--continuous speech and/or non-speech sounds--from unconstrained environments is still unrealized. Instead, there are now systems that understand "clean" speech well in relatively noiseless laboratory environments, but that break down in more realistic, noisier environments. As seen in the "cocktail-party effect," humans and other mammals have the ability to selectively attend to sound from a particular source, even when it is mixed with other sounds. Computers also need to be able to decide which parts of a mixed acoustic signal are relevant to a particular purpose--which part should be interpreted as speech, and which should be interpreted as a door closing, an air conditioner humming, or another person interrupting. Observations such as these have led a number of researchers to conclude that research on speech understanding and on nonspeech understanding need to be united within a more general framework. Researchers have also begun trying to understand computational auditory frameworks as parts of larger perception systems whose purpose is to give a computer integrated information about the real world. Inspiration for this work ranges from research on how different sensors can be integrated to models of how humans' auditory apparatus works in concert with vision, proprioception, etc. Representing some of the most advanced work on computers understanding speech, this collection of papers covers the work being done to integrate speech and nonspeech understanding in computer systems.




Thinking in Sound


Book Description

The realm of auditory cognition is beginning to affirm itself as a new research orientation. Until now, no volume has existed that covers in a didactic fashion the whole range of subjects in this domain. To rectify this situation a special tutorial workshop organized by the French Acoustical Society was held at IRCAM, the music research institute founded by Pierre Boulez. Specialists in perceptual organization, memory, attention, music psychology, neurospsychology, and developmental psychology were invited from Europe and North America. The chapters of this book present the materials from their lectures. The book will be useful to advanced students in the cognitive sciences and scientists specializing in many fields as well as in auditory psychology.




Speech Separation by Humans and Machines


Book Description

This book is appropriate for those specializing in speech science, hearing science, neuroscience, or computer science and engineers working on applications such as automatic speech recognition, cochlear implants, hands-free telephones, sound recording, multimedia indexing and retrieval.




Listening


Book Description

Listening combines broad coverage of acoustics, speech and music perception psychophysics, and auditory physiology with a coherent theoretical orientation in a lively and accessible introduction to the perception of music and speech events. Handel treats the production and perception of music and speech in parallel throughout the text, arguing that their production and perception follows identical principles; music and speech share the same formal properties, involve the same cognitive mechanisms, and cannot exist in separate "modules." The way that a sound is produced determines the physical properties of the acoustic wave. These properties in turn lead to the perception of the event. The initial chapters take up physical processes, including a section on characterization of sound and discussion of the way instruments and speech produce musical sound. Handel explains how the environment affects perceived sounds, including reflection, reverberation, diffraction, and the Doppler effect. Subsequent chapters take up psychological processes: partitioning smeared sounds into discrete events, identifying sound sources, the units and phrases of speech and music, and speech and music rhythms. The final chapter provides a detailed treatment of the physiology and neurophysiology of the auditory system. All of the author's explanations are coherent and clear, and this strategy includes discussing particular pieces of research in detail rather than covering many things superficially Handel analyzes causes as well as describing phenomena and sets out for the reader the difficulties inherent in the research methods he discusses. He defines the physical, musical, and psychological terms used, even the most basic ones, and covers all of the experimental methods and statistical procedures in the text. A Bradford Book.




Voice Leading


Book Description

An accessible scientific explanation for the traditional rules of voice leading, including an account of why listeners find some musical textures more pleasing than others. Voice leading is the musical art of combining sounds over time. In this book, David Huron offers an accessible account of the cognitive and perceptual foundations for this practice. Drawing on decades of scientific research, including his own award-winning work, Huron offers explanations for many practices and phenomena, including the perceptual dominance of the highest voice, chordal-tone doubling, direct octaves, embellishing tones, and the musical feeling of sounds “leading” somewhere. Huron shows how traditional rules of voice leading align almost perfectly with modern scientific accounts of auditory perception. He also reviews pertinent research establishing the role of learning and enculturation in auditory and musical perception. Voice leading has long been taught with reference to Baroque chorale-style part-writing, yet there exist many more musical styles and practices. The traditional emphasis on Baroque part-writing understandably leaves many musicians wondering why they are taught such an archaic and narrow practice in an age of stylistic diversity. Huron explains how and why Baroque voice leading continues to warrant its central pedagogical status. Expanding beyond choral-style writing, Huron shows how established perceptual principles can be used to compose, analyze, and critically understand any kind of acoustical texture from tune-and-accompaniment songs and symphonic orchestration to jazz combo arranging and abstract electroacoustic music. Finally, he offers a psychological explanation for why certain kinds of musical textures are more likely to be experienced by listeners as pleasing.




Machine Audition: Principles, Algorithms and Systems


Book Description

Machine audition is the study of algorithms and systems for the automatic analysis and understanding of sound by machine. It has recently attracted increasing interest within several research communities, such as signal processing, machine learning, auditory modeling, perception and cognition, psychology, pattern recognition, and artificial intelligence. However, the developments made so far are fragmented within these disciplines, lacking connections and incurring potentially overlapping research activities in this subject area. Machine Audition: Principles, Algorithms and Systems contains advances in algorithmic developments, theoretical frameworks, and experimental research findings. This book is useful for professionals who want an improved understanding about how to design algorithms for performing automatic analysis of audio signals, construct a computing system for understanding sound, and learn how to build advanced human-computer interactive systems.




Magnetoencephalography


Book Description

Magnetoencephalography (MEG) is an invaluable functional brain imaging technique that provides direct, real-time monitoring of neuronal activity necessary for gaining insight into dynamic cortical networks. Our intentions with this book are to cover the richness and transdisciplinary nature of the MEG field, make it more accessible to newcomers and experienced researchers and to stimulate growth in the MEG area. The book presents a comprehensive overview of MEG basics and the latest developments in methodological, empirical and clinical research, directed toward master and doctoral students, as well as researchers. There are three levels of contributions: 1) tutorials on instrumentation, measurements, modeling, and experimental design; 2) topical reviews providing extensive coverage of relevant research topics; and 3) short contributions on open, challenging issues, future developments and novel applications. The topics range from neuromagnetic measurements, signal processing and source localization techniques to dynamic functional networks underlying perception and cognition in both health and disease. Topical reviews cover, among others: development on SQUID-based and novel sensors, multi-modal integration (low field MRI and MEG; EEG and fMRI), Bayesian approaches to multi-modal integration, direct neuronal imaging, novel noise reduction methods, source-space functional analysis, decoding of brain states, dynamic brain connectivity, sensory-motor integration, MEG studies on perception and cognition, thalamocortical oscillations, fetal and neonatal MEG, pediatric MEG studies, cognitive development, clinical applications of MEG in epilepsy, pre-surgical mapping, stroke, schizophrenia, stuttering, traumatic brain injury, post-traumatic stress disorder, depression, autism, aging and neurodegeneration, MEG applications in cognitive neuropharmacology and an overview of the major open-source analysis tools.