Speech-to-Speech Translation


Book Description

This book provides the readers with retrospective and prospective views with detailed explanations of component technologies, speech recognition, language translation and speech synthesis. Speech-to-speech translation system (S2S) enables to break language barriers, i.e., communicate each other between any pair of person on the glove, which is one of extreme dreams of humankind. People, society, and economy connected by S2S will demonstrate explosive growth without exception. In 1986, Japan initiated basic research of S2S, then the idea spread world-wide and were explored deeply by researchers during three decades. Now, we see S2S application on smartphone/tablet around the world. Computational resources such as processors, memories, wireless communication accelerate this computation-intensive systems and accumulation of digital data of speech and language encourage recent approaches based on machine learning. Through field experiments after long research in laboratories, S2S systems are being well-developed and now ready to utilized in daily life. Unique chapter of this book is end-2-end evaluation by comparing system’s performance and human competence. The effectiveness of the system would be understood by the score of this evaluation. The book will end with one of the next focus of S2S will be technology of simultaneous interpretation for lecture, broadcast news and so on.




Verbmobil: Foundations of Speech-to-Speech Translation


Book Description

Verbmobil is the result of eight years of intensive research in a large speech-to-speech translation project, executed by a consortium comprising nineteen academic and four industrial partners. The system that was developed by more than 100 researchers and engineers handles dialogs in three business-oriented domains, with translation between three languages: German, English, and Japanese. Verbmobil deals with spontaneous speech, which includes realistic repair phenomena, and uses deep semantic analysis to recognize a speaker's slips and to translate what he tried to say rather than what he actually said. - This book gives the first comprehensive overview of the results of this unique and seminal project in human language technology. Contributions by leading scientists in speech and language technology look at the component technologies that make Verbmobil the most advanced speech-to-speech translation system worldwide and a landmark project in the history of natural language processing.




Automatic Speech Translation


Book Description

Automatic Speech Translation introduces recent results of Japanese research and development in speech translation and speech recognition. Topics covered include: fundamental concepts of speech recognition; speech pattern representation; phoneme-based HMM phoneme recognition; continuous speech recognition; speaker adaptation; speaker-independent speech recognition; utterance analysis, utterance transfer, utterance generation; contextual process­ing; speech synthesis and an experimental system of speech translation. This book presents the complicated technological aspects of machine translation and speech recognition, and outlines the future directions of this rapidly developing area of technology.




Speech-to-Speech Translation


Book Description

Speech--to--Speech Translation: a Massively Parallel Memory-Based Approach describes one of the world's first successful speech--to--speech machine translation systems. This system accepts speaker-independent continuous speech, and produces translations as audio output. Subsequent versions of this machine translation system have been implemented on several massively parallel computers, and these systems have attained translation performance in the milliseconds range. The success of this project triggered several massively parallel projects, as well as other massively parallel artificial intelligence projects throughout the world. Dr. Hiroaki Kitano received the distinguished `Computers and Thought Award' from the International Joint Conferences on Artificial Intelligence in 1993 for his work in this area, and that work is reported in this book.




Multilingual Speech Processing


Book Description

Tanja Schultz and Katrin Kirchhoff have compiled a comprehensive overview of speech processing from a multilingual perspective. By taking this all-inclusive approach to speech processing, the editors have included theories, algorithms, and techniques that are required to support spoken input and output in a large variety of languages. Multilingual Speech Processing presents a comprehensive introduction to research problems and solutions, both from a theoretical as well as a practical perspective, and highlights technology that incorporates the increasing necessity for multilingual applications in our global community. Current challenges of speech processing and the feasibility of sharing data and system components across different languages guide contributors in their discussions of trends, prognoses and open research issues. This includes automatic speech recognition and speech synthesis, but also speech-to-speech translation, dialog systems, automatic language identification, and handling non-native speech. The book is complemented by an overview of multilingual resources, important research trends, and actual speech processing systems that are being deployed in multilingual human-human and human-machine interfaces. Researchers and developers in industry and academia with different backgrounds but a common interest in multilingual speech processing will find an excellent overview of research problems and solutions detailed from theoretical and practical perspectives. - State-of-the-art research with a global perspective by authors from the USA, Asia, Europe, and South Africa - The only comprehensive introduction to multilingual speech processing currently available - Detailed presentation of technological advances integral to security, financial, cellular and commercial applications




Verbmobil: Foundations of Speech-to-Speech Translation


Book Description

In 1992 it seemed very difficult to answer the question whether it would be possible to develop a portable system for the automatic recognition and translation of spon taneous speech. Previous research work on speech processing had focused on read speech only and international projects aimed at automated text translation had just been terminated without achieving their objectives. Within this context, the German Federal Ministry of Education and Research (BMBF) made a careful analysis of all national and international research projects conducted in the field of speech and language technology before deciding to launch an eight-year basic-research lead project in which research groups were to cooperate in an interdisciplinary and international effort covering the disciplines of computer science, computational linguistics, translation science, signal processing, communi cation science and artificial intelligence. At some point, the project comprised up to 135 work packages with up to 33 research groups working on these packages. The project was controlled by means of a network plan. Every two years the project sit uation was assessed and the project goals were updated. An international scientific advisory board provided advice for BMBF. A new scientific approach was chosen for this project: coping with the com plexity of spontaneous speech with all its pertinent phenomena such as ambiguities, self-corrections, hesitations and disfluencies took precedence over the intended lex icon size. Another important aspect was that prosodic information was exploited at all processing stages.




Incremental Speech Translation


Book Description

Human language capabilities are based on mental proceduresthat are closely linked to the time domain. Listening, understanding,and reacting, on the one hand, as well as planning,formulating,and speaking,onthe other, are performedin a highlyover lapping manner, thus allowing inter human communication to proceed in a smooth and ?uent way. Although it happens to be the natural mode of human language interaction, in cremental processing is still far from becoming a common feature of today’s lan guage technology. Instead, it will certainly remain one of the big challenges for research activities in the years to come. Usually considered dif?cult to a degree that rendersit almost intractableforpracticalpurposes,incrementallanguageprocessing has recently been attracting a steadily growing interest in the spoken language pro cessing community. Its notorious dif?culty can be attributed mainly to two reasons: Due to the inaccessibility of the right context, global optimization criteria are no longer available. This loss must be compensated for by communicating larger search spaces between system components or by introducing appropriate repair mechanisms. In any case, the complexity of the task can easily grow by an order of magnitude or even more. Incrementality is an almost useless feature as long as it remains a local property of individual system components. The advantages of incremental processing can be effectiveonly if all the componentsof a producer consumerchain consistently adhere to the same pattern of temporal behavior.




Distant Speech Recognition


Book Description

A complete overview of distant automatic speech recognition The performance of conventional Automatic Speech Recognition (ASR) systems degrades dramatically as soon as the microphone is moved away from the mouth of the speaker. This is due to a broad variety of effects such as background noise, overlapping speech from other speakers, and reverberation. While traditional ASR systems underperform for speech captured with far-field sensors, there are a number of novel techniques within the recognition system as well as techniques developed in other areas of signal processing that can mitigate the deleterious effects of noise and reverberation, as well as separating speech from overlapping speakers. Distant Speech Recognitionpresents a contemporary and comprehensive description of both theoretic abstraction and practical issues inherent in the distant ASR problem. Key Features: Covers the entire topic of distant ASR and offers practical solutions to overcome the problems related to it Provides documentation and sample scripts to enable readers to construct state-of-the-art distant speech recognition systems Gives relevant background information in acoustics and filter techniques, Explains the extraction and enhancement of classification relevant speech features Describes maximum likelihood as well as discriminative parameter estimation, and maximum likelihood normalization techniques Discusses the use of multi-microphone configurations for speaker tracking and channel combination Presents several applications of the methods and technologies described in this book Accompanying website with open source software and tools to construct state-of-the-art distant speech recognition systems This reference will be an invaluable resource for researchers, developers, engineers and other professionals, as well as advanced students in speech technology, signal processing, acoustics, statistics and artificial intelligence fields.




Proceedings of the International Conference of Mechatronics and Cyber-MixMechatronics – 2019


Book Description

These proceedings gather contributions presented at the 3rd International Conference of Mechatronics and Cyber-MixMechatronics/ICOMECYME, organized by the National Institute of R&D in Mechatronics and Measurement Technique in Bucharest, Romania, on September 5th–6th, 2019. Reflecting the expansion mechatronics, it discusses topics in the newer trans-disciplinary fields, such as adaptronics, integronics, and cyber-mixmechatronics. With a rich scientific tradition and attracting specialists from around the globe – including North America, South America, and Asia – ICOMECYME focuses on presenting the latest research. It is mainly directed at academics and advanced students, but also appeals to R&D experts, offering a platform for scientific exchange. These proceedings are a valuable resource for entrepreneurs who want to invest in research and who are open for collaborations.




Handbook of Intelligent Computing and Optimization for Sustainable Development


Book Description

HANDBOOK OF INTELLIGENT COMPUTING AND OPTIMIZATION FOR SUSTAINABLE DEVELOPMENT This book provides a comprehensive overview of the latest breakthroughs and recent progress in sustainable intelligent computing technologies, applications, and optimization techniques across various industries. Optimization has received enormous attention along with the rapidly increasing use of communication technology and the development of user-friendly software and artificial intelligence. In almost all human activities, there is a desire to deliver the highest possible results with the least amount of effort. Moreover, optimization is a very well-known area with a vast number of applications, from route finding problems to medical treatment, construction, finance, accounting, engineering, and maintenance schedules in plants. As far as optimization of real-world problems is concerned, understanding the nature of the problem and grouping it in a proper class may help the designer employ proper techniques which can solve the problem efficiently. Many intelligent optimization techniques can find optimal solutions without the use of objective function and are less prone to local conditions. The 41 chapters comprising the Handbook of Intelligent Computing and Optimization for Sustainable Development by subject specialists, represent diverse disciplines such as mathematics and computer science, electrical and electronics engineering, neuroscience and cognitive sciences, medicine, and social sciences, and provide the reader with an integrated understanding of the importance that intelligent computing has in the sustainable development of current societies. It discusses the emerging research exploring the theoretical and practical aspects of successfully implementing new and innovative intelligent techniques in a variety of sectors, including IoT, manufacturing, optimization, and healthcare. Audience It is a pivotal reference source for IT specialists, industry professionals, managers, executives, researchers, scientists, and engineers seeking current research in emerging perspectives in the field of artificial intelligence in the areas of Internet of Things, renewable energy, optimization, and smart cities.