Essential Speech and Language Technology for Dutch


Book Description

The book provides an overview of more than a decade of joint R&D efforts in the Low Countries on HLT for Dutch. It not only presents the state of the art of HLT for Dutch in the areas covered, but, even more importantly, a description of the resources (data and tools) for Dutch that have been created are now available for both academia and industry worldwide. The contributions cover many areas of human language technology (for Dutch): corpus collection (including IPR issues) and building (in particular one corpus aiming at a collection of 500M word tokens), lexicology, anaphora resolution, a semantic network, parsing technology, speech recognition, machine translation, text (summaries) generation, web mining, information extraction, and text to speech to name the most important ones. The book also shows how a medium-sized language community (spanning two territories) can create a digital language infrastructure (resources, tools, etc.) as a basis for subsequent R&D. At the same time, it bundles contributions of almost all the HLT research groups in Flanders and the Netherlands, hence offers a view of their recent research activities. Targeted readers are mainly researchers in human language technology, in particular those focusing on Dutch. It concerns researchers active in larger networks such as the CLARIN, META-NET, FLaReNet and participating in conferences such as ACL, EACL, NAACL, COLING, RANLP, CICling, LREC, CLIN and DIR ( both in the Low Countries), InterSpeech, ASRU, ICASSP, ISCA, EUSIPCO, CLEF, TREC, etc. In addition, some chapters are interesting for human language technology policy makers and even for science policy makers in general.




CLARIN in the Low Countries


Book Description

This book describes the results of activities undertaken to construct the CLARIN research infrastructure in the Low Countries, i.e., in the Netherlands and in Flanders (the Dutch-speaking part of Belgium). CLARIN is a European research infrastructure for humanities and social science researchers that work with natural language data. This book introduces the CLARIN infrastructure, describes various aspects of the technical implementation of the infrastructure, and introduces data, applications and software services created in the Low Countries for a wide variety of humanities disciplines. These enable researchers to accelerate their research activities and to base their conclusions on a much larger and richer empirical base than was possible before, thus providing a basis for carrying out groundbreaking research in which old questions can be investigated in new ways and new questions can be raised and investigated for the first time. Given CLARIN's focus on language data, linguistics and particularly syntax are prominently present. However, other humanities disciplines that work with natural language data such as history, literary studies, religion studies, media studies, political studies, and philosophy are represented as well. The book is a must read for humanities scholars and students who want to understand and use the potential that the Digital Humanities offer, as well as for computer scientists and developers of research infrastructures, in particular for researchers working on the CLARIN infrastructure in other countries.




Essential Speech and Language Technology for Dutch


Book Description

This book provides an overview of more than a decade of joint R&D efforts in the Low Countries on HLT for Dutch. It details concrete results (resources and tools for Dutch) achieved that have now become available for both academia and industry worldwide.




European Language Equality


Book Description

This open access book presents a comprehensive collection of the European Language Equality (ELE) project’s results, its strategic agenda and roadmap with key recommendations to the European Union on how to achieve digital language equality in Europe by 2030. The fabric of the EU linguistic landscape comprises 24 official languages and over 60 regional and minority languages. However, language barriers still hamper communication and the free flow of information. Multilingualism is a key cultural cornerstone of Europe, signifying what it means to be and to feel European. Various studies and resolutions have found a striking imbalance in the support of Europe’s languages through technologies, issuing a call to action. Following an introduction, the book is divided into two parts. The first part describes the state of the art of language technology and language-centric AI and the definition and metrics developed to measure digital language equality. It also presents the status quo in 2022/2023, i.e., the current level of technology support for over 30 European languages. The second part describes plans and recommendations on how to bring about digital language equality in Europe by 2030. It includes chapters on the setup and results of the community consultation process, four technical deep dives, an overview of existing strategic documents and an abridged version of the strategic agenda and roadmap. The recommendations have been prepared jointly with the European community in the fields of language technology, natural language processing, and language-centric AI, as well as with representatives of relevant initiatives and associations, language communities and regional and minority language groups. Ensuring appropriate technology support for all European languages will not only create jobs, growth and opportunities in the digital single market. Overcoming language barriers in the digital environment is also essential for an inclusive society and for providing unity in diversity for many years to come.




Essential Speech and Language Technology for Dutch


Book Description

The book provides an overview of more than a decade of joint R&D efforts in the Low Countries on HLT for Dutch. It not only presents the state of the art of HLT for Dutch in the areas covered, but, even more importantly, a description of the resources (data and tools) for Dutch that have been created are now available for both academia and industry worldwide. The contributions cover many areas of human language technology (for Dutch): corpus collection (including IPR issues) and building (in particular one corpus aiming at a collection of 500M word tokens), lexicology, anaphora resolution, a semantic network, parsing technology, speech recognition, machine translation, text (summaries) generation, web mining, information extraction, and text to speech to name the most important ones. The book also shows how a medium-sized language community (spanning two territories) can create a digital language infrastructure (resources, tools, etc.) as a basis for subsequent R&D. At the same time, it bundles contributions of almost all the HLT research groups in Flanders and the Netherlands, hence offers a view of their recent research activities. Targeted readers are mainly researchers in human language technology, in particular those focusing on Dutch. It concerns researchers active in larger networks such as the CLARIN, META-NET, FLaReNet and participating in conferences such as ACL, EACL, NAACL, COLING, RANLP, CICling, LREC, CLIN and DIR ( both in the Low Countries), InterSpeech, ASRU, ICASSP, ISCA, EUSIPCO, CLEF, TREC, etc. In addition, some chapters are interesting for human language technology policy makers and even for science policy makers in general.




Crossroads Semantics


Book Description

As language is a multifaceted phenomenon, the study of language, as long as it is geared at providing a comprehensive picture of it, cannot be restricted to one component or one approach. This applies to the many different components of language as well, including semantics. If we want to fully understand the phenomenon of language meaning, we must not limit our research to lexical semantics, syntax-induced meaning or pragmatics. In order to enable ourselves to construct a consistent account of meaning, we need to extract relevant information from research done in different frameworks and from different theoretical standpoints. This volume brings together a number of computational, psycholinguistic as well as theoretical studies, which highlight and illustrate how research done in one subfield of linguistics can be relevant to others. The articles highlight the different ways in which one can work with different aspects of language meaning.




Emerging Technologies for Developing Countries


Book Description

This book constitutes the refereed proceedings of the Second International EAI Conference on Emerging Technologies for Developing Countries, AFRICATEK 2018, held in Cotonou, Benin, in May 2018. The 12 revised full papers and 4 short papers were selected from 27 submissions. The papers are organized thematically in tracks, starting with ITS and security, applications and IT services, gaming and user experience.




Text, Speech, and Dialogue


Book Description

This book constitutes the refereed proceedings of the 19th International Conference on Text, Speech, and Dialogue, TSD 2016, held in Brno, CzechRepublic, in September 2016. The 62 papers presented together with 3 abstracts of invited talks were carefully reviewed and selected from 127 submissions. They focus on topics such as corpora and language resources; speech recognition; tagging, classification and parsing of text and speech; speech and spoken language generation; semantic processing of text and speech; integrating applications of text and speech processing; automatic dialogue systems; as well as multimodal techniques and modelling.




CLARIN


Book Description

CLARIN, the "Common Language Resources and Technology Infrastructure", has established itself as a major player in the field of research infrastructures for the humanities. This volume provides a comprehensive overview of the organization, its members, its goals and its functioning, as well as of the tools and resources hosted by the infrastructure. The many contributors representing various fields, from computer science to law to psychology, analyse a wide range of topics, such as the technology behind the CLARIN infrastructure, the use of CLARIN resources in diverse research projects, the achievements of selected national CLARIN consortia, and the challenges that CLARIN has faced and will face in the future. The book will be published in 2022, 10 years after the establishment of CLARIN as a European Research Infrastructure Consortium by the European Commission (Decision 2012/136/EU). Watch our talk with the editors Darja Fišer and Andreas Witt here: https://youtu.be/ZOoiGbmMbxI




Manual of Clinical Phonetics


Book Description

This comprehensive collection equips readers with a state-of-the-art description of clinical phonetics and a practical guide on how to employ phonetic techniques in disordered speech analysis. Divided into four sections, the manual covers the foundations of phonetics, sociophonetic variation and its clinical application, clinical phonetic transcription, and instrumental approaches to the description of disordered speech. The book offers in-depth analysis of the instrumentation used in articulatory, auditory, perceptual, and acoustic phonetics and provides clear instruction on how to use the equipment for each technique as well as a critical discussion of how these techniques have been used in studies of speech disorders. With fascinating topics such as multilingual sources of phonetic variation, principles of phonetic transcription, speech recognition and synthesis, and statistical analysis of phonetic data, this is the essential companion for students and professionals of phonetics, phonology, language acquisition, clinical linguistics, and communication sciences and disorders.