Developing Linguistic Corpora


Book Description

A linguistic corpus is a collection of texts which have been selected and brought together so that language can be studied on the computer. Today, corpus linguistics offers some of the most powerful new procedures for the analysis of language, and the impact of this dynamic and expanding sub-discipline is making itself felt in many areas of language study. In this volume, a selection of leading experts in various key areas of corpus construction offer advice in a readable and largely non-technical style to help the reader to ensure that their corpus is well designed and fit for the intended purpose. This guide is aimed at those who are at some stage of building a linguistic corpus. Little or no knowledge of corpus linguistics or computational procedures is assumed, although it is hoped that more advanced users will find the guidelines here useful. It is also aimed at those who are not building a corpus, but who need to know something about the issues involved in the design of corpora in order to choose between available resources and to help draw conclusions from their studies.




Best Practices for Spoken Corpora in Linguistic Research


Book Description

A key concern of researchers involved in the creation and sharing of language resources is to attain maximum usability, reliability and longevity of these resources for present and future researchers in the language sciences. The view developed in this volume is that spoken corpora construction and sharing are major research endeavours that should also be laid open to academic debate in a manner that is more visible than is currently the case in corpus linguistics. The present volume brings together multiple research perspectives to bear on the question of what constitutes best practices for the construction of spoken corpora. The book brings into closer contact scholars whose specializations have often remained in relatively different streams of scientific investigation; that is, scholars whose work falls primarily in conversation analysis, pragmatics and discourse analysis, but who are involved in spoken corpus compilation, on the one hand, and scholars who also specialize in linguistics but who have been intensively involved in developing various infrastructures for spoken corpora, on the other hand. This combination of scholars brings into better relief the concerns of data providers, data curators and data users in linguistic research. This book is thus unique in that it highlights best practices from both the perspective of assembling, annotating and linguistic analysis of spoken corpora, as well as from the perspective of processing, archiving and disseminating spoken language. In doing so, the contributions emphasise not only the considerable promise that the rapid technological changes that society continues to experience in this area offer, but also possible dangers for the unwary.




Working with Portuguese Corpora


Book Description

Although Portuguese is one of the main world languages and researchers have been working on Portuguese electronic text collections for decades (e.g. Kelly, 1970; Biderman, 1978; Bacelar do Nascimento et al., 1984; see Berber Sardinha, 2005), this is the first volume in English that encapsulates the exciting and cutting-edge corpus linguistic work being done with Portuguese language corpora on different continents. The book includes chapters by leading corpus linguists dealing with Portuguese corpora across the world, and their contributions explore various methods and how they are applicable to a wide range of language issues. The book is divided into six sections, each covering a key issue in Corpus Linguistics: lexis and grammar, lexicography, language teaching and terminology, translation, corpus building and sharing, and parsing and annotation. Together these sections present the reader with a broad picture of the field.




Ten Lectures on Corpora and Cognitive Linguistics


Book Description

The volume consists of ten studies that involve the use of corpus data relevant to research within a Cognitive Linguistics framework.




Using Corpora in the Language Classroom


Book Description

Explains and illustrates how teachers can use corpora to create classroom materials and activities to address specific class needs. Using Corpora in the Language Classroom shows teachers how to use corpora and corpus tools to expand student learning. Together with its companion website, this teacher-friendly book demystifies corpus linguistics with clear explanations, instructions and examples. It provides the essential knowledge, tools, and skills teachers need to enable students to discover how language is really used. Clear and concise, this volume provides: -An overview of corpus linguistics -Clear explanations of terminology -Tasks and activities that invite readers to interact with the material -Principled instructions for creating classroom materials and activities, including how to create corpora to address specific class needs.




Ten Lectures on Corpus Linguistics with R


Book Description

In this book, Stefan Th. Gries provides an overview on how quantitative corpus methods can provide insights to cognitive/usage-based linguistics and selected psycholinguistic questions. Topics include the corpus linguistics in general, its most important methodological tools, its statistical nature, and the relation of all these topics to past and current usage-based theorizing. Central notions discussed in detail include frequency, dispersion, context, and others in a variety of applications and case studies; four practice sessions offer short introductions of how to compute various corpus statistics with the open source programming language and environment R.




Using Corpora in Discourse Analysis


Book Description

How can you carry out discourse analysis using corpus linguistics? What research questions should I ask? Which methods should you use and when? What is a collocational network or a key cluster? Introducing the major techniques, methods and tools for corpus-assisted analysis of discourse, this book answers these questions and more, showing readers how to best use corpora in their analyses of discourse. Using carefully tailored case studies, each chapter is devoted to a central technique, including frequency, concordancing and keywords, going step by step through the process of applying different analytical procedures. Introducing a wide range of different corpora, from holiday brochures to political debates, the book considers the key debates and latest advances in the field. Fully revised and updated, this new edition includes: - A new chapter on how to conduct research projects in corpus-based discourse analysis - Completely rewritten chapters on collocation and advanced techniques, using a corpus of jihadist propaganda texts and covering topics such as social media and visual analysis - Coverage of major tools, including CQPweb, AntConc, Sketch Engine and #LancsBox - Discussion of newer techniques including the derivation of lockwords and the comparison of multiple data sets for diachronic analysis With exercises, discussion questions and suggested further readings in each chapter, this book is an excellent guide to using corpus linguistics techniques to carry out discourse analysis.




How to Use Corpora in Language Teaching


Book Description

After decades of being overlooked, corpus evidence is becoming an important component of the teaching and learning of languages. Above all, the profession needs guidance in the practicalities of using corpora, interpreting the results and applying them to the problems and opportunities of the classroom. This book is intensely practical, written mainly by a new generation of language teachers who are acknowledged experts in central aspects of the discipline. It offers advice on what to do in the classroom, how to cope with teachers' queries about language, what corpora to use including learner corpora and spoken corpora and how to handle the variability of language; it reports on some current research and explains how the access software is constructed, including an opportunity for the practitioner to write small but useful programs; and it takes a look into the future of corpora in language teaching.




Corpora for Language Learning


Book Description

This volume presents a diverse range of expertise and practical advice on corpus-assisted language learning, bridging the gap between corpus research and actual classroom practice. Grounded in expert discussions and interviews, the book offers an extensive exploration into the intricacies of corpus-based language pedagogy, addressing its challenges, benefits, and potential drawbacks while demonstrating the power of data-driven learning (DDL) tools, including AntConc, WordSmith Tools, and CorpusMate. The book navigates the complexities of integrating DDL into mainstream educational systems, showcasing real-world applications for teaching. The authors bring together cutting-edge, international perspectives on this topic in dialogue with those using such techniques in their classroom practice. Both a rigorous academic resource and a hands-on guide for practitioners, this book is recommended reading for educators, researchers, or anyone wanting to upskill themselves in learning to harness the power of data in language pedagogy in primary, secondary, tertiary, or other professional contexts.




Teaching English with Corpora


Book Description

Teaching English with Corpora is an accessible and practical introduction to the ways in which online and offline corpora can be used in English language teaching (ELT). Featuring 70 chapters written by an international range of researchers and practitioners, this book: • provides readers with clear, tested examples of corpus-based/driven lesson plans; • contains activities relevant to English for general purposes and English for specific purposes; • caters for the needs of English language teachers working with learners at different proficiency levels; • features flexible teaching suggestions that can be explored as part of a lesson or as a full lesson. This book is an essential purchase for pre- and in-service English language teachers as well as those studying corpus linguistics in undergraduate/Master’s courses in applied linguistics, ELT and Teaching English to Speakers of Other Languages (TESOL).