Analyzing Linguistic Data


Book Description

Statistical analysis is a useful skill for linguists and psycholinguists, allowing them to understand the quantitative structure of their data. This textbook provides a straightforward introduction to the statistical analysis of language. Designed for linguists with a non-mathematical background, it clearly introduces the basic principles and methods of statistical analysis, using 'R', the leading computational statistics programme. The reader is guided step-by-step through a range of real data sets, allowing them to analyse acoustic data, construct grammatical trees for a variety of languages, quantify register variation in corpus linguistics, and measure experimental data using state-of-the-art models. The visualization of data plays a key role, both in the initial stages of data exploration and later on when the reader is encouraged to criticize various models. Containing over 40 exercises with model answers, this book will be welcomed by all linguists wishing to learn more about working with and presenting quantitative data.




Introducing Electronic Text Analysis


Book Description

Introducing Electronic Text Analysis is a practical and much needed introduction to corpora – bodies of linguistic data. Written specifically for students studying this topic for the first time, the book begins with a discussion of the underlying principles of electronic text analysis. It then examines how these corpora enhance our understanding of literary and non-literary works. In the first section the author introduces the concepts of concordance and lexical frequency, concepts which are then applied to a range of areas of language study. Key areas examined are the use of on-line corpora to complement traditional stylistic analysis, and the ways in which methods such as concordance and frequency counts can reveal a particular ideology within a text. Presenting an accessible and thorough understanding of the underlying principles of electronic text analysis, the book contains abundant illustrative examples and a glossary with definitions of main concepts. It will also be supported by a companion website with links to on-line corpora so that students can apply their knowledge to further study. The accompanying website to this book can be found at http://www.routledge.com/textbooks/0415320216




The Open Handbook of Linguistic Data Management


Book Description

A guide to principles and methods for the management, archiving, sharing, and citing of linguistic research data, especially digital data. "Doing language science" depends on collecting, transcribing, annotating, analyzing, storing, and sharing linguistic research data. This volume offers a guide to linguistic data management, engaging with current trends toward the transformation of linguistics into a more data-driven and reproducible scientific endeavor. It offers both principles and methods, presenting the conceptual foundations of linguistic data management and a series of case studies, each of which demonstrates a concrete application of abstract principles in a current practice. In part 1, contributors bring together knowledge from information science, archiving, and data stewardship relevant to linguistic data management. Topics covered include implementation principles, archiving data, finding and using datasets, and the valuation of time and effort involved in data management. Part 2 presents snapshots of practices across various subfields, with each chapter presenting a unique data management project with generalizable guidance for researchers. The Open Handbook of Linguistic Data Management is an essential addition to the toolkit of every linguist, guiding researchers toward making their data FAIR: Findable, Accessible, Interoperable, and Reusable.




Handbook of Language Analysis in Psychology


Book Description

Recent years have seen an explosion of interest in the use of computerized text analysis methods to address basic psychological questions. This comprehensive handbook brings together leading language analysis scholars to present foundational concepts and methods for investigating human thought, feeling, and behavior using language. Contributors work toward integrating psychological science and theory with natural language processing (NLP) and machine learning. Ethical issues in working with natural language data sets are discussed in depth. The volume showcases NLP-driven techniques and applications in areas including interpersonal relationships, personality, morality, deception, social biases, political psychology, psychopathology, and public health.




Linguistic Ethnography


Book Description

This is an engaging interdisciplinary guide to the unique role of language within ethnography. The book provides a philosophical overview of the field alongside practical support for designing and developing your own ethnographic research. It demonstrates how to build and develop arguments and engages with practical issues such as ethics, transcription and impact. There are chapter-long case studies based on real research that will explain key themes and help you create and analyse your own linguistic data. Drawing on the authors’ experience they outline the practical, epistemological and theoretical decisions that researchers must take when planning and carrying out their studies. Other key features include: A clear introduction to discourse analytic traditions Tips on how to produce effective field notes Guidance on how to manage interview and conversational data Advice on writing linguistic ethnographies for different audiences Annotated suggestions for further reading Full glossary This book is a master class in understanding linguistic ethnography, it will of interest to anyone conducting field research across the social sciences.




Data Visualization and Analysis in Second Language Research


Book Description

This introduction to visualization techniques and statistical models for second language research focuses on three types of data (continuous, binary, and scalar), helping readers to understand regression models fully and to apply them in their work. Garcia offers advanced coverage of Bayesian analysis, simulated data, exercises, implementable script code, and practical guidance on the latest R software packages. The book, also demonstrating the benefits to the L2 field of this type of statistical work, is a resource for graduate students and researchers in second language acquisition, applied linguistics, and corpus linguistics who are interested in quantitative data analysis.




Statistics for Linguistics with R


Book Description

This book is an introduction to statistics for linguists using the open source software R. It is aimed at students and instructors/professors with little or no statistical background and is written in a non-technical and reader-friendly/accessible style. It first introduces in detail the overall logic underlying quantitative studies: exploration, hypothesis formulation and operationalization, and the notion and meaning of significance tests. It then introduces some basics of the software R relevant to statistical data analysis. A chapter on descriptive statistics explains how summary statistics for frequencies, averages, and correlations are generated with R and how they are graphically represented best. A chapter on analytical statistics explains how statistical tests are performed in R on the basis of many different linguistic case studies: For nearly every single example, it is explained what the structure of the test looks like, how hypotheses are formulated, explored, and tested for statistical significance, how the results are graphically represented, and how one would summarize them in a paper/article. A chapter on selected multifactorial methods introduces how more complex research designs can be studied: methods for the study of multifactorial frequency data, correlations, tests for means, and binary response data are discussed and exemplified step-by-step. Also, the exploratory approach of hierarchical cluster analysis is illustrated in detail. The book comes with many exercises, boxes with short think breaks and warnings, recommendations for further study, and answer keys as well as a statistics for linguists newsgroup on the companion website. The volume is aimed at beginners on every level of linguistic education: undergraduate students, graduate students, and instructors/professors and can be used in any research methods and statistics class for linguists. It presupposes no quantitative/statistical knowledge whatsoever and, unlike most competing books, begins at step 1 for every method and explains everything explicitly.




The Oxford Handbook of Linguistic Analysis


Book Description

This handbook compares the main analytic frameworks and methods of contemporary linguistics. It offers a unique overview of linguistic theory, revealing the common concerns of competing approaches. By showing their current and potential applications it provides the means by which linguists and others can judge what are the most useful models for the task in hand. Distinguished scholars from all over the world explain the rationale and aims of over thirty explanatory approaches to the description, analysis, and understanding of language. Each chapter considers the main goals of the model; the relation it proposes from between lexicon, syntax, semantics, pragmatics, and phonology; the way it defines the interactions between cognition and grammar; what it counts as evidence; and how it explains linguistic change and structure. The Oxford Handbook of Linguistic Analysis offers an indispensable guide for everyone researching any aspect of language including those in linguistics, comparative philology, cognitive science, developmental philology, cognitive science, developmental psychology, computational science, and artificial intelligence. This second edition has been updated to include seven new chapters looking at linguistic units in language acquisition, conversation analysis, neurolinguistics, experimental phonetics, phonological analysis, experimental semantics, and distributional typology.




How Students Write: A Linguistic Analysis


Book Description

Broad generalizations about "people today" are a familiar feature of first-year student writing. How Students Write brings a fresh perspective to this perennial observation, using corpus linguistics techniques. This study analyzes sentence-level patterns in student writing to develop an understanding of how students present evidence, draw connections between ideas, relate to their readers, and, ultimately, learn to construct knowledge in their writing. Drawing on both first-year and upper-level student writing, the book examines the discourse of students at different points in their education. It also distinguishes between argumentative and analytic essays to explore the way school genres and assignments shape students' choices. In focusing on sentence-level features such as hedges ("perhaps") and boosters ("definitely"), this study shows how such rhetorical choices work together to open or close opportunities for thoughtful exchanges of ideas. Attention to these features can help instructors foster civil discourse, design effective assignments, and expose and question norms of higher education.




Linked Data in Linguistics


Book Description

The explosion of information technology has led to substantial growth of web-accessible linguistic data in terms of quantity, diversity and complexity. These resources become even more useful when interlinked with each other to generate network effects. The general trend of providing data online is thus accompanied by newly developing methodologies to interconnect linguistic data and metadata. This includes linguistic data collections, general-purpose knowledge bases (e.g., the DBpedia, a machine-readable edition of the Wikipedia), and repositories with specific information about languages, linguistic categories and phenomena. The Linked Data paradigm provides a framework for interoperability and access management, and thereby allows to integrate information from such a diverse set of resources. The contributions assembled in this volume illustrate the band-width of applications of the Linked Data paradigm for representative types of language resources. They cover lexical-semantic resources, annotated corpora, typological databases as well as terminology and metadata repositories. The book includes representative applications from diverse fields, ranging from academic linguistics (e.g., typology and corpus linguistics) over applied linguistics (e.g., lexicography and translation studies) to technical applications (in computational linguistics, Natural Language Processing and information technology). This volume accompanies the Workshop on Linked Data in Linguistics 2012 (LDL-2012) in Frankfurt/M., Germany, organized by the Open Linguistics Working Group (OWLG) of the Open Knowledge Foundation (OKFN). It assembles contributions of the workshop participants and, beyond this, it summarizes initial steps in the formation of a Linked Open Data cloud of linguistic resources, the Linguistic Linked Open Data cloud (LLOD).