Stylistics, Stylometry and Sentiment Analysis in German Studies


Book Description

Can literature be investigated through quantitative methods? Can style, empathy, and prestige be measured? This study attempts to respond to these questions by providing results from a selection of case studies taken from German literature of the 19th through the 21st century, including Goethe’s “late style”, Felix Salten, and the output of contemporary writers such as Florian Meimberg’s “twitterature” and Daniel Glattauer’s e-mail novel. Altogether, this study shows how the interplay among literary theory, stylometry, stylistics, sentiment analysis, empirical studies, and archival research can offer new answers to old questions regarding German literature and provide tools to formulate new questions and novel approaches to research.




Stylistics, Stylometry and Sentiment Analysis in German Studies


Book Description

Can literature be investigated through quantitative methods? Can style, empathy, and prestige be measured? This study attempts to respond to these questions by providing results from a selection of case studies taken from German literature of the 19th through the 21st century, including Goethe's "late style"; Felix Salten; and the output of contemporary writers such as Florian Meimberg's "twitterature" and Daniel Glattauer's e-mail novel. Altogether, this study shows how the interplay among literary theory, stylometry, stylistics, sentiment analysis, empirical studies, and archival research can offer new answers to old questions regarding German literature and even provide tools to formulate new questions and novel approaches to research.




Genre Analysis and Corpus Design


Book Description

This work in the field of digital literary stylistics and computational literary studies is concerned with theoretical concerns of literary genre, with the design of a corpus of nineteenth-century Spanish-American novels, and with its empirical analysis in terms of subgenres of the novel. The digital text corpus consists of 256 Argentine, Cuban, and Mexican novels from the period between 1830 and 1910. It has been created with the goal to analyze thematic subgenres and literary currents that were represented in numerous novels in the nineteenth century by means of computational text categorization methods. To categorize the texts, statistical classification and a family resemblance analysis relying on network analysis are used with the aim to examine how the subgenres, which are understood as communicative, conventional phenomena, can be captured on the stylistic, textual level of the novels that participate in them.




Humanities Data Analysis


Book Description

A practical guide to data-intensive humanities research using the Python programming language The use of quantitative methods in the humanities and related social sciences has increased considerably in recent years, allowing researchers to discover patterns in a vast range of source materials. Despite this growth, there are few resources addressed to students and scholars who wish to take advantage of these powerful tools. Humanities Data Analysis offers the first intermediate-level guide to quantitative data analysis for humanities students and scholars using the Python programming language. This practical textbook, which assumes a basic knowledge of Python, teaches readers the necessary skills for conducting humanities research in the rapidly developing digital environment. The book begins with an overview of the place of data science in the humanities, and proceeds to cover data carpentry: the essential techniques for gathering, cleaning, representing, and transforming textual and tabular data. Then, drawing from real-world, publicly available data sets that cover a variety of scholarly domains, the book delves into detailed case studies. Focusing on textual data analysis, the authors explore such diverse topics as network analysis, genre theory, onomastics, literacy, author attribution, mapping, stylometry, topic modeling, and time series analysis. Exercises and resources for further reading are provided at the end of each chapter. An ideal resource for humanities students and scholars aiming to take their Python skills to the next level, Humanities Data Analysis illustrates the benefits that quantitative methods can bring to complex research questions. Appropriate for advanced undergraduates, graduate students, and scholars with a basic knowledge of Python Applicable to many humanities disciplines, including history, literature, and sociology Offers real-world case studies using publicly available data sets Provides exercises at the end of each chapter for students to test acquired skills Emphasizes visual storytelling via data visualizations




Authorship Attribution


Book Description

Authorship Attribution surveys the history and present state of the discipline, presenting some comparative results where available. It also provides a theoretical and empirically-tested basis for further work. Many modern techniques are described and evaluated, along with some insights for application for novices and experts alike.




Exploring the Implications of Complexity Thinking for Translation Studies


Book Description

Exploring the Implications of Complexity Thinking for Translation Studies considers the new link between translation studies and complexity thinking. Edited by leading scholars in this emerging field, the collection builds on and expands work done in complexity thinking in translation studies over the past decade. In this volume, the contributors address a variety of implications that this new approach holds for key concepts in Translation Studies such as source vs. target texts, translational units, authorship, translatorship, for research topics including translation data, machine translation, communities of practice, and for research methods such as constraints and the emergence of trajectories. The various chapters provide valuable information as to how research methods informed by complexity thinking can be applied in translation studies. Presenting theoretical and methodological contributions as well as case studies, this volume is of interest to advanced students, academics, and researchers in translation and interpreting studies, literary studies, and related areas.




The Grammar of Genres and Styles


Book Description

The book provides new findings about the grammar of genres and styles. It combines new methods with different kinds of empirical material, from social reports to live TV sports commentaries or 16th century newspapers, in English, French, Latin and Spanish. The study of non-discrete units suggests new ways of seeing the linguistic variation between genres and styles and the ways in which belonging to a genre predetermines linguistic choices.




Machine Learning Methods for Stylometry


Book Description

This book presents methods and approaches used to identify the true author of a doubtful document or text excerpt. It provides a broad introduction to all text categorization problems (like authorship attribution, psychological traits of the author, detecting fake news, etc.) grounded in stylistic features. Specifically, machine learning models as valuable tools for verifying hypotheses or revealing significant patterns hidden in datasets are presented in detail. Stylometry is a multi-disciplinary field combining linguistics with both statistics and computer science. The content is divided into three parts. The first, which consists of the first three chapters, offers a general introduction to stylometry, its potential applications and limitations. Further, it introduces the ongoing example used to illustrate the concepts discussed throughout the remainder of the book. The four chapters of the second part are more devoted to computer science with a focus on machine learning models. Their main aim is to explain machine learning models for solving stylometric problems. Several general strategies used to identify, extract, select, and represent stylistic markers are explained. As deep learning represents an active field of research, information on neural network models and word embeddings applied to stylometry is provided, as well as a general introduction to the deep learning approach to solving stylometric questions. In turn, the third part illustrates the application of the previously discussed approaches in real cases: an authorship attribution problem, seeking to discover the secret hand behind the nom de plume Elena Ferrante, an Italian writer known worldwide for her My Brilliant Friend’s saga; author profiling in order to identify whether a set of tweets were generated by a bot or a human being and in this second case, whether it is a man or a woman; and an exploration of stylistic variations over time using US political speeches covering a period of ca. 230 years. A solutions-based approach is adopted throughout the book, and explanations are supported by examples written in R. To complement the main content and discussions on stylometric models and techniques, examples and datasets are freely available at the author’s Github website.







Corpus-based Language Studies


Book Description

Covering the major approaches to the use of corpus data, this work gathers together influential readings from leading names in the discipline, including Biber, Widdowson, Sinclair, Carter and McCarthy.