Authorship Attribution


Book Description

Authorship Attribution surveys the history and present state of the discipline, presenting some comparative results where available. It also provides a theoretical and empirically-tested basis for further work. Many modern techniques are described and evaluated, along with some insights for application for novices and experts alike.




Machine Learning Methods for Stylometry


Book Description

This book presents methods and approaches used to identify the true author of a doubtful document or text excerpt. It provides a broad introduction to all text categorization problems (like authorship attribution, psychological traits of the author, detecting fake news, etc.) grounded in stylistic features. Specifically, machine learning models as valuable tools for verifying hypotheses or revealing significant patterns hidden in datasets are presented in detail. Stylometry is a multi-disciplinary field combining linguistics with both statistics and computer science. The content is divided into three parts. The first, which consists of the first three chapters, offers a general introduction to stylometry, its potential applications and limitations. Further, it introduces the ongoing example used to illustrate the concepts discussed throughout the remainder of the book. The four chapters of the second part are more devoted to computer science with a focus on machine learning models. Their main aim is to explain machine learning models for solving stylometric problems. Several general strategies used to identify, extract, select, and represent stylistic markers are explained. As deep learning represents an active field of research, information on neural network models and word embeddings applied to stylometry is provided, as well as a general introduction to the deep learning approach to solving stylometric questions. In turn, the third part illustrates the application of the previously discussed approaches in real cases: an authorship attribution problem, seeking to discover the secret hand behind the nom de plume Elena Ferrante, an Italian writer known worldwide for her My Brilliant Friend’s saga; author profiling in order to identify whether a set of tweets were generated by a bot or a human being and in this second case, whether it is a man or a woman; and an exploration of stylistic variations over time using US political speeches covering a period of ca. 230 years. A solutions-based approach is adopted throughout the book, and explanations are supported by examples written in R. To complement the main content and discussions on stylometric models and techniques, examples and datasets are freely available at the author’s Github website.




Multidisciplinary Information Retrieval


Book Description

This book constitutes the proceedings of the 7th International Information Retrieval Facility Conference, IRFC 2014, held in Copenhagen, Denmark, November 2014. The 10 papers presented together with one industry paper were carefully reviewed and selected from 13 submissions. The conference aims at bringing young researchers into contact with the industry at an early stage, emphasizing the applicability of IR solutions to real industry cases and the respective challenges.




Issues in Information Science Research: 2013 Edition


Book Description

Issues in Information Science Research / 2013 Edition is a ScholarlyEditions™ book that delivers timely, authoritative, and comprehensive information about Web and Grid Services. The editors have built Issues in Information Science Research: 2013 Edition on the vast information databases of ScholarlyNews.™ You can expect the information about Web and Grid Services in this book to be deeper than what you can access anywhere else, as well as consistently reliable, authoritative, informed, and relevant. The content of Issues in Information Science Research: 2013 Edition has been produced by the world’s leading scientists, engineers, analysts, research institutions, and companies. All of the content is from peer-reviewed sources, and all of it is written, assembled, and edited by the editors at ScholarlyEditions™ and available exclusively from us. You now have a source you can cite with authority, confidence, and credibility. More information is available at http://www.ScholarlyEditions.com/.




Literary Detective Work on the Computer


Book Description

Computational linguistics can be used to uncover mysteries in text which are not always obvious to visual inspection. For example, the computer analysis of writing style can show who might be the true author of a text in cases of disputed authorship or suspected plagiarism. The theoretical background to authorship attribution is presented in a step by step manner, and comprehensive reviews of the field are given in two specialist areas, the writings of William Shakespeare and his contemporaries, and the various writing styles seen in religious texts. The final chapter looks at the progress computers have made in the decipherment of lost languages. This book is written for students and researchers of general linguistics, computational and corpus linguistics, and computer forensics. It will inspire future researchers to study these topics for themselves, and gives sufficient details of the methods and resources to get them started.




Information and Software Technologies


Book Description

This book constitutes the refereed proceedings of the 21th International Conference on Information and Software Technologies, ICIST 2015, held in Druskininkai, Lithuania, in October 2015. The 51 papers presented were carefully reviewed and selected from 125 submissions. The papers are organized in topical sections on information systems; business intelligence for information and software systems; software engineering; information technology applications.




The SAGE Handbook of Social Media Research Methods


Book Description

The SAGE Handbook of Social Media Research Methods spans the entire research process, from data collection to analysis and interpretation. This second edition has been comprehensively updated and expanded, from 39 to 49 chapters. In addition to a new section of chapters focussing on ethics, privacy and the politics of social media data, the new edition provides broader coverage of topics such as: Data sources Scraping and spidering data Locative data, video data and linked data Platform-specific analysis Analytical tools Critical social media analysis Written by leading scholars from across the globe, the chapters provide a mix of theoretical and applied assessments of topics, and include a range of new case studies and data sets that exemplify the methodological approaches. This Handbook is an essential resource for any researcher or postgraduate student embarking on a social media research project. PART 1: Conceptualising and Designing Social Media Research PART 2: Collecting Data PART 3: Qualitative Approaches to Social Media Data PART 4: Quantitative Approaches to Social Media Data PART 5: Diverse Approaches to Social Media Data PART 6: Research & Analytical Tools PART 7: Social Media Platforms PART 8: Privacy, Ethics and Inequalities




Databases Theory and Applications


Book Description

This book constitutes the refereed proceedings of the 25th Australasian Database Conference, ADC 2014, held in Brisbane, NSW, Australia, in July 2014. The 15 full papers presented together with 6 short papers and 2 keynotes were carefully reviewed and selected from 38 submissions. A large variety of subjects are covered, including hot topics such as data warehousing; database integration; mobile databases; cloud, distributed, and parallel databases; high dimensional and temporal data; image/video retrieval and databases; database performance and tuning; privacy and security in databases; query processing and optimization; semi-structured data and XML; spatial data processing and management; stream and sensor data management; uncertain and probabilistic databases; web databases; graph databases; web service management; and social media data management.




Applications of Topic Models


Book Description

Describes recent academic and industrial applications of topic models with the goal of launching a young researcher capable of building their own applications of topic models.




Journalism History and Digital Archives


Book Description

This book showcases various ways in which digital archives allow for new approaches to journalism history. The chapters in this book were selected based on three overall objectives: 1) research that highlights specific concerns within journalism history through digital archives; 2) discussions of digital methodologies, as well as specific applications, that are accessible for journalism scholars with no prior experiences with such approaches; and 3) that journalism history and digital archives are connected in other ways than through specific methods, i.e., that the connection raises larger questions of historiography and power. The contributions address cases and developments in Asia, South and North America and Europe; and range from long-range, big-data, machine-leaning and topic modelling studies of journalistic characteristics and meta-journalistic discourses to critiques of archival practices and access in relation to gender, social movements and poverty. The chapters in this book were originally published as a special issue of Digital Journalism.