Automatic Text Simplification


Book Description

Thanks to the availability of texts on the Web in recent years, increased knowledge and information have been made available to broader audiences. However, the way in which a text is written—its vocabulary, its syntax—can be difficult to read and understand for many people, especially those with poor literacy, cognitive or linguistic impairment, or those with limited knowledge of the language of the text. Texts containing uncommon words or long and complicated sentences can be difficult to read and understand by people as well as difficult to analyze by machines. Automatic text simplification is the process of transforming a text into another text which, ideally conveying the same message, will be easier to read and understand by a broader audience. The process usually involves the replacement of difficult or unknown phrases with simpler equivalents and the transformation of long and syntactically complex sentences into shorter and less complex ones. Automatic text simplification, a research topic which started 20 years ago, now has taken on a central role in natural language processing research not only because of the interesting challenges it posesses but also because of its social implications. This book presents past and current research in text simplification, exploring key issues including automatic readability assessment, lexical simplification, and syntactic simplification. It also provides a detailed account of machine learning techniques currently used in simplification, describes full systems designed for specific languages and target audiences, and offers available resources for research and development together with text simplification evaluation techniques.







From Complex Sentences to a Formal Semantic Representation using Syntactic Text Simplification and Open Information Extraction


Book Description

This work presents a discourse-aware Text Simplification approach that splits and rephrases complex English sentences within the semantic context in which they occur. Based on a linguistically grounded transformation stage, complex sentences are transformed into shorter utterances with a simple canonical structure that can be easily analyzed by downstream applications. To avoid breaking down the input into a disjointed sequence of statements that is difficult to interpret, the author incorporates the semantic context between the split propositions in the form of hierarchical structures and semantic relationships, thus generating a novel representation of complex assertions that puts a semantic layer on top of the simplified sentences. In a second step, she leverages the semantic hierarchy of minimal propositions to improve the performance of Open IE frameworks. She shows that such systems benefit in two dimensions. First, the canonical structure of the simplified sentences facilitates the extraction of relational tuples, leading to an improved precision and recall of the extracted relations. Second, the semantic hierarchy can be leveraged to enrich the output of existing Open IE approaches with additional meta-information, resulting in a novel lightweight semantic representation for complex text data in the form of normalized and context-preserving relational tuples.




Data Simplification


Book Description

Data Simplification: Taming Information With Open Source Tools addresses the simple fact that modern data is too big and complex to analyze in its native form. Data simplification is the process whereby large and complex data is rendered usable. Complex data must be simplified before it can be analyzed, but the process of data simplification is anything but simple, requiring a specialized set of skills and tools. This book provides data scientists from every scientific discipline with the methods and tools to simplify their data for immediate analysis or long-term storage in a form that can be readily repurposed or integrated with other data. Drawing upon years of practical experience, and using numerous examples and use cases, Jules Berman discusses the principles, methods, and tools that must be studied and mastered to achieve data simplification, open source tools, free utilities and snippets of code that can be reused and repurposed to simplify data, natural language processing and machine translation as a tool to simplify data, and data summarization and visualization and the role they play in making data useful for the end user. - Discusses data simplification principles, methods, and tools that must be studied and mastered - Provides open source tools, free utilities, and snippets of code that can be reused and repurposed to simplify data - Explains how to best utilize indexes to search, retrieve, and analyze textual data - Shows the data scientist how to apply ontologies, classifications, classes, properties, and instances to data using tried and true methods




Why Simple Wins


Book Description

Imagine what you could do with the time you spend writing emails every day. Complexity is killing companies' ability to innovate and adapt, and simplicity is fast becoming the competitive advantage of our time. Why Simple Wins helps leaders and their teams move beyond the feelings of frustration and futility that come with so much unproductive work in today's corporate world to create a corporate culture where valuable, essential, meaningful work is the norm. By learning how to eliminate redundancies, communicate with clarity, and make simplification a habit, individuals and companies can begin to recognize which activities are time-sucks and which create lasting value. Lisa Bodell's simplification method has several unique principles: Simplification is a skill that's available to us all, yet very few leaders use it. Simplification is the right thing to do--for our customers, for our company, and for each other. Operating with simplification as our core business model will make it easier to be respectful of each other's time. Simplification drives culture, and culture in turn drives employee engagement, customer relations, and overall productivity. This book is inspired by Bodell's passion for eliminating barriers to innovation and productivity. In it, she explains why change and innovation are so hard to achieve--and it's not what you might expect. The reality is this: we spend our days drowning in mundane tasks like meetings, emails, and reports. These are often self-created complexities that prevent us from getting to the meaningful work that truly matters. Using simple stories and techniques, Why Simple Wins shows that by using simplicity as an operating principle, we can eliminate the busy work that puts a chokehold on us every day, and instead spend time on the work that we value.




From Knowledge Intensive CAD to Knowledge Intensive Engineering


Book Description

IFIP Working Group 5.2 has organized a series of workshops extending the concept of intelligent CAD to the concept of "knowledge intensive engineering". The concept advocates that intensive life-cycle knowledge regarding products and design processes must be incorporated in the center of the CAD architecture. It focuses on the systematization and sharing of knowledge across the life-cycle stages and organizational boundaries. From Knowledge Intensive CAD to Knowledge Intensive Engineering comprises the Proceedings of the Fourth Workshop on Knowledge Intensive CAD, which was sponsored by the International Federation for Information Processing (IFIP) and held in Parma, Italy in May 2000. This workshop looked at the evolution of knowledge intensive design for the product life cycle moving towards knowledge intensive engineering. The 18 selected papers present an overview of the state-of-the-art in knowledge intensive engineering, discussing theoretical aspects and also practical systems and experiences gained in this area. An invited speaker paper is also included, discussing the role of knowledge in product and process innovation and technology for processing semantic knowledge. Main issues discussed in the book are: Architectures for knowledge intensive CAD; Tools for knowledge intensive CAD; Methodologies for knowledge intensive CAD; Implementation of knowledge intensive CAD; Applications of knowledge intensive CAD; Evolution of knowledge intensive design for the life-cycle; Formal methods. The volume is essential reading for researchers, graduate and postgraduate students, systems developers of advanced computer-aided design and manufacturing systems, and engineers involved in industrial applications.




Virtual Environments ’99


Book Description

This book contains the scientific papers presented at the SthEUROGRAPHICS Workshop on Virtual Environments '99, which st st was held in Vienna May 31 and June 1 . It was organized by the Institute of Computer Graphics of the Vienna University of Technology together with the Austrian Academy of Sciences and EUROGRAPHICS. The workshop brought together scientists from all over the world to present and discuss the latest scientific advances in the field of Virtual Environments. 31 papers where submitted for reviewing and 18 where selected to be presented at the workshop. Most of the top research institutions working in the area submitted papers and presented their latest results. These presentations were complemented by invited lectures from Stephen Feiner and Ron Azuma, two key researchers in the area of Augmented Reality. The book gives a good overview of the state of the art in Augmented Reality and Virtual Environment research. The special focus of the Workshop was Augmented Reality, reflecting a noticeable strong trend in the field of Virtual Environments. Augmented Reality tries to enrich real environments with virtual objects rather than replacing the real world with a virtual world. The main challenges include real time rendering, tracking, registration and occlusion of real and virtual objects, shading and lighting interaction, and interaction techniques in augmented environments. These problems are addressed by new research results documented in this book. Besides Augmented Reality, the papers collected here also address levels of detail, distributed environments, systems and applications, and interaction techniques.




Quality Estimation for Machine Translation


Book Description

Many applications within natural language processing involve performing text-to-text transformations, i.e., given a text in natural language as input, systems are required to produce a version of this text (e.g., a translation), also in natural language, as output. Automatically evaluating the output of such systems is an important component in developing text-to-text applications. Two approaches have been proposed for this problem: (i) to compare the system outputs against one or more reference outputs using string matching-based evaluation metrics and (ii) to build models based on human feedback to predict the quality of system outputs without reference texts. Despite their popularity, reference-based evaluation metrics are faced with the challenge that multiple good (and bad) quality outputs can be produced by text-to-text approaches for the same input. This variation is very hard to capture, even with multiple reference texts. In addition, reference-based metrics cannot be used in production (e.g., online machine translation systems), when systems are expected to produce outputs for any unseen input. In this book, we focus on the second set of metrics, so-called Quality Estimation (QE) metrics, where the goal is to provide an estimate on how good or reliable the texts produced by an application are without access to gold-standard outputs. QE enables different types of evaluation that can target different types of users and applications. Machine learning techniques are used to build QE models with various types of quality labels and explicit features or learnt representations, which can then predict the quality of unseen system outputs. This book describes the topic of QE for text-to-text applications, covering quality labels, features, algorithms, evaluation, uses, and state-of-the-art approaches. It focuses on machine translation as application, since this represents most of the QE work done to date. It also briefly describes QE for several other applications, including text simplification, text summarization, grammatical error correction, and natural language generation.




Computational Linguistics and Intelligent Text Processing


Book Description

The two-volume set LNCS 13451 and 13452 constitutes revised selected papers from the CICLing 2019 conference which took place in La Rochelle, France, April 2019. The total of 95 papers presented in the two volumes was carefully reviewed and selected from 335 submissions. The book also contains 3 invited papers. The papers are organized in the following topical sections: General, Information extraction, Information retrieval, Language modeling, Lexical resources, Machine translation, Morphology, sintax, parsing, Name entity recognition, Semantics and text similarity, Sentiment analysis, Speech processing, Text categorization, Text generation, and Text mining.




Advances in Computers


Book Description

Advances in Computers