Advances in Semantic Authoring and Publishing


Book Description

Dissemination can be seen as a communication process between scientists. Over the course of several publications, they expose and support their findings, while discussing stated claims. Such discourse structures are trapped within the content of the publications, thus making the semantics discoverable only by humans. In addition, the lack of advances in scientific publishing, where electronic publications are still used as simple projections of paper documents, combined with the current growth in the amount of scientific research being published, transforms the process of finding relevant literature into a cumbersome task. The work presented in this thesis proposes a solution that takes full advantage of the support provided by electronic publications and of the current Semantic Web technologies to expose and crystallise the different discourse structures. The goal is to pave the way towards a Semantic Publishing Ecosystem that will alleviate, at least partly, the information overload problem. Our solution relies on enriching scientific publications with explicit rhetorical and argumentation discourse structures, in addition to explicit linear structures for identification and localization, and bibliographic information. Embedding these structures within the publication documents (as semantic metadata) enables the creation of semantic publications, i.e., foundational artefacts of the Semantic Publishing Ecosystem and linked resources part of the current Web of Data.




Probabilistic Semantic Web


Book Description

The management of uncertainty in the Semantic Web is of foremost importance given the nature and origin of the available data. This book presents a probabilistic semantics for knowledge bases, DISPONTE, which is inspired by the distribution semantics of Probabilistic Logic Programming. The book also describes approaches for inference and learning. In particular, it discusses 3 reasoners and 2 learning algorithms. BUNDLE and TRILL are able to find explanations for queries and compute their probability with regard to DISPONTE KBs while TRILLP compactly represents explanations using a Boolean formula and computes the probability of queries. The system EDGE learns the parameters of axioms of DISPONTE KBs. To reduce the computational cost, EDGEMR performs distributed parameter learning. LEAP learns both the structure and parameters of KBs, with LEAPMR using EDGEMR for reducing the computational cost. The algorithms provide effective techniques for dealing with uncertain KBs and have been widely tested on various datasets and compared with state of the art systems.




Semantic Search for Novel Information


Book Description

In this book, new approaches are presented for detecting and extracting simultaneously relevant and novel information from unstructured text documents. A major contribution of these approaches is that the information already provided and the extracted information are modeled semantically. This leads to the following benefits: (a) ambiguities in the language can be resolved; (b) the exact information needs regarding relevance and novelty can be specified; and (c) knowledge graphs can be incorporated. More specifically, this book presents the following scientific contributions: 1. An assessment of the suitability of existing large knowledge graphs (namely, DBpedia, Freebase, OpenCyc, Wikidata, and YAGO) for the task of detecting novel information in text documents. 2. A description of an approach by which emerging entities that are missing in a knowledge graph are detected in a stream of text documents. 3. A suggestion for an approach to extracting novel, relevant, semantically-structured statements from text documents. The developed approaches are suitable for the recommendation of emerging entities and novel statements respectively, for the purpose of knowledge graph population, and for providing assistance to users requiring novel information, such as journalists and technology scouts.




Semantic Sentiment Analysis in Social Streams


Book Description

Microblogs and social media platforms are now considered among the most popular forms of online communication. Through a platform like Twitter, much information reflecting people’s opinions and attitudes is published and shared among users on a daily basis. This has recently brought great opportunities to companies interested in tracking and monitoring the reputation of their brands and businesses, and to policy makers and politicians to support their assessment of public opinions about their policies or political issues. A wide range of approaches to sentiment analysis on social media, have been recently built. Most of these approaches rely mainly on the presence of affect words or syntactic structures that explicitly and unambiguously reflect sentiment. However, these approaches are semantically weak, that is, they do not account for the semantics of words when detecting their sentiment in text. In order to address this problem, the author investigates the role of word semantics in sentiment analysis of microblogs. Specifically, Twitter is used as a case study of microblogging platforms to investigate whether capturing the sentiment of words with respect to their semantics leads to more accurate sentiment analysis models on Twitter. To this end, the author proposes several approaches in this book for extracting and incorporating two types of word semantics for sentiment analysis: contextual semantics (i.e., semantics captured from words’ co-occurrences) and conceptual semantics (i.e., semantics extracted from external knowledge sources). Experiments are conducted with both types of semantics by assessing their impact in three popular sentiment analysis tasks on Twitter; entity-level sentiment analysis, tweet-level sentiment analysis and context-sensitive sentiment lexicon adaptation. The findings from this body of work demonstrate the value of using semantics in sentiment analysis on Twitter. The proposed approaches, which consider word semantics for sentiment analysis at both entity and tweet levels, surpass non-semantic approaches in most evaluation scenarios. This book will be of interest to students, researchers and practitioners in the semantic sentiment analysis field.




Integrating Relational Databases with the Semantic Web


Book Description

An early vision in Computer Science was to create intelligent systems capable of reasoning on larg¬e amounts of data. Independent results in the areas of Semantic Web and Relational Databases have advanced us towards this vision. Despite independent advances, the interface between Relational Databases and Semantic Web is poorly understood. This dissertation revisits this early vision with respect to current technology and addresses the following question: How and to what extent can Relational Databases be integrated with the Semantic Web? The thesis is that much of the existing Relational Database infrastructure can be reused to support the Semantic Web. Two problems are studied. Can a Relational Database be automatically virtualized as a Semantic Web data source? The first contribution is an automatic direct mapping from a Relational Database schema and data to RDF and OWL. The second contribution is a method capable of evaluating SPARQL queries against the Relational Database by exploiting two existing relational query optimizations. These contributions are embodied in the Ultrawrap system. Experiments show that SPARQL query execution performance on Ultrawrap is comparable to that of SQL queries written directly for the relational data. Such results have not been previously achieved. Can a Relational Database be mapped to existing Semantic Web ontologies and act as a reasoner? A third contribution is a method for Relational Databases to support inheritance and transitivity by compiling the ontology as mappings, implementing the mappings as views, using SQL recursion and optimizing by materializing views. Ultrawrap is extended with this contribution. Empirical analysis reveals that Relational Databases are able to effectively act as reasoners.




Semantic and Fuzzy Modelling for Human Behaviour Recognition in Smart Spaces


Book Description

One of the major limitations of the Ambient Intelligent Systems today is the lack of semantic models of those activities on the environment, so that the system can recognize the specific activity being performed by the user(s) and act accordingly. In this context, this thesis addresses the general problem of knowledge representation in Smart Spaces. The main objective is to develop knowledge-based models, equipped with semantics to learn, infer and monitor human behaviours in Smart Spaces. Moreover, it is easy to recognize that some aspects of this problem have a high degree of uncertainty, and therefore, the developed models must be equipped with mechanisms to manage this type of information. As an added value, this system should be sufficiently simple and flexible to be managed by non-expert users, and thus, facilitate the transfer of research to industry. To do this, we develop graphical models to represent human behaviour in Smart Spaces, in order to provide them with more usability in the final application. As a result, human behaviour recognition can help assisting people with special needs such as independent elders, in remote rehabilitation monitoring, industrial process guidelines, and many other cases.




Semantic Web Enabled Software Engineering


Book Description

Over the last decade, ontology has become an important modeling component in software engineering. Semantic Web Enabled Software Engineering presents some critical findings on opening a new direction of the research of Software Engineering, by exploiting Semantic Web technologies. Most of these findings are from selected papers from the Semantic Web Enabled Software Engineering (SWESE) series of workshops starting from 2005. Edited by two leading researchers, this advanced text presents a unifying and contemporary perspective on the field. The book integrates in one volume a unified perspective on concepts and theories of connecting Software Engineering and Semantic Web. It presents state-of-the-art techniques on how to use Semantic Web technologies in Software Engineering and introduces techniques on how to design ontologies for Software Engineering.




The Semantic Web in Earth and Space Science. Current Status and Future Directions


Book Description

The geosciences are one of the fields leading the way in advancing semantic technologies. This book continues the dialogue and feedback between the geoscience and semantic web communities. Increasing data volumes within the geosciences makes it no longer practical to copy data and perform local analysis. Hypotheses are now being tested through online tools that combine and mine pools of data. This evolution in the way research is conducted is commonly referred to as e-Science. As e-Science has flourished, the barriers to free and open access to data have been lowered and the need for semantics has been heighted. As the volume, complexity, and heterogeneity of data resources grow, geoscientists are creating new capabilities that rely on semantic approaches. Geoscience researchers are actively working toward a research environment of software tools and interfaces to data archives and services with the goals of full-scale semantic integration beginning to take shape. The members of this emerging semantic e-Science community are increasingly in need of semantic-based methodologies, tools and infrastructure. A feedback system between the geo- and computational sciences is forming. Advances in knowledge modeling, logic-based hypothesis checking, semantic data integration, and knowledge discovery are leading to advances in scientific domains, which in turn are validating semantic approaches and pointing to new research directions. We present mature semantic applications within the geosciences and stimulate discussion on emerging challenges and new research directions.




Semantic Service Integration for Smart Grids


Book Description

The scope of the research presented includes semantic-based integration of data services in smart grids achieved through following the proposed (S2)In-approach developed corresponding to design science guidelines. This approach identifies standards and specifications, which are integrated in order to build the basis for the (S2)In-architecture. A process model is introduced in the beginning, which serves as framework for developing the target architecture. The first step of the process stipulates to define requirements for smart grid ICT-architectures being derived from established studies and divided into two classes: architecture and non-functional requirements (NFR). Based on the architecture requirements, the following specifications have been basically selected: The IEC CIM representing a domain-specific data model, the OPC UA being a communication standard with special respects to information modeling, and WSMO as an approach to realize the concept of Semantic Web Services. The next step specifies to develop both, a semantic information model (integration of CIM and OPC UA) and semantic services (integration of CIM and WSMO). These two components are then combined to obtain the target architecture, which allows precise descriptions of services as well as their combination and semi-automatic execution. Finally, the NFR are considered in order to evaluate the architecture based on simulated, representative use cases.




Federated Query Processing for the Semantic Web


Book Description

During the last years, the amount of RDF data has increased exponentially over the Web, exposed via SPARQL endpoints. These SPARQL endpoints allow users to direct SPARQL queries to the RDF data. Federated SPARQL query processing allows to query several of these RDF databases as if they were a single one, integrating the results from all of them. This is a key concept in the Web of Data and it is also a hot topic in the community. Besides of that, the W3C SPARQL-WG has standardized it in the new Recommendation SPARQL 1.1._x000D_ This book provides a formalisation of the W3C proposed recommendation. This formalisation allows to identify existing errors and allows to correct them before the implementation phase or when the execution of these federated queries start. The book constitutes a valuable resource for any implementer since it also proposes solutions to the problems identified as well as proposing a set of SPARQL pattern reordering rules, which reduce the execution time of federated queries significantly._x000D_ Another strong point of this book is the research methodology followed. It states clearly the problems in the state of the art, next defines the research hypothesis for next providing a thoroughly analysis of the semantics of the SPARQL 1.1 specification. Once the theoretical part is concluded the book steps into the implementation part, describing clearly the implementation decisions for finally evaluating the overall system.