The Semantic Web in Earth and Space Science. Current Status and Future Directions


Book Description

The geosciences are one of the fields leading the way in advancing semantic technologies. This book continues the dialogue and feedback between the geoscience and semantic web communities. Increasing data volumes within the geosciences makes it no longer practical to copy data and perform local analysis. Hypotheses are now being tested through online tools that combine and mine pools of data. This evolution in the way research is conducted is commonly referred to as e-Science. As e-Science has flourished, the barriers to free and open access to data have been lowered and the need for semantics has been heighted. As the volume, complexity, and heterogeneity of data resources grow, geoscientists are creating new capabilities that rely on semantic approaches. Geoscience researchers are actively working toward a research environment of software tools and interfaces to data archives and services with the goals of full-scale semantic integration beginning to take shape. The members of this emerging semantic e-Science community are increasingly in need of semantic-based methodologies, tools and infrastructure. A feedback system between the geo- and computational sciences is forming. Advances in knowledge modeling, logic-based hypothesis checking, semantic data integration, and knowledge discovery are leading to advances in scientific domains, which in turn are validating semantic approaches and pointing to new research directions. We present mature semantic applications within the geosciences and stimulate discussion on emerging challenges and new research directions.




Integrating Relational Databases with the Semantic Web


Book Description

An early vision in Computer Science was to create intelligent systems capable of reasoning on larg¬e amounts of data. Independent results in the areas of Semantic Web and Relational Databases have advanced us towards this vision. Despite independent advances, the interface between Relational Databases and Semantic Web is poorly understood. This dissertation revisits this early vision with respect to current technology and addresses the following question: How and to what extent can Relational Databases be integrated with the Semantic Web? The thesis is that much of the existing Relational Database infrastructure can be reused to support the Semantic Web. Two problems are studied. Can a Relational Database be automatically virtualized as a Semantic Web data source? The first contribution is an automatic direct mapping from a Relational Database schema and data to RDF and OWL. The second contribution is a method capable of evaluating SPARQL queries against the Relational Database by exploiting two existing relational query optimizations. These contributions are embodied in the Ultrawrap system. Experiments show that SPARQL query execution performance on Ultrawrap is comparable to that of SQL queries written directly for the relational data. Such results have not been previously achieved. Can a Relational Database be mapped to existing Semantic Web ontologies and act as a reasoner? A third contribution is a method for Relational Databases to support inheritance and transitivity by compiling the ontology as mappings, implementing the mappings as views, using SQL recursion and optimizing by materializing views. Ultrawrap is extended with this contribution. Empirical analysis reveals that Relational Databases are able to effectively act as reasoners.




Probabilistic Semantic Web


Book Description

The management of uncertainty in the Semantic Web is of foremost importance given the nature and origin of the available data. This book presents a probabilistic semantics for knowledge bases, DISPONTE, which is inspired by the distribution semantics of Probabilistic Logic Programming. The book also describes approaches for inference and learning. In particular, it discusses 3 reasoners and 2 learning algorithms. BUNDLE and TRILL are able to find explanations for queries and compute their probability with regard to DISPONTE KBs while TRILLP compactly represents explanations using a Boolean formula and computes the probability of queries. The system EDGE learns the parameters of axioms of DISPONTE KBs. To reduce the computational cost, EDGEMR performs distributed parameter learning. LEAP learns both the structure and parameters of KBs, with LEAPMR using EDGEMR for reducing the computational cost. The algorithms provide effective techniques for dealing with uncertain KBs and have been widely tested on various datasets and compared with state of the art systems.




Exploiting Semantic Web Knowledge Graphs in Data Mining


Book Description

Data Mining and Knowledge Discovery in Databases (KDD) is a research field concerned with deriving higher-level insights from data. The tasks performed in this field are knowledge intensive and can benefit from additional knowledge from various sources, so many approaches have been proposed that combine Semantic Web data with the data mining and knowledge discovery process. This book, Exploiting Semantic Web Knowledge Graphs in Data Mining, aims to show that Semantic Web knowledge graphs are useful for generating valuable data mining features that can be used in various data mining tasks. In Part I, Mining Semantic Web Knowledge Graphs, the author evaluates unsupervised feature generation strategies from types and relations in knowledge graphs used in different data mining tasks such as classification, regression, and outlier detection. Part II, Semantic Web Knowledge Graphs Embeddings, proposes an approach that circumvents the shortcomings introduced with the approaches in Part I, developing an approach that is able to embed complete Semantic Web knowledge graphs in a low dimensional feature space where each entity and relation in the knowledge graph is represented as a numerical vector. Finally, Part III, Applications of Semantic Web Knowledge Graphs, describes a list of applications that exploit Semantic Web knowledge graphs like classification and regression, showing that the approaches developed in Part I and Part II can be used in applications in various domains. The book will be of interest to all those working in the field of data mining and KDD.




Emerging Topics in Semantic Technologies


Book Description

This book includes a selection of thoroughly refereed papers accepted at the Satellite Events of the 17th Internal Semantic Web Conference, ISWC 2018, held in Monterey, CA in October 2018. The key areas addressed by these events include the core Semantic Web technologies such as knowledge graphs and scalable knowledge base systems, ontology design and modelling, semantic deep learning and statistics. Furthermore, several novel applications of semantic technologies to the topics of Internet of Things (IoT), healthcare, social media and social good are discussed. Finally, important topics at the interface of the Semantic Web technologies and their human users are addressed, including visualization and interaction paradigms for Web Data as well as crowdsourcing applications.




Semantic Data Mining


Book Description

Ontologies are now increasingly used to integrate, and organize data and knowledge, particularly in data and knowledge-intensive applications in both research and industry. The book is devoted to semantic data mining – a data mining approach where domain ontologies are used as background knowledge, and where the new challenge is to mine knowledge encoded in domain ontologies and knowledge graphs, rather than only purely empirical data. The introductory chapters of the book provide theoretical foundations of both data mining and ontology representation. Taking a unified perspective, the book then covers several methods for semantic data mining, addressing tasks such as pattern mining, classification and similarity-based approaches. It attempts to provide state-of-the-art answers to specific challenges and peculiarities of data mining with use of ontologies, in particular: How to deal with incompleteness of knowledge and the so-called Open World Assumption? What is a truly “semantic” similarity measure? The book contains several chapters with examples of applications of semantic data mining. The examples start from a scenario with moderate use of lightweight ontologies for knowledge graph enrichment and end with a full-fledged scenario of an intelligent knowledge discovery assistant using complex domain ontologies for meta-mining, i.e., an ontology-based meta-learning approach to full data mining processes. The book is intended for researchers in the fields of semantic technologies, knowledge engineering, data science, and data mining, and developers of knowledge-based systems and applications.




Semantic Search for Novel Information


Book Description

In this book, new approaches are presented for detecting and extracting simultaneously relevant and novel information from unstructured text documents. A major contribution of these approaches is that the information already provided and the extracted information are modeled semantically. This leads to the following benefits: (a) ambiguities in the language can be resolved; (b) the exact information needs regarding relevance and novelty can be specified; and (c) knowledge graphs can be incorporated. More specifically, this book presents the following scientific contributions: 1. An assessment of the suitability of existing large knowledge graphs (namely, DBpedia, Freebase, OpenCyc, Wikidata, and YAGO) for the task of detecting novel information in text documents. 2. A description of an approach by which emerging entities that are missing in a knowledge graph are detected in a stream of text documents. 3. A suggestion for an approach to extracting novel, relevant, semantically-structured statements from text documents. The developed approaches are suitable for the recommendation of emerging entities and novel statements respectively, for the purpose of knowledge graph population, and for providing assistance to users requiring novel information, such as journalists and technology scouts.




Semantic Sentiment Analysis in Social Streams


Book Description

Microblogs and social media platforms are now considered among the most popular forms of online communication. Through a platform like Twitter, much information reflecting people’s opinions and attitudes is published and shared among users on a daily basis. This has recently brought great opportunities to companies interested in tracking and monitoring the reputation of their brands and businesses, and to policy makers and politicians to support their assessment of public opinions about their policies or political issues. A wide range of approaches to sentiment analysis on social media, have been recently built. Most of these approaches rely mainly on the presence of affect words or syntactic structures that explicitly and unambiguously reflect sentiment. However, these approaches are semantically weak, that is, they do not account for the semantics of words when detecting their sentiment in text. In order to address this problem, the author investigates the role of word semantics in sentiment analysis of microblogs. Specifically, Twitter is used as a case study of microblogging platforms to investigate whether capturing the sentiment of words with respect to their semantics leads to more accurate sentiment analysis models on Twitter. To this end, the author proposes several approaches in this book for extracting and incorporating two types of word semantics for sentiment analysis: contextual semantics (i.e., semantics captured from words’ co-occurrences) and conceptual semantics (i.e., semantics extracted from external knowledge sources). Experiments are conducted with both types of semantics by assessing their impact in three popular sentiment analysis tasks on Twitter; entity-level sentiment analysis, tweet-level sentiment analysis and context-sensitive sentiment lexicon adaptation. The findings from this body of work demonstrate the value of using semantics in sentiment analysis on Twitter. The proposed approaches, which consider word semantics for sentiment analysis at both entity and tweet levels, surpass non-semantic approaches in most evaluation scenarios. This book will be of interest to students, researchers and practitioners in the semantic sentiment analysis field.




Query Processing over Graph-structured Data on the Web


Book Description

In the last years, Linked Data initiatives have encouraged the publication of large graph-structured datasets using the Resource Description Framework (RDF). Due to the constant growth of RDF data on the web, more flexible data management infrastructures must be able to efficiently and effectively exploit the vast amount of knowledge accessible on the web. This book presents flexible query processing strategies over RDF graphs on the web using the SPARQL query language. In this work, we show how query engines can change plans on-the-fly with adaptive techniques to cope with unpredictable conditions and to reduce execution time. Furthermore, this work investigates the application of crowdsourcing in query processing, where engines are able to contact humans to enhance the quality of query answers. The theoretical and empirical results presented in this book indicate that flexible techniques allow for querying RDF data sources efficiently and effectively.




Semantic and Fuzzy Modelling for Human Behaviour Recognition in Smart Spaces


Book Description

One of the major limitations of the Ambient Intelligent Systems today is the lack of semantic models of those activities on the environment, so that the system can recognize the specific activity being performed by the user(s) and act accordingly. In this context, this thesis addresses the general problem of knowledge representation in Smart Spaces. The main objective is to develop knowledge-based models, equipped with semantics to learn, infer and monitor human behaviours in Smart Spaces. Moreover, it is easy to recognize that some aspects of this problem have a high degree of uncertainty, and therefore, the developed models must be equipped with mechanisms to manage this type of information. As an added value, this system should be sufficiently simple and flexible to be managed by non-expert users, and thus, facilitate the transfer of research to industry. To do this, we develop graphical models to represent human behaviour in Smart Spaces, in order to provide them with more usability in the final application. As a result, human behaviour recognition can help assisting people with special needs such as independent elders, in remote rehabilitation monitoring, industrial process guidelines, and many other cases.