Inductive Inference for Large Scale Text Classification


Book Description

Text classification is becoming a crucial task to analysts in different areas. In the last few decades, the production of textual documents in digital form has increased exponentially. Their applications range from web pages to scientific documents, including emails, news and books. Despite the widespread use of digital texts, handling them is inherently difficult - the large amount of data necessary to represent them and the subjectivity of classification complicate matters. This book gives a concise view on how to use kernel approaches for inductive inference in large scale text classification; it presents a series of new techniques to enhance, scale and distribute text classification tasks. It is not intended to be a comprehensive survey of the state-of-the-art of the whole field of text classification. Its purpose is less ambitious and more practical: to explain and illustrate some of the important methods used in this field, in particular kernel approaches and techniques.




Data Science Concepts and Techniques with Applications


Book Description

This book comprehensively covers the topic of data science. Data science is an umbrella term that encompasses data analytics, data mining, machine learning, and several other related disciplines. This book synthesizes both fundamental and advanced topics of a research area that has now reached maturity. The chapters of this book are organized into three sections: The first section is an introduction to data science. Starting from the basic concepts, the book will highlight the types of data, its use, its importance and issues that are normally faced in data analytics. Followed by discussion on wide range of applications of data science and widely used techniques in data science. The second section is devoted to the tools and techniques of data science. It consists of data pre-processing, feature selection, classification and clustering concepts as well as an introduction to text mining and opining mining. And finally, the third section of the book focuses on two programming languages commonly used for data science projects i.e. Python and R programming language. Although this book primarily serves as a textbook, it will also appeal to industrial practitioners and researchers due to its focus on applications and references. The book is suitable for both undergraduate and postgraduate students as well as those carrying out research in data science. It can be used as a textbook for undergraduate students in computer science, engineering and mathematics. It can also be accessible to undergraduate students from other areas with the adequate background. The more advanced chapters can be used by postgraduate researchers intending to gather a deeper theoretical understanding.




Applied Text Mining


Book Description

This textbook covers the concepts, theories, and implementations of text mining and natural language processing (NLP). It covers both the theory and the practical implementation, and every concept is explained with simple and easy-to-understand examples. It consists of three parts. In Part 1 which consists of three chapters details about basic concepts and applications of text mining are provided, including eg sentiment analysis and opinion mining. It builds a strong foundation for the reader in order to understand the remaining parts. In the five chapters of Part 2, all the core concepts of text analytics like feature engineering, text classification, text clustering, text summarization, topic mapping, and text visualization are covered. Finally, in Part 3 there are three chapters covering deep-learning-based text mining, which is the dominating method applied to practically all text mining tasks nowadays. Various deep learning approaches to text mining are covered, including models for processing and parsing text, for lexical analysis, and for machine translation. All three parts include large parts of Python code that shows the implementation of the described concepts and approaches. The textbook was specifically written to enable the teaching of both basic and advanced concepts from one single book. The implementation of every text mining task is carefully explained, based Python as the programming language and Spacy and NLTK as Natural Language Processing libraries. The book is suitable for both undergraduate and graduate students in computer science and engineering.




Advances in Neural Networks – ISNN 2016


Book Description

This book constitutes the refereed proceedings of the 13th International Symposium on Neural Networks, ISNN 2016, held in St. Petersburg, Russia in July 2016. The 84 revised full papers presented in this volume were carefully reviewed and selected from 104 submissions. The papers cover many topics of neural network-related research including signal and image processing; dynamical behaviors of recurrent neural networks; intelligent control; clustering, classification, modeling, and forecasting; evolutionary computation; and cognition computation and spiking neural networks.




Risks and Security of Internet and Systems


Book Description

This book constitutes the proceedings of the 15th International Conference on Risks and Security of Internet and Systems, CRiTIS 2020, which took place during November 4-6, 2020. The conference was originally planned to take place in Paris, France, but had to change to an online format due to the COVID-19 pandemic. The 16 full and 7 short papers included in this volume were carefully reviewed and selected from 44 submissions. In addition, the book contains one invited talk in full paper length. The papers were organized in topical sections named: vulnerabilities, attacks and intrusion detection; TLS, openness and security control; access control, risk assessment and security knowledge; risk analysis, neural networks and Web protection; infrastructure security and malware detection.




Innovations in Smart Cities Applications Edition 2


Book Description

This book highlights cutting-edge research presented at the third installment of the International Conference on Smart City Applications (SCA2018), held in Tétouan, Morocco on October 10–11, 2018. It presents original research results, new ideas, and practical lessons learned that touch on all aspects of smart city applications. The respective papers share new and highly original results by leading experts on IoT, Big Data, and Cloud technologies, and address a broad range of key challenges in smart cities, including Smart Education and Intelligent Learning Systems, Smart Healthcare, Smart Building and Home Automation, Smart Environment and Smart Agriculture, Smart Economy and Digital Business, and Information Technologies and Computer Science, among others. In addition, various novel proposals regarding smart cities are discussed. Gathering peer-reviewed chapters written by prominent researchers from around the globe, the book offers an invaluable instructional and research tool for courses on computer and urban sciences; students and practitioners in computer science, information science, technology studies and urban management studies will find it particularly useful. Further, the book is an excellent reference guide for professionals and researchers working in mobility, education, governance, energy, the environment and computer sciences.




Heuristic and Knowledge-Based Security Checks of Source Code Artifacts Using Community Knowledge


Book Description

The goal of this dissertation is to support developers in applying security checks using community knowledge. Artificial intelligence approaches combined with natural language processing techniques are employed to identify security-related information from community websites such as Stack Overflow or GitHub. All security-related information is stored in a security knowledge base. This knowledge base provides code fragments that represent the community´s knowledge about vulnerabilities, security-patches, and exploits. Comprehensive knowledge is required to carry out security checks on software artifacts, such as data covering known vulnerabilities and their manifestation in the source code as well as possible attack strategies. Approaches that check software libraries and source code fragments are provided for the automated use of the data. Insecure software libraries can be detected using the NVD combined with metadata and library file hash approaches introduced in this dissertation. Vulnerable source code fragments can be identified using community knowledge represented by code fragments extracted from the largest coding community websites: Stack Overflow and GitHub. A state-of-the-art clone detection approach is modified and enriched by several heuristics to enable vulnerability detection and leverage community knowledge while maintaining good performance. Using various case studies, the approaches implemented in Eclipse plugins and a JIRA plugin are adapted to the users´ needs and evaluated.




Advances in Intelligent Information Systems


Book Description

Intelligent Information Systems (IIS) can be defined as the next generation of Information Systems (IS) developed as a result of integration of AI and database (DB) technologies. IIS embody knowledge that allows them to exhibit intelligent behavior, allows them to cooperate with users and other systems in problem solving, discovery, retrieval, and manipulation of data and knowledge. For any IIS to serve its purpose, the information must be available when it is needed. This means that the computing systems used to store data and process the information, and the security controls used to protect it must be functioning correctly. This book covers some of the above topics and it is divided into four sections: Classification, Approximation and Data Security, Knowledge Management, and Application of IIS to medical and music domains.




Advances in Machine Learning I


Book Description

Professor Richard S. Michalski passed away on September 20, 2007. Once we learned about his untimely death we immediately realized that we would no longer have with us a truly exceptional scholar and researcher who for several decades had been inf- encing the work of numerous scientists all over the world - not only in his area of expertise, notably machine learning, but also in the broadly understood areas of data analysis, data mining, knowledge discovery and many others. In fact, his influence was even much broader due to his creative vision, integrity, scientific excellence and exceptionally wide intellectual horizons which extended to history, political science and arts. Professor Michalski’s death was a particularly deep loss to the whole Polish sci- tific community and the Polish Academy of Sciences in particular. After graduation, he began his research career at the Institute of Automatic Control, Polish Academy of Science in Warsaw. In 1970 he left his native country and hold various prestigious positions at top US universities. His research gained impetus and he soon established himself as a world authority in his areas of interest – notably, he was widely cons- ered a father of machine learning.




Multi-Objective Swarm Intelligent Systems


Book Description

This book covers the latest in multi-objective swarm intelligence and cooperative behavior. It contains innovative and intriguing applications as well as additions to the methodology and theory of genetic programming.