Machine Learning Methods with Noisy, Incomplete or Small Datasets


Book Description

Over the past years, businesses have had to tackle the issues caused by numerous forces from political, technological and societal environment. The changes in the global market and increasing uncertainty require us to focus on disruptive innovations and to investigate this phenomenon from different perspectives. The benefits of innovations are related to lower costs, improved efficiency, reduced risk, and better response to the customers’ needs due to new products, services or processes. On the other hand, new business models expose various risks, such as cyber risks, operational risks, regulatory risks, and others. Therefore, we believe that the entrepreneurial behavior and global mindset of decision-makers significantly contribute to the development of innovations, which benefit by closing the prevailing gap between developed and developing countries. Thus, this Special Issue contributes to closing the research gap in the literature by providing a platform for a scientific debate on innovation, internationalization and entrepreneurship, which would facilitate improving the resilience of businesses to future disruptions. Order Your Print Copy




Machine Learning Methods with Noisy, Incomplete Or Small Datasets


Book Description

In many machine learning applications, available datasets are sometimes incomplete, noisy or affected by artifacts. In supervised scenarios, it could happen that label information has low quality, which might include unbalanced training sets, noisy labels and other problems. Moreover, in practice, it is very common that available data samples are not enough to derive useful supervised or unsupervised classifiers. All these issues are commonly referred to as the low-quality data problem. This book collects novel contributions on machine learning methods for low-quality datasets, to contribute to the dissemination of new ideas to solve this challenging problem, and to provide clear examples of application in real scenarios.




Machine Learning with Noisy Labels


Book Description

Most of the modern machine learning models, based on deep learning techniques, depend on carefully curated and cleanly labelled training sets to be reliably trained and deployed. However, the expensive labelling process involved in the acquisition of such training sets limits the number and size of datasets available to build new models, slowing down progress in the field. Alternatively, many poorly curated training sets containing noisy labels are readily available to be used to build new models. However, the successful exploration of such noisy-label training sets depends on the development of algorithms and models that are robust to these noisy labels. Machine learning and Noisy Labels: Definitions, Theory, Techniques and Solutions defines different types of label noise, introduces the theory behind the problem, presents the main techniques that enable the effective use of noisy-label training sets, and explains the most accurate methods developed in the field. This book is an ideal introduction to machine learning with noisy labels suitable for senior undergraduates, post graduate students, researchers and practitioners using, and researching into, machine learning methods. Shows how to design and reproduce regression, classification and segmentation models using large-scale noisy-label training sets Gives an understanding of the theory of, and motivation for, noisy-label learning Shows how to classify noisy-label learning methods into a set of core techniques




Machine Learning and Principles and Practice of Knowledge Discovery in Databases


Book Description

This volume constitutes the papers of several workshops which were held in conjunction with the International Workshops of ECML PKDD 2022 on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2022, held in Grenoble, France, during September 19–23, 2022. The 73 revised full papers and 6 short papers presented in this book were carefully reviewed and selected from 143 submissions. ECML PKDD 2022 presents the following five workshops: Workshop on Data Science for Social Good (SoGood 2022) Workshop on New Frontiers in Mining Complex Patterns (NFMCP 2022) Workshop on Explainable Knowledge Discovery in Data Mining (XKDD 2022) Workshop on Uplift Modeling (UMOD 2022) Workshop on IoT, Edge and Mobile for Embedded Machine Learning (ITEM 2022) Workshop on Mining Data for Financial Application (MIDAS 2022) Workshop on Machine Learning for Cybersecurity (MLCS 2022) Workshop on Machine Learning for Buildings Energy Management (MLBEM 2022) Workshop on Machine Learning for Pharma and Healthcare Applications (PharML 2022) Workshop on Data Analysis in Life Science (DALS 2022) Workshop on IoT Streams for Predictive Maintenance (IoT-PdM 2022)




Futuristic Trends for Sustainable Development and Sustainable Ecosystems


Book Description

A key focus in recent years has been on sustainable development and promoting environmentally conscious practices. In today’s rapidly evolving technological world, it is important to consider how technology can be applied to solve problems across disciplines and fields in these areas. Further study is needed in order to understand how technology can be applied to sustainability and the best practices, considerations, and challenges that follow. Futuristic Trends for Sustainable Development and Sustainable Ecosystems discusses recent advances and innovative research in the area of information and communication technology for sustainable development and covers practices in several artificial intelligence fields such as knowledge representation and reasoning, natural language processing, machine learning, and the semantic web. Covering topics such as blockchain, deep learning, and renewable energy, this reference work is ideal for computer scientists, industry professionals, researchers, academicians, scholars, instructors, and students.




In-Memory Computing Hardware Accelerators for Data-Intensive Applications


Book Description

This book describes the state-of-the-art of technology and research on In-Memory Computing Hardware Accelerators for Data-Intensive Applications. The authors discuss how processing-centric computing has become insufficient to meet target requirements and how Memory-centric computing may be better suited for the needs of current applications. This reveals for readers how current and emerging memory technologies are causing a shift in the computing paradigm. The authors do deep-dive discussions on volatile and non-volatile memory technologies, covering their basic memory cell structures, operations, different computational memory designs and the challenges associated with them. Specific case studies and potential applications are provided along with their current status and commercial availability in the market.







Learning from Imperfect Data: Noisy Labels, Truncation, and Coarsening


Book Description

The datasets used in machine learning and statistics are \emph{huge} and often \emph{imperfect},\textit{e.g.}, they contain corrupted data, examples with wrong labels, or hidden biases. Most existing approaches (i) produce unreliable results when the datasets are corrupted, (ii) are computationally inefficient, or (iii) come without any theoretical/provable performance guarantees. In this thesis, we \emph{design learning algorithms} that are \textbf{computationally efficient} and at the same time \textbf{provably reliable}, even when used on imperfect datasets. We first focus on supervised learning settings with noisy labels. We present efficient and optimal learners under the semi-random noise models of Massart and Tsybakov -- where the true label of each example is flipped with probability at most 50\% -- and an efficient approximate learner under adversarial label noise -- where a small but arbitrary fraction of labels is flipped -- under structured feature distributions. Apart from classification, we extend our results to noisy label-ranking. In truncated statistics, the learner does not observe a representative set of samples from the whole population, but only truncated samples, \textit{i.e.}, samples from a potentially small subset of the support of the population distribution. We give the first efficient algorithms for learning Gaussian distributions with unknown truncation sets and initiate the study of non-parametric truncated statistics. Closely related to truncation is \emph{data coarsening}, where instead of observing the class of an example, the learner receives a set of potential classes, one of which is guaranteed to be the correct class. We initiate the theoretical study of the problem, and present the first efficient learning algorithms for learning from coarse data.




Innovations in Electronics and Communication Engineering


Book Description

This book covers various streams of communication engineering like signal processing, VLSI design, embedded systems, wireless communications and electronics and communications in general. The book is a collection of best selected research papers presented at 9th International Conference on Innovations in Electronics and Communication Engineering at Guru Nanak Institutions Hyderabad, India. The book presents works from researchers, technocrats and experts about latest technologies in electronic and communication engineering. The authors have discussed the latest cutting edge technology, and the book will serve as a reference for young researchers.




Graph Learning for Brain Imaging


Book Description