Practical Weak Supervision


Book Description

Most data scientists and engineers today rely on quality labeled data to train machine learning models. But building a training set manually is time-consuming and expensive, leaving many companies with unfinished ML projects. There's a more practical approach. In this book, Wee Hyong Tok, Amit Bahree, and Senja Filipi show you how to create products using weakly supervised learning models. You'll learn how to build natural language processing and computer vision projects using weakly labeled datasets from Snorkel, a spin-off from the Stanford AI Lab. Because so many companies have pursued ML projects that never go beyond their labs, this book also provides a guide on how to ship the deep learning models you build. Get up to speed on the field of weak supervision, including ways to use it as part of the data science process Use Snorkel AI for weak supervision and data programming Get code examples for using Snorkel to label text and image datasets Use a weakly labeled dataset for text and image classification Learn practical considerations for using Snorkel with large datasets and using Spark clusters to scale labeling




Practical Weak Supervision


Book Description

Most data scientists and engineers today rely on quality labeled data to train machine learning models. But building a training set manually is time-consuming and expensive, leaving many companies with unfinished ML projects. There's a more practical approach. In this book, Wee Hyong Tok, Amit Bahree, and Senja Filipi show you how to create products using weakly supervised learning models. You'll learn how to build natural language processing and computer vision projects using weakly labeled datasets from Snorkel, a spin-off from the Stanford AI Lab. Because so many companies have pursued ML projects that never go beyond their labs, this book also provides a guide on how to ship the deep learning models you build. Get up to speed on the field of weak supervision, including ways to use it as part of the data science process Use Snorkel AI for weak supervision and data programming Get code examples for using Snorkel to label text and image datasets Use a weakly labeled dataset for text and image classification Learn practical considerations for using Snorkel with large datasets and using Spark clusters to scale labeling




Machine Learning and Data Science Blueprints for Finance


Book Description

Over the next few decades, machine learning and data science will transform the finance industry. With this practical book, analysts, traders, researchers, and developers will learn how to build machine learning algorithms crucial to the industry. You’ll examine ML concepts and over 20 case studies in supervised, unsupervised, and reinforcement learning, along with natural language processing (NLP). Ideal for professionals working at hedge funds, investment and retail banks, and fintech firms, this book also delves deep into portfolio management, algorithmic trading, derivative pricing, fraud detection, asset price prediction, sentiment analysis, and chatbot development. You’ll explore real-life problems faced by practitioners and learn scientifically sound solutions supported by code and examples. This book covers: Supervised learning regression-based models for trading strategies, derivative pricing, and portfolio management Supervised learning classification-based models for credit default risk prediction, fraud detection, and trading strategies Dimensionality reduction techniques with case studies in portfolio management, trading strategy, and yield curve construction Algorithms and clustering techniques for finding similar objects, with case studies in trading strategies and portfolio management Reinforcement learning models and techniques used for building trading strategies, derivatives hedging, and portfolio management NLP techniques using Python libraries such as NLTK and scikit-learn for transforming text into meaningful representations




Driven by Data


Book Description

Offers a practical guide for improving schools dramatically that will enable all students from all backgrounds to achieve at high levels. Includes assessment forms, an index, and a DVD.




Practical Natural Language Processing


Book Description

Many books and courses tackle natural language processing (NLP) problems with toy use cases and well-defined datasets. But if you want to build, iterate, and scale NLP systems in a business setting and tailor them for particular industry verticals, this is your guide. Software engineers and data scientists will learn how to navigate the maze of options available at each step of the journey. Through the course of the book, authors Sowmya Vajjala, Bodhisattwa Majumder, Anuj Gupta, and Harshit Surana will guide you through the process of building real-world NLP solutions embedded in larger product setups. You’ll learn how to adapt your solutions for different industry verticals such as healthcare, social media, and retail. With this book, you’ll: Understand the wide spectrum of problem statements, tasks, and solution approaches within NLP Implement and evaluate different NLP applications using machine learning and deep learning methods Fine-tune your NLP solution based on your business problem and industry vertical Evaluate various algorithms and approaches for NLP product tasks, datasets, and stages Produce software solutions following best practices around release, deployment, and DevOps for NLP systems Understand best practices, opportunities, and the roadmap for NLP from a business and product leader’s perspective




Machine Learning from Weak Supervision


Book Description

Fundamental theory and practical algorithms of weakly supervised classification, emphasizing an approach based on empirical risk minimization. Standard machine learning techniques require large amounts of labeled data to work well. When we apply machine learning to problems in the physical world, however, it is extremely difficult to collect such quantities of labeled data. In this book Masashi Sugiyama, Han Bao, Takashi Ishida, Nan Lu, Tomoya Sakai and Gang Niu present theory and algorithms for weakly supervised learning, a paradigm of machine learning from weakly labeled data. Emphasizing an approach based on empirical risk minimization and drawing on state-of-the-art research in weakly supervised learning, the book provides both the fundamentals of the field and the advanced mathematical theories underlying them. It can be used as a reference for practitioners and researchers and in the classroom. The book first mathematically formulates classification problems, defines common notations, and reviews various algorithms for supervised binary and multiclass classification. It then explores problems of binary weakly supervised classification, including positive-unlabeled (PU) classification, positive-negative-unlabeled (PNU) classification, and unlabeled-unlabeled (UU) classification. It then turns to multiclass classification, discussing complementary-label (CL) classification and partial-label (PL) classification. Finally, the book addresses more advanced issues, including a family of correction methods to improve the generalization performance of weakly supervised learning and the problem of class-prior estimation.




Data Mining and Data Warehousing


Book Description

Written in lucid language, this valuable textbook brings together fundamental concepts of data mining and data warehousing in a single volume. Important topics including information theory, decision tree, Naïve Bayes classifier, distance metrics, partitioning clustering, associate mining, data marts and operational data store are discussed comprehensively. The textbook is written to cater to the needs of undergraduate students of computer science, engineering and information technology for a course on data mining and data warehousing. The text simplifies the understanding of the concepts through exercises and practical examples. Chapters such as classification, associate mining and cluster analysis are discussed in detail with their practical implementation using Weka and R language data mining tools. Advanced topics including big data analytics, relational data models and NoSQL are discussed in detail. Pedagogical features including unsolved problems and multiple-choice questions are interspersed throughout the book for better understanding.




Information and Communications Security


Book Description

This two-volume set LNCS 12918 - 12919 constitutes the refereed proceedings of the 23nd International Conference on Information and Communications Security, ICICS 2021, held in Chongqing, China, in September 2021. The 49 revised full papers presented in the book were carefully selected from 182 submissions. The papers in Part I are organized in the following thematic blocks:​ blockchain and federated learning; malware analysis and detection; IoT security; software security; Internet security; data-driven cybersecurity.




Deep Learning For Physics Research


Book Description

A core principle of physics is knowledge gained from data. Thus, deep learning has instantly entered physics and may become a new paradigm in basic and applied research.This textbook addresses physics students and physicists who want to understand what deep learning actually means, and what is the potential for their own scientific projects. Being familiar with linear algebra and parameter optimization is sufficient to jump-start deep learning. Adopting a pragmatic approach, basic and advanced applications in physics research are described. Also offered are simple hands-on exercises for implementing deep networks for which python code and training data can be downloaded.




Fundamentals and Methods of Machine and Deep Learning


Book Description

FUNDAMENTALS AND METHODS OF MACHINE AND DEEP LEARNING The book provides a practical approach by explaining the concepts of machine learning and deep learning algorithms, evaluation of methodology advances, and algorithm demonstrations with applications. Over the past two decades, the field of machine learning and its subfield deep learning have played a main role in software applications development. Also, in recent research studies, they are regarded as one of the disruptive technologies that will transform our future life, business, and the global economy. The recent explosion of digital data in a wide variety of domains, including science, engineering, Internet of Things, biomedical, healthcare, and many business sectors, has declared the era of big data, which cannot be analysed by classical statistics but by the more modern, robust machine learning and deep learning techniques. Since machine learning learns from data rather than by programming hard-coded decision rules, an attempt is being made to use machine learning to make computers that are able to solve problems like human experts in the field. The goal of this book is to present a??practical approach by explaining the concepts of machine learning and deep learning algorithms with applications. Supervised machine learning algorithms, ensemble machine learning algorithms, feature selection, deep learning techniques, and their applications are discussed. Also included in the eighteen chapters is unique information which provides a clear understanding of concepts by using algorithms and case studies illustrated with applications of machine learning and deep learning in different domains, including disease prediction, software defect prediction, online television analysis, medical image processing, etc. Each of the chapters briefly described below provides both a chosen approach and its implementation. Audience Researchers and engineers in artificial intelligence, computer scientists as well as software developers.