Using Additional Information in Streaming Algorithms


Book Description

Streaming problems are algorithmic problems that are mainly characterized by their massive input streams. Because of these data streams, the algorithms for these problems are forced to be space-efficient, as the input stream length generally exceeds the available storage. The goal of this study is to analyze the impact of additional information (more specifically, a hypothesis of the solution) on the algorithmic space complexities of several streaming problems. To this end, different streaming problems are analyzed and compared. The two problems “most frequent item” and “number of distinct items”, with many configurations of different result accuracies and probabilities, are deeply studied. Both lower and upper bounds for the space and time complexity for deterministic and probabilistic environments are analyzed with respect to possible improvements due to additional information. The general solution search problem is compared to the decision problem where a solution hypothesis has to be satisfied.




Data Streams


Book Description

In the data stream scenario, input arrives very rapidly and there is limited memory to store the input. Algorithms have to work with one or few passes over the data, space less than linear in the input size or time significantly less than the input size. In the past few years, a new theory has emerged for reasoning about algorithms that work within these constraints on space, time, and number of passes. Some of the methods rely on metric embeddings, pseudo-random computations, sparse approximation theory and communication complexity. The applications for this scenario include IP network traffic analysis, mining text message streams and processing massive data sets in general. Researchers in Theoretical Computer Science, Databases, IP Networking and Computer Systems are working on the data stream challenges.




Machine Learning for Data Streams


Book Description

A hands-on approach to tasks and techniques in data stream mining and real-time analytics, with examples in MOA, a popular freely available open-source software framework. Today many information sources—including sensor networks, financial markets, social networks, and healthcare monitoring—are so-called data streams, arriving sequentially and at high speed. Analysis must take place in real time, with partial data and without the capacity to store the entire data set. This book presents algorithms and techniques used in data stream mining and real-time analytics. Taking a hands-on approach, the book demonstrates the techniques using MOA (Massive Online Analysis), a popular, freely available open-source software framework, allowing readers to try out the techniques after reading the explanations. The book first offers a brief introduction to the topic, covering big data mining, basic methodologies for mining data streams, and a simple example of MOA. More detailed discussions follow, with chapters on sketching techniques, change, classification, ensemble methods, regression, clustering, and frequent pattern mining. Most of these chapters include exercises, an MOA-based lab session, or both. Finally, the book discusses the MOA software, covering the MOA graphical user interface, the command line, use of its API, and the development of new methods within MOA. The book will be an essential reference for readers who want to use data stream mining as a tool, researchers in innovation or data stream mining, and programmers who want to create new algorithms for MOA.




Algorithms—Advances in Research and Application: 2013 Edition


Book Description

Algorithms—Advances in Research and Application: 2013 Edition is a ScholarlyEditions™ book that delivers timely, authoritative, and comprehensive information about Coloring Algorithm. The editors have built Algorithms—Advances in Research and Application: 2013 Edition on the vast information databases of ScholarlyNews.™ You can expect the information about Coloring Algorithm in this book to be deeper than what you can access anywhere else, as well as consistently reliable, authoritative, informed, and relevant. The content of Algorithms—Advances in Research and Application: 2013 Edition has been produced by the world’s leading scientists, engineers, analysts, research institutions, and companies. All of the content is from peer-reviewed sources, and all of it is written, assembled, and edited by the editors at ScholarlyEditions™ and available exclusively from us. You now have a source you can cite with authority, confidence, and credibility. More information is available at http://www.ScholarlyEditions.com/.




Data Streams


Book Description

This book primarily discusses issues related to the mining aspects of data streams and it is unique in its primary focus on the subject. This volume covers mining aspects of data streams comprehensively: each contributed chapter contains a survey on the topic, the key ideas in the field for that particular topic, and future research directions. The book is intended for a professional audience composed of researchers and practitioners in industry. This book is also appropriate for advanced-level students in computer science.




Modeling and Using Context


Book Description

Here are the refereed proceedings of the 6th International and Interdisciplinary Conference on Modeling and Using Context. The 42 papers deal with the interdisciplinary topic of modeling and using context from various perspectives, including computer science, artificial intelligence, cognitive science, linguistics, organizational science, philosophy, and psychology. In addition, readers discover applications in areas such as medicine and law.




The Creativity Code


Book Description

“A brilliant travel guide to the coming world of AI.” —Jeanette Winterson What does it mean to be creative? Can creativity be trained? Is it uniquely human, or could AI be considered creative? Mathematical genius and exuberant polymath Marcus du Sautoy plunges us into the world of artificial intelligence and algorithmic learning in this essential guide to the future of creativity. He considers the role of pattern and imitation in the creative process and sets out to investigate the programs and programmers—from Deep Mind and the Flow Machine to Botnik and WHIM—who are seeking to rival or surpass human innovation in gaming, music, art, and language. A thrilling tour of the landscape of invention, The Creativity Code explores the new face of creativity and the mysteries of the human code. “As machines outsmart us in ever more domains, we can at least comfort ourselves that one area will remain sacrosanct and uncomputable: human creativity. Or can we?...In his fascinating exploration of the nature of creativity, Marcus du Sautoy questions many of those assumptions.” —Financial Times “Fascinating...If all the experiences, hopes, dreams, visions, lusts, loves, and hatreds that shape the human imagination amount to nothing more than a ‘code,’ then sooner or later a machine will crack it. Indeed, du Sautoy assembles an eclectic array of evidence to show how that’s happening even now.” —The Times




Stream Processing with Apache Spark


Book Description

Before you can build analytics tools to gain quick insights, you first need to know how to process data in real time. With this practical guide, developers familiar with Apache Spark will learn how to put this in-memory framework to use for streaming data. You’ll discover how Spark enables you to write streaming jobs in almost the same way you write batch jobs. Authors Gerard Maas and François Garillot help you explore the theoretical underpinnings of Apache Spark. This comprehensive guide features two sections that compare and contrast the streaming APIs Spark now supports: the original Spark Streaming library and the newer Structured Streaming API. Learn fundamental stream processing concepts and examine different streaming architectures Explore Structured Streaming through practical examples; learn different aspects of stream processing in detail Create and operate streaming jobs and applications with Spark Streaming; integrate Spark Streaming with other Spark APIs Learn advanced Spark Streaming techniques, including approximation algorithms and machine learning algorithms Compare Apache Spark to other stream processing projects, including Apache Storm, Apache Flink, and Apache Kafka Streams




Bio-inspired Algorithms for Data Streaming and Visualization, Big Data Management, and Fog Computing


Book Description

This book aims to provide some insights into recently developed bio-inspired algorithms within recent emerging trends of fog computing, sentiment analysis, and data streaming as well as to provide a more comprehensive approach to the big data management from pre-processing to analytics to visualization phases. The subject area of this book is within the realm of computer science, notably algorithms (meta-heuristic and, more particularly, bio-inspired algorithms). Although application domains of these new algorithms may be mentioned, the scope of this book is not on the application of algorithms to specific or general domains but to provide an update on recent research trends for bio-inspired algorithms within a specific application domain or emerging area. These areas include data streaming, fog computing, and phases of big data management. One of the reasons for writing this book is that the bio-inspired approach does not receive much attention but shows considerable promise and diversity in terms of approach of many issues in big data and streaming. Some novel approaches of this book are the use of these algorithms to all phases of data management (not just a particular phase such as data mining or business intelligence as many books focus on); effective demonstration of the effectiveness of a selected algorithm within a chapter against comparative algorithms using the experimental method. Another novel approach is a brief overview and evaluation of traditional algorithms, both sequential and parallel, for use in data mining, in order to provide an overview of existing algorithms in use. This overview complements a further chapter on bio-inspired algorithms for data mining to enable readers to make a more suitable choice of algorithm for data mining within a particular context. In all chapters, references for further reading are provided, and in selected chapters, the author also include ideas for future research.




Knowledge Discovery from Data Streams


Book Description

Since the beginning of the Internet age and the increased use of ubiquitous computing devices, the large volume and continuous flow of distributed data have imposed new constraints on the design of learning algorithms. Exploring how to extract knowledge structures from evolving and time-changing data, Knowledge Discovery from Data Streams presents