125 Problems in Text Algorithms


Book Description

String matching is one of the oldest algorithmic techniques, yet still one of the most pervasive in computer science. The past 20 years have seen technological leaps in applications as diverse as information retrieval and compression. This copiously illustrated collection of puzzles and exercises in key areas of text algorithms and combinatorics on words offers graduate students and researchers a pleasant and direct way to learn and practice with advanced concepts. The problems are drawn from a large range of scientific publications, both classic and new. Building up from the basics, the book goes on to showcase problems in combinatorics on words (including Fibonacci or Thue-Morse words), pattern matching (including Knuth-Morris-Pratt and Boyer-Moore like algorithms), efficient text data structures (including suffix trees and suffix arrays), regularities in words (including periods and runs) and text compression (including Huffman, Lempel-Ziv and Burrows-Wheeler based methods).




Text Algorithms


Book Description

This much-needed book on the design of algorithms and data structures for text processing emphasizes both theoretical foundations and practical applications. It is intended to serve both as a textbook for courses on algorithm design, especially those related to text processing, and as a reference for computer science professionals. The work takes a unique approach, one that goes more deeply into its topic than other more general books. It contains both classical algorithms and recent results of research on the subject. The book is the first text to contain a collection of a wide range of text algorithms, many of them quite new and appearing here for the first time. Other algorithms, while known by reputation, have never been published in the journal literature. Two such important algorithms are those of Karp, Miller and Rosenberg, and that of Weiner. Here they are presented together for the fist time. The core of the book is the material on suffix trees and subword graphs, applications of these data structures, new approaches to time-space optimal string-matching, and text compression. Also covered are basic parallel algorithms for text problems. Applications of all these algorithms are given for problems involving data retrieval systems, treatment of natural languages, investigation of genomes, data compression software, and text processing tools. From the theoretical point of view. the book is a goldmine of paradigms for the development of efficient algorithms, providing the necessary foundation to creating practical software dealing with sequences. A crucial point in the authors' approach is the development of a methodology for presenting text algorithms so they can be fully understood. Throughout, the book emphasizes the efficiency of algorithms, holding that the essence of their usefulness depends on it. This is especially important since the algorithms described here will find application in "Big Science" areas like molecular sequence analysis where the explosive growth of data has caused problems for the current generation of software. Finally, with its development of theoretical background, the book can be considered as a mathematical foundation for the analysis and production of text processing algorithms.




200 Problems on Languages, Automata, and Computation


Book Description

Formal languages and automata have long been fundamental to theoretical computer science, but students often struggle to understand these concepts in the abstract. This book provides a rich source of compelling exercises designed to help students grasp the subject intuitively through practice. The text covers important topics such as finite automata, regular expressions, push-down automata, grammars, and Turing machines via a series of problems of increasing difficultly. Problems are organised by topic, many with multiple follow-ups, and each section begins with a short recap of the basic notions necessary to make progress. Complete solutions are given for all exercises, making the book well suited for self-study as well as for use as a course supplement. Developed over the course of the editors' two decades of experience teaching the acclaimed Automata, Formal Languages, and Computation course at the University of Warsaw, it is an ideal resource for students and instructors alike.




Jewels Of Stringology: Text Algorithms


Book Description

The term “stringology” is a popular nickname for text algorithms, or algorithms on strings. This book deals with the most basic algorithms in the area. Most of them can be viewed as “algorithmic jewels” and deserve reader-friendly presentation. One of the main aims of the book is to present several of the most celebrated algorithms in a simple way by omitting obscuring details and separating algorithmic structure from combinatorial theoretical background. The book reflects the relationships between applications of text-algorithmic techniques and the classification of algorithms according to the measures of complexity considered. The text can be viewed as a parade of algorithms in which the main purpose is to discuss the foundations of the algorithms and their interconnections. One can partition the algorithmic problems discussed into practical and theoretical problems. Certainly, string matching and data compression are in the former class, while most problems related to symmetries and repetitions in texts are in the latter. However, all the problems are interesting from an algorithmic point of view and enable the reader to appreciate the importance of combinatorics on words as a tool in the design of efficient text algorithms.In most textbooks on algorithms and data structures, the presentation of efficient algorithms on words is quite short as compared to issues in graph theory, sorting, searching, and some other areas. At the same time, there are many presentations of interesting algorithms on words accessible only in journals and in a form directed mainly at specialists. This book fills the gap in the book literature on algorithms on words, and brings together the many results presently dispersed in the masses of journal articles. The presentation is reader-friendly; many examples and about two hundred figures illustrate nicely the behaviour of otherwise very complex algorithms.




Dimension Groups and Dynamical Systems


Book Description

This book is the first self-contained exposition of the fascinating link between dynamical systems and dimension groups. The authors explore the rich interplay between topological properties of dynamical systems and the algebraic structures associated with them, with an emphasis on symbolic systems, particularly substitution systems. It is recommended for anybody with an interest in topological and symbolic dynamics, automata theory or combinatorics on words. Intended to serve as an introduction for graduate students and other newcomers to the field as well as a reference for established researchers, the book includes a thorough account of the background notions as well as detailed exposition – with full proofs – of the major results of the subject. A wealth of examples and exercises, with solutions, serve to build intuition, while the many open problems collected at the end provide jumping-off points for future research.




Algorithms from THE BOOK


Book Description

Algorithms are a dominant force in modern culture, and every indication is that they will become more pervasive, not less. The best algorithms are undergirded by beautiful mathematics. This text cuts across discipline boundaries to highlight some of the most famous and successful algorithms. Readers are exposed to the principles behind these examples and guided in assembling complex algorithms from simpler building blocks. Written in clear, instructive language within the constraints of mathematical rigor, Algorithms from THE BOOK includes a large number of classroom-tested exercises at the end of each chapter. The appendices cover background material often omitted from undergraduate courses. Most of the algorithm descriptions are accompanied by Julia code, an ideal language for scientific computing. This code is immediately available for experimentation. Algorithms from THE BOOK is aimed at first-year graduate and advanced undergraduate students. It will also serve as a convenient reference for professionals throughout the mathematical sciences, physical sciences, engineering, and the quantitative sectors of the biological and social sciences.




Understanding Machine Learning


Book Description

Introduces machine learning and its algorithmic paradigms, explaining the principles behind automated learning approaches and the considerations underlying their usage.




Combinatorial Algorithms on Words


Book Description

Combinatorial Algorithms on Words refers to the collection of manipulations of strings of symbols (words) - not necessarily from a finite alphabet - that exploit the combinatorial properties of the logical/physical input arrangement to achieve efficient computational performances. The model of computation may be any of the established serial paradigms (e.g. RAM's, Turing Machines), or one of the emerging parallel models (e.g. PRAM ,WRAM, Systolic Arrays, CCC). This book focuses on some of the accomplishments of recent years in such disparate areas as pattern matching, data compression, free groups, coding theory, parallel and VLSI computation, and symbolic dynamics; these share a common flavor, yet ltave not been examined together in the past. In addition to being theoretically interest ing, these studies have had significant applications. It happens that these works have all too frequently been carried out in isolation, with contributions addressing similar issues scattered throughout a rather diverse body of literature. We felt that it would be advantageous to both current and future researchers to collect this work in a sin gle reference. It should be clear that the book's emphasis is on aspects of combinatorics and com plexity rather than logic, foundations, and decidability. In view of the large body of research and the degree of unity already achieved by studies in the theory of auto mata and formal languages, we have allocated very little space to them.




Algorithms


Book Description

This text, extensively class-tested over a decade at UC Berkeley and UC San Diego, explains the fundamentals of algorithms in a story line that makes the material enjoyable and easy to digest. Emphasis is placed on understanding the crisp mathematical idea behind each algorithm, in a manner that is intuitive and rigorous without being unduly formal. Features include:The use of boxes to strengthen the narrative: pieces that provide historical context, descriptions of how the algorithms are used in practice, and excursions for the mathematically sophisticated. Carefully chosen advanced topics that can be skipped in a standard one-semester course but can be covered in an advanced algorithms course or in a more leisurely two-semester sequence.An accessible treatment of linear programming introduces students to one of the greatest achievements in algorithms. An optional chapter on the quantum algorithm for factoring provides a unique peephole into this exciting topic. In addition to the text DasGupta also offers a Solutions Manual which is available on the Online Learning Center."Algorithms is an outstanding undergraduate text equally informed by the historical roots and contemporary applications of its subject. Like a captivating novel it is a joy to read." Tim Roughgarden Stanford University




Information Theory, Inference and Learning Algorithms


Book Description

Information theory and inference, taught together in this exciting textbook, lie at the heart of many important areas of modern technology - communication, signal processing, data mining, machine learning, pattern recognition, computational neuroscience, bioinformatics and cryptography. The book introduces theory in tandem with applications. Information theory is taught alongside practical communication systems such as arithmetic coding for data compression and sparse-graph codes for error-correction. Inference techniques, including message-passing algorithms, Monte Carlo methods and variational approximations, are developed alongside applications to clustering, convolutional codes, independent component analysis, and neural networks. Uniquely, the book covers state-of-the-art error-correcting codes, including low-density-parity-check codes, turbo codes, and digital fountain codes - the twenty-first-century standards for satellite communications, disk drives, and data broadcast. Richly illustrated, filled with worked examples and over 400 exercises, some with detailed solutions, the book is ideal for self-learning, and for undergraduate or graduate courses. It also provides an unparalleled entry point for professionals in areas as diverse as computational biology, financial engineering and machine learning.