Guide to Reliable Distributed Systems


Book Description

This book describes the key concepts, principles and implementation options for creating high-assurance cloud computing solutions. The guide starts with a broad technical overview and basic introduction to cloud computing, looking at the overall architecture of the cloud, client systems, the modern Internet and cloud computing data centers. It then delves into the core challenges of showing how reliability and fault-tolerance can be abstracted, how the resulting questions can be solved, and how the solutions can be leveraged to create a wide range of practical cloud applications. The author’s style is practical, and the guide should be readily understandable without any special background. Concrete examples are often drawn from real-world settings to illustrate key insights. Appendices show how the most important reliability models can be formalized, describe the API of the Isis2 platform, and offer more than 80 problems at varying levels of difficulty.







Distributed Services with Go


Book Description

You know the basics of Go and are eager to put your knowledge to work. This book is just what you need to apply Go to real-world situations. You'll build a distributed service that's highly available, resilient, and scalable. Along the way you'll master the techniques, tools, and tricks that skilled Go programmers use every day to build quality applications. Level up your Go skills today. Take your Go skills to the next level by learning how to design, develop, and deploy a distributed service. Start from the bare essentials of storage handling, then work your way through networking a client and server, and finally to distributing server instances, deployment, and testing. All this will make coding in your day job or side projects easier, faster, and more fun. Lay out your applications and libraries to be modular and easy to maintain. Build networked, secure clients and servers with gRPC. Monitor your applications with metrics, logs, and traces to make them debuggable and reliable. Test and benchmark your applications to ensure they're correct and fast. Build your own distributed services with service discovery and consensus. Write CLIs to configure your applications. Deploy applications to the cloud with Kubernetes and manage them with your own Kubernetes Operator. Dive into writing Go and join the hundreds of thousands who are using it to build software for the real world. What You Need: Go 1.11 and Kubernetes 1.12.




Designing Distributed Systems


Book Description

Without established design patterns to guide them, developers have had to build distributed systems from scratch, and most of these systems are very unique indeed. Today, the increasing use of containers has paved the way for core distributed system patterns and reusable containerized components. This practical guide presents a collection of repeatable, generic patterns to help make the development of reliable distributed systems far more approachable and efficient. Author Brendan Burns—Director of Engineering at Microsoft Azure—demonstrates how you can adapt existing software design patterns for designing and building reliable distributed applications. Systems engineers and application developers will learn how these long-established patterns provide a common language and framework for dramatically increasing the quality of your system. Understand how patterns and reusable components enable the rapid development of reliable distributed systems Use the side-car, adapter, and ambassador patterns to split your application into a group of containers on a single machine Explore loosely coupled multi-node distributed patterns for replication, scaling, and communication between the components Learn distributed system patterns for large-scale batch data processing covering work-queues, event-based processing, and coordinated workflows




Reliable Distributed Systems


Book Description

Explains fault tolerance in clear terms, with concrete examples drawn from real-world settings Highly practical focus aimed at building "mission-critical" networked applications that remain secure




Building Dependable Distributed Systems


Book Description

A one-volume guide to the most essential techniques for designing and building dependable distributed systems Instead of covering a broad range of research works for each dependability strategy, this useful reference focuses on only a selected few (usually the most seminal works, the most practical approaches, or the first publication of each approach), explaining each in depth, usually with a comprehensive set of examples. Each technique is dissected thoroughly enough so that readers who are not familiar with dependable distributed computing can actually grasp the technique after studying the book. Building Dependable Distributed Systems consists of eight chapters. The first introduces the basic concepts and terminology of dependable distributed computing, and also provides an overview of the primary means of achieving dependability. Checkpointing and logging mechanisms, which are the most commonly used means of achieving limited degree of fault tolerance, are described in the second chapter. Works on recovery-oriented computing, focusing on the practical techniques that reduce the fault detection and recovery times for Internet-based applications, are covered in chapter three. Chapter four outlines the replication techniques for data and service fault tolerance. This chapter also pays particular attention to optimistic replication and the CAP theorem. Chapter five explains a few seminal works on group communication systems. Chapter six introduces the distributed consensus problem and covers a number of Paxos family algorithms in depth. The Byzantine generals problem and its latest solutions, including the seminal Practical Byzantine Fault Tolerance (PBFT) algorithm and a number of its derivatives, are introduced in chapter seven. The final chapter details the latest research results surrounding application-aware Byzantine fault tolerance, which represents an important step forward in the practical use of Byzantine fault tolerance techniques.




Understanding Distributed Systems, Second Edition


Book Description

Learning to build distributed systems is hard, especially if they are large scale. It's not that there is a lack of information out there. You can find academic papers, engineering blogs, and even books on the subject. The problem is that the available information is spread out all over the place, and if you were to put it on a spectrum from theory to practice, you would find a lot of material at the two ends but not much in the middle. That is why I decided to write a book that brings together the core theoretical and practical concepts of distributed systems so that you don't have to spend hours connecting the dots. This book will guide you through the fundamentals of large-scale distributed systems, with just enough details and external references to dive deeper. This is the guide I wished existed when I first started out, based on my experience building large distributed systems that scale to millions of requests per second and billions of devices. If you are a developer working on the backend of web or mobile applications (or would like to be!), this book is for you. When building distributed applications, you need to be familiar with the network stack, data consistency models, scalability and reliability patterns, observability best practices, and much more. Although you can build applications without knowing much of that, you will end up spending hours debugging and re-architecting them, learning hard lessons that you could have acquired in a much faster and less painful way. However, if you have several years of experience designing and building highly available and fault-tolerant applications that scale to millions of users, this book might not be for you. As an expert, you are likely looking for depth rather than breadth, and this book focuses more on the latter since it would be impossible to cover the field otherwise. The second edition is a complete rewrite of the previous edition. Every page of the first edition has been reviewed and where appropriate reworked, with new topics covered for the first time.




Security Engineering


Book Description

Now that there’s software in everything, how can you make anything secure? Understand how to engineer dependable systems with this newly updated classic In Security Engineering: A Guide to Building Dependable Distributed Systems, Third Edition Cambridge University professor Ross Anderson updates his classic textbook and teaches readers how to design, implement, and test systems to withstand both error and attack. This book became a best-seller in 2001 and helped establish the discipline of security engineering. By the second edition in 2008, underground dark markets had let the bad guys specialize and scale up; attacks were increasingly on users rather than on technology. The book repeated its success by showing how security engineers can focus on usability. Now the third edition brings it up to date for 2020. As people now go online from phones more than laptops, most servers are in the cloud, online advertising drives the Internet and social networks have taken over much human interaction, many patterns of crime and abuse are the same, but the methods have evolved. Ross Anderson explores what security engineering means in 2020, including: How the basic elements of cryptography, protocols, and access control translate to the new world of phones, cloud services, social media and the Internet of Things Who the attackers are – from nation states and business competitors through criminal gangs to stalkers and playground bullies What they do – from phishing and carding through SIM swapping and software exploits to DDoS and fake news Security psychology, from privacy through ease-of-use to deception The economics of security and dependability – why companies build vulnerable systems and governments look the other way How dozens of industries went online – well or badly How to manage security and safety engineering in a world of agile development – from reliability engineering to DevSecOps The third edition of Security Engineering ends with a grand challenge: sustainable security. As we build ever more software and connectivity into safety-critical durable goods like cars and medical devices, how do we design systems we can maintain and defend for decades? Or will everything in the world need monthly software upgrades, and become unsafe once they stop?




Guide to Reliable Distributed Systems


Book Description

This book describes the key concepts, principles and implementation options for creating high-assurance cloud computing solutions. The guide starts with a broad technical overview and basic introduction to cloud computing, looking at the overall architecture of the cloud, client systems, the modern Internet and cloud computing data centers. It then delves into the core challenges of showing how reliability and fault-tolerance can be abstracted, how the resulting questions can be solved, and how the solutions can be leveraged to create a wide range of practical cloud applications. The author’s style is practical, and the guide should be readily understandable without any special background. Concrete examples are often drawn from real-world settings to illustrate key insights. Appendices show how the most important reliability models can be formalized, describe the API of the Isis2 platform, and offer more than 80 problems at varying levels of difficulty.




Introduction to Reliable Distributed Programming


Book Description

In modern computing a program is usually distributed among several processes. The fundamental challenge when developing reliable distributed programs is to support the cooperation of processes required to execute a common task, even when some of these processes fail. Guerraoui and Rodrigues present an introductory description of fundamental reliable distributed programming abstractions as well as algorithms to implement these abstractions. The authors follow an incremental approach by first introducing basic abstractions in simple distributed environments, before moving to more sophisticated abstractions and more challenging environments. Each core chapter is devoted to one specific class of abstractions, covering reliable delivery, shared memory, consensus and various forms of agreement. This textbook comes with a companion set of running examples implemented in Java. These can be used by students to get a better understanding of how reliable distributed programming abstractions can be implemented and used in practice. Combined, the chapters deliver a full course on reliable distributed programming. The book can also be used as a complete reference on the basic elements required to build reliable distributed applications.