Proceedings


Book Description













Guide to Reliable Distributed Systems


Book Description

This book describes the key concepts, principles and implementation options for creating high-assurance cloud computing solutions. The guide starts with a broad technical overview and basic introduction to cloud computing, looking at the overall architecture of the cloud, client systems, the modern Internet and cloud computing data centers. It then delves into the core challenges of showing how reliability and fault-tolerance can be abstracted, how the resulting questions can be solved, and how the solutions can be leveraged to create a wide range of practical cloud applications. The author’s style is practical, and the guide should be readily understandable without any special background. Concrete examples are often drawn from real-world settings to illustrate key insights. Appendices show how the most important reliability models can be formalized, describe the API of the Isis2 platform, and offer more than 80 problems at varying levels of difficulty.







Reliable Distributed Systems


Book Description

Explains fault tolerance in clear terms, with concrete examples drawn from real-world settings Highly practical focus aimed at building "mission-critical" networked applications that remain secure




Dependable Computing


Book Description

Dependable Computing Covering dependability from software and hardware perspectives Dependable Computing: Design and Assessment looks at both the software and hardware aspects of dependability. This book: Provides an in-depth examination of dependability/fault tolerance topics Describes dependability taxonomy, and briefly contrasts classical techniques with their modern counterparts or extensions Walks up the system stack from the hardware logic via operating systems up to software applications with respect to how they are hardened for dependability Describes the use of measurement-based analysis of computing systems Illustrates technology through real-life applications Discusses security attacks and unique dependability requirements for emerging applications, e.g., smart electric power grids and cloud computing Finally, using critical societal applications such as autonomous vehicles, large-scale clouds, and engineering solutions for healthcare, the book illustrates the emerging challenges faced in making artificial intelligence (AI) and its applications dependable and trustworthy. This book is suitable for those studying in the fields of computer engineering and computer science. Professionals who are working within the new reality to ensure dependable computing will find helpful information to support their efforts. With the support of practical case studies and use cases from both academia and real-world deployments, the book provides a journey of developments that include the impact of artificial intelligence and machine learning on this ever-growing field. This book offers a single compendium that spans the myriad areas in which dependability has been applied, providing theoretical concepts and applied knowledge with content that will excite a beginner, and rigor that will satisfy an expert. Accompanying the book is an online repository of problem sets and solutions, as well as slides for instructors, that span the chapters of the book.




Fault-Tolerant Real-Time Systems


Book Description

Real-time computer systems are very often subject to dependability requirements because of their application areas. Fly-by-wire airplane control systems, control of power plants, industrial process control systems and others are required to continue their function despite faults. Fault-tolerance and real-time requirements thus constitute a kind of natural combination in process control applications. Systematic fault-tolerance is based on redundancy, which is used to mask failures of individual components. The problem of replica determinism is thereby to ensure that replicated components show consistent behavior in the absence of faults. It might seem trivial that, given an identical sequence of inputs, replicated computer systems will produce consistent outputs. Unfortunately, this is not the case. The problem of replica non-determinism and the presentation of its possible solutions is the subject of Fault-Tolerant Real-Time Systems: The Problem of Replica Determinism. The field of automotive electronics is an important application area of fault-tolerant real-time systems. Systems like anti-lock braking, engine control, active suspension or vehicle dynamics control have demanding real-time and fault-tolerance requirements. These requirements have to be met even in the presence of very limited resources since cost is extremely important. Because of its interesting properties Fault-Tolerant Real-Time Systems gives an introduction to the application area of automotive electronics. The requirements of automotive electronics are a topic of discussion in the remainder of this work and are used as a benchmark to evaluate solutions to the problem of replica determinism.




Foundations of Dependable Computing


Book Description

Foundations of Dependable Computing: Models and Frameworks for Dependable Systems presents two comprehensive frameworks for reasoning about system dependability, thereby establishing a context for understanding the roles played by specific approaches presented in this book's two companion volumes. It then explores the range of models and analysis methods necessary to design, validate and analyze dependable systems. A companion to this book (published by Kluwer), subtitled Paradigms for Dependable Applications, presents a variety of specific approaches to achieving dependability at the application level. Driven by the higher level fault models of Models and Frameworks for Dependable Systems, and built on the lower level abstractions implemented in a third companion book subtitled System Implementation, these approaches demonstrate how dependability may be tuned to the requirements of an application, the fault environment, and the characteristics of the target platform. Three classes of paradigms are considered: protocol-based paradigms for distributed applications, algorithm-based paradigms for parallel applications, and approaches to exploiting application semantics in embedded real-time control systems. Another companion book (published by Kluwer) subtitled System Implementation, explores the system infrastructure needed to support the various paradigms of Paradigms for Dependable Applications. Approaches to implementing support mechanisms and to incorporating additional appropriate levels of fault detection and fault tolerance at the processor, network, and operating system level are presented. A primary concern at these levels is balancing cost and performance against coverage and overall dependability. As these chapters demonstrate, low overhead, practical solutions are attainable and not necessarily incompatible with performance considerations. The section on innovative compiler support, in particular, demonstrates how the benefits of application specificity may be obtained while reducing hardware cost and run-time overhead.