Design of Dependable Computing Systems


Book Description

This book analyzes the causes of failures in computing systems, their consequences, as weIl as the existing solutions to manage them. The domain is tackled in a progressive and educational manner with two objectives: 1. The mastering of the basics of dependability domain at system level, that is to say independently ofthe technology used (hardware or software) and of the domain of application. 2. The understanding of the fundamental techniques available to prevent, to remove, to tolerate, and to forecast faults in hardware and software technologies. The first objective leads to the presentation of the general problem, the fault models and degradation mechanisms wh ich are at the origin of the failures, and finally the methods and techniques which permit the faults to be prevented, removed or tolerated. This study concerns logical systems in general, independently of the hardware and software technologies put in place. This knowledge is indispensable for two reasons: • A large part of a product' s development is independent of the technological means (expression of requirements, specification and most of the design stage). Very often, the development team does not possess this basic knowledge; hence, the dependability requirements are considered uniquely during the technological implementation. Such an approach is expensive and inefficient. Indeed, the removal of a preliminary design fault can be very difficult (if possible) if this fault is detected during the product's final testing.




Fundamentals of Dependable Computing for Software Engineers


Book Description

Fundamentals of Dependable Computing for Software Engineers presents the essential elements of computer system dependability. The book describes a comprehensive dependability-engineering process and explains the roles of software and software engineers in computer system dependability. Readers will learn: Why dependability matters What it means for a system to be dependable How to build a dependable software system How to assess whether a software system is adequately dependable The author focuses on the actions needed to reduce the rate of failure to an acceptable level, covering material essential for engineers developing systems with extreme consequences of failure, such as safety-critical systems, security-critical systems, and critical infrastructure systems. The text explores the systems engineering aspects of dependability and provides a framework for engineers to reason and make decisions about software and its dependability. It also offers a comprehensive approach to achieve software dependability and includes a bibliography of the most relevant literature. Emphasizing the software engineering elements of dependability, this book helps software and computer engineers in fields requiring ultra-high levels of dependability, such as avionics, medical devices, automotive electronics, weapon systems, and advanced information systems, construct software systems that are dependable and within budget and time constraints.




Dependable Computing Systems


Book Description

A team of recognized experts leads the way to dependable computing systems With computers and networks pervading every aspect of daily life, there is an ever-growing demand for dependability. In this unique resource, researchers and organizations will find the tools needed to identify and engage state-of-the-art approaches used for the specification, design, and assessment of dependable computer systems. The first part of the book addresses models and paradigms of dependable computing, and the second part deals with enabling technologies and applications. Tough issues in creating dependable computing systems are also tackled, including: * Verification techniques * Model-based evaluation * Adjudication and data fusion * Robust communications primitives * Fault tolerance * Middleware * Grid security * Dependability in IBM mainframes * Embedded software * Real-time systems Each chapter of this contributed work has been authored by a recognized expert. This is an excellent textbook for graduate and advanced undergraduate students in electrical engineering, computer engineering, and computer science, as well as a must-have reference that will help engineers, programmers, and technologists develop systems that are secure and reliable.







Dependable Embedded Systems


Book Description

This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly within the last five years. This book introduces the most prominent reliability concerns from today’s points of view and roughly recapitulates the progress in the community so far. Unlike other books that focus on a single abstraction level such circuit level or system level alone, the focus of this book is to deal with the different reliability challenges across different levels starting from the physical level all the way to the system level (cross-layer approaches). The book aims at demonstrating how new hardware/software co-design solution can be proposed to ef-fectively mitigate reliability degradation such as transistor aging, processor variation, temperature effects, soft errors, etc. Provides readers with latest insights into novel, cross-layer methods and models with respect to dependability of embedded systems; Describes cross-layer approaches that can leverage reliability through techniques that are pro-actively designed with respect to techniques at other layers; Explains run-time adaptation and concepts/means of self-organization, in order to achieve error resiliency in complex, future many core systems.




Building Dependable Distributed Systems


Book Description

A one-volume guide to the most essential techniques for designing and building dependable distributed systems Instead of covering a broad range of research works for each dependability strategy, this useful reference focuses on only a selected few (usually the most seminal works, the most practical approaches, or the first publication of each approach), explaining each in depth, usually with a comprehensive set of examples. Each technique is dissected thoroughly enough so that readers who are not familiar with dependable distributed computing can actually grasp the technique after studying the book. Building Dependable Distributed Systems consists of eight chapters. The first introduces the basic concepts and terminology of dependable distributed computing, and also provides an overview of the primary means of achieving dependability. Checkpointing and logging mechanisms, which are the most commonly used means of achieving limited degree of fault tolerance, are described in the second chapter. Works on recovery-oriented computing, focusing on the practical techniques that reduce the fault detection and recovery times for Internet-based applications, are covered in chapter three. Chapter four outlines the replication techniques for data and service fault tolerance. This chapter also pays particular attention to optimistic replication and the CAP theorem. Chapter five explains a few seminal works on group communication systems. Chapter six introduces the distributed consensus problem and covers a number of Paxos family algorithms in depth. The Byzantine generals problem and its latest solutions, including the seminal Practical Byzantine Fault Tolerance (PBFT) algorithm and a number of its derivatives, are introduced in chapter seven. The final chapter details the latest research results surrounding application-aware Byzantine fault tolerance, which represents an important step forward in the practical use of Byzantine fault tolerance techniques.







Reliable Computer Systems


Book Description

This classic reference work is a comprehensive guide to the design, evaluation, and use of reliable computer systems. It includes case studies of reliable systems from manufacturers, such as Tandem, Stratus, IBM, and Digital. It covers special systems such as the Galileo Orbiter fault protection system and AT&T telephone switching system processors.




Dependable Computer Systems


Book Description

Dependability analysis is the recent approach to performance evaluation of contemporary systems which tries to cope with new challenges that are brought with their unprecedented complexity, size and diversity. Especially in case of computer systems and networks such evaluation must be based on multidisciplinary approach to theory, technology, and maintenance of systems which operate in real (and very often unfriendly) environments. As opposed to “classic” reliability which focuses mainly on technical aspects of system functioning, dependability studies investigate the systems as multifaceted and sophisticated amalgamations of technical, information and also human resources. This monograph presents selected new developments in such areas of dependability research as mathematical models, evaluation of software, probabilistic assessment, methodologies, tools, and technologies. Intelligent and soft computing methods help to resolve fundamental problems of dependability analysis which are caused by the fact that in contemporary computer systems it is often difficult to find a relation between system elements and system events (the relation between reasons and results) and it is even more difficult to define strict mathematical models with “analytical” relationships between such phenomena.




Theory and Engineering of Dependable Computer Systems and Networks


Book Description

This book contains papers on selected aspects of dependability analysis in computer systems and networks, which were chosen for discussion during the 16th DepCoS-RELCOMEX conference held in Wrocław, Poland, from June 28 to July 2, 2021. Their collection will be a valuable source material for scientists, researchers, practitioners and students who are dealing with design, analysis and engineering of computer systems and networks and must ensure their dependable operation. Being probably the most complex technical systems ever engineered by man (and also—the most dynamically evolving ones), organization of contemporary computer systems cannot be interpreted only as structures built on the basis of (unreliable) technical resources. Their evaluation must take into account a specific blend of interacting people (their needs and behaviours), networks (together with mobile properties, cloud organization, Internet of Everything, etc.) and a large number of users dispersed geographically and constantly producing an unconceivable number of applications. Ever-growing number of research methods being continuously developed for dependability analyses apply the newest techniques of artificial and computational intelligence. Selection of papers in these proceedings illustrates diversity of multi-disciplinary topics which are considered in present-day dependability explorations.