Architectures and Algorithms for Mitigation of Soft Errors in Nanoscale VLSI Circuits


Book Description

ABSTRACT: The occurrence of transient faults like soft errors in computer circuits poses a significant challenge to the reliability of computer systems. Soft error, which occurs when the energetic neutrons coming from space or the alpha particles arising out of packaging materials hit the transistors, may manifest themselves as a bit flip in the memory element or as a transient glitch generated at any internal node of combinational logic, which may subsequently propagate to and be captured in a latch. Although the problem of soft errors was earlier only a concern for space applications, aggressive technology scaling trends have exacerbated the problem to modern VLSI systems even for terrestrial applications. In this dissertation, we explore techniques at all levels of the design flow to reduce the vulnerability of VLSI systems against soft errors without compromising on other design metrics like delay, area and power. We propose new models for estimating soft errors for storage structures and combinational logic. While soft errors in caches are estimated using the vulnerability metric, soft errors in logic circuits are estimated using two new metrics called the glitch enabling probability (GEP) and the cumulative probability of observability (CPO). These metrics, based on signal probabilities of nets, accurately model soft errors in radiation-aware synthesis algorithms and helps in efficient exploration of the design solution space during optimization. At the physical design level, we leverage the use of larger netlengths to provide larger RC ladders for effectively filtering out the transient glitches. Towards this, a new heuristic has been developed to selectively assign larger wirelengths to certain critical nets. This reduces the delay and area overhead while improving the immunity to soft errors. Based on this, we propose two placement algorithms based on simulated annealing and quadratic programming which significantly reduce the soft error rates of circuits. At the circuit level, we develop techniques for hardening circuit nodes using a novel radiation jammer technique. The proposed technique is based on the principles of a RC differentiator and is used to isolate the driven cell from the driving cell which is being hit by a radiation strike. Since the blind insertion of radiation blocker cells on all circuit nodes is expensive, candidate nodes are selected for insertion of these cells using a new metric called the probability of radiation blocker circuit insertion (PRI). We investigate a gate sizing algorithm, at the logic level, in which we simultaneously optimize both the soft error rate (SER) and the crosstalk noise besides the power and performance of circuits while considering the effect of process variations. The reliability centric gate sizing technique has been formulated as a mathematical program and is efficiently solved. At the architectural level, we develop solutions for the correction of multi-bit errors in large L2 caches by controlling or mining the redundancy in the memory hierarchy and methods to increase the amount of redundancy in the memory hierarchy by employing a redundancy-based replacement policy, in which the amount of redundancy is controlled using a user defined redundancy threshold. The novel architectures and the new reliability-centric synthesis algorithms proposed for the various design abstraction levels have been shown to achieve significant reduction of soft error rates in current nanometer circuits. The design techniques, algorithms and architectures can be integrated into existing design flows. A VLSI system implementation can leverage on the architectural solutions for the reliability of the caches while the custom hardware synthesized for the VLSI system can be protected against radiation strikes by utilizing the circuit level, logic level and layout level optimization algorithms that have been developed.




Soft Error Reliability of VLSI Circuits


Book Description

This book is intended for readers who are interested in the design of robust and reliable electronic digital systems. The authors cover emerging trends in design of today’s reliable electronic systems which are applicable to safety-critical applications, such as automotive or healthcare electronic systems. The emphasis is on modeling approaches and algorithms for analysis and mitigation of soft errors in nano-scale CMOS digital circuits, using techniques that are the cornerstone of Computer Aided Design (CAD) of reliable VLSI circuits. The authors introduce software tools for analysis and mitigation of soft errors in electronic systems, which can be integrated easily with design flows. In addition to discussing soft error aware analysis techniques for combinational logic, the authors also describe new soft error mitigation strategies targeting commercial digital circuits. Coverage includes novel Soft Error Rate (SER) analysis techniques such as process variation aware SER estimation and GPU accelerated SER analysis techniques, in addition to SER reduction methods such as gate sizing and logic restructuring based SER techniques.




Mitigation of Soft Errors in Nanoscale VLSI Circuits


Book Description

Reliability is a key concern in VLSI systems and transient/intermittent faults, often caused by soft errors, require designers to create special mitigation techniques. This book describes such techniques, spanning all levels of the design flow, to reduce systematically the vulnerability of VLSI systems to soft errors. Readers will be enabled to address soft error issues early in their design flow, allowing them to weigh the implications of dedicating more resources for soft error detection and prevention, against the correlating impact on delay, power and area.




Analysis and Design of Resilient VLSI Circuits


Book Description

This monograph is motivated by the challenges faced in designing reliable VLSI systems in modern VLSI processes. The reliable operation of integrated circuits (ICs) has become increasingly dif?cult to achieve in the deep submicron (DSM) era. With continuouslydecreasing device feature sizes, combinedwith lower supply voltages and higher operating frequencies, the noise immunity of VLSI circuits is decreasing alarmingly. Thus, VLSI circuits are becoming more vulnerable to noise effects such as crosstalk, power supply variations, and radiation-inducedsoft errors. Among these noise sources, soft errors(or error caused by radiation particle strikes) have become an increasingly troublesome issue for memory arrays as well as c- binational logic circuits. Also, in the DSM era, process variations are increasing at a signi?cant rate, making it more dif?cult to design reliable VLSI circuits. Hence, it is important to ef?ciently design robust VLSI circuits that are resilient to radiation particle strikes and process variations. The work presented in this research mo- graph presents several analysis and design techniques with the goal of realizing VLSI circuits, which are radiation and process variation tolerant.




Nanoscale VLSI


Book Description

This book describes methodologies in the design of VLSI devices, circuits and their applications at nanoscale levels. The book begins with the discussion on the dominant role of power dissipation in highly scaled devices.The 15 Chapters of the book are classified under four sections that cover design, modeling, and simulation of electronic, magnetic and compound semiconductors for their applications in VLSI devices, circuits, and systems. This comprehensive volume eloquently presents the design methodologies for ultra–low power VLSI design, potential post–CMOS devices, and their applications from the architectural and system perspectives. The book shall serve as an invaluable reference book for the graduate students, Ph.D./ M.S./ M.Tech. Scholars, researchers, and practicing engineers working in the frontier areas of nanoscale VLSI design.




Dependable Multicore Architectures at Nanoscale


Book Description

This book provides comprehensive coverage of the dependability challenges in today's advanced computing systems. It is an in-depth discussion of all the technological and design-level techniques that may be used to overcome these issues and analyzes various dependability-assessment methods. The impact of individual application scenarios on the definition of challenges and solutions is considered so that the designer can clearly assess the problems and adjust the solution based on the specifications in question. The book is composed of three sections, beginning with an introduction to current dependability challenges arising in complex computing systems implemented with nanoscale technologies, and of the effect of the application scenario. The second section details all the fault-tolerance techniques that are applicable in the manufacture of reliable advanced computing devices. Different levels, from technology-level fault avoidance to the use of error correcting codes and system-level checkpointing are introduced and explained as applicable to the different application scenario requirements. Finally the third section proposes a roadmap of future trends in and perspectives on the dependability and manufacturability of advanced computing systems from the special point of view of industrial stakeholders. Dependable Multicore Architectures at Nanoscale showcases the original ideas and concepts introduced into the field of nanoscale manufacturing and systems reliability over nearly four years of work within COST Action IC1103 MEDIAN, a think-tank with participants from 27 countries. Academic researchers and graduate students working in multi-core computer systems and their manufacture will find this book of interest as will industrial design and manufacturing engineers working in VLSI companies.




Techniques for Enhancing Reliability in VLSI Circuits


Book Description

Several techniques at the architectural level to detect soft errors with minimal performance overhead, that make use of data, information, temporal and spatial redundancy are proposed. The techniques are designed in such a way that much of their latency overhead can be hidden by the latency of other functional operations. It is shown that the proposed methodologies can be implemented with negligible or minimal performance overhead hidden by critical path operations in the datapath. In designs with large peak power values, high current spikes cause noise within the power supply creating timing issues in the circuit which affect its functionality. A path clustering algorithm is proposed which attempts to normalize the current draw in the circuit over the circuit's clock period by delaying the start times of certain paths. By reducing the number of paths starting at a time instance, we reduce the amount of current drawn from the power supply is reduced. Experimental results indicate a reduction of up to 72\% in peak power values when tested on the ISCAS '85 and OpenCores benchmarks. Variations in VLSI designs come from process, voltage supply, and Temperature (PVT). These variations in the design cause non-ideal behavior at random internal nodes which impacts the timing of the design. A variation aware circuit level design methodology is presented in this dissertation in which the architecture dynamically stretches the clock when the effect of an variation effects are observed within the circuit during computations. While previous research efforts found are directed towards reducing variation effects, this technique offers an alternative approach to adapt dynamically to variation effects. The design technique is shown to increase in timing yield on ITC '99 benchmark circuits by an average of 41\% with negligible area overhead.




Nanoscale Devices


Book Description

The primary aim of this book is to discuss various aspects of nanoscale device design and their applications including transport mechanism, modeling, and circuit applications. . Provides a platform for modeling and analysis of state-of-the-art devices in nanoscale regime, reviews issues related to optimizing the sub-nanometer device performance and addresses simulation aspect and/or fabrication process of devices Also, includes design problems at the end of each chapter