Time-Predictable Architectures


Book Description

Building computers that can be used to design embedded real-time systems is the subject of this title. Real-time embedded software requires increasingly higher performances. The authors therefore consider processors that implement advanced mechanisms such as pipelining, out-of-order execution, branch prediction, cache memories, multi-threading, multicorearchitectures, etc. The authors of this book investigate the timepredictability of such schemes.




MULTICORE SYSTEMS ON-CHIP


Book Description

Conventional on-chip communication design mostly use ad-hoc approaches that fail to meet the challenges posed by the next-generation MultiCore Systems on-chip (MCSoC) designs. These major challenges include wiring delay, predictability, diverse interconnection architectures, and power dissipation. A Network-on-Chip (NoC) paradigm is emerging as the solution for the problems of interconnecting dozens of cores into a single system on-chip. However, there are many problems associated with the design of such systems. These problems arise from non-scalable global wire delays, failure to achieve global synchronization, and difficulties associated with non-scalable bus-based functional interconnects. The book consists of three parts, with each part being subdivided into four chapters. The first part deals with design and methodology issues. The architectures used in conventional methods of MCSoCs design and custom multiprocessor architectures are not flexible enough to meet the requirements of different application domains and not scalable enough to meet different computation needs and different complexities of various applications. Several chapters of the first part will emphasize on the design techniques and methodologies. The second part covers the most critical part of MCSoCs design — the interconnections. One approach to addressing the design methodologies is to adopt the so-called reusability feature to boost design productivity. In the past years, the primitive design units evolved from transistors to gates, finite state machines, and processor cores. The network-on-chip paradigm offers this attractive property for the future and will be able to close the productivity gap. The last part of this book delves into MCSoCs validations and optimizations. A more qualitative approach of system validation is based on the use of formal techniques for hardware design. The main advantage of formal methods is the possibility to prove the validity of essential design requirements. As formal languages have a mathematical foundation, it is possible to formally extract and verify these desired properties of the complete abstract state space. Online testing techniques for identifying faults that can lead to system failure are also surveyed. Emphasis is given to analytical redundancy-based techniques that have been developed for fault detection and isolation in the automatic control area.




Memory Controllers for Mixed-Time-Criticality Systems


Book Description

This book discusses the design and performance analysis of SDRAM controllers that cater to both real-time and best-effort applications, i.e. mixed-time-criticality memory controllers. The authors describe the state of the art, and then focus on an architecture template for reconfigurable memory controllers that addresses effectively the quickly evolving set of SDRAM standards, in terms of worst-case timing and power analysis, as well as implementation. A prototype implementation of the controller in SystemC and synthesizable VHDL for an FPGA development board are used as a proof of concept of the architecture template.




A Time-predictable Many-core Processor Design for Critical Real-time Embedded Systems


Book Description

Critical Real-Time Embedded Systems (CRTES) are in charge of controlling fundamental parts of embedded system, e.g. energy harvesting solar panels in satellites, steering and breaking in cars, or flight management systems in airplanes. To do so, CRTES require strong evidence of correct functional and timing behavior. The former guarantees that the system operates correctly in response of its inputs; the latter ensures that its operations are performed within a predefined time budget. CRTES aim at increasing the number and complexity of functions. Examples include the incorporation of \smarter" Advanced Driver Assistance System (ADAS) functionality in modern cars or advanced collision avoidance systems in Unmanned Aerial Vehicles (UAVs). All these new features, implemented in software, lead to an exponential growth in both performance requirements and software development complexity. Furthermore, there is a strong need to integrate multiple functions into the same computing platform to reduce the number of processing units, mass and space requirements, etc. Overall, there is a clear need to increase the computing power of current CRTES in order to support new sophisticated and complex functionality, and integrate multiple systems into a single platform. The use of multi- and many-core processor architectures is increasingly seen in the CRTES industry as the solution to cope with the performance demand and cost constraints of future CRTES. Many-cores supply higher performance by exploiting the parallelism of applications while providing a better performance per watt as cores are maintained simpler with respect to complex single-core processors. Moreover, the parallelization capabilities allow scheduling multiple functions into the same processor, maximizing the hardware utilization. However, the use of multi- and many-cores in CRTES also brings a number of challenges related to provide evidence about the correct operation of the system, especially in the timing domain. Hence, despite the advantages of many-cores and the fact that they are nowadays a reality in the embedded domain (e.g. Kalray MPPA, Freescale NXP P4080, TI Keystone II), their use in CRTES still requires finding efficient ways of providing reliable evidence about the correct operation of the system. This thesis investigates the use of many-core processors in CRTES as a means to satisfy performance demands of future complex applications while providing the necessary timing guarantees. To do so, this thesis contributes to advance the state-of-the-art towards the exploitation of parallel capabilities of many-cores in CRTES contributing in two different computing domains. From the hardware domain, this thesis proposes new many-core designs that enable deriving reliable and tight timing guarantees. From the software domain, we present efficient scheduling and timing analysis techniques to exploit the parallelization capabilities of many-core architectures and to derive tight and trustworthy Worst-Case Execution Time (WCET) estimates of CRTES.




17th International Conference on Information Technology–New Generations (ITNG 2020)


Book Description

This volume presents the 17th International Conference on Information Technology—New Generations (ITNG), and chronicles an annual event on state of the art technologies for digital information and communications. The application of advanced information technology to such domains as astronomy, biology, education, geosciences, security, and healthcare are among the themes explored by the ITNG proceedings. Visionary ideas, theoretical and experimental results, as well as prototypes, designs, and tools that help information flow to end users are of special interest. Specific topics include Machine Learning, Robotics, High Performance Computing, and Innovative Methods of Computing. The conference features keynote speakers; a best student contribution award, poster award, and service award; a technical open panel, and workshops/exhibits from industry, government, and academia.




Multicore Systems On-Chip: Practical Software/Hardware Design


Book Description

System on chips designs have evolved from fairly simple unicore, single memory designs to complex heterogeneous multicore SoC architectures consisting of a large number of IP blocks on the same silicon. To meet high computational demands posed by latest consumer electronic devices, most current systems are based on such paradigm, which represents a real revolution in many aspects in computing. The attraction of multicore processing for power reduction is compelling. By splitting a set of tasks among multiple processor cores, the operating frequency necessary for each core can be reduced, allowing to reduce the voltage on each core. Because dynamic power is proportional to the frequency and to the square of the voltage, we get a big gain, even though we may have more cores running. As more and more cores are integrated into these designs to share the ever increasing processing load, the main challenges lie in efficient memory hierarchy, scalable system interconnect, new programming paradigms, and efficient integration methodology for connecting such heterogeneous cores into a single system capable of leveraging their individual flexibility. Current design methods tend toward mixed HW/SW co-designs targeting multicore systems on-chip for specific applications. To decide on the lowest cost mix of cores, designers must iteratively map the device’s functionality to a particular HW/SW partition and target architectures. In addition, to connect the heterogeneous cores, the architecture requires high performance complex communication architectures and efficient communication protocols, such as hierarchical bus, point-to-point connection, or Network-on-Chip. Software development also becomes far more complex due to the difficulties in breaking a single processing task into multiple parts that can be processed separately and then reassembled later. This reflects the fact that certain processor jobs cannot be easily parallelized to run concurrently on multiple processing cores and that load balancing between processing cores – especially heterogeneous cores – is very difficult.




WCET Optimizations and Architectural Support for Hard Real-time Systems


Book Description

As time predictability is critical to hard real-time systems, it is not only necessary to accurately estimate the worst-case execution time (WCET) of the real-time tasks but also desirable to improve either the WCET of the tasks or time predictability of the system, because the real-time tasks with lower WCETs are easy to schedule and more likely to meat their deadlines. As a real-time system is an integration of software and hardware, the optimization can be achieved through two ways: software optimization and time-predictable architectural support. In terms of software optimization, we fi rst propose a loop-based instruction prefetching approach to further improve the WCET comparing with simple prefetching techniques such as Next-N-Line prefetching which can enhance both the average-case performance and the worst-case performance. Our prefetching approach can exploit the program controlow information to intelligently prefetch instructions that are most likely needed. Second, as inter-thread interferences in shared caches can signi cantly a ect the WCET of real-time tasks running on multicore processors, we study three multicore-aware code positioning methods to reduce the inter-core L2 cache interferences between co-running real-time threads. One strategy focuses on decreasing the longest WCET among the co-running threads, and two other methods aim at achieving fairness in terms of the amount or percentage of WCET reduction among co-running threads. In the aspect of time-predictable architectural support, we introduce the concept of architectural time predictability (ATP) to separate timing uncertainty concerns caused by hardware from software, which greatly facilitates the advancement of time-predictable processor design. We also propose a metric called Architectural Time-predictability Factor (ATF) to measure architectural time predictability quantitatively. Furthermore, while cache memories can generally improve average-case performance, they are harmful to time predictability and thus are not desirable for hard real-time and safety-critical systems. In contrast, Scratch-Pad Memories (SPMs) are time predictable, but they may lead to inferior performance. Guided by ATF, we propose and evaluate a variety of hybrid on-chip memory architectures to combine both caches and SPMs intelligently to achieve good time predictability and high performance. Detailed implementation and experimental results discussion are presented in this dissertation.




Multicore Computing


Book Description

Every area of science and engineering today has to process voluminous data sets. Using exact, or even approximate, algorithms to solve intractable problems in critical areas, such as computational biology, takes time that is exponential in some of the underlying parameters. Parallel computing addresses this issue and has become affordable with the advent of multicore architectures. However, programming multicore machines is much more difficult due to oddities existing in the architectures. Offering insights into different facets of this area, Multicore Computing: Algorithms, Architectures, and Applications focuses on the architectures, algorithms, and applications of multicore computing. It will help readers understand the intricacies of these architectures and prepare them to design efficient multicore algorithms. Contributors at the forefront of the field cover the memory hierarchy for multicore and manycore processors, the caching strategy Flexible Set Balancing, the main features of the latest SPARC architecture specification, the Cilk and Cilk++ programming languages, the numerical software library Parallel Linear Algebra Software for Multicore Architectures (PLASMA), and the exact multipattern string matching algorithm of Aho-Corasick. They also describe the architecture and programming model of the NVIDIA Tesla GPU, discuss scheduling directed acyclic graphs onto multi/manycore processors, and evaluate design trade-offs among Intel and AMD multicore processors, IBM Cell Broadband Engine, and NVIDIA GPUs. In addition, the book explains how to design algorithms for the Cell Broadband Engine and how to use the backprojection algorithm for generating images from synthetic aperture radar data.




Principles of Asynchronous Circuit Design


Book Description

Principles of Asynchronous Circuit Design - A Systems Perspective addresses the need for an introductory text on asynchronous circuit design. Part I is an 8-chapter tutorial which addresses the most important issues for the beginner, including how to think about asynchronous systems. Part II is a 4-chapter introduction to Balsa, a freely-available synthesis system for asynchronous circuits which will enable the reader to get hands-on experience of designing high-level asynchronous systems. Part III offers a number of examples of state-of-the-art asynchronous systems to illustrate what can be built using asynchronous techniques. The examples range from a complete commercial smart card chip to complex microprocessors. The objective in writing this book has been to enable industrial designers with a background in conventional (clocked) design to be able to understand asynchronous design sufficiently to assess what it has to offer and whether it might be advantageous in their next design task.