Implementation of a Fast Inter-prediction Mode Decision in H.264/AVC Video Encoder


Book Description

H.264/MPEG-4 Part 10 or AVC (advanced video coding) is currently one of the most widely used industry standards for video compression. There are several video codec solutions, both software and hardware, available in the market for H.264. This video compression technology is primarily used in applications such as video conferencing, mobile TV, blu-ray discs, digital television and internet video streaming. This thesis uses the JM 17.2 reference software [15], which is available for all users and can be downloaded from http://iphome.hhi.de/suehring/tml. The software is mainly used for educational purposes; it also includes the reference software manual which has information about installation, compilation and usage. In real time applications such as video streaming and video conferencing it is important that the video encoding/decoding is fast. It is known, that most of the complexity lies in the H.264 encoder, specifically the motion estimation (ME) and mode decision process introduces high computational complexity and takes a lot of CPU (central processing unit) usage. The mode decision process is complex because of variable block sizes (16X16 to 4x4) motion estimation and half and quarter pixel motion compensations. Hence, the objective of this thesis is to reduce the encoding time while maintaining the same quality and efficiency of compression. The Fast adaptive termination (FAT) [30] algorithm is used in the mode decision and motion estimation process. Based on the rate-distortion (RD) cost characteristics all the inter modes are classified as either skip modes or non-skip modes. In order to select the best mode for any macroblock, the minimum RD cost of these two modes is predicted. Further, for skip mode, an early-skip mode detection test is proposed; for non-skip mode a three-stage scheme is proposed to speed up the mode decision process. Experimental results demonstrate that the proposed technique has good robustness in coding efficiency with different quantization parameters (QP) and various video sequences. It is able to achieve encoding time saving by 47.6% and loss of only 0.01% decrease in structural similarity index matrix (SSIM) with negligible degradation in peak signal to noise ratio (PSNR) and acceptable increase in bit rate.




Rate Distortion Optimization for Interprediction in H.264/AVC Video Coding


Book Description

Part 10 of MPEG-4 describes the Advanced Video Coding (AVC) method widely known as H.264. H.264 is the product of a collaborative effort known as the Joint Video Team(JVT). The final draft of the standard was completed in May of 2003 and since then H.264 has become one of the most commonly used formats for compression [1]. H.264, unlike previous standards, describes a myriad of coding options that involve variable block size inter prediction methods, nine different intra prediction modes, multi frame prediction and B frame prediction. There are a huge number of options for coding that will tend to generate a different number of coded bits and different reconstruction quality. A video encoder is challenged to minimize coded bitrate and maximize quality. However, choosing the coding mode of a macroblock to achieve this is a difficult problem due to the large number of coding combinations and parameters. Rate Distortion Optimization is an effective technique for choosing the 'best' coding mode for a macroblock. This thesis presents two features of an H.264 encoder, multi frame prediction and B frame prediction. Additionally, a Rate Distortion Optimization scheme is implemented with the features to improve overall performance of the encoder.







High Efficiency Video Coding and Other Emerging Standards


Book Description

High Efficiency Video Coding and Other Emerging Standards provides an overview of high efficiency video coding (HEVC) and all its extensions and profiles. There are nearly 300 projects and problems included, and about 400 references related to HEVC alone. Next generation video coding (NGVC) beyond HEVC is also described. Other video coding standards such as AVS2, DAALA, THOR, VP9 (Google), DIRAC, VC1, and AV1 are addressed, and image coding standards such as JPEG, JPEG-LS, JPEG2000, JPEG XR, JPEG XS, JPEG XT and JPEG-Pleno are also listed.Understanding of these standards and their implementation is facilitated by overview papers, standards documents, reference software, software manuals, test sequences, source codes, tutorials, keynote speakers, panel discussions, reflector and ftp/web sites – all in the public domain. Access to these categories is also provided.




The H.264 Advanced Video Compression Standard


Book Description

H.264 Advanced Video Coding or MPEG-4 Part 10 is fundamental to a growing range of markets such as high definition broadcasting, internet video sharing, mobile video and digital surveillance. This book reflects the growing importance and implementation of H.264 video technology. Offering a detailed overview of the system, it explains the syntax, tools and features of H.264 and equips readers with practical advice on how to get the most out of the standard. Packed with clear examples and illustrations to explain H.264 technology in an accessible and practical way. Covers basic video coding concepts, video formats and visual quality. Explains how to measure and optimise the performance of H.264 and how to balance bitrate, computation and video quality. Analyses recent work on scalable and multi-view versions of H.264, case studies of H.264 codecs and new technological developments such as the popular High Profile extensions. An invaluable companion for developers, broadcasters, system integrators, academics and students who want to master this burgeoning state-of-the-art technology. "[This book] unravels the mysteries behind the latest H.264 standard and delves deeper into each of the operations in the codec. The reader can implement (simulate, design, evaluate, optimize) the codec with all profiles and levels. The book ends with extensions and directions (such as SVC and MVC) for further research." Professor K. R. Rao, The University of Texas at Arlington, co-inventor of the Discrete Cosine Transform




Complexity Reduction in HEVC Intra Coding and Comparison with H.264/AVC


Book Description

ITU-T (VCEG) and ISO/IEC (MPEG) collaborated and formed the joint collaborative team on video coding (JCT-VC) in April 2010 to develop the next-generation video coding (NGVC) standard.. HEVC standard doubles the coding efficiency and the approximately 50% less bit rate with respect to H.264/AVC, at nearly the same video quality at expense of increased complexity. In this thesis, a technique is proposed to reduce the complexity of HEVC intra coding to get better encoding time, involving two steps - first by optimizing the PU (prediction unit) size decision process using texture complexity analysis by intensity gradients and second to obtain the reduced prediction modes by applying a combination of rough mode decision (RMD) and most probable modes (MPM) thereby reducing the number of modes based on rate distortion optimization (RDO) followed by residual quad-tree (RQT) which is used to simplify the entire process The technique developed in this thesis achieved an average gain of 47.25% encoder time when implemented for several test sequences at very less loss in performance with high complexity reduction.




Reducing the Compexity of Inter-prediction Mode Decision for High Effeciency Video Codec


Book Description

The High Efficiency Video Coding (HEVC) standard is the latest joint video project of the International Telecommunication Unit (ITU-T) Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) standardization organizations, working together in a partnership known as the Joint Collaborative Team on Video Coding (JCT-VC). While the HEVC is based on the same architecture of the widely used H.264/AVC (Advance Video Coding) standard [8], it includes many new coding tools, and almost all the encoder blocks are optimized with respect to their counterparts in the H.264/AVC standard. This allows the new standard to achieve up to 50% bitrate reduction compared to its predecessor with the same visual quality at the cost of increased complexity [1]. Like H.264/AVC, mode decisions with Motion Estimation (ME) remain among the most time-consuming computations in HEVC. In an inter-prediction mode decision, a fullsearch algorithm searches for every possible block size and refines the results from integer-pel to quarter-pel resolution. Thus, a full-search algorithm guarantees the highest level of compression performance. However, the considerable computational complexity for a mode decision decreases the encoding speed. In this thesis a fast adaptive termination [20] algorithm is proposed that terminates early the mode decision in inter-prediction for HEVC. Based on Rate Distortion (RD) cost, all the inter prediction modes are classified as skip or non-skip modes, and to select the best mode minimum RD cost of these two modes are predicted. For skip mode, the mode decision is predicted in early stage while in non-skip mode different stages are proposed to speed-up the mode decision. Experimental results based on several video test sequences suggest a decrease of about 25%-40% in encoding time is achieved with implementation of the Fast Adaptive Termination algorithm for interprediction mode decision with negligible degradation in peak signal to noise ratio (PSNR). Metrics such as BD-bitrate (Bjøntegaard Delta bitrate), BD-PSNR (Bjøntegaard Delta Peak Signal to Noise Ratio), SSIM (Structural Similarity) and computational complexity are also used.




The Era of Interactive Media


Book Description

Interactive Media is a new research field and a landmark in multimedia development. The Era of Interactive Media is an edited volume contributed from world experts working in academia, research institutions and industry. The Era of Interactive Media focuses mainly on Interactive Media and its various applications. This book also covers multimedia analysis and retrieval; multimedia security rights and management; multimedia compression and optimization; multimedia communication and networking; and multimedia systems and applications. The Era of Interactive Media is designed for a professional audience composed of practitioners and researchers working in the field of multimedia. Advanced-level students in computer science and electrical engineering will also find this book useful as a secondary text or reference.




Fast and Adaptive H.264/AVC Video Coding for Network Based Applications


Book Description

As the state of the art video coding standard, H.264/AVC achieves significant coding performance gain comparing to its predecessors. Nevertheless, the advance comes at huge complexity increase of the encoder, which may hinder its applications to real world. In addition, network applications impose some unique requirements on existing video coding algorithms. For instance, a variable bit rate output of the encoder has to be tuned into a constant rate bit stream to fit transmission channel bandwidth. In this dissertation, two issues related to H.264/AVC video coding are to be addressed: coding complexity and bandwidth adaption (rate control), and corresponding solutions are provided. To reduce the coding complexity, the original mode decision process in H.264/AVC reference software is optimized for fast implementation. Moreover, two rate control algorithms are given to address different requirements of rate control: quality fluctuation reduction and accurate basic unit quantization decision. Experiments are performed to test and validate the proposed algorithms. The results show that the proposed algorithms provide efficient solutions to the above problems and facilitate H.264/AVC coding standard for practical deployment.