High-Order Models in Semantic Image Segmentation


Book Description

High-Order Models in Semantic Image Segmentation reviews recent developments in optimization-based methods for image segmentation, presenting several geometric and mathematical models that underlie a broad class of recent segmentation techniques. Focusing on impactful algorithms in the computer vision community in the last 10 years, the book includes sections on graph-theoretic and continuous relaxation techniques, which can compute globally optimal solutions for many problems. The book provides a practical and accessible introduction to these state-of -the-art segmentation techniques that is ideal for academics, industry researchers, and graduate students in computer vision, machine learning and medical imaging. Gives an intuitive and conceptual understanding of this mathematically involved subject by using a large number of graphical illustrations Provides the right amount of knowledge to apply sophisticated techniques for a wide range of new applications Contains numerous tables that compare different algorithms, facilitating the appropriate choice of algorithm for the intended application Presents an array of practical applications in computer vision and medical imaging Includes code for many of the algorithms that is available on the book’s companion website







Semantic Image Segmentation


Book Description

Semantic image segmentation (SiS) plays a fundamental role towards a general understanding of the image content and context, in a broad variety of computer vision applications, thus providing key information for the global understanding of an image.This monograph summarizes two decades of research in the field of SiS, where a literature review of solutions starting from early historical methods is proposed, followed by an overview of more recent deep learning methods, including the latest trend of using transformers.The publication is complemented by presenting particular cases of the weak supervision and side machine learning techniques that can be used to improve the semantic segmentation, such as curriculum, incremental or self-supervised learning. State-of-the-art SiS models rely on a large amount of annotated samples, which are more expensive to obtain than labels for tasks such as image classification. Since unlabeled data is significantly cheaper to obtain, it is not surprising that Unsupervised Domain Adaptation (UDA) reached a broad success within the semantic segmentation community. Therefore, a second core contribution of this monograph is to summarize five years of a rapidly growing field, Domain Adaptation for Semantic Image Segmentation (DASiS), which embraces the importance of semantic segmentation itself and a critical need of adapting segmentation models to new environments. In addition to providing a comprehensive survey on DASiS techniques, newer trends such as multi-domain learning, domain generalization, domain incremental learning, test-time adaptation and source-free domain adaptation are also presented. The publication concludes by describing datasets and benchmarks most widely used in SiS and DASiS and briefly discusses related tasks such as instance and panoptic image segmentation, as well as applications such as medical image segmentation.This monograph should provide researchers across academia and industry with a comprehensive reference guide, and will help them in fostering new research directions in the field.




Computer Vision -- ECCV 2014


Book Description

The seven-volume set comprising LNCS volumes 8689-8695 constitutes the refereed proceedings of the 13th European Conference on Computer Vision, ECCV 2014, held in Zurich, Switzerland, in September 2014. The 363 revised papers presented were carefully reviewed and selected from 1444 submissions. The papers are organized in topical sections on tracking and activity recognition; recognition; learning and inference; structure from motion and feature matching; computational photography and low-level vision; vision; segmentation and saliency; context and 3D scenes; motion and 3D scene analysis; and poster sessions.




Advances in Information Retrieval


Book Description

This two-volume set LNCS 11437 and 11438 constitutes the refereed proceedings of the 41st European Conference on IR Research, ECIR 2019, held in Cologne, Germany, in April 2019. The 48 full papers presented together with 2 keynote papers, 44 short papers, 8 demonstration papers, 8 invited CLEF papers, 11 doctoral consortium papers, 4 workshop papers, and 4 tutorials were carefully reviewed and selected from 365 submissions. They were organized in topical sections named: Modeling Relations; Classification and Search; Recommender Systems; Graphs; Query Analytics; Representation; Reproducibility (Systems); Reproducibility (Application); Neural IR; Cross Lingual IR; QA and Conversational Search; Topic Modeling; Metrics; Image IR; Short Papers; Demonstration Papers; CLEF Organizers Lab Track; Doctoral Consortium Papers; Workshops; and Tutorials.




Practical Machine Learning for Computer Vision


Book Description

This practical book shows you how to employ machine learning models to extract information from images. ML engineers and data scientists will learn how to solve a variety of image problems including classification, object detection, autoencoders, image generation, counting, and captioning with proven ML techniques. This book provides a great introduction to end-to-end deep learning: dataset creation, data preprocessing, model design, model training, evaluation, deployment, and interpretability. Google engineers Valliappa Lakshmanan, Martin Görner, and Ryan Gillard show you how to develop accurate and explainable computer vision ML models and put them into large-scale production using robust ML architecture in a flexible and maintainable way. You'll learn how to design, train, evaluate, and predict with models written in TensorFlow or Keras. You'll learn how to: Design ML architecture for computer vision tasks Select a model (such as ResNet, SqueezeNet, or EfficientNet) appropriate to your task Create an end-to-end ML pipeline to train, evaluate, deploy, and explain your model Preprocess images for data augmentation and to support learnability Incorporate explainability and responsible AI best practices Deploy image models as web services or on edge devices Monitor and manage ML models




From Interactive to Semantic Image Segmentation


Book Description

This thesis investigates two well defined problems in image segmentation, viz. in- teractive and semantic image segmentation. Interactive segmentation involves power assisting a user in cutting out objects from an image, whereas semantic segmenta- tion involves partitioning pixels in an image into object categories. Vve investigate various models and energy formulations for both these problems in this thesis. In order to improve the performance of interactive systems, low level texture features are introduced as a replacement for the more commonly used RGB fea- tures. To quantify the improvement obtained by using these texture features, two annotated datasets of images are introduced (one consisting of natural images, and the other consisting of camouflaged objects). A significant improvement in perfor- mance is observed when using texture features for the case of monochrome images and images containing camouflaged objects. We also explore adding mid-level cues such as shape constraints into interactive segmentation by introducing the idea of geodesic star convexity, which extends the existing notion of a star convexity prior in two important ways: (i) It allows for multiple star centres as opposed to single stars in the original prior and (ii) It generalises the shape constraint by allowing for Geodesic paths as opposed to Euclidean rays. Global minima of our energy func- tion can be obtained subject to these new constraints. We also introduce Geodesic Forests, which exploit the structure of shortest paths in implementing the extended constraints. These extensions to star convexity allow us to use such constraints in a practical segmentation system. This system is evaluated by means of a "robot user" to measure the amount of interaction required in a precise way, and it is shown that having shape constraints reduces user effort significantly compared to existing interactive systems. We also introduce a new and harder dataset which augments the existing GrabCut dataset with more realistic images and ground truth taken from the PASCAL VOC segmentation challenge. In the latter part of the thesis, we bring in object category level information in order to make the interactive segmentation tasks easier, and move towards fully automated semantic segmentation. An algorithm to automatically segment humans from cluttered images given their bounding boxes is presented. A top down seg- mentation of the human is obtained using classifiers trained to predict segmentation masks from local HOG descriptors. These masks are then combined with bottom up image information in a local GrabCut like procedure. This algorithm is later completely automated to segment humans without requiring a bounding box, and is quantitatively compared with other semantic segmentation methods. We also introduce a novel way to acquire large quantities of segmented training data rel- atively effortlessly using the Kinect. In the final part of this work, we explore various semantic segmentation methods based on learning using bottom up super- pixelisations. Different methods of combining multiple super-pixelisations are dis- cussed and quantitatively evaluated on two segmentation datasets. We observe that simple combinations of independently trained classifiers on single super-pixelisations perform almost as good as complex methods based on jointly learning across multiple super-pixelisations. We also explore CRF based formulations for semantic segmen- tation, and introduce novel visual words based object boundary description in the energy formulation. The object appearance and boundary parameters are trained jointly using structured output learning methods, and the benefit of adding pairwise terms is quantified on two different datasets.




Variational and Level Set Methods in Image Segmentation


Book Description

Image segmentation consists of dividing an image domain into disjoint regions according to a characterization of the image within or in-between the regions. Therefore, segmenting an image is to divide its domain into relevant components. The efficient solution of the key problems in image segmentation promises to enable a rich array of useful applications. The current major application areas include robotics, medical image analysis, remote sensing, scene understanding, and image database retrieval. The subject of this book is image segmentation by variational methods with a focus on formulations which use closed regular plane curves to define the segmentation regions and on a level set implementation of the corresponding active curve evolution algorithms. Each method is developed from an objective functional which embeds constraints on both the image domain partition of the segmentation and the image data within or in-between the partition regions. The necessary conditions to optimize the objective functional are then derived and solved numerically. The book covers, within the active curve and level set formalism, the basic two-region segmentation methods, multiregion extensions, region merging, image modeling, and motion based segmentation. To treat various important classes of images, modeling investigates several parametric distributions such as the Gaussian, Gamma, Weibull, and Wishart. It also investigates non-parametric models. In motion segmentation, both optical flow and the movement of real three-dimensional objects are studied.




Medical Image Computing and Computer Assisted Intervention – MICCAI 2020


Book Description

The seven-volume set LNCS 12261, 12262, 12263, 12264, 12265, 12266, and 12267 constitutes the refereed proceedings of the 23rd International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2020, held in Lima, Peru, in October 2020. The conference was held virtually due to the COVID-19 pandemic. The 542 revised full papers presented were carefully reviewed and selected from 1809 submissions in a double-blind review process. The papers are organized in the following topical sections: Part I: machine learning methodologies Part II: image reconstruction; prediction and diagnosis; cross-domain methods and reconstruction; domain adaptation; machine learning applications; generative adversarial networks Part III: CAI applications; image registration; instrumentation and surgical phase detection; navigation and visualization; ultrasound imaging; video image analysis Part IV: segmentation; shape models and landmark detection Part V: biological, optical, microscopic imaging; cell segmentation and stain normalization; histopathology image analysis; opthalmology Part VI: angiography and vessel analysis; breast imaging; colonoscopy; dermatology; fetal imaging; heart and lung imaging; musculoskeletal imaging Part VI: brain development and atlases; DWI and tractography; functional brain networks; neuroimaging; positron emission tomography




Bridging the Semantic Gap in Image and Video Analysis


Book Description

This book presents cutting-edge research on various ways to bridge the semantic gap in image and video analysis. The respective chapters address different stages of image processing, revealing that the first step is a future extraction, the second is a segmentation process, the third is object recognition, and the fourth and last involve the semantic interpretation of the image. The semantic gap is a challenging area of research, and describes the difference between low-level features extracted from the image and the high-level semantic meanings that people can derive from the image. The result greatly depends on lower level vision techniques, such as feature selection, segmentation, object recognition, and so on. The use of deep models has freed humans from manually selecting and extracting the set of features. Deep learning does this automatically, developing more abstract features at the successive levels. The book offers a valuable resource for researchers, practitioners, students and professors in Computer Engineering, Computer Science and related fields whose work involves images, video analysis, image interpretation and so on.