Probabilistic Models and Inference for Multi-View People Detection in Overlapping Depth Images


Book Description

In this work, the task of wide-area indoor people detection in a network of depth sensors is examined. In particular, we investigate how the redundant and complementary multi-view information, including the temporal context, can be jointly leveraged to improve the detection performance. We recast the problem of multi-view people detection in overlapping depth images as an inverse problem and present a generative probabilistic framework to jointly exploit the temporal multi-view image evidence.




Computational, label, and data efficiency in deep learning for sparse 3D data


Book Description

Deep learning is widely applied to sparse 3D data to perform challenging tasks, e.g., 3D object detection and semantic segmentation. However, the high performance of deep learning comes with high costs, including computational costs and the effort to capture and label data. This work investigates and improves the efficiency of deep learning for sparse 3D data to overcome the obstacles to the further development of this technology.




Light Field Imaging for Deflectometry


Book Description

Optical measurement methods are becoming increasingly important for high-precision production of components and quality assurance. The increasing demand can be met by modern imaging systems with advanced optics, such as light field cameras. This work explores their use in the deflectometric measurement of specular surfaces. It presents improvements in phase unwrapping and calibration techniques, enabling high surface reconstruction accuracies using only a single monocular light field camera.




Model-based Filtering of Interfering Signals in Ultrasonic Time Delay Estimations


Book Description

This work presents model-based algorithmic approaches for interference-invariant time delay estimation, which are specifically suited for the estimation of small time delay differences with a necessary resolution well below the sampling time. Therefore, the methods can be applied particularly well for transit-time ultrasonic flow measurements, since the problem of interfering signals is especially prominent in this application.




Machine Learning for Camera-Based Monitoring of Laser Welding Processes


Book Description

The increasing use of automated laser welding processes causes high demands on process monitoring. This work demonstrates methods that use a camera mounted on the focussing optics to perform pre-, in-, and post-process monitoring of welding processes. The implementation uses machine learning methods. All algorithms consider the integration into industrial processes. These challenges include a small database, limited industrial manufacturing inference hardware, and user acceptance.




Reconstruction from Spatio-Spectrally Coded Multispectral Light Fields


Book Description

In dieser Arbeit werden spektral kodierte multispektrale Lichtfelder untersucht, wie sie von einer Lichtfeldkamera mit einem spektral kodierten Mikrolinsenarray aufgenommen werden. Für die Rekonstruktion der kodierten Lichtfelder werden zwei Methoden entwickelt, eine basierend auf den Prinzipien des Compressed Sensing sowie eine Deep Learning Methode. Anhand neuartiger synthetischer und realer Datensätze werden die vorgeschlagenen Rekonstruktionsansätze im Detail evaluiert. -In this work, spatio-spectrally coded multispectral light fields, as taken by a light field camera with a spectrally coded microlens array, are investigated. For the reconstruction of the coded light fields, two methods, one based on the principles of compressed sensing and one deep learning approach, are developed. Using novel synthetic as well as a real-world datasets, the proposed reconstruction approaches are evaluated in detail.




Artificial Intelligence Applications and Innovations


Book Description

This book constitutes the refereed proceedings of the 12th IFIP WG 12.5 International Conference on Artificial Intelligence Applications and Innovations, AIAI 2016, and three parallel workshops, held in Thessaloniki, Greece, in September 2016. The workshops are the Third Workshop on New Methods and Tools for Big Data, MT4BD 2016, the 5th Mining Humanistic Data Workshop, MHDW 2016, and the First Workshop on 5G - Putting Intelligence to the Network Edge, 5G-PINE 2016.The 30 revised full papers and 8 short papers presented at the main conference were carefully reviewed and selected from 65 submissions. The 17 revised full papers and 7 short papers presented at the 3 parallel workshops were selected from 33 submissions. The papers cover a broad range of topics such as artificial neural networks, classification, clustering, control systems - robotics, data mining, engineering application of AI, environmental applications of AI, feature reduction, filtering, financial-economics modeling, fuzzy logic, genetic algorithms, hybrid systems, image and video processing, medical AI applications, multi-agent systems, ontology, optimization, pattern recognition, support vector machines, text mining, and Web-social media data AI modeling.




Models for Multi-view Object Class Detection


Book Description

Learning how to detect objects from many classes in a wide variety of viewpoints is a key goal of computer vision. Existing approaches, however, require excessive amounts of training data. Implementors need to collect numerous training images not only to cover changes in the same object's shape due to the viewpoint variation, but also to accommodate the variability in appearance among instances of the same class. We introduce the Potemkin model, which exploits the relationship between 3D objects and their 2D projections for efficient and effective learning. The Potemkin model can be constructed from a few views of an object of the target class. We use the Potemkin model to transform images of objects from one view to several other views, effectively multiplying their value for class detection. This approach can be coupled with any 2D image-based detection system. We show that automatically transformed images dramatically decrease the data requirements for multi-view object class detection. The Potemkin model also allows detection systems to reconstruct the 3D shapes of detected objects automatically from a single 2D image. This reconstruction generates realistic views of 3D models, and also provides accurate 3D information for entire objects. We demonstrate its usefulness in three applications: robot manipulation, object detection using 2.5D data, and generating 3D 'pop-up' models from photos.




Practical Machine Learning for Computer Vision


Book Description

This practical book shows you how to employ machine learning models to extract information from images. ML engineers and data scientists will learn how to solve a variety of image problems including classification, object detection, autoencoders, image generation, counting, and captioning with proven ML techniques. This book provides a great introduction to end-to-end deep learning: dataset creation, data preprocessing, model design, model training, evaluation, deployment, and interpretability. Google engineers Valliappa Lakshmanan, Martin Görner, and Ryan Gillard show you how to develop accurate and explainable computer vision ML models and put them into large-scale production using robust ML architecture in a flexible and maintainable way. You'll learn how to design, train, evaluate, and predict with models written in TensorFlow or Keras. You'll learn how to: Design ML architecture for computer vision tasks Select a model (such as ResNet, SqueezeNet, or EfficientNet) appropriate to your task Create an end-to-end ML pipeline to train, evaluate, deploy, and explain your model Preprocess images for data augmentation and to support learnability Incorporate explainability and responsible AI best practices Deploy image models as web services or on edge devices Monitor and manage ML models




Representations and Techniques for 3D Object Recognition and Scene Interpretation


Book Description

One of the grand challenges of artificial intelligence is to enable computers to interpret 3D scenes and objects from imagery. This book organizes and introduces major concepts in 3D scene and object representation and inference from still images, with a focus on recent efforts to fuse models of geometry and perspective with statistical machine learning. The book is organized into three sections: (1) Interpretation of Physical Space; (2) Recognition of 3D Objects; and (3) Integrated 3D Scene Interpretation. The first discusses representations of spatial layout and techniques to interpret physical scenes from images. The second section introduces representations for 3D object categories that account for the intrinsically 3D nature of objects and provide robustness to change in viewpoints. The third section discusses strategies to unite inference of scene geometry and object pose and identity into a coherent scene interpretation. Each section broadly surveys important ideas from cognitive science and artificial intelligence research, organizes and discusses key concepts and techniques from recent work in computer vision, and describes a few sample approaches in detail. Newcomers to computer vision will benefit from introductions to basic concepts, such as single-view geometry and image classification, while experts and novices alike may find inspiration from the book's organization and discussion of the most recent ideas in 3D scene understanding and 3D object recognition. Specific topics include: mathematics of perspective geometry; visual elements of the physical scene, structural 3D scene representations; techniques and features for image and region categorization; historical perspective, computational models, and datasets and machine learning techniques for 3D object recognition; inferences of geometrical attributes of objects, such as size and pose; and probabilistic and feature-passing approaches for contextual reasoning about 3D objects and scenes. Table of Contents: Background on 3D Scene Models / Single-view Geometry / Modeling the Physical Scene / Categorizing Images and Regions / Examples of 3D Scene Interpretation / Background on 3D Recognition / Modeling 3D Objects / Recognizing and Understanding 3D Objects / Examples of 2D 1/2 Layout Models / Reasoning about Objects and Scenes / Cascades of Classifiers / Conclusion and Future Directions