Video Text Detection


Book Description

This book presents a systematic introduction to the latest developments in video text detection. Opening with a discussion of the underlying theory and a brief history of video text detection, the text proceeds to cover pre-processing and post-processing techniques, character segmentation and recognition, identification of non-English scripts, techniques for multi-modal analysis and performance evaluation. The detection of text from both natural video scenes and artificially inserted captions is examined. Various applications of the technology are also reviewed, from license plate recognition and road navigation assistance, to sports analysis and video advertising systems. Features: explains the fundamental theory in a succinct manner, supplemented with references for further reading; highlights practical techniques to help the reader understand and develop their own video text detection systems and applications; serves as an easy-to-navigate reference, presenting the material in self-contained chapters.




Cognitively Inspired Video Text Processing


Book Description

As technologies are fast advancing, the importance of text detection and recognition is receiving special attention from the researchers. Thus, one can see several real-time applications of video text processing which requires cognitive-based methods to find a solution. The main applications are (1) retrieving and indexing video based on semantic of the content of the video, (2) machine translation to assist foreigners, (3) assisting blind people to walk on the road freely without aid, (4) automatic vehicle driving, (5) license plate tracing to catch vehicles which violate the traffic signals, (6) monitoring the images posted on social media based on text and content of the images, (7) identifying the location based on the address of the street and shops, etc., (8) tracing players in the sports based on the jersey/bib number or text, and (9) in the same way, tracing the bib number in case of marathon and other events. For the above-mentioned applications, text detection and recognition in video and natural scene images is an integral part of the system.




Automated System for Text Detection Individual Video Images


Book Description

Text detection in video images is a challenging research problem because of the poor spatial resolution and complex background, which may contain a variety of colors. An automated system for text detection in video images is presented. It makes use of four modules to implement a series of processes to extract text regions from video images. The first module, called the multistage pulse code modulation (MPCM) module, is used to locate potential text regions in color video images. It converts a video image to a coded image, with each pixel encoded by a priority code ranging from 7 down to 0 in accordance with its priority, and further produces a binary thresholded image, which segments potential text regions from the background. The second module, called the text region detection module, applies a sequence of spatial filters to remove noisy regions and eliminate regions that are unlikely to contain text. The third module, called the text box finding module, merges text regions and produces boxes that are likely to contain text. Finally, the fourth module, called the optical character recognition (OCR) module, eliminates the text boxes that produce no OCR output. An extensive set of experiments is conducted and demonstrates that the proposed system is effective in detecting text in a wide variety of video images.







Advances in Multimedia Information Processing -- PCM 2010, Part I


Book Description

The 2010 Pacific-Rim Conference on Multimedia (PCM 2010) was held in Shanghai at Fudan University, during September 21–24, 2010. Since its inauguration in 2000, PCM has been held in various places around the Pacific Rim, namely Sydney (PCM 2000), Beijing (PCM 2001), Hsinchu (PCM 2002), Singapore (PCM 2003), Tokyo (PCM 2004), Jeju (PCM 2005), Zhejiang (PCM 2006), Hong Kong (PCM 2007), Tainan (PCM 2008), and Bangkok (PCM 2009). PCM is a major annual international conference organized as a forum for the dissemination of state-of-the-art technological advances and research results in the fields of theoretical, experimental, and applied multimedia analysis and processing. PCM 2010 featured a comprehensive technical program which included 75 oral and 56 poster presentations selected from 261 submissions from Australia, Canada, China, France, Germany, Hong Kong, India, Iran, Italy, Japan, Korea, Myanmar, Norway, Singapore, Taiwan, Thailand, the UK, and the USA. Three distinguished researchers, Prof. Zhi-Hua Zhou from Nanjing University, Dr. Yong Rui from Microsoft, and Dr. Tie-Yan Liu from Microsoft Research Asia delivered three keynote talks to the conference. We are very grateful to the many people who helped to make this conference a s- cess. We would like to especially thank Hong Lu for local organization, Qi Zhang for handling the publication of the proceedings, and Cheng Jin for looking after the c- ference website and publicity. We thank Fei Wu for organizing the special session on large-scale multimedia search in the social network settings.







Proceedings of the 3rd International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2014


Book Description

This volume contains 87 papers presented at FICTA 2014: Third International Conference on Frontiers in Intelligent Computing: Theory and Applications. The conference was held during 14-15, November, 2014 at Bhubaneswar, Odisha, India. This volume contains papers mainly focused on Network and Information Security, Grid Computing and Clod Computing, Cyber Security and Digital Forensics, Computer Vision, Signal, Image & Video Processing, Software Engineering in Multidisciplinary Domains and Ad-hoc and Wireless Sensor Networks.




Advances in Multimedia Information Processing - PCM 2008


Book Description

This book constitutes the refereed proceedings of the 9th Pacific Rim Conference on Multimedia, PCM 2008, held in Tainan, Taiwan, in December 2008. The 79 revised full papers and 39 revised poster presented were carefully reviewed and selected from 210 submissions. The papers are organized in topical sections on next generation video coding techniques, audio processing and classification, interactive multimedia systems, advances in H.264/AVC, multimedia networking techniques, advanced image processing techniques, video analysis and its applications, image detection and classification, visual and spatial analyses, multimedia human computer interfaces, multimedia security and DRM, advanced image and video processing, multimedia database and retrieval, multimedia management and authoring, multimedia personalization, multimedia for e-learning, multimedia networking techniques, multimedia systems and applications, advanced multimedia techniques, as well as multimedia processing and analyses.




Advanced Internet Based Systems and Applications


Book Description

This book constitutes the thoroughly refereed post-conference proceedings of the Second International Conference on on Signal-Image Technology and Internet-Based Systems, SITIS 2006, held in Hammamet, Tunisia, in December, 2006. The 33 full papers were carefully reviewed and selected from the best papers presented at the conference and are presented in revised and extended form. Part of the papers focus on the emerging modeling, representation and retrieval techniques that take into account the amount, type and diversity of information accessible in distributed computing environment. Other contributions are devoted to emerging and novel concepts, architectures and methodologies for creating an interconnected world in which information can be exchanged easily, tasks can be processed collaboratively, and communities of users with similarly interests can be formed while addressing security threats that are present more than ever before.




Automatic Text Detection and Tracking in Digital Video


Book Description

Text which either appears in a scene or is graphically added to video can provide an important supplemental source of index information as well as clues for decoding the video's structure and for classification. In this paper we present algorithms for detecting and tracking text components that appear within digital video frames. Our system implements a scale-space feature extractor that feeds an artificial neural processor to extract textual regions and track their movement over time. The extracted regions can then be used as input to an appropriate Optical Character Recognition system which produces indexible keywords.