Synthesis Lectures on Computer Vision
2 total works
Background subtraction is a widely used concept for detection of moving objects in videos. In the last two decades there has been a lot of development in designing algorithms for background subtraction, as well as wide use of these algorithms in various important applications, such as visual surveillance, sports video analysis, motion capture, etc. Various statistical approaches have been proposed to model scene backgrounds. The concept of background subtraction also has been extended to detect objects from videos captured from moving cameras. This book reviews the concept and practice of background subtraction. We discuss several traditional statistical background subtraction models, including the widely used parametric Gaussian mixture models and non-parametric models. We also discuss the issue of shadow suppression, which is essential for human motion analysis applications. This book discusses approaches and tradeoffs for background maintenance. This book also reviews many of the recent developments in background subtraction paradigm. Recent advances in developing algorithms for background subtraction from moving cameras are described, including motion-compensation-based approaches and motion-segmentation-based approaches.
In its early years, the field of computer vision was largely motivated by researchers seeking computational models of biological vision and solutions to practical problems in manufacturing, defense, and medicine. For the past two decades or so, there has been an increasing interest in computer vision as an input modality in the context of human-computer interaction. Such vision-based interaction can endow interactive systems with visual capabilities similar to those important to human-human interaction, in order to perceive non-verbal cues and incorporate this information in applications such as interactive gaming, visualization, art installations, intelligent agent interaction, and various kinds of command and control tasks. Enabling this kind of rich, visual and multimodal interaction requires interactive-time solutions to problems such as detecting and recognizing faces and facial expressions, determining a person's direction of gaze and focus of attention, tracking movement of thebody, and recognizing various kinds of gestures. In building technologies for vision-based interaction, there are choices to be made as to the range of possible sensors employed (e.g., single camera, stereo rig, depth camera), the precision and granularity of the desired outputs, the mobility of the solution, usability issues, etc. Practical considerations dictate that there is not a one-size-fits-all solution to the variety of interaction scenarios; however, there are principles and methodological approaches common to a wide range of problems in the domain. While new sensors such as the Microsoft Kinect are having a major influence on the research and practice of vision-based interaction in various settings, they are just a starting point for continued progress in the area. In this book, we discuss the landscape of history, opportunities, and challenges in this area of vision-based interaction; we review the state-of-the-art and seminal works in detecting and recognizing the human body and its components; we explore both static and dynamic approaches to "looking at people" vision problems; and we place the computer vision work in the context of other modalities and multimodal applications. Readers should gain a thorough understanding of current and future possibilities of computer vision technologies in the context of human-computer interaction.