: Ling Shao, Caifeng Shan, Jiebo Luo, Minoru Etoh
: Ling Shao, Caifeng Shan, Jiebo Luo, Minoru Etoh
: Multimedia Interaction and Intelligent User Interfaces Principles, Methods and Applications
: Springer-Verlag
: 9781849965071
: Advances in Computer Vision and Pattern Recognition
: 1
: CHF 87.00
:
: Anwendungs-Software
: English
: 302
: Wasserzeichen
: PC/MAC/eReader/Tablet
: PDF
Consumer electronics (CE) devices, providing multimedia entertainment and enabling communication, have become ubiquitous in daily life. However, consumer interaction with such equipment currently requires the use of devices such as remote controls and keyboards, which are often inconvenient, ambiguous and non-interactive. An important challenge for the modern CE industry is the design of user interfaces for CE products that enable interactions which are natural, intuitive and fun. As many CE products are supplied with microphones and cameras, the exploitation of both audio and visual information for interactive multimedia is a growing field of research. Collecting together contributions from an international selection of experts, including leading researchers in industry, this unique text presents the latest advances in applications of multimedia interaction and user interfaces for consumer electronics. Covering issues of both multimedia content analysis and human-machine interaction, the book examines a wide range of techniques from computer vision, machine learning, audio and speech processing, communications, artificial intelligence and media technology. Topics and features: introduces novel computationally efficient algorithms to extract semantically meaningful audio-visual events; investigates modality allocation in intelligent multimodal presentation systems, taking into account the cognitive impacts of modality on human information processing; provides an overview on gesture control technologies for CE; presents systems for natural human-computer interaction, virtual content insertion, and human action retrieval; examines techniques for 3D face pose estimation, physical activity recognition, and video summary quality evaluation; discusses the features that characterize the new generation of CE and examines how web services can be integrated with CE products for improved user experience. This book is an essential resource for researchers and practitioners from both academia and industry working in areas of multimedia analysis, human-computer interaction and interactive user interfaces. Graduate students studying computer vision, pattern recognition and multimedia will also find this a useful reference.
Preface4
Contents7
Retrieving Human Actions Using Spatio-Temporal Features and Relevance Feedback9
Introduction9
Action Retrieval Scheme12
Action Retrieval Framework12
Spatio-Temporal Interest Point Detection13
Feature Description14
Codebook Formation and Action Video Representation16
Similarity Matching Scheme16
Action Retrieval on the KTH Dataset16
Dataset Processing16
Performance Evaluation17
Summary for Experiments on the KTH Dataset21
Realistic Action Retrieval in Movies22
Challenges of This Task22
Implementation24
Result Demonstration26
Discussion28
Application29
Conclusions29
References30
Computationally Efficient Clustering of Audio-Visual Meeting Data32
Introduction32
Background33
Challenges in Meeting Analysis35
Background on Speaker Diarization37
Background on Audio-Visual Synchrony38
Human Body Motions in Conversations39
Approach40
The Augmented MultiParty Interaction (AMI) Corpus41
Audio Speaker Diarization43
Traditional Offline Speaker Diarization43
Feature Extraction43
Speech/Nonspeech Detection43
Speaker Segmentation and Clustering44
Online Speaker Diarization45
Unsupervised Bootstrapping of Speaker Models45
Speaker Recognition46
A Note on Model Order Selection46
Summary of the Diarization Performance47
Extracting Computationally Efficient Video Features48
Estimating Personal Activit