audiovisual signal processing
This paper presents evaluation methods of video monitoring systems, which analyze situation during a meeting as well as detect and track participants, record speaker’s talks. Video monitoring system complexity depends on conditions and limitations, which are superposed by specific application, geometric parameters of observation area and signal quality. Several metrics are employed for evaluation of accuracy of video monitoring systems work – precision and accuracy of multiple object tracking.
Several criteria were employed for evaluation of the proposed algorithm for speaker detection and recording: (1) initial delay between audio and video streams; (2) a length of recorded avi file; (3) number of duplicated frames.
During the experiments 36 avi files were recorded in the discussion work mode. After manual checking it was detected that 89% of the files have speaker’s speech and 11% are false alarm files with noises. Such noises are produced at process of tester standing up from a chair, because in such a moment chair’s mechanical details make high noise. Also mistakes in detecting sitting participants influence on appearance of false files.
A result of experiments shows, that avi file in mean consists of 40% duplicated frames. That frames consist of frames added in process for eliminating of initial delay between audio and video streams and frames added in synchronization process.