Automatic Temporal Alignment of AV Data with Confidence Estimation

In this paper, we propose a new approach for the automatic audio-based temporal alignment with confidence estimation of audio-visual data, recorded by different cameras, camcorders or mobile phones during social events. All recorded data is temporally aligned based on ASR-related features with a common master track, recorded by a reference camera, and the corresponding confidence of alignment is estimated. The core of the algorithm is based on perceptual time-frequency analysis with a precision of 10 ms. The results show correct alignment in 99% of cases for a real life dataset and surpass the performance of cross correlation while keeping lower system requirements.


Presented at:
Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, USA
Year:
2010
Publisher:
P.O. Box 592, CH-1920 Martigny, Switzerland
Keywords:
Laboratories:




 Record created 2010-02-11, last modified 2018-03-17

n/a:
Download fulltextPDF
External links:
Download fulltextURL
Download fulltextRelated documents
Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)