In this paper, we present the results of a study on excitation frequency impact on short-term recording synchronisation and confidence estimation for multisource audiovisual data, recorded by different personal capturing devices during social events. The core of the algorithm is based on perceptual time-quefrency analysis with a precision of 10 ms. Performance levels achieved to date on 14+ hours hand-labelled dataset have shown positive impact of excitation frequency on temporal synchronisation (98.19% precision for 5 s recordings) and confidence estimation (99.08% precision with 100% recall for 5 s recordings). The results surpass the performance of fast cross correlation while keeping lower system requirements.