Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Phonological Posterior Hashing for Query by Example Spoken Term Detection
 
conference paper

Phonological Posterior Hashing for Query by Example Spoken Term Detection

Asaei, Afsaneh  
•
Ram, Dhananjay
•
Bourlard, Hervé
2018
19Th Annual Conference Of The International Speech Communication Association (Interspeech 2018), Vols 1-6
19th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2018)

State of the art query by example spoken term detection (QbE-STD) systems in zero-resource conditions rely on representation of speech in terms of sequences of class-conditional posterior probabilities estimated by deep neural network (DNN). The posteriors are often used for pattern matching or dynamic time warping (DTW). Exploiting posterior probabilities as speech representation propounds diverse advantages in a classification system. One key property of the posterior representations is that they admit a highly effective hashing strategy that enables indexing a large audio archive in divisions for reducing the search complexity. Moreover, posterior indexing leads to a compressed representation and enables pronunciation dewarping and partial detection with no need for DTW. We exploit these characteristics of the posterior space in the context of redundant hash addressing for query-by-example spoken term detection (QbE-STD). We evaluate the QbE-STD system on AMI corpus and demonstrate that tremendous speedup and superior accuracy is achieved compared to the state-of-the-art pattern matching solution based on DTW. The system has the potential to enable massively large scale spoken query detection.

  • Details
  • Metrics
Type
conference paper
DOI
10.21437/Interspeech.2018-1973
Web of Science ID

WOS:000465363900434

Author(s)
Asaei, Afsaneh  
Ram, Dhananjay
Bourlard, Hervé
Date Issued

2018

Publisher

ISCA-INT SPEECH COMMUNICATION ASSOC

Publisher place

Baixas

Published in
19Th Annual Conference Of The International Speech Communication Association (Interspeech 2018), Vols 1-6
ISBN of the book

978-1-5108-7221-9

Series title/Series vol.

Interspeech

Start page

2067

End page

2071

Subjects

posterior probability structures

•

posterior hashing

•

pronunciation dewarping

•

structural similarity measure

•

query by example

•

spoken term detection

URL

Related documents

http://publications.idiap.ch/downloads/papers/2018/Asaei_PHONOLOGICALPOSTERIORHASHINGFORQUERYBYEXAMPLESPOKENTERMDETECTION_2018.pdf

Related documents

http://publications.idiap.ch/index.php/publications/showcite/Asaei_Idiap-RR-31-2016
Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LIDIAP  
Event nameEvent placeEvent date
19th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2018)

Hyderabad, INDIA

Aug 02-Sep 06, 2018

Available on Infoscience
July 26, 2018
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/147495
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés