Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Supervised and unsupervised Web-based language model domain adaptation
 
conference paper

Supervised and unsupervised Web-based language model domain adaptation

Lecorvé, Gwénolé
•
Dines, John  
•
Hain, Thomas
Show more
2012
Interspeech 2012
Interspeech

Domain language model adaptation consists in re-estimating probabilities of a baseline LM in order to better match the specifics of a given broad topic of interest. To do so, a common strategy is to retrieve adaptation texts from the Web based on a given domain-representative seed text. In this paper, we study how the selection of this seed text influences the adaptation process and the performances of resulting adapted language models in automatic speech recognition. More precisely, the goal of this original study is to analyze the differences of our Web-based adaptation approach between the supervised case, in which the seed text is manually generated, and the unsupervised case, where the seed text is given by an automatic transcript. Experiments were carried out on data sourced from a real-world use case, more specifically, videos produced for a university YouTube channel. Results show that our approach is quite robust since the unsupervised adaptation provides similar performance to the supervised case in terms of the overall perplexity and word error rate.

  • Details
  • Metrics
Type
conference paper
DOI
10.21437/Interspeech.2012-62
Author(s)
Lecorvé, Gwénolé
Dines, John  
Hain, Thomas
Motlicek, Petr
Date Issued

2012

Published in
Interspeech 2012
Start page

182

End page

185

Subjects

ASR

•

Automatic Speech Recognition

•

domain adaptation

•

Language Models

•

supervision

•

Web data

URL

Related documents

http://publications.idiap.ch/index.php/publications/showcite/Lecorve_Idiap-RR-22-2012
Written at

EPFL

EPFL units
LIDIAP  
Event nameEvent place
Interspeech

Portland, Oregon, USA

Available on Infoscience
December 19, 2013
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/98281
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés