000192494 001__ 192494
000192494 005__ 20190316235800.0
000192494 037__ $$aCONF
000192494 245__ $$aSupervised and unsupervised Web-based language model domain adaptation
000192494 269__ $$a2012
000192494 260__ $$c2012
000192494 336__ $$aConference Papers
000192494 520__ $$aDomain language model adaptation consists in re-estimating probabilities of a baseline LM in order to better match the specifics of a given broad topic of interest. To do so, a common strategy is to retrieve adaptation texts from the Web based on a given domain-representative seed text. In this paper, we study how the selection of this seed text influences the adaptation process and the performances of resulting adapted language models in automatic speech recognition. More precisely, the goal of this original study is to analyze the differences of our Web-based adaptation approach between the supervised case, in which the seed text is manually generated, and the unsupervised case, where the seed text is given by an automatic transcript. Experiments were carried out on data sourced from a real-world use case, more specifically, videos produced for a university YouTube channel. Results show that our approach is quite robust since the unsupervised adaptation provides similar performance to the supervised case in terms of the overall perplexity and word error rate.
000192494 6531_ $$aASR
000192494 6531_ $$aAutomatic Speech Recognition
000192494 6531_ $$adomain adaptation
000192494 6531_ $$aLanguage Models
000192494 6531_ $$asupervision
000192494 6531_ $$aWeb data
000192494 700__ $$aLecorvé, Gwénolé
000192494 700__ $$0243992$$g192380$$aDines, John
000192494 700__ $$aHain, Thomas
000192494 700__ $$aMotlicek, Petr
000192494 7112_ $$cPortland, Oregon, USA$$aProceedings of Interspeech
000192494 8564_ $$uhttp://publications.idiap.ch/index.php/publications/showcite/Lecorve_Idiap-RR-22-2012$$zRelated documents
000192494 909C0 $$xU10381$$0252189$$pLIDIAP
000192494 909CO $$qGLOBAL_SET$$pconf$$ooai:infoscience.tind.io:192494$$pSTI
000192494 937__ $$aEPFL-CONF-192494
000192494 970__ $$aLecorve_INTERSPEECH_2012/LIDIAP
000192494 973__ $$aEPFL
000192494 980__ $$aCONF