Domain-specific language model adaptation: a case study

Domain language model (LM) adaptation consists in re-estimating probabilities of a baseline LM to better match the peculiarities of a given broad topic of interest. To do so, a yet common strategy consists in retrieving adaptation texts from the Web based on a given domain representative seed text. In this report, we extensively study this process by analyzing the impact of numerous parameters. The domain adaptation is carried on a set of videos dealing with business and management. The achieved results mainly show which Web querying strategies perform the best and how significantly the supervision level of the adaptation process impacts the overall performances.


