VSE knjižnice (vzajemna bibliografsko-kataložna baza podatkov COBIB.SI)
  • Topic detection for language model adaptation of highly-inflected language by using a fuzzy comparison function
    Sepesy Maučec, Mirjam ; Kačič, Zdravko
    A new framework is proposed to construct corpus-based topic-adapted language models for large vocabulary speech recognition of highly-inflected Slovenian language. The proposed techniques can be ... applied to other Slavic languages, where words are formed by many different inflectional affixatation. In this article an attempt two important difficulties of highly-inflected languages (high out-of-vocabulary rate and the problem of topic detection) is described. The first problem is solved by the decomposition of words into stemsand endings, and topic detection is improved by a novel approach for feature extraction based on soft comparison of words. The results of experiments on the second largest Slovenian newspaper news corpus Večer show the decrease in peplexity by 17% in average over a general word-based model.
    Vir: Eurospeech 2001 Scandinavia : proceedings (Vol. 1, str. 243-246)
    Vrsta gradiva - prispevek na konferenci
    Leto - 2001
    Jezik - angleški
    COBISS.SI-ID - 6484502