VSE knjižnice (vzajemna bibliografsko-kataložna baza podatkov COBIB.SI)
-
Using data-driven subword units in language model of highly inflective Slovenian languageSepesy Maučec, Mirjam ...This paper presents the results of a study on modeling the highly inflective Slovenian language. We focus on creating a language model for a large vocabulary speech recognition system. A new ... data-driven method is proposed for the induction of inflectional morphology into language modeling. The research focus is on data sparsity, which results from the complex morphology of the language. The idea of using subword units is examined. An attempt is made to figure out the segmentation of words into two subword unitsČ stems and endings. No prior knowledge of the language is used. The subword units should fit into the frameworks of the probabilistic language models. A morphologically correct decomposition of words is not being sought, but searching for a decomposition which yields the minimum entropy of the training corpus. This entropy is approximated by using N-gram models. Despite some seemingly over-simplified assumption, the subword models improve the applicability of the language models for a sparse training corpus. The experiments were performed using the VEÈER newswire text corpus as a training corpus. The test set was taken from the SNABI speech database, because the final models were evaluated in speech recognition experiments on SNABI speech database. Two different subword-based models are proposed and examined experimentally. The experiments demonstrate that subword-based models, which considerably reduce OOV rate, improve speech recognition WER when compared with standard word-based models, even though they increase test set perplexity. Subword-based models with improved perplexity, but which reduce the OOV rate much less than the previous ones, do not improve speech recognition results.Vir: International journal of pattern recognition and artificial intelligence. - ISSN 0218-0014 (Vol. 23, iss. 2, Mar. 2009, str. 287-312)Vrsta gradiva - članek, sestavni delLeto - 2009Jezik - angleškiCOBISS.SI-ID - 13118230
Avtor
Sepesy Maučec, Mirjam |
Rotovnik, Tomaž, telekomunikacije |
Kačič, Zdravko |
Brest, Janez
Teme
avtomatsko razpoznavanje govora |
slovenski jezik |
modeliranje |
statistično modeliranje jezika |
automatic speech recognition |
statistical language modeling |
subword units |
inflective language |
Slovenian language
vir: International journal of pattern recognition and artificial intelligence. - ISSN 0218-0014 (Vol. 23, iss. 2, Mar. 2009, str. 287-312)
Vnos na polico
Trajna povezava
- URL:
Faktor vpliva
Dostop do baze podatkov JCR je dovoljen samo uporabnikom iz Slovenije. Vaš trenutni IP-naslov ni na seznamu dovoljenih za dostop, zato je potrebna avtentikacija z ustreznim računom AAI.
| Leto | Faktor vpliva | Izdaja | Kategorija | Razvrstitev | ||||
|---|---|---|---|---|---|---|---|---|
| JCR | SNIP | JCR | SNIP | JCR | SNIP | JCR | SNIP | |
Faktor vpliva
Baze podatkov, v katerih je revija indeksirana
| Ime baze podatkov | Področje | Leto |
|---|
| Povezave do osebnih bibliografij avtorjev | Povezave do podatkov o raziskovalcih v sistemu SICRIS |
|---|---|
| Sepesy Maučec, Mirjam | 18168 |
| Rotovnik, Tomaž, telekomunikacije | 21304 |
| Kačič, Zdravko | 06821 |
| Brest, Janez | 16118 |
Vir: Osebne bibliografije
in: SICRIS
Izberite prevzemno mesto:
Prevzem gradiva po pošti
Naslov za dostavo:
Med podatki člana manjka naslov.
Storitev za pridobivanje naslova trenutno ni dostopna, prosimo, poskusite še enkrat.
S klikom na gumb "V redu" boste potrdili zgoraj izbrano prevzemno mesto in dokončali postopek rezervacije.
S klikom na gumb "V redu" boste potrdili zgoraj izbrano prevzemno mesto in naslov za dostavo ter dokončali postopek rezervacije.
S klikom na gumb "V redu" boste potrdili zgoraj izbrani naslov za dostavo in dokončali postopek rezervacije.
Obvestilo
Trenutno je storitev za avtomatsko prijavo in rezervacijo nedostopna. Gradivo lahko rezervirate sami na portalu Biblos ali ponovno poskusite tukaj kasneje.
Gesla v Splošnem geslovniku COBISS
Izbira mesta prevzema
Gradivo iz matične enote je brezplačno. Če je gradivo na mesto prevzema dostavljeno iz drugih enot, lahko knjižnica to storitev zaračuna.
| Mesto prevzema | Status gradiva | Rezervacija |
|---|
Rezervacija v teku
Prosimo, počakajte trenutek.
Rezervacija je uspela.
Rezervacija ni uspela.
Rezervacija...
Članska izkaznica:
Mesto prevzema: