Show simple item record

An intelligent extension of the training set for the Persian n-gram language model: an enrichment algorithm

dc.creatorMotavallian, Rezvan
dc.creatorKomeily, Masoud
dc.date2023-11-06
dc.date.accessioned2024-11-19T15:17:33Z
dc.date.available2024-11-19T15:17:33Z
dc.identifierhttps://onomazein.letras.uc.cl/index.php/onom/article/view/69745
dc.identifier10.7764/onomazein.61.09
dc.identifier.urihttps://revistaschilenas.uchile.cl/handle/2250/246317
dc.descriptionIn this article, we are going to introduce an automatic mechanism to intelligently extend the training set to improve the n-gram language model of Persian. Given the free word-order property in Persian, our enrichment algorithm diversifies n-gram combinations in baseline training data through dependency reordering, adding permissible sentences and filtering ungrammatical sentences using a hybrid empirical (heuristic) and linguistic approach. Experiments performed on baseline training set (taken from a standard Persian corpus) and the resulting enriched training set indicate a declining trend in average relative perplexity (between 34% to 73%) for informal/spoken vs. formal/written Persian test data.en-US
dc.formatapplication/pdf
dc.languageeng
dc.publisherFacultad de Letras de la Pontificia Universidad Católica de Chilees-ES
dc.relationhttps://onomazein.letras.uc.cl/index.php/onom/article/view/69745/54195
dc.rightshttps://creativecommons.org/licenses/by/4.0es-ES
dc.sourceOnomázein ; No. 61 (2023): September; 191-211en-US
dc.sourceOnomázein ; Núm. 61 (2023): Septiembre; 191-211es-ES
dc.source0718-5758
dc.subjecttraining corpusen-US
dc.subjectn-gram language modelen-US
dc.subjectdependency parsingen-US
dc.subjectenrichment algorithmen-US
dc.subjectfree word-orderen-US
dc.titleAn intelligent extension of the training set for the Persian n-gram language model: an enrichment algorithmen-US
dc.titleAn intelligent extension of the training set for the Persian n-gram language model: an enrichment algorithmes-ES
dc.typeinfo:eu-repo/semantics/article
dc.typeinfo:eu-repo/semantics/publishedVersion


This item appears in the following Collection(s)

Show simple item record