Development online models for automatic speech recognition systems with a low data level

Mamyrbayev ОZh*; Oralbekova DO*; Alimhan K; Othman M and Zhumazhanov B

Indexing & Archiving

Sherpa/Romeo ORCID (Signatory Publisher) iThenticate - Plagiarism Checker CrossRef Meta Data User - Indexing J Gate Indexed - Indexing DORA - San Francisco Declaration on Research Assessment Portico - Archiving BASE (Bielefeld Academic Search Engine) - Indexing Scilit - Indexing Open Archives Initiative - Indexing CNKI-Archiving Index Copernicus - Indexing (Underevaluation) TDNet - Indexing HOLLIS catalog tool - Powered by Harward Library GrowKudos-Indexing Dimensions Academic Microsoft ScienceOpen

https://www.peertechzpublications.org/submission

Mathematics and Physics Group Google Scholar CrossRef J-Gate DORA Portico BASE Scilit OAI CNKI TDNet ResearchGate GrowKudos

ISSN: 2689-7636

Abstract

Open Access Research Article Article ID: AMP-5-149

Development online models for automatic speech recognition systems with a low data level

Mamyrbayev ОZh, Oralbekova DO, Alimhan K, Othman M and Zhumazhanov B

Speech recognition is a rapidly growing field in machine learning. Conventional automatic speech recognition systems were built based on independent components, that is an acoustic model, a language model and a vocabulary, which were tuned and trained separately. The acoustic model is used to predict the context-dependent states of phonemes, and the language model and lexicon determine the most possible sequences of spoken phrases. The development of deep learning technologies has contributed to the improvement of other scientific areas, which includes speech recognition. Today, the most popular speech recognition systems are systems based on an end-to-end (E2E) structure, which trains the components of a traditional model simultaneously without isolating individual elements, representing the system as a single neural network. The E2E structure represents the system as one whole element, in contrast to the traditional one, which has several independent elements. The E2E system provides a direct mapping of acoustic signals in a sequence of labels without intermediate states, without the need for post-processing at the output, which makes it easy to implement. Today, the popular models are those that directly output the sequence of words based on the input sound in real-time, which are online end-to-end models. This article provides a detailed overview of popular online-based models for E2E systems such as RNN-T, Neural Transducer (NT) and Monotonic Chunkwise Attention (MoChA). It should be emphasized that online models for Kazakh speech recognition have not been developed at the moment. For low-resource languages, like the Kazakh language, the above models have not been studied. Thus, systems based on these models have been trained to recognize Kazakh speech. The results obtained showed that all three models work well for recognizing Kazakh speech without the use of external additions.

Keywords:

Published on: Aug 23, 2022 Pages: 107-111

Full Text PDF Full Text HTML DOI: 10.17352/amp.000049
CrossMark Publons Harvard Library HOLLIS Search IT Semantic Scholar Get Citation Base Search Scilit OAI-PMH ResearchGate Academic Microsoft GrowKudos Universite de Paris UW Libraries SJSU King Library SJSU King Library NUS Library McGill DET KGL BIBLiOTEK JCU Discovery Universidad De Lima WorldCat VU on WorldCat

Indexing & Archiving

Abstract

Development online models for automatic speech recognition systems with a low data level

Mamyrbayev ОZh, Oralbekova DO, Alimhan K, Othman M and Zhumazhanov B

Indexing/Archiving

Editor-in-Chief

NIH Funded Articles

Research Articles

Pinterest on AMP

Subscribe to receive issue release notifications and newsletters from Peertechz journals

cantrella caralynn

Samantha Wyatt

MUSHTAQ CHALKOO

Dr Shivaji Jadhav

Indexing & Archiving

Abstract

Development online models for automatic speech recognition systems with a low data level

Mamyrbayev ОZh*, Oralbekova DO*, Alimhan K, Othman M and Zhumazhanov B

Indexing/Archiving

Editor-in-Chief

NIH Funded Articles

Research Articles

Pinterest on AMP

Subscribe to receive issue release notifications and newsletters from Peertechz journals

Peertechz Publications Google Reviews

Anna Maria

cantrella caralynn

Samantha Wyatt

MUSHTAQ CHALKOO

Dr Shivaji Jadhav

Mamyrbayev ОZh, Oralbekova DO, Alimhan K, Othman M and Zhumazhanov B