FEATURES EXTRACTION AND TRAINING STRATEGIES IN CONTINUOUS SPEECH RECOGNITION FOR ROMANIAN LANGUAGE
Corneliu Octavian Dumitru, Inge Gavat
2006
Abstract
This paper describes continuous speech recognition experiments for Romanian language, by using HMM (Hidden Markov Models) modeling. The following questions are to be discussed: the realization of a new front-end reconsidering linear prediction, the enhancement of recognition rates by context dependent modeling, the evaluation of training strategies ensuring speaker independence of the recognition process without speaker adaptation procedures, by speaker selection for training. The experiments lead to a development of the initial system with a promising front-end based on PLP (Perceptual Linear Prediction) coefficients, second ranked for the recognition performance obtained, near the first ranked front-end based on mel-frequency cepstral coefficients (MFCC), but far better as the last ranked, based on simple linear prediction. Concerning the implemented algorithm for context dependent modeling, it permits in all situations enhanced recognition rates. The experiments made with gender speaker selection enhanced under certain conditions the recognition rate, proving good generalization properties especially by training with the male speakers database.
References
- Dumitru, C.O., Gavat, I., 2005. Features Extraction, Modeling and Training Strategies in Continuous Speech Recognition for Romanian Language, Proc. EUROCON, Belgrade, Serbia & Montenegro, pp. 1425-1428.
- Dumitru, C.O., Gavat, I., 2005. A Comparative Study of Features for Continuous Speech Recognition by Statistical Modeling with Monophones and Triphones, Proc. SPED, Cluj-Napoca, Romania, pp.73-78.
- Furui, S., 2000. Digital Speech Processing, Synthesis and Recognition, 2-end, rev and expanded Marcel Dekker, N.Y.
- Gold, B., Morgan, N., 2002. Speech and audio signal processing, John Wiley and Sons, N.Y.
- Goronzy, S., 2002. Robust Adaptation to Non-Native Accents in Automatic Speech Recognition, Springer - Verlag Berlin Heidelberg, Germany.
- Hanson, B.A., Applebaum, T.H., 1990. Robust SpeakerIndependent Word Features Using Static, Dynamic And Acceleration Features, Proc. ICASSP, pp. 857- 860.
- Hermansky, H., 1990. Perceptual Linear Predictive Analysis of Speech, J. Acoust. Soc. America, Vol.87, No.4, pp. 1738-1752.
- Huang, X., Acero, A., Hon, H.W., 2001. Spoken Language Processing - A Guide to Theory, Algorithm, and System Development, Prentice Hall.
- Huang, C., Chen, T., Chang, E., 2002. Speaker Selection Training For Large Vocabulary Continuous Speech Recognition, Proc. ICLSP Vol. 1, pp. 609-612.
- Milner, B.A., 2002. Comparison of Front-End Configurations for Robust Speech Recognition, ICLSP 2002 Proceedings, Vol. 1, pp. 797-800.
- Oancea, E., Gavat, I., Dumitru, C.O., Munteanu, D., 2004. Continuous speech recognition for Romanian language based on context-dependent modeling, Proc. COMMUNICATION 2004, Bucharest, Romania, pp. 221-224.
- Woodland, P.C., Odell, J.J., Valtchev, V., Young, S.J., 1994. Large Vocabulary Continuous Speech Recognition Using HTK, Proc. ICASSP 1994, Adelaide.
- Young, S.J., 1992. The General Use of Tying in PhonemeBased HMM Speech Recognizers, Proc. ICASSP'92, Vol. 1, pp. 569-572, San Francisco.
- Young, S.J., Odell, J.J., Woodland, P.C., 1994. Tree Based State Tying for High Accuracy Modeling, ARPA Workshop on Human Language Technology, Princeton.
Paper Citation
in Harvard Style
Octavian Dumitru C. and Gavat I. (2006). FEATURES EXTRACTION AND TRAINING STRATEGIES IN CONTINUOUS SPEECH RECOGNITION FOR ROMANIAN LANGUAGE . In Proceedings of the Third International Conference on Informatics in Control, Automation and Robotics - Volume 3: ICINCO, ISBN 978-972-8865-61-0, pages 114-121. DOI: 10.5220/0001198901140121
in Bibtex Style
@conference{icinco06,
author={Corneliu Octavian Dumitru and Inge Gavat},
title={FEATURES EXTRACTION AND TRAINING STRATEGIES IN CONTINUOUS SPEECH RECOGNITION FOR ROMANIAN LANGUAGE},
booktitle={Proceedings of the Third International Conference on Informatics in Control, Automation and Robotics - Volume 3: ICINCO,},
year={2006},
pages={114-121},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001198901140121},
isbn={978-972-8865-61-0},
}
in EndNote Style
TY - CONF
JO - Proceedings of the Third International Conference on Informatics in Control, Automation and Robotics - Volume 3: ICINCO,
TI - FEATURES EXTRACTION AND TRAINING STRATEGIES IN CONTINUOUS SPEECH RECOGNITION FOR ROMANIAN LANGUAGE
SN - 978-972-8865-61-0
AU - Octavian Dumitru C.
AU - Gavat I.
PY - 2006
SP - 114
EP - 121
DO - 10.5220/0001198901140121