APPLYING SPEECH RECOGNITION TECHNOLOGY IN TEACHING SPEAKING SKILL AT THE UNIVERSITY OF LABOUR AND SOCIAL AFFAIRS

Authors

  • Lai Minh Thu

DOI:

https://doi.org/10.59266/houjs.2025.748

Keywords:

English pronunciation, speech recognition, ELSA Speak, educational technology, self-directed learning, English language teaching

Abstract

This study investigates the effectiveness of using Automatic Speech Recognition (ASR) technology, specifically the ELSA Speak application, to enhance English pronunciation among non-English major students at the University of Labour and Social Affairs. A quasi-experimental design was employed with 60 first- and second-year students divided into two groups: an experimental group using ELSA Speak for six weeks and a control group receiving traditional instruction. The results revealed that the experimental group showed significantly greater improvements in segmental accuracy, suprasegmental features, and overall intelligibility. Learner feedback also indicated high satisfaction with the app’s usefulness, ease of use, and timely feedback. These findings highlight the potential of ASR integration as an effective supplementary tool for pronunciation instruction in Vietnamese public university settings and suggest promising directions for technology-enhanced, personalized, and autonomous language learning.

References

[1]. Ahn, T. Y., & Lee, S. M. (2016). User experience of a mobile speaking application with automatic speech recognition for EFL learning. British Journal of Educational Technology, 47(4), 778-786. https://doi.org/10.1111/ bjet.12354

[2]. Chapelle, C., & Jamieson, J. (2008). Tips for teaching with CALL: Practical approaches to computer-assisted language learning. Pearson Education.

[3]. Derwing, T. M., & Munro, M. J. (2015). Pronunciation fundamentals: Evidence-based perspectives for L2 teaching and research. John Benjamins Publishing Company.

[4]. Elimat, A. K., & AbuSeileek, A. F. (2014). Automatic speech recognition technology as an effective means for teaching pronunciation. JALT CALL Journal, 10(1), 21-47. https://doi. org/10.29140/jaltcall.v10n1.166

[5]. Jayalath, J., & Esichaikul, V. (2022). Gamification to enhance motivation and engagement in blended eLearning for technical and vocational education and training. Technology, Knowledge and Learning, 27, 91-118. https://doi. org/10.1007/s10758-020-09466-2

[6]. McCrocklin, S. M. (2016). Pronunciation learner autonomy: The potential of automatic speech recognition. System, 57, 25-42. https:// doi.org/10.1016/j.system.2015.12.013

[7]. Nguyen, T. S., Nguyen, T. D. T., Hoang, N. Q. N., & Do, T. K. H. (2025). How AI-powered voice recognition has supported pronunciation competence among EFL university learners. CALL-EJ, 26(3), 64-83. https://doi. org/10.54855/callej.252634

[8]. Nguyen, L. T., & Newton, J. (2020). Pronunciation teaching in tertiary EFL classes: Vietnamese teachers’ beliefs and practices. TESL-EJ, 24(1). https:// www.tesl-ej.org/wordpress/issues/ volume24/ej93/ej93a1/

[9]. Pham, V. T. T., & Pham, A. T. (2025). English major students’satisfaction with ELSA Speak in English pronunciation courses. PLOS ONE, 20(1), e0317378.

https:// doi. org/ 10. 1371 / journal. pone.0317378

[10]. Sholekhah, M. F., & Fakhrurriana, R. (2023). The use of ELSA Speak as a Mobile-Assisted Language Learning (MALL) towards EFL students’ pronunciation. JELITA, 2(2), 93-100. https://doi.org/10.37058/jelita.v2i2.7596

[11]. Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Harvard University Press.

Loading...