Comparative Evaluation of Speech-to- Text Models for Lithuanian Transcription: Effects of Audio Formats and Recording Environments

Dovydas Šablevičius; Asta Slotkienė

doi:10.15388/LMITT.2025.23

Articles

Dovydas Šablevičius

Vilnius University

Asta Slotkienė

Vilnius University

Published 2025-05-12

https://doi.org/10.15388/LMITT.2025.23

PDF

Keywords

Speech-to-text model
audio format
transcription accuracy
recording environment
Lithuanian language

How to Cite

Šablevičius, D. and Slotkienė, A. (2025) “Comparative Evaluation of Speech-to- Text Models for Lithuanian Transcription: Effects of Audio Formats and Recording Environments”, Vilnius University Open Series, pp. 197–208. doi:10.15388/LMITT.2025.23.

Download Citation

Abstract

This study evaluates the performance of various speech-to-text models for Lithuanian transcription, focusing on how audio formats and recording environments affect their accuracy. Among the models tested, Google’s Chirp-2 demonstrated the highest accuracy under optimal conditions. However, its performance declined with increased playback speeds and in environments with significant background noise, highlighting the importance of controlled recording conditions for effective deployment of STT systems in realworld applications.

PDF

This work is licensed under a Creative Commons Attribution 4.0 International License.

Downloads

Download data is not yet available.

Most read articles by the same author(s)

Dovydas Marius Zapkus, Asta Slotkienė, Unit Test Generation Using Large Language Models: A Systematic Literature Review , Vilnius University Open Series: 2024: Proceedings of the Conference "Lithuanian MSc Research in Informatics and ICT"
Dovydas Marius Zapkus, Asta Slotkienė, Quality Evaluation of Large Language Models Generated Unit Tests: Influence of Structured Output , Vilnius University Open Series: 2025: Proceedings of the Conference "Lithuanian MSc Research in Informatics and ICT". 2025