Quality Evaluation of Large Language Models Generated Unit Tests: Influence of Structured Output

Dovydas Marius Zapkus; Asta Slotkienė

doi:10.15388/MITT.2025.32

Articles

Dovydas Marius Zapkus

Vilnius University

Asta Slotkienė

Vilnius University

Published 2025-05-12

https://doi.org/10.15388/MITT.2025.32

PDF

Keywords

large language model
unit test
quality metrics
structured prompt output

How to Cite

Zapkus, D.M. and Slotkienė, A. (2025) “Quality Evaluation of Large Language Models Generated Unit Tests: Influence of Structured Output”, Vilnius University Open Series, pp. 281–288. doi:10.15388/MITT.2025.32.

Download Citation

Abstract

Unit testing is critical in software quality assurance, and large language models (LLMs) offer an approach to automate this process. This paper evaluates the quality of unit tests generated by large language models using structured output prompts. The research applied six LLMs in generating unit tests across different classes of cyclomatic complexity of C# focal methods. The experiment result shows that LLMs generated results according to a strict structure output (Arrange-Act-Assert pattern) that significantly influences the quality of the generated unit tests.

PDF

This work is licensed under a Creative Commons Attribution 4.0 International License.

Downloads

Download data is not yet available.

Most read articles by the same author(s)

Dovydas Marius Zapkus, Asta Slotkienė, Unit Test Generation Using Large Language Models: A Systematic Literature Review , Vilnius University Open Series: 2024: Proceedings of the Conference "Lithuanian MSc Research in Informatics and ICT"
Dovydas Šablevičius, Asta Slotkienė, Comparative Evaluation of Speech-to- Text Models for Lithuanian Transcription: Effects of Audio Formats and Recording Environments , Vilnius University Open Series: 2025: Proceedings of the Conference "Lithuanian MSc Research in Informatics and ICT". 2025