Evaluating ChatGPT-3’s efficacy in solving coding tasks: implications for academic integrity in English language assessments

Збережено в:
Бібліографічні деталі
Опубліковано в::Language Testing in Asia vol. 15, no. 1 (Dec 2025), p. 37
Автор: Elhambakhsh, Seyedeh Elham
Опубліковано:
Springer Nature B.V.
Предмети:
Онлайн доступ:Citation/Abstract
Full Text
Full Text - PDF
Теги: Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!

MARC

LEADER 00000nab a2200000uu 4500
001 3226525233
003 UK-CbPIL
022 |a 2229-0443 
024 7 |a 10.1186/s40468-024-00333-w  |2 doi 
035 |a 3226525233 
045 2 |b d20251201  |b d20251231 
084 |a 243835  |2 nlm 
100 1 |a Elhambakhsh, Seyedeh Elham  |u Yazd University, Department of Language and Literature, Yazd, Iran (GRID:grid.413021.5) (ISNI:0000 0004 0612 8240) 
245 1 |a Evaluating ChatGPT-3’s efficacy in solving coding tasks: implications for academic integrity in English language assessments 
260 |b Springer Nature B.V.  |c Dec 2025 
513 |a Journal Article 
520 3 |a The purpose of this study was to examine ChatGPT-3’s capabilities to generate code solutions for assessment problems commonly assessed by automatic correction tools in the TEFL academic setting, focusing on the Kattis platform. The researcher explored potential implications for academic integrity and the challenges associated with AI-generated solutions. The investigation involved testing ChatGPT on a subset of 124 English language assessment tasks from Kattis, a widely used automatic software grading tool. The results revealed that ChatGPT independently solved 16 tasks successfully. Data analysis demonstrated that while ChatGPT performed well on simpler problems, it faced challenges with more complex assessment tasks. To supplement quantitative findings, a qualitative follow-up investigation was conducted, including interviews with two EFL assessment instructors. The discussion encompassed methodological considerations, the effectiveness of Kattis in preventing cheating, and the limitations in detecting AI-generated code. ChatGPT independently solved 16 out of 124 assessment tasks assessed by Kattis. Performance varied based on task complexity, with better accuracy on simpler problems. Qualitative insights revealed both the strengths and limitations of Kattis in preventing cheating. While ChatGPT demonstrates competence in solving certain assessment problems, challenges persist with more complex tasks. The study emphasizes the need for continuous adaptation in EFL assessment methodologies to maintain academic integrity in the face of evolving AI capabilities. As students gain access to sophisticated AI-generated solutions, the need for vigilant strategies to uphold originality and critical thinking in academic work becomes increasingly crucial. The study's findings have implications for multiple stakeholders, including (1) awareness of AI capabilities in generating code solutions, necessitating vigilant assessment strategies. (2) Understanding the importance of academic integrity and the limitations of AI in mastering complex assessment tasks. (3) Insights into the interplay between AI, automated assessment systems, and academic integrity, guiding future investigations. This performance illustrates the need for careful assessment design to mitigate the risk of AI-assisted academic dishonesty while maintaining rigorous academic standards. 
610 4 |a Wikipedia OpenAI 
653 |a Teaching 
653 |a Higher education 
653 |a Students 
653 |a Metadata 
653 |a English language 
653 |a Complexity 
653 |a Plagiarism 
653 |a Automation 
653 |a Language assessment 
653 |a Code reuse 
653 |a Feedback 
653 |a Teachers 
653 |a Chatbots 
653 |a Artificial intelligence 
653 |a Human-computer interaction 
653 |a Computer assisted language learning 
653 |a Algorithms 
653 |a TESOL 
653 |a Cheating 
653 |a English as a second language instruction 
653 |a Task performance 
653 |a Risk assessment 
653 |a Efficacy 
653 |a Task complexity 
653 |a Academic work 
653 |a Morality 
653 |a Academic achievement 
653 |a Evaluation 
653 |a Data analysis 
653 |a Dishonesty 
653 |a Academic standards 
653 |a Limitations 
653 |a Complex tasks 
653 |a Academic writing 
653 |a Critical thinking 
653 |a Rating Scales 
653 |a Influence of Technology 
653 |a Intellectual Disciplines 
653 |a Educational Technology 
653 |a Measurement Techniques 
653 |a English (Second Language) 
653 |a Judges 
653 |a Integrity 
653 |a Time 
653 |a Coding 
653 |a Educational Assessment 
653 |a Difficulty Level 
653 |a English 
773 0 |t Language Testing in Asia  |g vol. 15, no. 1 (Dec 2025), p. 37 
786 0 |d ProQuest  |t Education Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3226525233/abstract/embedded/ZKJTFFSVAI7CB62C?source=fedsrch 
856 4 0 |3 Full Text  |u https://www.proquest.com/docview/3226525233/fulltext/embedded/ZKJTFFSVAI7CB62C?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3226525233/fulltextPDF/embedded/ZKJTFFSVAI7CB62C?source=fedsrch