The feasibility of computerized adaptive testing of the national benchmark test: A simulation study

Guardat en:
Dades bibliogràfiques
Publicat a:Journal of Pedagogical Research vol. 8, no. 2 (Jun 2024), p. 95-113
Autor principal: Musa Adekunle Ayanwale
Altres autors: Ndlovu, Mdutshekelwa
Publicat:
Journal of Pedagogical Research
Matèries:
Accés en línia:Citation/Abstract
Full Text - PDF
Etiquetes: Afegir etiqueta
Sense etiquetes, Sigues el primer a etiquetar aquest registre!

MARC

LEADER 00000nab a2200000uu 4500
001 3264262810
003 UK-CbPIL
022 |a 2602-3717 
024 7 |a 10.33902/JPR.202425210  |2 doi 
035 |a 3264262810 
045 2 |b d20240601  |b d20240630 
100 1 |a Musa Adekunle Ayanwale 
245 1 |a The feasibility of computerized adaptive testing of the national benchmark test: A simulation study 
260 |b Journal of Pedagogical Research  |c Jun 2024 
513 |a Journal Article 
520 3 |a The COVID-19 pandemic has had a significant impact on high-stakes testing, including the National Benchmark Tests in South Africa. Current linear testing formats have been criticized for their limitations, leading to a shift towards Computerized Adaptive Testing [CAT]. Assessments with CAT are more precise and take less time. Evaluation of CAT programs requires simulation studies. To assess the feasibility of implementing CAT in NBTs, SimulCAT, a simulation tool, was utilized. The SimulCAT simulation involved creating 10,000 examinees with a normal distribution characterized by a mean of 0 and a standard deviation of 1. A pool of 500 test items was employed, and specific parameters were established for the item selection algorithm, CAT administration rules, item exposure control, and termination criteria. The termination criteria required a standard error of less than 0.35 to ensure accurate abilities estimation. The findings from the simulation study demonstrated that fixed-length tests provided higher testing precision without any systematic error, as indicated by measurement statistics like CBIAS, CMAE, and CRMSE. However, fixed-length tests exhibited a higher item exposure rate, which could be mitigated by selecting items with fewer dependencies on specific item parameters (a-parameters). On the other hand, variable-length tests demonstrated increased redundancy. Based on these results, CAT is recommended as an alternative approach for conducting NBTs due to its capability to accurately measure individual abilities and reduce the testing duration. For high-stakes assessments like the NBTs, fixed-length tests are preferred as they offer superior testing precision while minimizing item exposure rates. 
651 4 |a Africa 
653 |a Simulation 
653 |a Item Analysis 
653 |a Computer Assisted Testing 
653 |a Adaptive Testing 
653 |a High Stakes Tests 
653 |a Test Items 
653 |a Algorithms 
700 1 |a Ndlovu, Mdutshekelwa 
773 0 |t Journal of Pedagogical Research  |g vol. 8, no. 2 (Jun 2024), p. 95-113 
786 0 |d ProQuest  |t Education Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3264262810/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3264262810/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch