The Feasibility of Computerized Adaptive Testing of the National Benchmark Test: A Simulation Study

Guardat en:
Dades bibliogràfiques
Publicat a:Journal of Pedagogical Research vol. 8, no. 2 (2024), p. 95
Autor principal: Musa Adekunle Ayanwale
Altres autors: Ndlovu, Mdutshekelwa
Publicat:
Journal of Pedagogical Research
Matèries:
Accés en línia:Citation/Abstract
Full text outside of ProQuest
Etiquetes: Afegir etiqueta
Sense etiquetes, Sigues el primer a etiquetar aquest registre!

MARC

LEADER 00000nab a2200000uu 4500
001 3075713858
003 UK-CbPIL
035 |a 3075713858 
045 2 |b d20240101  |b d20241231 
084 |a EJ1428037 
100 1 |a Musa Adekunle Ayanwale 
245 1 |a The Feasibility of Computerized Adaptive Testing of the National Benchmark Test: A Simulation Study 
260 |b Journal of Pedagogical Research  |c 2024 
513 |a Report Article 
520 3 |a The COVID-19 pandemic has had a significant impact on high-stakes testing, including the national benchmark tests in South Africa. Current linear testing formats have been criticized for their limitations, leading to a shift towards Computerized Adaptive Testing [CAT]. Assessments with CAT are more precise and take less time. Evaluation of CAT programs requires simulation studies. To assess the feasibility of implementing CAT in NBTs, SimulCAT, a simulation tool, was utilized. The SimulCAT simulation involved creating 10,000 examinees with a normal distribution characterized by a mean of 0 and a standard deviation of 1. A pool of 500 test items was employed, and specific parameters were established for the item selection algorithm, CAT administration rules, item exposure control, and termination criteria. The termination criteria required a standard error of less than 0.35 to ensure accurate abilities estimation. The findings from the simulation study demonstrated that fixed-length tests provided higher testing precision without any systematic error, as indicated by measurement statistics like CBIAS, CMAE, and CRMSE. However, fixed-length tests exhibited a higher item exposure rate, which could be mitigated by selecting items with fewer dependencies on specific item parameters (a-parameters). On the other hand, variable-length tests demonstrated increased redundancy. Based on these results, CAT is recommended as an alternative approach for conducting NBTs due to its capability to accurately measure individual abilities and reduce the testing duration. For high-stakes assessments like the NBTs, fixed-length tests are preferred as they offer superior testing precision while minimizing item exposure rates. 
651 4 |a South Africa 
653 |a Adaptive Testing 
653 |a Benchmarking 
653 |a National Competency Tests 
653 |a Computer Assisted Testing 
653 |a High Stakes Tests 
653 |a Foreign Countries 
653 |a Test Validity 
653 |a Simulation 
653 |a Test Items 
653 |a Computer Software 
653 |a Monte Carlo Methods 
700 1 |a Ndlovu, Mdutshekelwa 
773 0 |t Journal of Pedagogical Research  |g vol. 8, no. 2 (2024), p. 95 
786 0 |d ProQuest  |t ERIC 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3075713858/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch 
856 4 0 |3 Full text outside of ProQuest  |u http://eric.ed.gov/?id=EJ1428037