The feasibility of computerized adaptive testing of the national benchmark test: A simulation study

Guardat en:

Dades bibliogràfiques
Publicat a:	Journal of Pedagogical Research vol. 8, no. 2 (Jun 2024), p. 95-113
Autor principal:	Musa Adekunle Ayanwale
Altres autors:	Ndlovu, Mdutshekelwa
Publicat:	Journal of Pedagogical Research
Matèries:	Africa Simulation Item Analysis Computer Assisted Testing Adaptive Testing High Stakes Tests Test Items Algorithms
Accés en línia:	Citation/Abstract Full Text - PDF
Etiquetes:	Afegir etiqueta Sense etiquetes, Sigues el primer a etiquetar aquest registre!

MARC


LEADER	00000nab a2200000uu 4500
001	3264262810
003	UK-CbPIL
022			\|a 2602-3717
024	7		\|a 10.33902/JPR.202425210 \|2 doi
035			\|a 3264262810
045	2		\|b d20240601 \|b d20240630
100	1		\|a Musa Adekunle Ayanwale
245	1		\|a The feasibility of computerized adaptive testing of the national benchmark test: A simulation study
260			\|b Journal of Pedagogical Research \|c Jun 2024
513			\|a Journal Article
520	3		\|a The COVID-19 pandemic has had a significant impact on high-stakes testing, including the National Benchmark Tests in South Africa. Current linear testing formats have been criticized for their limitations, leading to a shift towards Computerized Adaptive Testing [CAT]. Assessments with CAT are more precise and take less time. Evaluation of CAT programs requires simulation studies. To assess the feasibility of implementing CAT in NBTs, SimulCAT, a simulation tool, was utilized. The SimulCAT simulation involved creating 10,000 examinees with a normal distribution characterized by a mean of 0 and a standard deviation of 1. A pool of 500 test items was employed, and specific parameters were established for the item selection algorithm, CAT administration rules, item exposure control, and termination criteria. The termination criteria required a standard error of less than 0.35 to ensure accurate abilities estimation. The findings from the simulation study demonstrated that fixed-length tests provided higher testing precision without any systematic error, as indicated by measurement statistics like CBIAS, CMAE, and CRMSE. However, fixed-length tests exhibited a higher item exposure rate, which could be mitigated by selecting items with fewer dependencies on specific item parameters (a-parameters). On the other hand, variable-length tests demonstrated increased redundancy. Based on these results, CAT is recommended as an alternative approach for conducting NBTs due to its capability to accurately measure individual abilities and reduce the testing duration. For high-stakes assessments like the NBTs, fixed-length tests are preferred as they offer superior testing precision while minimizing item exposure rates.
651		4	\|a Africa
653			\|a Simulation
653			\|a Item Analysis
653			\|a Computer Assisted Testing
653			\|a Adaptive Testing
653			\|a High Stakes Tests
653			\|a Test Items
653			\|a Algorithms
700	1		\|a Ndlovu, Mdutshekelwa
773	0		\|t Journal of Pedagogical Research \|g vol. 8, no. 2 (Jun 2024), p. 95-113
786	0		\|d ProQuest \|t Education Database
856	4	1	\|3 Citation/Abstract \|u https://www.proquest.com/docview/3264262810/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch
856	4	0	\|3 Full Text - PDF \|u https://www.proquest.com/docview/3264262810/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch