Standard setting for dental knowledge tests: reproducibility of the modified Angoff and Ebel method across judges

Guardado en:

Detalles Bibliográficos
Publicado en:	BMC Medical Education vol. 25 (2025), p. 1-14
Autor principal:	Ting Khee Ho
Otros Autores:	Noor Lide Abu Kassim, Lucy O’Malley, Reza Vahid Roudsari
Publicado:	Springer Nature B.V.
Materias:	National Board of Medical Examiners Malaysia Standards Medical education Workshops Dental care Candidates Feedback Validity Educational objectives Hypotheses Methods Dentistry Reproducibility Medical examiners Minimum Competency Testing Cutting Scores Standard Setting Item Response Theory Competence Test Format Inferences Standard Setting (Scoring) Meetings Individual Testing Minimum Competencies Licensing Examinations (Professions) Test Items Examiners Ethics Student Evaluation Alternative Assessment Feedback (Response) Test Interpretation Educational Assessment Dental Evaluation Summative Evaluation Outcomes of Education
Acceso en línea:	Citation/Abstract Full Text Full Text - PDF
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

MARC


LEADER	00000nab a2200000uu 4500
001	3268438111
003	UK-CbPIL
022			\|a 1472-6920
024	7		\|a 10.1186/s12909-025-07822-3 \|2 doi
035			\|a 3268438111
045	2		\|b d20250101 \|b d20251231
084			\|a 58506 \|2 nlm
100	1		\|a Ting Khee Ho
245	1		\|a Standard setting for dental knowledge tests: reproducibility of the modified Angoff and Ebel method across judges
260			\|b Springer Nature B.V. \|c 2025
513			\|a Journal Article
520	3		\|a IntroductionCriterion-referenced standard setting methods establish passing scores based on predefined competency levels. The credibility of these scores must be supported by validity evidence. This study evaluated the reproducibility of modified Angoff and Ebel standards across different test formats and panels in dental assessments. Inter-rater reliability for each method was also assessed.MethodsTwelve judges, selected via purposive sampling, were divided into two equal groups representing various specialisms. Each panel applied modified Angoff and Ebel methods to set standards for one-best answer (OBA) and short answer question (SAQ) items. Method replicability across panels was assessed using the Mann–Whitney U-test to compare passing scores between Groups A and B. The Wilcoxon signed-rank test compared passing scores between modified Angoff and Ebel within groups. Inter-rater reliability was estimated using the intraclass correlation coefficient for modified Angoff and Fleiss’ kappa for Ebel. Statistical analysis was conducted using IBM SPSS, with significance set at p < 0.05.ResultsThe median (IQR) years of teaching experience were 14.0 (17.0) for Group A judges and 21.5 (18.0) for Group B judges. In Group A, median (IQR) passing scores using modified Angoff were 49.75 (3.31) for OBA and 51.75 (6.13) for SAQ, with statistical no significant differences (p > 0.05) from Ebel OBA 47.38 (2.02), SAQ 49.50 (5.38). In Group B, modified Angoff passing scores were significantly higher than Ebel (p < 0.05): modified Angoff OBA 66.12 (3.31), SAQ 58.00 (7.50); Ebel OBA 55.92 (2.73), SAQ 49.50 (8.25). Passing scores were consistent across panels for SAQ but not for OBA. Inter-rater agreement, intraclass correlation coefficients (ICC) and Fleiss’ kappa were higher in Group A across both methods.ConclusionReproducibility of modified Angoff and Ebel standards across panels was mixed. Passing scores were consistent across judges for SAQ but varied for OBA in both methods. Group A showed consistency between modified Angoff and Ebel standards, whereas Group B had differing passing scores between both standards. These findings should be carefully considered when establishing defensible and reliable passing standards for dental knowledge assessments.
610		4	\|a National Board of Medical Examiners
651		4	\|a Malaysia
653			\|a Standards
653			\|a Medical education
653			\|a Workshops
653			\|a Dental care
653			\|a Candidates
653			\|a Feedback
653			\|a Validity
653			\|a Educational objectives
653			\|a Hypotheses
653			\|a Methods
653			\|a Dentistry
653			\|a Reproducibility
653			\|a Medical examiners
653			\|a Minimum Competency Testing
653			\|a Cutting Scores
653			\|a Standard Setting
653			\|a Item Response Theory
653			\|a Competence
653			\|a Test Format
653			\|a Inferences
653			\|a Standard Setting (Scoring)
653			\|a Meetings
653			\|a Individual Testing
653			\|a Minimum Competencies
653			\|a Licensing Examinations (Professions)
653			\|a Test Items
653			\|a Examiners
653			\|a Ethics
653			\|a Student Evaluation
653			\|a Alternative Assessment
653			\|a Feedback (Response)
653			\|a Test Interpretation
653			\|a Educational Assessment
653			\|a Dental Evaluation
653			\|a Summative Evaluation
653			\|a Outcomes of Education
700	1		\|a Noor Lide Abu Kassim
700	1		\|a Lucy O’Malley
700	1		\|a Reza Vahid Roudsari
773	0		\|t BMC Medical Education \|g vol. 25 (2025), p. 1-14
786	0		\|d ProQuest \|t Healthcare Administration Database
856	4	1	\|3 Citation/Abstract \|u https://www.proquest.com/docview/3268438111/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch
856	4	0	\|3 Full Text \|u https://www.proquest.com/docview/3268438111/fulltext/embedded/L8HZQI7Z43R0LA5T?source=fedsrch
856	4	0	\|3 Full Text - PDF \|u https://www.proquest.com/docview/3268438111/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch