Effect Of Item Flaws On Difficulty And Discriminatory Indices Of Multiple Choice Questions In A Four-Year Medical Curriculum In Saudi Arabia

Fahad AlDhahri, Aamir Omair, Mohi Eldin Magzoub



This study reviewed the multiple choice questions (MCQs) written at the College of Medicine, King Saud bin Abdulaziz University for Health Sciences (KSAU-HS) over the four years of its curriculum. It assessed the effect of item flaws on the Difficulty levels and Discriminatory indices of the MCQs.


All the MCQs used during the four years in all the blocks for the second batch of medical students at KSAU-HS were reviewed to identify the type of flaws and number of distractors that were functioning. The Difficulty levels and Discriminatory indices were obtained from the Assessment Unit, which were compared between the items with and without flaws using Independent samples t-test. Comparison of presence of flaws between different groups was done using the Chi Square test.


The 1412 MCQs reviewed consisted of 938 (66%) recall and 474 (34%) reasoning type of questions, with a difficulty level of 0.69+0.24 and discriminatory index of 0.21+0.22.  There were 535 (38%) MCQs in which all the other three options were functioning distractors. There was one non-functioning distractor in 449 (32%) and two or more non-functioning distractors in 428 (30%) MCQs.  There were 287 (20%) MCQs which had flaws, with more than half of the flaws i.e. 152 (53%) being Negative statements in the flawed MCQs. The Difficulty level in questions with no flaw was 0.70+0.24, and 0.67+0.23 for questions with a flaw (p=0.06).  The Discriminatory Index showed that 42% MCQs with flaws had a satisfactory discriminatory index of >0.3 as compared to 32% of questions without any identifiable flaw (p=0.02).


The overall difficulty and discriminatory indices were satisfactory for the reviewed MCQs. Item flaws were present in 20% of the MCQs with negative statements as the most common flaw. There was no significant difference in the difficulty level with regards to flaws, but questions with flaws had a better discrimination index as compared to those with no flaws.

