Psychometricians are obsessed with item discrimination (producing a desired spread of student scores with the fewest number of items) and test reliability (getting the same average test score from repeated tests). Teachers and students need to know what has been mastered and what has yet to be learned. These two goals are not fully compatible.
In fact mastery produces a score near 100%; what has to be learned, a score of near 0%; but psychometricians want an average test score of near 50% to maximize their favorite calculations. Traditional multiple-choice (TMC) generally produces a convenient average classroom test score of 75% (25 points for marking, each item with four answer options, and 50 points from a mix of mastery and discriminating items).
The TMC test ranks students by their performance on the test and their luck on test day. It does not ask them what they really trust they know, that is of value, that is the basis for further learning and instruction (the information needed for effective formative assessment).
Pearson announced a modification to TMC in 2004 (distrator-rationle taxonomy). In 2010 Pearson reported on a study using ordered multiple-choice (OMC) that still forces students to mark an answer to every item rather than use the test to report what they actually trust they know or can do (the basis for further learning and instruction).
The first report introduced OMC. The second demonstrated that it can actually be done. OMC ranks item distractors by the level of understanding.
Other themes and counts of distractors can also be used. This method of writing distractors makes sense for any multiple-choice test. The big difference is in scoring the distractors.
An OMC test is carried out with the weight for each option determined prior to administering the test. This requires priming (field testing) to discover items that perform as expected by experts. With acceptable items in hand, the test is scored 1, 2, 3, and 4 for the four options – four levels of understanding -- (Minimal, Moderate, Significant, and Correct).
TMC involves subjective item selection by a teacher or test expert with right/wrong scoring. This ranks students. OMC involves both subjective item and subjective distractor selection with partial credit model scoring. OMC is a refinement of TMC.
OMC student rankings include an insight into student understanding. How practical OMC is and how it can be applied in the classroom is left for further study. I would predict it will be used in standardized tests in a few years after online testing provides the needed data to demonstrate its usefulness.
The OMC answer options are sensitive to how well a test matches student preparation. This fitness, the expected average test score when students do not know the right answer and guess after discarding all the options they know are wrong, is calculated by PUP520 for each test. This value can equal the test design value (25% for a 4-option item test) to above 80% on a test that closely matches student preparation.
[All tests make a better fit to one small group of students, and a worst fit to another small group of students, than to the entire class. This is just one part of luck on test day. There is no way to know which students are favored or disfavored using forced-choice testing. Judgment Multiple-Choice (JMC) permits each student to control quality independently from quantity.]
Another factor to consider when using OMC is that the number of answer options could be reduced to three (Minimal, Moderate, and Correct) to increase the portion of distractors that work as expected. Knowledge Factor only uses three answer options and omit (JMC) in its patented instruction/assessment system that guarantees mastery.
My suggestion is to add one more option to OMC: omit. Then student judgment would also be measured along with that of the psychometricians and teachers. Judgment ordered multiple-choice (JOMC) would then be a refined, honest, accurate, and fair test.
We would know what students value as the basis for further learning and instruction by letting them tell us. This makes more sense than guessing what a student knows when 1/2 of the right marks may be just luck on test day.
Please encourage Nebraska to allow students to report what they trust they know and what they trust they have yet to learn. Blog. Petition. We need to foster innovation wherever it may take hold.