The relationship between negative stem and taxonomy of multiple- choice questions in residency pre-board and board exams

Introduction Student assessment is one of the most important parts of any education program. A systematic process of collecting, analyzing and interpreting information to determine to what extent the educational objectives are being realized and fulfilled is called assessment or evaluation.1 The assessment process should offer an appropriate picture of each student’s academic progress during different time periods and identify problems and short comings of education. If there is no satisfactory assessment result, it may reflect the low activity of students, failure in planning and teaching or undesirable assessment methods.1-4 Multiple-choice tests can assess a large number of students in a short period of time. This is one of the best type of objective tests in terms of uniformity of questions, less possibility of blind guesses there is in true-false questions and easy-to-correct answer sheets. Multiple-choice questions are composed of a stem followed by four or five answers as options, only one of which is the best answer. In correct answers are provided as possible and distractor options. If the multiple-choice questions are designed with accuracy and in respect to testing principles, they could separate students with strong and weak abilities in the cognitive domain.5,6 To validate multiple-choice questions, researchers developed Millman’s principles on the structure of the stem and options, and experts agreed these principles lead to successful academic testing. Based on Millman’s principals, if the stem question is negative, negative wording would be distinguished by underlining, italicizing and bolding.5-7 The results of studies indicated that students spent more time answering negative stem questions and made more


Introduction
Student assessment is one of the most important parts of any education program.A systematic process of collecting, analyzing and interpreting information to determine to what extent the educational objectives are being realized and fulfilled is called assessment or evaluation. 1 The assessment process should offer an appropriate picture of each student's academic progress during different time periods and identify problems and short comings of education.2][3][4] Multiple-choice tests can assess a large number of students in a short period of time.This is one of the best type of objective tests in terms of uniformity of questions, less possibility of blind guesses there is in true-false questions and easy-to-correct answer sheets.Multiple-choice questions are composed of a stem followed by four or five answers as options, only one of which is the best answer.In correct answers are provided as possible and distractor options.If the multiple-choice questions are designed with accuracy and in respect to testing principles, they could separate students with strong and weak abilities in the cognitive domain. 5,6To validate multiple-choice questions, researchers developed Millman's principles on the structure of the stem and options, and experts agreed these principles lead to successful academic testing.][7] The results of studies indicated that students spent more time answering negative stem questions and made more mistakes compared with positive stem questions. 8,9However, the use of negative stem questions has been continued and they comprise 15%-36% of multiple choice questions in medical tests. 10,11ultiple choice tests are the most common objective tests used in cognitive domain evaluation related to the medical field, and they are frequently used as pre-board exams and board exams for clinical specialties in Iran.It is necessary that a quantitative and qualitative analysis be carried out after providing test questions, running the test and determining the students' scores to guarantee the proper quality of the questions.Among the qualitative indicators of multiple-choice tests, the evaluation of taxonomy knowledge and respect of structural principles are examined most often. 12his study aimed at examining the relationship between negative stem questions and multiple choice question taxonomy in pre-board tests in Tabriz University of Medical Sciences and national board tests in internal medicine, general surgery, pediatrics and obstetrics and gynecology residency in 2010-2011.

Materials and Methods
In this cross-sectional study, Total of 2400 questions from 8 tests (board, pre-board) in 4 fields were studied.The relationship between the negative stem of multiple choice questions and their taxonomy were examined in pre-board and board exams among pediatrics, internal medicine, surgery and obstetrics and gynecology residency programs in 2010-2011.Taxonomy of each question (levels I, II, III) was determined by three experts in their own field and medical education. 13Each question in every exam was reviewed by the project executive in terms of negative stems.If there were a negative word or negative concept in the question body such as not right, wrong, except, unless, but, least, not likely and forbidden, it was considered a negative stem.The collected data were analyzed by SPSS 18.A chi-square test was used for the significant differences between positive and negative stems.P value <0.05 was considered significant.

Results
A total of 2400 questions from 8 tests (board, pre-board) in 4 fields (internal medicine, pediatrics, general surgery, obstetrics and gynecology) were studied.Taxonomic distribution of various tests has been shown in Table 1.Negative stems were distinguished by underlining negative words or concepts in all questions.The number and percentage of questions with negative stems in Tabriz University of Medical Sciences pre-board and board written exams in internal medicine, general surgery, pediatrics and obstetrics and gynecology for the residency period 2010-2011 are compared in Table 2.As seen in Table 2, the highest percentage of questions with negative stems were related to the general surgery and pediatrics fields.
Of the total number of test questions, 23.1% in pre-board exams and 16.6% in board exams had negative stems.The difference was statistically significant comparing these two tests (P = 0.0001).The relationship between question taxonomy and negative stems in Tabriz University of Medical Sciences preboard and board written exams in internal medicine, general surgery, pediatrics and obstetrics and gynecology for the residency period 2010-2011 are shown in Table 3.As shown in Table 3, there is a correlation between negative stem questions and their taxonomy.This means that 63.9% of negative stem and 31.1% of positive stem questions have been designed in taxonomy level I (P = 0.0001).

Discussion
One of the most common problems with multiple-choice questions is a structural form that often allows examinees to guess the correct answer when they are not knowledgeable enough.Not respecting testing principles in the preparation of questions can lead to structural forms that make it more or less difficult to answer questions and affect the proper functioning of examinees. 14One of the difficulties in preparing multiple-choice questions is that it is sometimes too difficult to provide some distract or wrong options that potentially appear to be right.In such cases, if providing right options is easier, you can use negative stem questions in which all options except one of them are correct. 1Based on the results of this study, the percentage of negative stem questions in the studied tests ranged from 11.6% up to 35%.Although no standard fixed percentage has been set for negative stem questions, it is cited as 15-36% in some studies. 10,11 study by Harasym et al showed that the use of negative words in stem questions made questions easier and increased scores. 15However, in his study Tamir mentioned that negative stem questions with a high taxonomy level made questions difficult. 16Other studies have noted that negative or positive words in the questions stem have no effect on the difficulty of questions or assessment of learners' education at higher levels. 17,18In their study, Harasym et al emphasized the restricted use of negative stem questions and the replacement of simple response negatively-worded questions with multiple response positively-worded questions has been recommended. 10he use of negative verbs in the stem makes questions more difficult and confuses the examinee because this type of question forces the examinee to mentally change the question format from negative to positive and then look for the correct answer. 10,19,20A study by Harasym et al showed that the validity and reliability of multiple response positively-worded questions are higher than simple response negatively-worded questions and those questions assess students' academic performance better.It also indicated that the use of negative words in the question stem should be limited to those questions in which it is important for the examinee to learn what must not be done.However, most learners need to learn what to do and positive words should be used in the question stem of these questions.Choosing the negative stem option, on the other hand, does not mean that the examinee is aware of the right aspect of the action, which is the learning goal. 10ased on the results of this study, 31.1% of positive stem questions and 63.9% of negative stem questions were designed in taxonomy level I. (P = 0.0001).Although designing questions in taxonomy level I does not necessarily result in an easy question, it encourages residents to learn the content by memorizing the subject, while the main goal in clinical fields is interpreting and analyzing data and problem-solving.It appears the problem of overcoming memory based question (taxonomy I) in negative stem questions in this study not only taints test validity but also pushes students to superficial learning and memorizing.Although words with negative concepts are considered structurally acceptable based on the instructions of the ministry, the results of this study indicated that negative stem questions were significantly designed in taxonomy level I.

Study limitations
Reviewing the relationship between negative stem ques-tions and quantitative results of questions was not possible due to the unavailability of difficulty index and discrimination index.

Conclusion
The negative stem questions significantly result in the design of low-level cognitive questions.It is recommended that the use of negative words in question stems should be limited to those that are important for the examinee to learn what must not be done and negative stem questions should be considered structurally undesirable.

Ethical approval
Not applicable.

Competing interests
None.

Acknowledgements
This paper is the result of a research project approved by the Medical Education Research Center, Tabriz University of Medical Sciences.The researchers would like to express their deepest gratitude to Ms. Zakieh Ebadi, who kindly cooperated in entering data into statistical software.

Table 1 .
Taxonomic distribution of written tests for pre-board of Tabriz University of Medical Sciences and board exams in internal medicine, general surgery, pediatrics and obstetrics and gynecology for the residency period 2010-2011

Table 2 .
Comparison of questions with negative stems in Tabriz University of Medical Sciences pre-board and board written exams in internal medicine, general surgery, pediatrics and obstetrics and gynecology for the residency period 2010-2011

Table 3 .
The relationship between the number and percentage of questions with negative stems and the distribution of question taxonomy in internal medicine, general surgery, pediatrics and obstetrics and gynecology for the residency period 2010-2011