Fahad Al-Eidan ( Department of Pharmacology, King Saud Bin Abdul Aziz University of Health Sciences, Riyadh, Saudi Arabia. )
Lubna Ansari Baig ( APPNA Institute of Public Health, Jinnah Sindh Medical University, Karachi Pakistan. )
Mohi-Eldin Magzoub ( Department of Medical Education, King Saud Bin Abdul Aziz University of Health Sciences, Riyadh, Saudi Arabia. )
Aamir Omair ( Department of Medical Education, King Saud Bin Abdul Aziz University of Health Sciences, Riyadh, Saudi Arabia. )
Objectives: To assess reliability and validity of evaluation tool using Haematology course as an example.
Methods: The cross-sectional study was conducted at King Saud Bin Abdul Aziz University of Health Sciences, Riyadh, Saudi Arabia, in 2012, while data analysis was completed in 2013. The 27-item block evaluation instrument was developed by a multidisciplinary faculty after a comprehensive literature review. Validity of the questionnaire was confirmed using principal component analysis with varimax rotation and Kaiser normalisation. Identified factors were combined to get the internal consistency reliability of each factor. Student\\\'s t-test was used to compare mean ratings between male and female students for the faculty and block evaluation.
Results: Of the 116 subjects in the study, 80(69%) were males and 36(31%) were females. Reliability of the questionnaire was Cronbach\\\'s alpha 0.91. Factor analysis yielded a logically coherent 7 factor solution that explained 75% of the variation in the data. The factors were group dynamics in problem-based learning (alpha0.92), block administration (alpha 0.89), quality of objective structured clinical examination (alpha 0.86), block coordination (alpha 0.81), structure of problem-based learning (alpha 0.84), quality of written exam (alpha 0.91), and difficulty of exams (alpha0.41). Female students\\\' opinion on depth of analysis and critical thinking was significantly higher than that of the males (p=0.03).
Conclusion: The faculty evaluation tool used was found to be reliable, but its validity, as assessed through factor analysis, has to be interpreted with caution as the responders were less than the minimum required for factor analysis.
Keywords: Haematology course, Reliability and validity, Saudi Arabia. (JPMA 66: 453; 2016)
Evaluation is an approach used to measure the quality of effectiveness of a programme and it is an essential part of medical education process.1,2 Medical teaching requires evaluation as a part of their quality assurance and improvement procedures which provides evidence whether the teaching standards are being improved and how well course objectives are being achieved.3 Evaluation should be multi-dimensional, involving subjective and objective data to gather comprehensive qualitative and quantitative information on teaching processes and learning outcomes.3,4 Hence, there is a need for developing valid and reliable instruments for course evaluations.
Student evaluations of teaching (SET) generally using Lickert-type scales are the most commonly used methods of evaluation of teaching in higher education.5 Despite questions regarding students competency in evaluating faculty, it is generally agreed that only students are in a position to provide faculty and course evaluation.6-8 Using this approach to evaluate quality of course as a whole can be misleading as student ratings might be biased by the initial interest of students,9 instructor reputation,10 and instructor enthusiasm.11,12
The knowledge and ability of the supervisors in conducting the course has been identified as an important factor that affects the scores of the students in that course.13,14 Harris et al. identified curricular design, administrative skills of the supervisors, learning resources and environment as important factors for success of curriculum.,sup>15 Srinivasan et al. in 2011 identified six core competencies for medical educators through a systematic review which are: medical knowledge, learner centeredness, interpersonal and communication skills, professionalism and role modelling, practice-based reflection and improvement, and system-based learning.16
A systematic review by Beckman et al. on the reliability and validity of instruments used for clinical teaching found that majority of the instruments used internal structure for validating the instruments.17 The most frequently used domains included evaluation of clinical teaching and interpersonal skills of tutor, with the least being motivation, delegation, punctuality and availability.17 The Boerboom et al. study from the veterinary school in Netherlands got a five factor solution for the Mastricht Clinical Teaching Questionnaire (MSTQ). All the five factors had reliability ranging from 0.87 to 0.96 and included general learning climate (GLC), modelling, coaching, articulation and exploration.18 Kirschling et al. validation of teaching effectiveness tool for nurses also got a five factor solution with some similarities with the Boerboom et al. study and included: knowledge and expertise, facilitative teaching, communication style, use of own experience and feedback.18,19 Broomfield and Bligh validated the Course Evaluation Questionnaire (CEQ) for undergraduate medical education and found good reliability for the 6 factors, out of which five factors were validated by Steele et al.20,21
The College of Medicine, King Saud bin Abdul Aziz University of Health Sciences (KSAU-HS) is constantly trying to improve the quality of courses and relying heavily on course evaluation instruments. There has been no study so far in the Kingdom of Saudi Arabia that assessed the reliability and validity of the course evaluation instruments. The current study was planned to assess reliability and validity of the course evaluation tool using the Haematology course as an example.
Subjects and Methods
The cross-sectional study was conducted at KSAU-HS, Riyadh, Saudi Arabia, in 2012, while data analysis was completed in 2013. The College of Medicine in KSAU-HS has a four-year problem-based learning (PBL) programme adapted from the University of Sydney, Australia. The programme is delivered as organ-system based courses (referred to as blocks) in the first two years (called stage 1 and 2) and then in discipline-based courses/blocks (called stage 3) inclusive of clerkship in Medicine, Surgery, Paediatrics, Gynaecology and Obstetrics, and Family Medicine in the latter two years. The block is managed through a preceptor who is called the coordinator. The responsibilities of the block coordinator include organisation and smooth functioning of the block, assigning tutors and attachments for the students, developing assessment tools with other faculty members in the block, and dealing with students\\\' learning issues during the block. This study was done with the haematology block that is offered in the first year over six weeks. The first five weeks of the block are for teaching and the last week is the exam week. The teaching is organised around the problem used for PBL sessions in that week.
After approval from the institutional review committee, the 27-item block evaluation instrument was developed for the study to assess various components of the block, including organization (5 items), duration (1 item), quality of problems (9 items), performance of block coordinator (3 items), quality of written exams (3 items), and quality of objective structured clinical examination (OSCE) (6 items). The questionnaire was developed after an in-depth review and validation by faculty and in the light of block evaluation instruments available in the literature.17-21 Typically, block evaluation is conducted at the end of each week through a structured questionnaire on a five-point Likert scale (5 = excellent, 4 = very good, 3 = good, 2 = fair, 1 = poor). The results of all the weeks are then aggregated to get a comprehensive view of the entire block.
The study comprised male and female students of first year in the medical school who went through the haematology block. The internal consistency reliability of the instrument was assessed through Cronbach\\\'s alpha. To assess the validity of the questionnaire, factor analysis was done using principal component analysis with varimax rotation and Kaiser normalisation. Loadings below 0.3 were suppressed and Eigen value was kept at 1. The items loading on to the respective factors were combined to get the internal consistency reliability of each factor. To compare the significant differences between males and females, student’s t-test was used to compare mean ratings for the faculty and block evaluation. Levine\\\'s test for equality of variances was done to ensure that the appropriate statistical test was applied. Levine\\\'s test was only significant for quality of block and, hence, t-test, which does not assume that variances are equal, was used.
Of the 116 subjects in the study, 80(69%) were males and 36(31%) were females. The reliability of the questionnaire was Cronbach\\\'s Alpha 0.91. Factor analysis yielded a 7 factor solution that explained 75% of the variation in the data. The seven factors, which were logically coherent and matched with the items on the faculty evaluation tool, were: group dynamics in PBL (alpha 0.92), block administration (alpha 0.89), quality of OSCE (alpha 0.86), block coordination (alpha 0.81), structure of PBL (alpha 0.84), quality of written exam (alpha 0.91), and difficulty of exams (alpha 0.41). There were some double loadings which were logically connected, like sequence of activities loaded with block administration and coordination. There was one isolated loading of block duration with difficulty of exam which we could not explain and therefore was not included with any factor for calculating reliability (Table-1).
There was a significant difference between the perception of knowledge gained in the first week, "Always Tired" (p =0.03) and in the third week "A Swollen Knee" (p=0.007) (Table 2).
There were no significant differences between males and females in their perception of the relevance and characteristics (problem was stimulating) of the problem used for PBL sessions (p>0.05).There was no statistically significant difference in the opinion of male and female students about the quality of problems used for PBL Sessions (p>0.05).
Male students gave a significantly high rating to block book\\\'s clarity (p<0.001) and content (p=0.007) (Table-3),
and thought that block coordinator was more helpful (p<0.001) (Table-4)
compared to females. Male students\\\' gave a significantly high rating (3.8±1.1) to the quality of items in the written exam compared to female students\\\' (3.3 ±1.3) (p=0.04). There were no significant differences among male and female students for the quality and difficulty of examination (Table-5).
The key finding of our study was that the faculty evaluation tool was a reliable and valid instrument. Also, there were no major differences among male and female students although the instructors and block coordinators were different for the two groups.
Most studies in literature have validated the teaching instruments and found five to seven factor solutions with 1-43 items on the quationnaires.17-19 The seven-factor solution yielded by the present study, had strong loadings from the 27 items on the questionnaire which is an empirical evidence of its reliability. The six factors had high reliability ranging from 0.81 to0.92 with the exception of one, "difficulty of exams", which could be due to the fact that there were only two variables that did not load on to any other factor. The method and process of adducing evidence of reliability that we used for the data is similar to the studies that have validated their teacher evaluation instruments.17-19 The items on the questionnaire used by two other studies20,21 assessed students\\\' perception of learning from the course whereas our instrument assessed students\\\' perception of block organisation and management.
There were generally no statistically significant differences between the perceptions of males and females on any of the factors evaluating the course of study/block, which is not in concordance with the Dundee Ready Educational Environment Measure (DREEM) inventory that was used in the Kingdom and found a positive inclination of females towards the learning environment.22 The same inventory when used in Sweden showed no difference in the perception of males and females.23 Although these studies assessed the learning environment, but we stress that this is the closest comparison at this time of the instrument that we used for block evaluation. The differences in the opinion/perceptions of male and female students are similar to what is found internationally even though the classes for male and female students are held separately in Saudi Arabia and more than 90% of the times they are taught by different instructors, and even the buildings are separate. Although our data is from one cohort of students from one block/course of study, but we can be confident in saying that both male and female students had similar opinion on the quality of the organisation and management of the Haematology Block.
In terms of limitations, factor analysis requires a minimum of 5 responses for each variable and the current study was short of 19 responses. Hence, the results have to be interpreted with caution as they may be unstable. Besides, the results of the study are preliminary as they are from one block of study, but this instrument can be used confidently for other blocks as well. As a continuation of this study we intend to use the data from other blocks to do a confirmatory factor analysis to adduce evidence of construct validity for this instrument.
The faculty evaluation tool was found to be reliable, but its construct validity has to be evaluated with caution as there were 13% less respondents than required. No major difference was found between male and female students about the perceptions of the quality of block organization, except that male students thought that the block book was better, instructors were more helpful and the items on the exam were of high quality.
1. Morrison J. ABC of learning and teaching in medicine: Evaluation. BMJ. 2003; 326:385-7.
2. Cohen L, Manion L. Research methods in education. 4th ed. London: Routledge, 1994.
3. McOwen KS, Bellini LM, Morrison G, Shea JA. The development and implementation of a health-system-wide evaluation system for education activities: build it and they will come. Acad Med. 2009; 84:1352-9.
4. Snell L, Tallett S, Haist S,Hays R, Norcini J, Prince K, et al. A review of the evaluation of clinical teaching: new perspectives and challenges. Med Educ. 2000; 34:862-70.
5. McKeachie W. Student ratings; the validity of use. Am Psychol. 1997; 52: 1218-25.
6. Coffey M, Gibbs G. The evaluation of the student evaluation of educational quality (SEEQ) questionnaire in UK higher education. Ass Eval High Edu. 2001; 26:89-93.
7. Cohen PA, McKeachie WJ. The role of colleagues in the evaluation of teaching. Improving college and university teaching. Teache Evaluation second edition ed - Kenneth D. Peterson. Corwin Press, Inc. California 2000; pp 147-54.
8. Kember D, Leung D, Kwan K. Does the use of student feedback questionnaires improve the overall quality of teaching? Assess Eval Higher Educ. 2002; 27:411-25.
9. Prave RS, Baril GL. Instructor ratings: Controlling for bias from initial student interest. J Educ Bus. 1993; 68:362-6.
10. Griffin BW. Instructor reputation and student ratings of instruction. Contemp Educ Psychol. 2001;26:534-52.
11. Naftulin DH, Ware JE, Donnelly FA. The Doctor Fox lecture: A paradigm of educational seduction. J Med Educ. 1973; 48:630-5.
12. Marsh HW, Ware JE. Effects of expressiveness, content coverage, and incentive on multidimensional student rating scales: New interpretations of the Dr. Fox effect. J Educational Psychol. 1982; 74:126-34.
13. Gathright MM, Thrush C, Jarvis R, Hicks E, Cargile C, Clardy J, et al. Identifying areas for curricular program improvement based on perceptions of skills, competencies, and performance. Acad Psychiatry. 2009; 33:37-42.
14. Steinert Y. Mapping the teacher\\\'s role: The value of defining core competencies for teaching. Med Teach. 2009; 31:371-2.
15. Harris DL, Krause KC, Parish DC, Smith MU. Academic competencies for medical faculty. Fam Med. 2007; 39:343-50.
16. Srinivasan M, Li ST, Meyers FJ, Pratt DD, Collins JB, Braddock C, et al. "Teaching as a Competency": competencies for medical educators. Acad Med. 2011; 86:1211-20.
17. Beckman TJ, Ghosh AK, Cook DA, Erwin PJ, Mandrekar JN. How reliable are assessments of clinical teaching? A review of the published instruments. J Gen Intern Med. 2004; 19:971-7.
18. Boerboom TB, Dolmans DH, Jaarsma AD, Muijtjens AM, Van Beukelen P, Scherpbier AJ. Exploring the validity and reliability of a questionnaire for evaluating veterinary clinical teachers\\\' supervisory skills during clinical rotations. Med Teach. 2011; 33:e84-91.
19. Kirschling JM, Fields J, Imle M, Mowery M, Tanner CA, Perrin N, et al. Evaluating teaching effectiveness. J Nurs Educ. 1995; 34:401-10.
20. Broomfield D, Bligh J. An evaluation of the \\\'short form\\\' course experience questionnaire with medical students. Med Educ. 1998; 32:367-9.
21. Steele G, West S, Simeon D. Using a modified course experience questionnaire (CEQ) to evaluate the innovative teaching of medical communication skills. Educ Health (Abingdon). 2003; 16:133-44.
22. Mojaddidi MA, Khoshhal KI, Habib F, Shalaby S, El-Bab ME, Al-Zalabani AH. Reassessment of the undergraduate educational environment in College of Medicine, Taibah University, Almadinah Almunawwarah, Saudi Arabia. Med Teach. 2013; 35:S39-46.
23. Edgren G, Haffling AC, Jakobsson U, McAleer S, Danielsen N. Comparing the educational environment (as measured by DREEM) at two different stages of curriculum reform. Med Teach. 2010; 32:e233-8.