Using Contingency Table Approaches in Differential Item Functioning Analysis: A Comparison
Volume 4, Issue 4, July 2015, Pages: 139-148
Received: May 16, 2015;
Accepted: Jun. 1, 2015;
Published: Jun. 23, 2015
Views 3123 Downloads 177
Jose Quito Pedrajita, Educational Research and Evaluation Department, Division of Educational Leadership and Professional Services, University of the Philippines College of Education, Diliman, Quezon City, Philippines
This study provides a demonstration of differential item functioning (DIF) analysis. It made use of test scores of 200 junior high school students on a Chemistry Achievement Test, a measure tested for its psychometric properties. One hundred students came from a public school, while the other 100 were private school examinees; one hundred students were males and the other 100 were females; and 95 students were of low ability and 105 students were of high ability based on their English II grades. Four contingency table approaches, the Chi-Square, Distractor Response Analysis, Logistic Regression and the Mantel-Haenszel Statistic, were applied in the DIF analysis to identify test items indicating bias between examinees matched on school type, gender, and English ability. Thereafter, the results for the four approaches were compared. The findings revealed the presence of items indicating school type-, gender-, and English ability-based DIF. There was a high degree of correspondence between the Logistic Regression and the Mantel-Haenszel Statistic in identifying potentially biased test items.
Jose Quito Pedrajita,
Using Contingency Table Approaches in Differential Item Functioning Analysis: A Comparison, Education Journal.
Vol. 4, No. 4,
2015, pp. 139-148.
Baker, F. & Kim, S. H. (2004). Item Response Theory Parameters Estimation Techniques. New York: Marcel Dekker Inc. 2nd edition.
Barnett, S. & Ercikan, K. (2006). Examining sources of gender DIF in mathematics assessments using a confirmatory multidimensional model approach. Applied Measurement in Education. 19(4), 289-304.
Camilli, G. and Shepard, L. (1994). Methods for Identifying Biased Test Items. Volume 4, Sage Publications, Inc., California.
Embretson, S. E. & Reise, S. P. (2000). Item Response Theory for Psychologists. London: Lawrence Erlbaum Associates, Publishers.
Fidalgo, A. M. (2011). A new approach for differential item functioning detection using Mantel-Haenszel Methods. The GMHDIF Program. The Spanish Journal of Psychology, 14:2.
Gierl, M. J. (1999). Differential item functioning on the Alberta Education Social Studies 30 Diploma Examination. Canadian Social Studies, Vol. 33, No. 2.
Hambleton, R. K., Swaminathan, H., & Rogers, J. H. (1991). Fundamentals of Item Response Theory (IRT). London: Sage Publications.
Kamata, A. and Vaughn B. (2004). An introduction to differential item functioning analysis. Learning Disabilities: A Contemporary Journal 2(2), 49 – 69.
Kanjee, A. (2007). Using logistic regression to detect bias when multiple groups are tested. South African Journal of Psychology, 37, 47 – 61
Kristjansson, E., Aylesworth, R., McDowell, I., & Zumbo, B. D. (2005). A comparison of four methods for detecting differential item functioning in ordered response items. Educational and Psychological Measurement, 65:6, 935-953.
Le, V. (1999). Identifying differential item functioning on the NELS: 88 History Achievement Test. Center for the Study of Evaluation, National Center for Research on Evaluation, Standards, and Student Testing. Retrieved November 2011 from ProQuest Journals.
Mazor, K. E., Kanjee, A., and Clauser, B. E. (1995). Using logistic regression and the Mantel-Haenszel with multiple ability estimates to detect differential item functioning. Journal of Educational Measurement, 32, 131 – 144.
Navas-Ara, M. J., & Gomez-Benito, J. (2002). Effects of ability scale purification on the identification of differential item functioning. European Journal of Psychological Assessment. Vol. 18, No. 1, pp. 9-15.
Nijenhuis, J. T., Tolboom, E., Resing, W., & Bleichrodt, N. (2004). Does cultural background influence the intellectual performance of children from immigrant groups? The RAKIT Intelligence Test for Immigrant Children. European Journal of Psychological Assessment. Vol. 20, No. 1, pp. 10-26.
Osterlind, Steven J. (1983). Test Item Bias. Sage Publications, Inc., California.
Osterlind, S. J. & Everson, H. T. (2009). Differential Item Functioning. 2nd Edition, CA: Sage Publications.
Penfield, R. P. & Camilli, G. (2007). Differential item functioning and item bias. In C. Rao & S. Sinharay (Eds.) Handbook of Statistics Psychometrics, Vol. 26.
Reynolds R. C., Livingston, R. B., & Willson, V. (2006). Measurement and Assessment in Education. Boston: Pearson.
Rogers, H. J. & Swaminathan, H. (1993). A comparison of logistic regression and Mantel-Haenszel procedures for detecting differential item functioning. Applied Psychological Measurement, 17, 105 – 116.
Roussos, L. A. & Stout, W. (2004). Differential item functioning analysis: Detecting DIF items and testing DIF hypotheses. In D. Kaplan (Ed.) The Sage Handbook of Quantitative Methodology for Social Sciences. Thousand Oaks: Sage.
Sheppard, R., Han, K., Colarell, S. M., Dai, G., & King, D. W. (2006). Differential Item Functioning by Sex and Race in the Hogan Personality Inventory. SAGE Publications. Retrieved June 1, 2010 from ProQuest Educational Journals
Stoneberg, B. D. (2004). A study of gender-based differential item functioning (DIF) in the Spring 2003 Idaho Standards Achievement Tests applying the Simultaneous Bias Test (SIBTEST) and the Mantel-Haenzel Chi Square Test. The University of Maryland Measurement, Statistics, and Evaluation Department and the National Center for Education Statistics (NCES) Assessment Division.
Swaminathan, H. and Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361 – 370.
Wang, N. & Lane, S. (1996). Detection of gender-related differential item functioning in a mathematics performance test. Applied Measurement in Education, 9:2, 175-199.
Wiberg, M. (2009). Differential item functioning in mastery tests: A comparison of three methods using real data. International Journal of Testing. 9, 41-59.
Wolf, L. R. & Phyllis, R. (1990). The SAT gender gap. Women and Languge. Vol. 13, Issue 2. Retrieved September 26, 2011 from ProQuest Journals.
Zheng, Y., Geirl, M. J., & Cui, Y. (2007). Using real data to compare DIF detection and effect size measures among Mantel-Haenzel, SIBTEST, and Logistic Regression procedures. Paper presented at NCME 2007, Chicago.