Archive
Special Issues
Statistical Models for Count Data
Science Journal of Applied Mathematics and Statistics
Volume 4, Issue 6, December 2016, Pages: 256-262
Received: Sep. 13, 2016; Accepted: Sep. 23, 2016; Published: Oct. 15, 2016
Authors
Alexander Kasyoki Muoka, Department of Basic and Applied Sciences, Jomo Kenyatta University of Agriculture and Technology-Westlands campus, Nairobi, Kenya
Oscar Owino Ngesa, Mathematics and Informatics department, Taita Taveta University College, Voi, Kenya
Anthony Gichuhi Waititu, Department of Basic and Applied Sciences, Jomo Kenyatta University of Agriculture and Technology-Westlands campus, Nairobi, Kenya
Article Tools
Abstract
Statistical analyses involving count data may take several forms depending on the context of use, that is; simple counts such as the number of plants in a particular field and categorical data in which counts represent the number of items falling in each of the several categories. The mostly adapted model for analyzing count data is the Poisson model. Other models that can be considered for modeling count data are the negative binomial and the hurdle models. It is of great importance that these models are systematically considered and compared before choosing one at the expense of others to handle count data. In real world situations count data sets may have zero counts which have an importance attached to them. In this work, statistical simulation technique was used to compare the performance of these count data models. Count data sets with different proportions of zero were simulated. Akaike Information Criterion (AIC) was used in the simulation study to compare how well several count data models fit the simulated datasets. From the results of the study it was concluded that negative binomial model fits better to over-dispersed data which has below 0.3 proportion of zeros and that hurdle model performs better in data with 0.3 and above proportion of zero.
Keywords
Count, Modeling, Simulation, AIC, Compare
Alexander Kasyoki Muoka, Oscar Owino Ngesa, Anthony Gichuhi Waititu, Statistical Models for Count Data, Science Journal of Applied Mathematics and Statistics. Vol. 4, No. 6, 2016, pp. 256-262. doi: 10.11648/j.sjams.20160406.12
References
[1]
Dalrymple, M. L., Hudson, I., & Ford, R. P. K. (2003). Finite mixture, zero-inflated poisson and hurdle models with application to sids. Computational Statistics & Data Analysis, 41 (3), 491-504.
[2]
Gurmu, S., & Trivedi, P. K. (1996). Excess zeros in count models for recreational trips. Journal of Business & Economic Statistics, 14 (4), 469-477.
[3]
Johansson, A. (2014). A comparison of regression models for count data in third party automobile insurance.
[4]
Lord, D., Washington, S. P., & Ivan, J. N. (2005). Poisson, poisson-gamma and zero-inflated regression models of motor vehicle crashes: balancing statistical fit and theory. Accident Analysis & Prevention, 37 (1), 35-46.
[5]
Frees, E. W. (2010). Regression modeling with actuarial and financial applications. Cambridge University Press.
[6]
Cameron, A., & Trivedi, P. (1999). Regression analysis of count data. Cam-bridge University Press.
[7]
Johnson, N. L., Kotz, S., & Kemp, A. (1992). Univariate distributions. New York, John Wiley.
[8]
Hilbe, J. (2014). Modeling count data. Cambridge University Press.
[9]
Bonate, P. L. (2001). A brief introduction to monte carlo simulation. Clinical pharmacokinetics, 40 (1), 15-22.
[10]
Mooney, C. Z. (1997). Monte carlo simulation (quantitative applications in the social sciences).
[11]
Min, Y., & Agresti, A. (2005). Random e ect models for repeated measures of zero-in ated count data. Statistical Modelling, 5 (1), 1-19.
[12]
Civettini, A. J., & Hines, E. (2005). Misspeci cation e ects in zero-in ated negative binomial regression models: Common cases. In Annual meeting of the southern political science association. new orleans, la.
[13]
Lambert, D. (1992). Zero-in ated poisson regression, with an application to defects in manufacturing. Technometrics, 34 (1), 1-14.
[14]
Miller, J. M. (2007). Comparing poisson, hurdle, and zip model fit under varying degrees of skew and zero-inflation. University of Florida
PUBLICATION SERVICES