Robust Linear Regression Using L1-Penalized MM-Estimation for High Dimensional Data
American Journal of Theoretical and Applied Statistics
Volume 4, Issue 3, May 2015, Pages: 78-84
Received: Mar. 10, 2015; Accepted: Mar. 24, 2015; Published: Mar. 30, 2015
Views 3767      Downloads 323
Authors
Kamal Darwish, Yildiz Technical University, Department of Statistics, Istanbul, Turkey
Ali Hakan Buyuklu, Yildiz Technical University, Department of Statistics, Istanbul, Turkey
Article Tools
Follow on us
Abstract
Large datasets, where the number of predictors p is larger than the sample sizes n, have become very popular in recent years. These datasets pose great challenges for building a linear good prediction model. In addition, when dataset contains a fraction of outliers and other contaminations, linear regression becomes a difficult problem. Therefore, we need methods that are sparse and robust at the same time. In this paper, we implemented the approach of MM estimation and proposed L1-Penalized MM-estimation (MM-Lasso). Our proposed estimator combining sparse LTS sparse estimator to penalized M-estimators to get sparse model estimation with high breakdown value and good prediction. We implemented MM-Lasso by using C programming language. Simulation study demonstrates the favorable prediction performance of MM-Lasso.
Keywords
MM Estimate, Sparse Model, LTS Estimate, Robust Regression
To cite this article
Kamal Darwish, Ali Hakan Buyuklu, Robust Linear Regression Using L1-Penalized MM-Estimation for High Dimensional Data, American Journal of Theoretical and Applied Statistics. Vol. 4, No. 3, 2015, pp. 78-84. doi: 10.11648/j.ajtas.20150403.12
References
[1]
A. E. Hoerl and R. W. Kennard, “Ridge Regression: Biased Estimation for Nonorthogonal Problems,” Technometrics, vol. 12, no. 1, pp. 55–67, 1970.
[2]
R. Tibshirani, “Regression shrinkage and selection via the lasso,” J. Royal. Statist. Soc B., vol. 58, no. 1, pp. 267–288, 1996.
[3]
B. Efron, T. Hastie, and R.Tibshirani, “Least angle regression,” The Annals of Statistics, vol. 32, pp, 407–499, 2004.
[4]
K. Knight and W. Fu, “Asymptotics for Lasso-Type Estimators,” The Annals of Statistics, vol. 28, pp. 1356–1378, 2000.
[5]
J. Fan and R. Li, “Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties,” Journal of the American Statistical Association, vol. 96, no. 456, pp. 1348–1360, 2001
[6]
A. Alfons, C. Croux, and S. Gelper, “Sparse least trimmed squares regression for analyzing high dimensional large data sets,” The Annals of Applied Statistics, vol. 7, no. 1, pp. 226–248, 2013.
[7]
H.Wang, G. Li, and G. Jiang, “Robust regression shrinkage and consistent variable selection through the LAD-lasso,” Journal of Business & Economic Statistics, vol. 25, pp. 347-355, 2007.
[8]
G. Li, H. Peng, and L. Zhu,“Nonconcave penalized M-estimation with a diverging number of parameters,” Statitica Sinica , vol. 21, no. 1, pp. 391–419, 2013.
[9]
R. A. Maronna, “Robust ridge regression for high-dimensional data,” Technometrics, vol. 53, pp. 44–53, 2011.
[10]
J. A. Khan, Aelst, S. Van. and R. H. Zamar, “Robust linear model selection based on least angle regression,” Journal of the Statistical Association, vol. 102, pp. 1289–1299, 2007.
[11]
P. Rousseeuw and A. Leroy, Robust regression and outlier detection. John Wiley & Sons, 1987.
[12]
V. J. Yohai, “High Breakdown-point and High Efficiency Estimates for Regression,” The Annals of Statistics, vol. 15, pp. 642-65, 1987.
[13]
R. Maronna, D. Martin, and V. Yohai, Robust Statistics. John Wiley & Sons, Chichester. ISBN 978-0-470-01092-1, 2006.
[14]
A. E. Beaton, and J. W. Tukey, “The fitting of power series, meaning polynomials, illustrated on band-spectroscopic data,” Technometrics, vol. 16, pp. 147-185, 1974.
[15]
R. A. Maronna, and V. J. Yohai, “Correcting MM Estimates for Fat Data Sets,” Computational Statistics & Data Analysis, vol. 54, pp. 3168-3173, 2010.
[16]
V. J. Yohai and R.H. Zamar, “High breakdown-point estimates of regression by means of the minimization of an efficient scale,” Journal of the American Statistical Association, vol. 83, pp. 406–413, 1988.
[17]
A. Alfons, simFrame: Simulation framework. R package version 0.5, 2012b.
[18]
A. Alfons, robustHD: Robust methods for high-dimensional R pakage version 0.1.0, 2012a.
[19]
R. Koenker, quantreg: Quantile regression. R package version 4.67, 2011.
[20]
T. Hasti and B. Efron, lars: Least angle regression, lasso and forward stagewise. R package version 0.9-8, 2011.
ADDRESS
Science Publishing Group
1 Rockefeller Plaza,
10th and 11th Floors,
New York, NY 10020
U.S.A.
Tel: (001)347-983-5186