Linear Regression with Regularization: A Comparative Study

Silvio Cabral Patricio, Fernando K. Inaba, Matheus Cornejo

Publikation: Konferencebidrag uden forlag/tidsskriftPosterForskningpeer review

Abstract

Variable selection methods such as backward, forward, and stepwise are generally used to obtain models with lower prediction error and greater interpretability. However, due to the discrete process in choosing the regression variables, i.e., variables are retained or discarded, the resulting model may show large variance and, therefore, not reduce the prediction error compared to the full model. Another approach to improve interpretability and prediction error is regularization in the regression, which seeks to shrink the coefficients to zero. The ridge, lasso, and elastic net are among the most used regularizations. The present work consists of a comparative study between the regression models using ridge, lasso, and elastic net regularization, in addition to the model using stepwise variable selection and the complete model. Synthetic data with different sample sizes and regressors were generated to perform the comparative study. In addition, a real database was used in the study. Variable selection methods such as backward, forward, and stepwise are generally used to obtain models with lower prediction error and greater interpretability. However, due to the discrete process in choosing the regression variables, i.e., variables are retained or discarded, the resulting model may show large variance and, therefore, not reduce the prediction error compared to the full model. Another approach to improve interpretability and prediction error is regularization in the regression, which seeks to shrink the coefficients to zero. The ridge, lasso, and elastic net are among the most used regularizations. The present work consists of a comparative study between the regression models using ridge, lasso, and elastic net regularization, in addition to the model using stepwise variable selection and the complete model. Synthetic data with different sample sizes and regressors were generated to perform the comparative study. In addition, a real database was used in the study.
OriginalsprogEngelsk
Publikationsdato2016
StatusUdgivet - 2016
Udgivet eksterntJa
BegivenhedXII Semana de Estatística - Vitoria, Brasilien
Varighed: 11. okt. 2016 → …

Konference

KonferenceXII Semana de Estatística
Land/OmrådeBrasilien
ByVitoria
Periode11/10/2016 → …

Fingeraftryk

Dyk ned i forskningsemnerne om 'Linear Regression with Regularization: A Comparative Study'. Sammen danner de et unikt fingeraftryk.

Citationsformater