Sparse Logistic Regression: Comparison of Regularization and Bayesian implementations

Mattia Zanon, Giuliano Zambonin, Gian Antonio Susto, Seán McLoone

Research output: Contribution to journalArticlepeer-review

312 Downloads (Pure)

Abstract

In knowledge-based systems, besides obtaining good output prediction accuracy, it is crucial to understand the subset of input variables that have most influence on the output, with the goal of gaining deeper insight into the underlying process. These requirements call for logistic model estimation techniques that provide a sparse solution, i.e., where coefficients associated with non-important variables are set to zero. In this work we compare the performance of two methods: the first one is based on the well known Least Absolute Shrinkage and Selection Operator (LASSO) which involves regularization with an L1 norm; the second one is the Relevance Vector Machine (RVM) which is based on a Bayesian implementation of the linear logistic model. The two methods are extensively compared in this paper, on real and simulated datasets. Results show that, in general, the two approaches are comparable in terms of prediction performance. RVM outperforms the LASSO both in term of structure recovery (estimation of the correct non-zero model coefficients) and prediction accuracy when the dimensionality of the data tends to increase. However, LASSO shows comparable performance to RVM when the dimensionality of the data is much higher than number of samples that is p >> n.
Original languageEnglish
Article number137
Number of pages24
JournalAlgorithms
Volume13
Issue number6
DOIs
Publication statusPublished - 08 Jun 2020

Fingerprint

Dive into the research topics of 'Sparse Logistic Regression: Comparison of Regularization and Bayesian implementations'. Together they form a unique fingerprint.

Cite this