Supervised Covariance Selection For Linear Discriminant Analysis
Hideitsu Hino, Nima Reyhani

Linear discriminant analysis relies on sample covariance matrix, which is a major estimation issue in many high dimensional statistical problems. Sample covariance matrix estimation has been studied recently and a number of solutions are proposed for such problem. Naıve Bayes approach assumes that the covariates are independent in high dimensional settings and showed that this assumption theoretically results in high classification accuracy. Here, we study the performance of other covariance estimators when the sparseness is not assumed to be huge, which comes at some computational cost compared to naıve Bayes. Our study shows that in some cases, we might gain by taking a covariance matrix with controlled sparseness. Then, we cast the covariance selection problem into the framework of empirical risk minimization, and propose the supervised covariance learning which uses the labels information in covariance matrix selection. The empirical results show that the use of controlled sparseness and labels information improves the classification accuracy compared to the naıve Bayes.