Methods for Finding Informative Correlated Predictors in High Dimensional Linear Regression Models

Niharika Gauraha

Abstract: We discuss an important research problem concerning selection  of covariates that are both highly informative and correlated with each other. Most of the variable selection methods exclude the highly correlated covariates from the regression equations, which can lead investigators to overlook significant features. Especially in high dimensional settings, independence assumptions among predictors are unlikely to be fulfilled in general. Finding important predictors (which can be also strongly correlated) for better model interpretations arises in many applications such as bioinformatics and genomics. Mainly, we consider the following two potential approaches for correlated variable selection:

  • The two-stage procedures that involve clustering or grouping of correlated predictors and then pursuing group-wise model fitting. For example, cluster Lasso Methods, and Stability Feature Selection using Cluster Representative Lasso etc.
  •  Simultaneous clustering and model fitting that involves combination of two different penalties. For example, Elastic Net is a combination of the Lasso penalty (L1) and the Ridge (L2)penalty.
Time of Seminar: 
2017-10-10 13:00 to 14:00