Exclusive Content & Downloads from ASQ

The Cluster Elastic Net for High-Dimensional Regression with Unknown Variable Grouping

Summary: [This abstract is based on the authors' abstract.] In the high-dimensional regression setting, the elastic net produces a parsimonious model by shrinking all coefficients toward the origin. However, in certain settings, this behavior might not be desirable: if some features are highly correlated with each other and associated with the response, then we might wish to perform less shrinkage on the coefficients corresponding to that subset of features. This article proposes the cluster elastic net, which selectively shrinks the coefficients for such variables toward each other, rather than toward the origin. Instead of assuming that the clusters are known a priori, the cluster elastic net infers clusters of features from the data, on the basis of correlation among the variables as well as association with the response. These clusters are then used to more accurately perform regression. The authors demonstrate the theoretical advantages of the proposed approach, and explore its performance in a simulation study, and in an application to HIV drug resistance data. Supplementary materials are available online.

Anyone with a subscription, including Site and Enterprise members, can access this article.

Other Ways to Access content:

Join ASQ

Join ASQ as a Full member. Enjoy all the ASQ member benefits including access to many online articles.

  • Topics: Data Quality
  • Keywords: Cluster analysis, Clustering, Shrinkage, Linear regression, Pharmaceutical industry, Correlated data
  • Author: Witten, Daniela M.; Shojaie, Ali; Zhang, Fan;
  • Journal: Technometrics