Exclusive Content & Downloads from ASQ

A Parallel EM Algorithm for Model-Based Clustering Applied to the Exploration of Large Spatio-Temporal Data

Summary: [This abstract is based on the authors' abstract.] The authors develop a parallel expectation–maximization (EM) algorithm for multivariate Gaussian mixture models and use it to perform model-based clustering of a large climate dataset. Three variants of the EM algorithm are reformulated in parallel and a new variant that is faster is presented. All are implemented using the single program, multiple data programming model, which is able to take advantage of the combined collective memory of large distributed computer architectures to process larger datasets. Displays of the estimated mixture model rather than the data allow the exploration of multivariate relationships in a way that scales to arbitrary size data. The authors study the performance of their methodology on simulated data and apply the methodology to a high-resolution climate dataset produced by the community atmosphere model (CAM5). This article has supplementary material online.

Anyone with a subscription, including Site and Enterprise members, can access this article.

Other Ways to Access content:

Join ASQ

Join ASQ as a Full member. Enjoy all the ASQ member benefits including access to many online articles.

  • Topics: Statistics
  • Keywords: Time series, Gaussian curve, Space-time modeling, Multivariate time series, Clustering, Multivariate analysis
  • Author: Chen, Wei-Chen; Ostrouchov, George; Pugmire, David; Prabhat; Wehner, Michael;
  • Journal: Technometrics