Exclusive Content & Downloads from ASQ

Analysis of High-Dimensional Structure-Activity Screening Datasets Using the Optimal Bit String Tree

Summary: We propose a new classification method called the Optimal Bit String Tree (OBSTree) to identify quantitative structure-activity relationships (QSARs). The method introduces the concept of a chromosome to describe the presence/absence context of a combination of descriptors. A descriptor set and its optimal chromosome form the splitting variable. A new stochastic searching scheme that contains a weighted sampling scheme, simulated annealing, and a trimming procedure optimizes the choice of splitting variable. Simulation studies and an application to screening monoamine oxidase inhibitors show that OBSTree is advantageous in accurately and effectively identifying QSAR rules and finding different classes of active compounds. Details of the algorithm, SAS code, and simulated and real datasets are available online as supplementary materials.

Anyone with a subscription, including Site and Enterprise members, can access this article.

Other Ways to Access content:

Join ASQ

Join ASQ as a Full member. Enjoy all the ASQ member benefits including access to many online articles.

  • Topics: Data Quality
  • Keywords: Throughput, Screening, Biomedical, Chemical, Pharmaceutical industry, Classification, Tree diagrams, Attribute data, Drug discovery, Prediction, QSAR, Simulated annealing,
  • Author: Zhang, Ke; Hughes-Oliver, Jacqueline M.; Young, S. Stanley;
  • Journal: Technometrics