Development of nearest neighbor classifiers identifying dermal sensitizers based on a local lymph node assay database.
Statistics: Harnessing the Power of Information. 2007 JSM Proceedings. Papers Presented at the Joint Statistical Meetings Salt Lake City, Utah, July 29 - August 2, 2007, and other ASA-sponsored conferences. Alexandria, VA: American Statistical Association, 2007 Aug; :CD-ROM
K Nearest Neighbor classifiers were developed to predict skin sensitization of a new chemical based on a murine local lymph node assay database of 178 organic chemicals. Two filters were compared for preselection of molecular descriptors. The Fisher's Discriminant Ratio filter picked a subset of descriptors which turn out to be more discriminatory than those picked by the t-test filter. Then, a step forward search method was implemented to screen out extra descriptors and simplify the classifiers based on leave one-out accuracy. Euclidean and Mahalanobis distance metrics were also examined and the results showed the Mahalanobis distance was appropriate for this study. The 3-nearest neighbor classifier of 13 descriptors singled out by the above methods has an especially balanced performance with sensitivity of 92% and specificity of 81 % for this unbalanced dataset.
Sampling; Sampling-methods; Simulation-methods; Statistical-analysis; Mathematical-models
Statistics: Harnessing the Power of Information. 2007 JSM Proceedings. Papers Presented at the Joint Statistical Meetings Salt Lake City, Utah, July 29 - August 2, 2007, and other ASA-sponsored conferences