Applicability Domain
This article provides insufficient context for those unfamiliar with the subject. Please help improve the article with a good introductory style. (November 2009) |
This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed. (November 2009) |
The Applicability Domain (AD) of a QSAR is the physico-chemical, structural or biological space, knowledge or information on which the training set of the model has been developed, and for which it is applicable to make predictions for new compounds.
The purpose of AD is to state whether the model's assumptions are met. In general, this is the case for interpolation rather than for extrapolation. Although up to now there is no single generally accepted algorithm for determining the AD, there exists a rather systematic approach for defining interpolation regions[1]. The process involves the removal of outliers and a probability density distribution method using kernel-weighted sampling. A recent rigorous benchmarking study of several AD algorithms identified standard-deviation of models as the most reliable approach [2].
To investigate the AD of a training set of chemicals one can directly analyse properties of the multivariate descriptor space of the training compounds or more indirectly via distance (or similarity) metrics. When using distance metrics care should be taken to use an orthogonal and significant vector space. This can be achieved by different means of feature selection and successive principle components analysis.
Notes
- ↑ Jaworska J, Nikolova-Jeliazkova N, Aldenberg T: QSAR applicabilty domain estimation by projection of the training set descriptor space: a review. Altern Lab Anim 2005, 33(5):445-459
- ↑ Tetko IV, Sushko I, Pandey AK, Zhu H, Tropsha A, Papa E, Oberg T, Todeschini R, Fourches D, Varnek A. Critical assessment of QSAR models of environmental toxicity against Tetrahymena pyriformis: focusing on applicability domain and overfitting by variable selection. J Chem Inf Model. 2008 Sep;48(9):1733-46.
30px | This science article is a stub. You can help ssf by expanding it. |
- Wikipedia articles needing context from November 2009
- Articles with invalid date parameter in template
- All Wikipedia articles needing context
- Wikipedia introduction cleanup from November 2009
- All pages needing cleanup
- Articles needing additional references from November 2009
- All articles needing additional references
- Pages with broken file links
- Science stubs
- Medicinal chemistry
- Pharmacology
- Cheminformatics
- Paradoxes
- 2Fix