These data are from a cytotoxicity assay conducted by the Scripps Research Institute Molecular Screening Center. There are 500 compounds assessed for toxicity using the the Jurkat human T-Cell line. 50 of these compounds were active (toxic). Visit http://pubchem.ncbi.nlm.nih.gov/assay/assay.cgi?aid=364 for more details.
Data frame with 500 rows and 173 columns. The first column contains the compound ids. The second contains the outcome of the assay (a binary variable, indicating active/inactive). The next columns are chemical descriptor columns. Two descriptor sets are present. Both of these sets were computed using the software, PowerMV - see Liu et al. (2005) for more information. The first set of 24 continuous descriptors are a modification of the Burden number descriptors (Burden, 1989). The second set contains 147 binary descriptors, indicating the presence/absence of "pharmacophore" features, described in more detail in Liu et al. (2005).
Burden, F. R. (1989). Molecular identification number for substructure searches. Journal of Chemical Information and Computer Sciences, 29(3), 225-227.
Liu, K., Feng, J., & Young, S. S. (2005). PowerMV: a software environment for molecular viewing, descriptor generation, data analysis and hit evaluation. Journal of chemical information and modeling, 45(2), 515-522.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.