A data set that contains information about compounds used in drug discovery. Specifically, this data set consists of 5631 compounds on which an in-house solubility screen (ability of a compound to dissolve in a water/solvent mixture) was performed.
Based on this screen, compounds were categorized as either insoluble (n=3493) or soluble (n=2138). Then, for each compound, 72 continuous, noisy structural descriptors were computed.
A data frame with 5631 observations on the following 73 variables. Some rows have missing data.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.