We are often interested in the association between the expression level of a set of genes in a cell or tissue and a measured physical trait or phenotype. We may have prior knowledge of which genes likely cause a certain phenotype, and therefore we may want to predict what phenotype a sample might have given its gene expression. Alternatively, we may want to determine whether a set of genes are good predictors of a known phenotype (1). In either case, regression is a useful tool to quantify this association (2), either linear regression, for a continuous phenotype, or logistic regression, for a binary phenotype. Here we implement our own algorithms for linear and logistic regression optimized for work with sparse data (or other big scale daat) as input. We calculate beta coefficient estimates as well as their associated F statistics (linear regression), z-scores (logistic regression) and significance values. Our algorithm is based on iteratively reweighted least squares (IRWLS)(3).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.