Description Usage Arguments Details Value
View source: R/gene_statistics.R
This function performs a linear regression on the log-transformed values of
the variance vs mean, i.e., log(variance)~log(mean)
. Genes whose
(log(mean), log(variance)) points stay above the fitted line can be
considered high variance genes. These values will have a negative
residual (.resid
column in dataframe residuals
).
1 | mean_variance_fit(gene_stat_df)
|
gene_stat_df |
A dataframe of at least three columns: |
To keep only the high variance genes, you can filter by residual:
mean_variance_fit(foo)$residuals %>% dplyr::filter(.resid < 0)
.
A list of three dataframes:
Information about the regression coeficients. Each row is a regression coefficient.
Information on the goodness of fit.
Information about each fitted point. The column
.resid
contains the difference between the fitted line and the data
point, i.e., \hat{y}-y. Please note this is not the most customary
definition of residual:
broom issue 802.
But, given this definition, the high variance genes are indeed those that
show a negative residual.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.