Description Usage Arguments Details Value Author(s) Examples
This function considers a two-part semi-parametric model for metabolomics and proteomics data. A kernel-smoothed method is applied to estimate the regression coefficients. And likelihood ratio test is constructed for differential abundance analysis.
1 |
sumExp |
An object of 'SummarizedExperiment' class. |
VOI |
Variable of interest. Default is NULL, when there is only one covariate, otherwise it must be one of the column names in colData. |
... |
Additional arguments passed to |
The differential abundance analysis is to compare metabolomic or proteomic profiles between different experimental groups, which utilizes a two-part model: a logistic regression model to characterize the zero proportion and a semi-parametric model to characterize non-zero values. Let Y_ig be the random variable representing the abundance of feature g in subject i. This two-part model has the following form:
log(pi_ig/(1-pi_ig))=gamma_0g + gamma_g*X_i
log(Y_ig)=beta_g*X_i+ epsilon_ig
where pi_ig=Pr(Y_ig=0) be the probability of point mass, X_i=(X_i1, X_i2,..., X_iQ)^T is a Q-vector covariates that specifies the treatment conditions applied to subject i. The corresponding Q-vector of model parameters gamma_g=(gamma_1g, gamma_2g,...,gamma_Qg)^T quantify the covariates effects on the fraction of zero values for feature g and gamma_0g is the intercept. beta_g=(beta_1g, beta_2g,..., beta_Qg) ^T is a Q-vector of model parameters quantifying the covariates effects on the non-zero values for the feature. And epsilon_ig are independent error terms with a common but completely unspecified density function f_g.
Hypothesis testing on the effect of the qth covariate on the gth feature is performed by assessing gamma_qg and beta_qg. Consider the null hypothesis H_0: gamma_qg and beta_qg against alternative hypothesis H_1: at least one of the two parameters is non-zero. The p-value is calculated based on a chi-square distribution with 2 degrees of freedom. To adjust for multiple comparisons across features, the false discovery discovery rate (FDR) q-value is calculated based on the qvalue function in R/Bioconductor.
A list containing the following components:
gamma |
a vector of point estimators for gamma_g in the logistic model (binary part) |
beta |
a vector of point estimators for beta_g in the semi-parametric model (non-zero part) |
pv_gamma |
a vector of one-part p-values for gamma_g |
pv_beta |
a vector of one-part p-values for beta_g |
qv_gamma |
a vector of one-part q-values for gamma_g |
qv_beta |
a vector of one-part q-values for beta_g |
pv_2part |
a vector of two-part p-values for overall test |
qv_2part |
a vector of two-part q-values for overall test |
feat.names |
a vector of feature names |
Yuntong Li <yuntong.li@uky.edu>, Chi Wang <chi.wang@uky.edu>, Li Chen <lichenuky@uky.edu>
1 2 3 4 5 6 7 | ##--------- load data ------------
data(exampleSumExp)
results = SDA(exampleSumExp)
##------ two part q-values -------
results$qv_2part
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.