test_diff | R Documentation |
The ‘test_diff()' function is used to test coefficients of a ’proDAFit'
object. It provides a Wald test to test individual
coefficients and a likelihood ratio F-test to compare the
original model with a reduced model. The result_names
method provides a quick overview which coefficients are
available for testing.
test_diff(
fit,
contrast,
reduced_model = ~1,
alternative = c("two.sided", "greater", "less"),
pval_adjust_method = "BH",
sort_by = NULL,
decreasing = FALSE,
n_max = Inf,
verbose = FALSE
)
## S4 method for signature 'proDAFit'
result_names(fit)
fit |
an object of class 'proDAFit'. Usually, this is
produced by calling |
contrast |
an expression or a string specifying which
contrast is tested. It can be a single coefficient (to see
the available options use |
reduced_model |
If you don't want to test an individual
coefficient, you can can specify a reduced model and compare
it with the original model using an F-test. This is useful
to find out how a set of parameters affect the goodness of
the fit. If neither a |
alternative |
a string that decides how the
hypothesis test is done. This parameter is only relevant for
the Wald-test specified using the 'contrast' argument.
Default: |
pval_adjust_method |
a string the indicates the method
that is used to adjust the p-value for the multiple testing.
It must match the options in |
sort_by |
a string that specifies the column that is used
to sort the resulting data.frame. Default: |
decreasing |
a boolean to indicate if the order is reversed.
Default: |
n_max |
the maximum number of rows returned by the method.
Default: |
verbose |
boolean that signals if the method prints informative
messages. Default: |
To test if coefficient is different from zero with a Wald
test use the contrast
function argument. To test if two
models differ with an F-test use the reduced_model
argument. Depending on the test that is conducted, the functions
returns slightly different data.frames.
The function is designed to follow the principles of the
base R test functions (ie. t.test
and
wilcox.test
) and the functions designed
for collecting the results of high-throughput testing
(ie. limma::topTable
and DESeq2::results
).
The 'result_names()' function returns a character vector.
The 'test_diff()' function returns a data.frame
with one row per protein
with the key parameters of the statistical test. Depending what kind of test
(Wald or F test) the content of the 'data.frame' differs.
The Wald test, which can considered equivalent to a t-test, returns a 'data.frame' with the following columns:
the name of the protein, extracted from the rowname of the input matrix
the p-value of the statistical test
the multiple testing adjusted p-value
the difference that particular coefficient makes. In differential expression analysis this value is also called log fold change, which is equivalent to the difference on the log scale.
the diff
divided by the standard
error se
the standard error associated with the diff
the degrees of freedom, which describe the amount
of available information for estimating the se
. They
are the sum of the number of samples the protein was observed
in, the amount of information contained in the missing values,
and the degrees of freedom of the variance prior.
the estimate of the average abundance of the protein across all samples.
the approximated information available for estimating the protein features, expressed as multiple of the information contained in one observed value.
the number of samples a protein was observed in
The F-test returns a 'data.frame' with the following columns
the name of the protein, extracted from the rowname of the input matrix
the p-value of the statistical test
the multiple testing adjusted p-value
the ratio of difference of normalized deviances from original model and the reduced model, divided by the standard deviation.
the difference of the number of coefficients in the original model and the number of coefficients in the reduced model
the degrees of freedom, which describe the amount
of available information for estimating the se
. They
are the sum of the number of samples the protein was observed
in, the amount of information contained in the missing values,
and the degrees of freedom of the variance prior.
the estimate of the average abundance of the protein across all samples.
the information available for estimating the protein features, expressed as multiple of the information contained in one observed value.
the number of samples a protein was observed in
The contrast argument is inspired by
limma::makeContrasts
.
# "t-test"
syn_data <- generate_synthetic_data(n_proteins = 10)
fit <- proDA(syn_data$Y, design = syn_data$groups)
result_names(fit)
test_diff(fit, Condition_1 - Condition_2)
suppressPackageStartupMessages(library(SummarizedExperiment))
se <- generate_synthetic_data(n_proteins = 10,
n_conditions = 3,
return_summarized_experiment = TRUE)
colData(se)$age <- rnorm(9, mean=45, sd=5)
colData(se)
fit <- proDA(se, design = ~ group + age)
result_names(fit)
test_diff(fit, "groupCondition_2",
n_max = 3, sort_by = "pval")
# F-test
test_diff(fit, reduced_model = ~ group)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.