View source: R/empiricalBayesLM.R
modifiedBisquareWeights | R Documentation |
Calculation of bisquare weights and the intermediate weight factors similar to those used in the calculation of biweight midcovariance and midcorrelation. The weights are designed such that outliers get smaller weights; the weights become zero for data points more than 9 median absolute deviations from the median.
modifiedBisquareWeights(
x,
removedCovariates = NULL,
pearsonFallback = TRUE,
maxPOutliers = 0.05,
outlierReferenceWeight = 0.1,
groupsForMinWeightRestriction = NULL,
minWeightInGroups = 0,
maxPropUnderMinWeight = 1,
defaultWeight = 1,
getFactors = FALSE)
x |
A matrix of numeric observations with variables (features) in columns and observations (samples) in rows. Weights will be calculated separately for each column. |
removedCovariates |
Optional matrix or data frame of variables that are to be regressed out of each column
of |
pearsonFallback |
Logical: for columns of |
maxPOutliers |
Optional numeric scalar between 0 and 1. Specifies the maximum proportion of outliers in each column,
i.e., data with weights equal to
|
outlierReferenceWeight |
A number between 0 and 1 specifying what is to be considered an outlier when calculating the proportion of outliers. |
groupsForMinWeightRestriction |
An optional vector with length equal to the number of samples (rows) in |
minWeightInGroups |
A threshold weight, see |
maxPropUnderMinWeight |
A proportion (number between 0 and 1). See |
defaultWeight |
Value used for weights that would be undefined or not finite, for example, when a
column in |
getFactors |
Logical: should the intermediate weight factors be returned as well? |
Weights are calculated independently for each column of x
. Denoting a column of x
as y
, the weights
are calculated as (1-u^2)^2
where u
is defined as
\min(1, |y-m|/(9MMAD))
. Here m
is the median
of the column y
and MMAD
is the modified median absolute deviation. We call the expression
|y-m|/(9 MMAD)
the weight factors. Note that outliers are observations with high (>1) weight factors for outliers but low (zero) weights.
The calculation of MMAD
starts
with calculating the (unscaled) median absolute deviation of the column x
. If the median absolute deviation is
zero and pearsonFallback
is TRUE, it is replaced by the standard deviation
(multiplied by qnorm(0.75)
to make it asymptotically consistent with
MAD). The following two conditions are then
checked: (1) The proportion of weights below outlierReferenceWeight
is at most maxPOutliers
and (2) if groupsForMinWeightRestriction
has non-zero length, then for each individual level in
groupsForMinWeightRestriction
, the proportion of samples with weights less than minWeightInGroups
is at
most maxPropUnderMinWeight
. (If groupsForMinWeightRestriction
is zero-length, the second condition is
considered trivially satisfied.) If both conditions are met, MMAD
equals the median absolute deviation, MAD. If
either condition is not met, MMAD equals the lowest number for which both conditions are met.
When the input getFactors
is TRUE
, a list with two components:
weights |
A matrix of the same dimensions and |
factors |
A matrix of the same form as |
When the input getFactors
is FALSE
, the function returns the matrix of weights.
Peter Langfelder
A full description of the weight calculation can be found, e.g., in Methods section of
Wang N, Langfelder P, et al (2022) Mapping brain gene coexpression in daytime transcriptomes unveils diurnal molecular networks and deciphers perturbation gene signatures. Neuron. 2022 Oct 19;110(20):3318-3338.e9. PMID: 36265442; PMCID: PMC9665885. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.neuron.2022.09.028")}
Other references include, in reverse chronological order,
Peter Langfelder, Steve Horvath (2012) Fast R Functions for Robust Correlations and Hierarchical Clustering. Journal of Statistical Software, 46(11), 1-17. https://www.jstatsoft.org/v46/i11/
"Introduction to Robust Estimation and Hypothesis Testing", Rand Wilcox, Academic Press, 1997.
"Data Analysis and Regression: A Second Course in Statistics", Mosteller and Tukey, Addison-Wesley, 1977, pp. 203-209.
bicovWeights
for a simpler, less flexible calculation.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.