goodSamplesMS | R Documentation |
This function checks data for missing entries and returns a list of samples that pass two criteria on maximum number of missing values: the fraction of missing values must be below a given threshold and the total number of missing genes must be below a given threshold.
goodSamplesMS(multiExpr,
multiWeights = NULL,
useSamples = NULL,
useGenes = NULL,
minFraction = 1/2,
minNSamples = ..minNSamples,
minNGenes = ..minNGenes,
minRelativeWeight = 0.1,
verbose = 1, indent = 0)
multiExpr |
expression data in the multi-set format (see |
multiWeights |
optional observation weights in the same format (and dimensions) as |
useSamples |
optional specifications of which samples to use for the check. Should be a logical
vector; samples whose entries are |
useGenes |
optional specifications of genes for which to perform the check. Should be a logical
vector; genes whose entries are |
minFraction |
minimum fraction of non-missing samples for a gene to be considered good. |
minNSamples |
minimum number of good samples for the data set to be considered fit for analysis. If the actual number of good samples falls below this threshold, an error will be issued. |
minNGenes |
minimum number of non-missing samples for a sample to be considered good. |
minRelativeWeight |
observations whose relative weight is below this threshold will be considered missing. Here relative weight is weight divided by the maximum weight in the column (gene). |
verbose |
integer level of verbosity. Zero means silent, higher values make the output progressively more and more verbose. |
indent |
indentation for diagnostic messages. Zero means no indentation, each unit adds two spaces. |
The constants ..minNSamples
and ..minNGenes
are both set to the value 4.
If weights are given, entries whose relative weight (i.e., weight divided by maximum weight in the column or gene) will be considered missing.
For most data sets, the fraction of missing samples criterion will be much more stringent than the absolute number of missing samples criterion.
A list with one component per input set. Each component is a logical vector with one entry per sample in the corresponding set, indicating whether the sample passed the missing value criteria.
Peter Langfelder and Steve Horvath
goodGenes
, goodSamples
, goodSamplesGenes
for cleaning
individual sets separately;
goodGenesMS
, goodSamplesGenesMS
for additional cleaning of multiple data
sets together.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.