setSamFilter | R Documentation |
Search samples which do not meet the criteria and label them as "invalid".
setSamFilter(
object,
id = NA_character_,
missing = 1,
het = c(0, 1),
mac = 0,
maf = 0,
ad_ref = c(0, Inf),
ad_alt = c(0, Inf),
dp = c(0, Inf),
mean_ref = c(0, Inf),
mean_alt = c(0, Inf),
sd_ref = Inf,
sd_alt = Inf,
...
)
## S4 method for signature 'GbsrGenotypeData'
setSamFilter(
object,
id,
missing,
het,
mac,
maf,
ad_ref,
ad_alt,
dp,
mean_ref,
mean_alt,
sd_ref,
sd_alt
)
object |
A GbsrGenotypeData object. |
id |
A vector of strings matching with sample ID which can
be retrieve by |
missing |
A numeric value [0-1] to specify the maximum missing genotype call rate per sample. |
het |
A vector of two numeric values [0-1] to specify the minimum and maximum heterozygous genotype call rate per sample. |
mac |
A integer value to specify the minimum minor allele count per sample. |
maf |
A numeric value to specify the minimum minor allele frequency per sample. |
ad_ref |
A numeric vector with length two specifying lower and upper limit of reference read counts per sample. |
ad_alt |
A numeric vector with length two specifying lower and upper limit of alternative read counts per sample. |
dp |
A numeric vector with length two specifying lower and upper limit of total read counts per sample. |
mean_ref |
A numeric vector with length two specifying lower and upper limit of mean of reference read counts per sample. |
mean_alt |
A numeric vector with length two specifying lower and upper limit of mean of alternative read counts per sample. |
sd_ref |
A numeric value specifying the upper limit of standard deviation of reference read counts per sample. |
sd_alt |
A numeric value specifying the upper limit of standard deviation of alternative read counts per sample. |
... |
Unused. |
For mean_ref
, mean_alt
, sd_ref
, and sd_alt
,
this function calculate mean and standard deviation of reads
obtained at SNP markers of each sample. If a mean read counts
of a sample was smaller than the specified lower limit or larger
than the upper limit, this function labels the sample as "invalid".
In the case of sd_ref
and sd_alt
, standard deviations of read counts
of each sample are checked and the samples having
a larger standard deviation will be labeled as "invalid".
To check valid and invalid samples, run validSam()
.
A GbsrGenotypeData object with filters on samples.
# Load data in the GDS file and instantiate a [GbsrGenotypeData] object.
gds_fn <- system.file("extdata", "sample.gds", package = "GBScleanR")
gds <- loadGDS(gds_fn)
# Summarize the information needed for filtering.
gds <- countGenotype(gds)
gds <- countRead(gds)
gds <- setSamFilter(gds,
id = getSamID(gds)[1:10],
missing = 0.2,
dp = c(5, Inf))
# Close the connection to the GDS file.
closeGDS(gds)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.