View source: R/filter_individuals.R
filter_individuals | R Documentation |
Remove individuals with bad QC based on:
missingness (genotyping rate)
heterozygosity
coverage (total, median, iqr)
Filter targets: Individuals
Statistics: Missingness, heterozygosity and coverage
Used internally in radiator and might be of interest for users who wants to blacklist individuals.
filter_individuals(
data,
interactive.filter = TRUE,
filter.individuals.missing = NULL,
filter.individuals.heterozygosity = NULL,
filter.individuals.coverage.total = NULL,
filter.individuals.coverage.median = NULL,
filter.individuals.coverage.iqr = NULL,
parallel.core = parallel::detectCores() - 1,
verbose = TRUE,
...
)
data |
(2 options) A Genomic Data Structure (GDS) file or object generated by radiator. How to get GDS?
Look into:
|
interactive.filter |
(optional, logical) Do you want the filtering session to
be interactive. Figures of distribution are shown before asking for filtering
thresholds.
Default: |
filter.individuals.missing |
(optional, double) A proportion above which the individuals are
blacklisted and removed from the dataset.
Default: |
filter.individuals.heterozygosity |
(optional, string of doubles) A proportion below and
above which the individuals are blacklisted and removed from the dataset.
Default: |
filter.individuals.coverage.total |
(optional, string of doubles)
Target the total coverage per samples.
A proportion below and
above which the individuals are blacklisted and removed from the dataset.
Default: |
filter.individuals.coverage.median |
(optional, string of integers)
Target the median coverage per samples.
Integers, below and above, that blacklist individuals (removed from the dataset)
Default: |
filter.individuals.coverage.iqr |
(optional, string of integers)
Target the IQR (Interquartile Range) coverage per samples.
Integers, below and above, that blacklist individuals (removed from the dataset)
Default: |
parallel.core |
(optional) The number of core used for parallel
execution during import.
Default: |
verbose |
(optional, logical) When |
... |
(optional) Advance mode that allows to pass further arguments for fine-tuning the function. Also used for legacy arguments (see details or special section) |
A list with the filtered input and blacklist of individuals.
Thierry Gosselin thierrygosselin@icloud.com
filter_rad
tidy_genomic_data
, read_vcf
,
tidy_vcf
.
## Not run:
require(SeqArray)
# blacklisting outliers individuals:
id.qc <- radiator::filter_individuals(
data = "my.radiator.gds.rad",
filter.individuals.missing = "outliers",
filter.individuals.heterozygosity = "outliers",
filter.individuals.coverage.total = "outliers")
# using values to blacklist individuals:
id.qc <- radiator::filter_individuals(
data = "my.radiator.gds.rad",
filter.individuals.missing = 0.5,
filter.individuals.heterozygosity = c(0.02, 0.03),
filter.individuals.coverage.total = c(900000, 5000000))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.