snpMatrixScour: SNP filtering based on Minor Allele Frequency, Hardy-Weinberg...

Description Usage Arguments Details Value See Also Examples

View source: R/Scour.R

Description

snpMatrixScour aims at filtering out SNPs of a snpMatrix object based on Minor Allele Frequency criterion, deviation to Hardy-Weinberg Equilibrium and SNPs call rate.

Usage

1
2
snpMatrixScour(snpX, genes.length = NULL, genes.info = NULL,
  min.maf = 0.01, min.eq = 0.01, call.rate = 0.9)

Arguments

snpX

SnpMatrix object from which SNPs are to be removed

genes.length

(optional) numeric vector. It is the length (in columns/SNPs) of each gene. Each gene declared is considered contiguous with the one before and after it. genes.lengths can be named (names will be kept).

genes.info

(optional) a data.frame with four columns named Genenames, SNPnames, Position and Chromosome. Each row describes a SNP and missing values are not allowed.

min.maf

a single value between 0 and 0.5 that gives the threshold for the MAF (Minor Allele Frequency) of a SNP. SNP with MAF < min.maf are removed. Default is 0.01.

min.eq

a single value between 0 and 1 that gives the maximum acceptable p-value for the χ ^2 verifying HWE deviation. SNP that does not meet that criterion are removed. Default is 0.01.

call.rate

a single value between 0 and 1 that gives the minimum acceptable call rate for a SNP. Default is 0.9. Low values for SNPs call rate can make imputation harder (residual missing values).

Details

This function removes SNPs that does not meet all following criteria:

If genes.length and genes.info are provided by the user, an updated version is returned by snpMatrixScour. The returned object can be directly used as inputs of the GGI function.

Value

A list with two objects:

snpX

the SnpMatrix object where non-conform SNPs are removed.

genes.info

the object that contains the updated gene lengths information. Can be a numeric vector (possibly named) or a data frame. If genes.length and genes.info are not provided by the user as input of the snpMatrixScour function, the genes.info object is NULL.

See Also

GGI

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
## Not run: 
ped <- system.file("extdata/example.ped", package="GeneGeneInteR")
info <- system.file("extdata/example.info", package="GeneGeneInteR")
posi <- system.file("extdata/example.txt", package="GeneGeneInteR")
data.imported <- importFile(file=ped, snps=info, pos=posi, pos.sep="\t")

## End(Not run)
### Equivalent loading of the imported data
load(system.file("extdata/dataImported.Rdata", package="GeneGeneInteR"))

## In this example, SNPs are with MAF lower than 0.2 or p-value for HWE testing lower than 0.05 or
# a proportion of missing value higher than 0.2 are removed
data.scour1 <- snpMatrixScour(data.imported$snpX, genes.info = data.imported$genes.info,
                               min.maf = 0.2, min.eq=0.05, call.rate = 0.8)
## Two genes have been completely removed from the resulting dataset.

GeneGeneInteR documentation built on Nov. 8, 2020, 6:28 p.m.