preprocess_SNPs: preprocess_SNPs

Description Usage Arguments Value Examples

View source: R/preprocess_SNPs.R

Description

This functions takes raw SNP data and the associated phenotype response and returns a SNP dataset and phenotype response variable that can be used in the preselection function.

Usage

1
preprocess_SNPs(SNPs,Y,MAF = 0.01,number_cores,na.rm)

Arguments

SNPs

SNP data where each column is a SNP and the SNP column takes on the values A, C, T, or G.

Y

The phenotype response of interest. Should be a numeric vector.

MAF

The minor allele frequency at which to drop SNPs. Default is set to 0.01, meaning if the minor allele occurs less than 1 percent of the time in a given SNP, that given SNP will be dropped from the dataset.

number_cores

The number of cores one would wish to parallelize over.

na.rm

If there is NA's in the vector Y, set na.rm = TRUE and the Y values that are NA will be removed as well as the corresponding rows of the SNP matrix.

Value

SNPs

A new SNP matrix. The matrix will be formatted so the minor allele's are coded as 0's and the major allele's are coded as 1's. This matrix will have columns dropped that have minor allele frequency less than the specified value. It will also aggregate over replications, so SNP's and the vector Y will be aggregated according to replications in the SNP matrix.

Y

The new aggregated response vector Y. If you did not have any replications then this vector will be the exact same as the one entered.

SNPs_Dropped

This will tell you which SNPs were dropped if the had minor allele frequency less than the specified value, it will be in the form of column index number. If no SNPs were dropped this will be the character string "None".

Examples

1
2
3
4
5
data("vignette_lm_dat")
Y <- vignette_lm_dat$Phenotype
SNPs <- vignette_lm_dat[,-1]

preprocess_SNPs(SNPs = SNPs,Y = Y,MAF = 0.01,number_cores = 1,na.rm = FALSE)

willja16/GWAS.BAYES documentation built on Sept. 24, 2020, 12:48 a.m.