calcMAF: Takes a recoded (0,1,2) data.table and returns a two column...

Description Usage Arguments Value Author(s) Examples

Description

Takes a recoded (0,1,2) data.table and returns a two column data.table of num missing and minor allele frequency ((minor alelle)/total) FracMissing is number of NA values/number of lines MAF is (number of minor allele + 0.5 the number of het allele)/(number of nonNA alleles)

Usage

1
calcMAF(genoTable)

Arguments

genoTable

Data table where SNPs are rows and lines are columns, no metadata columns. Coded as 0,1,2.

Value

Returns a data.table with two columns and rows in the same order as the input. One column contains the MAF and one contains the FracMissing for each row.

Author(s)

Greg Ziegler

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
alleleTable <- cbind(alleleTable,calcMAF(genoTable))
#decide who to delete
alleleTable[,discard := (FracMissing>0.2 | MAF<0.1)]

#get rid of the rows in snp that are TRUE in alleleTable$discard
#memory efficient approach to removing row
# see http://stackoverflow.com/questions/10790204/how-to-delete-a-row-by-reference-in-data-table?lq=1
keepIdxs <- which(alleleTable$discard==FALSE)
cols <- names(GenotypeData)
firstCol <- cols[1]
snp.subset <- data.table(col1 = GenotypeData[[firstCol]][keepIdxs]) 
setnames(snp.subset,"col1",firstCol)
for(col in cols[2:length(cols)]){
  snp.subset[, (col) := GenotypeData[[col]][keepIdxs],]
  GenotypeData[, (col):= NULL,] #delete
}
rm(GenotypeData)
snpInfo <- alleleTable[discard==FALSE]

gziegler/ionomicsUtils documentation built on June 20, 2019, 8:04 p.m.