Description Usage Arguments Value Author(s) Examples
Takes a recoded (0,1,2) data.table and returns a two column data.table of num missing and minor allele frequency ((minor alelle)/total) FracMissing is number of NA values/number of lines MAF is (number of minor allele + 0.5 the number of het allele)/(number of nonNA alleles)
1 | calcMAF(genoTable)
|
genoTable |
Data table where SNPs are rows and lines are columns, no metadata columns. Coded as 0,1,2. |
Returns a data.table with two columns and rows in the same order as the input. One column contains the MAF and one contains the FracMissing for each row.
Greg Ziegler
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | alleleTable <- cbind(alleleTable,calcMAF(genoTable))
#decide who to delete
alleleTable[,discard := (FracMissing>0.2 | MAF<0.1)]
#get rid of the rows in snp that are TRUE in alleleTable$discard
#memory efficient approach to removing row
# see http://stackoverflow.com/questions/10790204/how-to-delete-a-row-by-reference-in-data-table?lq=1
keepIdxs <- which(alleleTable$discard==FALSE)
cols <- names(GenotypeData)
firstCol <- cols[1]
snp.subset <- data.table(col1 = GenotypeData[[firstCol]][keepIdxs])
setnames(snp.subset,"col1",firstCol)
for(col in cols[2:length(cols)]){
snp.subset[, (col) := GenotypeData[[col]][keepIdxs],]
GenotypeData[, (col):= NULL,] #delete
}
rm(GenotypeData)
snpInfo <- alleleTable[discard==FALSE]
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.