View source: R/filter_unlink.R
filter_unlink | R Documentation |
Parses a data table of genotypes/allele frequencies and returns a list of loci that are "unlinked", in the sense they occur on different contigs.
filter_unlink(
dat,
chromCol = "CHROM",
locusCol = "LOCUS",
posCol = "POS",
method = "random"
)
dat |
Data table: The sequencing read information, must contain the columns:
|
chromCol |
Character: The chromosome (or contig) information column. Default = |
locusCol |
Character: The locus information column. Default = |
posCol |
Character: The locus position column. Default = |
method |
Character: How should the filtering be performed? Default = |
Note, this function is specifically designed for RADseq data where contigs comprise small (100s bp) genomic regions assembed from restriction digest fragments. It should not be used on genomic contigs from genome assembly. Additionally, it is also important to follow up filtering with formal tests of linkage disequilibrium.
Returns a character vector of locus names in dat[[locusCol]]
that are not on the same contig in dat[[chromCol]]
.
data(data_Genos)
# Number of unique SNP per locus
data_Genos[, length(unique(LOCUS)), by=CHROM]$V1 %>% table
# Randomly sample 1 SNP per locus
snp.rand.1st <- filter_unlink(data_Genos, method='random')
snp.rand.2nd <- filter_unlink(data_Genos, method='random')
# Number of SNPs different between random sets
setdiff(snp.rand.1st, snp.rand.2nd) %>% length
# Sample first SNP per locus
snp.first.1st <- filter_unlink(data_Genos, method='first')
snp.first.2nd <- filter_unlink(data_Genos, method='first')
# Number of SNPs different between random sets
setdiff(snp.first.1st, snp.first.2nd) %>% length
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.