distance_thin: Filter a vcf file based on distance between SNPs on a given...

View source: R/distance_thin.R

distance_thinR Documentation

Filter a vcf file based on distance between SNPs on a given scaffold

Description

This function requires a vcfR object as input, and returns a vcfR object filtered to retain only SNPs greater than a specified distance apart on each scaffold. The function starts by automatically retaining the first SNP on a given scaffold, and then subsequently keeping the next SNP that is greater than the specified distance away, until it reaches the end of the scaffold/chromosome. This function scales well with an increasing number of SNPs, but poorly with an increasing number of scaffolds/chromosomes. For this reason, there is a built in progress bar, to monitor potentially long-running executions with many scaffolds. This type of filtering is often employed to reduce linkage among input SNPs, especially for downstream input to programs like structure, which require unlinked SNPs.

Usage

distance_thin(vcfR, min.distance = NULL)

Arguments

vcfR

a vcfR object

min.distance

a numeric value representing the smallest distance (in base-pairs) allowed between SNPs after distance thinning

Value

An identical vcfR object, except that SNPs separated by less than the specified distance have been removed from the file

Examples

distance_thin(vcfR = SNPfiltR::vcfR.example, min.distance = 1000)

SNPfiltR documentation built on March 31, 2023, 8:57 p.m.