| thinSNP | R Documentation |
This function groups SNPs by chromosome, sorts them by physical position, and then iteratively selects SNPs such that no two selected SNPs within the same chromosome are closer than a specified minimum distance.
thinSNP(df, chrom_col_name, pos_col_name, min_distance)
df |
The input dataframe. |
chrom_col_name |
A string specifying the name of the chromosome column. |
pos_col_name |
A string specifying the name of the physical position column. |
min_distance |
A numeric value for the minimum distance between selected SNPs.
The unit of this distance should match the unit of the |
A thinned dataframe with the same columns as the input.
# Create sample SNP data
set.seed(123)
n_snps <- 20
snp_data <- data.frame(
MarkerID = paste0("SNP", 1:n_snps),
Chrom = sample(c("chr1", "chr2"), n_snps, replace = TRUE),
ChromPosPhysical = c(
sort(sample(1:1000, 5)), # SNPs on chr1
sort(sample(1:1000, 5)) + 500, # More SNPs on chr1
sort(sample(1:2000, 10)) # SNPs on chr2
),
Allele = sample(c("A/T", "G/C"), n_snps, replace = TRUE)
)
# Ensure it's sorted by Chrom and ChromPosPhysical for clarity in example
snp_data <- snp_data[order(snp_data$Chrom, snp_data$ChromPosPhysical), ]
rownames(snp_data) <- NULL
print("Original SNP data:")
print(snp_data)
# Thin the SNPs, keeping a minimum distance of 100 units (e.g., bp)
thinned_snps <- thinSNP(
df = snp_data,
chrom_col_name = "Chrom",
pos_col_name = "ChromPosPhysical",
min_distance = 100
)
print("Thinned SNP data (min_distance = 100):")
print(thinned_snps)
# Thin with a larger distance
thinned_snps_large_dist <- thinSNP(
df = snp_data,
chrom_col_name = "Chrom",
pos_col_name = "ChromPosPhysical",
min_distance = 500
)
print("Thinned SNP data (min_distance = 500):")
print(thinned_snps_large_dist)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.