dfilter: Filtering datasets for subpopulations with low sample sizes

Description Usage Arguments Value Examples

View source: R/dfilter.R

Description

Simplifies the process of eliminating subpopulations with low sample sizes.

Usage

1
dfilter(data, minsample)

Arguments

data

Matrix containing genotype data with individuals as rows and loci as columns. Genotypes should be coded as 0 (homozygous), 1 (heterozygous), or 2 (homozygous). Rownames must be subpopulation names and column names should be marker names.

minsample

An integer representing the smallest number of individuals a subpopulation must contain to be included in analysis.

Value

filtered_data The original dataset minus the subpopulations that fail to meet the sample size threshold.

Examples

1
2
3
4
5
6
7
8
test <- matrix(round(runif(400,1,2)), nrow = 100)
rownames(test) <- c(rep(c('A','B','C'),each=25), rep(c('D','E'), each=5), rep('F', 15))
dim(test)

#The 'D' and 'E' subpopulations have only five members each and should be removed
filtered_test <- dfilter(test,12)

dim(filtered_test)	# New dataset is reduced by 10 rows (five for 'D' and five for 'E')

ohtadstats documentation built on Nov. 15, 2019, 5:11 p.m.