region_sampler1: Sample sequence IDs from within a region

Description Usage Arguments Examples

View source: R/sampleSelection1.R

Description

Sample sequence IDs from within a region

Usage

1
2
3
4
5
6
7
region_sampler1(
  md,
  n = 50,
  inclusion_rules = list(),
  dedup = TRUE,
  time_stratify = TRUE
)

Arguments

md

data frame with gisaid metadata

n

sample size within region

inclusion_rules

Required list of rules of the form c( <meta data column> , <regular expression> ). Where this pattern matches, a sequence will be included in the sample

dedup

If TRUE identical sequences will not be included

time_stratify

If TRUE, will collect a time stratified sample from within the region instead of simple random sample

Examples

1
2
3
4
5
6
7
8
## Not run: 
#This will get a sample from King County in Washington and exclude sequences labelled Washington from the exog sample, since we aren't sure if they are from King County or not
ipatt = '^KingCounty$'
epatt = '.*Washington.*'
regiontips = region_sampler1( md, n = 10  , inclusion_rules = list( c('CityOrCounty', ipatt) ))
exogtips = exog_sampler1( md, 20, D, s, exclusion_rules = list( c('CityOrCounty', epatt) )  )

## End(Not run)

emvolz-phylodynamics/sarscov2Rutils documentation built on Nov. 17, 2020, 9:22 a.m.