sample.df: Select Random Sampling from Data Frame

Description Usage Arguments Details Value Examples

Description

Returns a random sample of a given data frame, either simple or stratified, depending on inputs.

Usage

1
sample.df(df, size, strata = NULL, sample.only = NULL)

Arguments

df

Data frame for selecting stratified random sample.

size

Size of sample to select. If less than 1, a percent will be selected. If greater than 1, an absolute number will be selected.

strata

String, or vector of strings, containing names of fields in df on which to stratify. If NULL, selects a simple random sample.

sample.only

Flag for returning entire dataframe with sample marked, or sample only. If NULL, returns entire dataframe with sample marked.

Details

If strata are specified, and an absolute size is given which is larger than the total number of observations in a given strata cell, no records will be selected from that cell.

Value

Data frame containing selected observations.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Return whole dataframe with 10% of observations randomly marked
sample.df(workhrs, 0.10)

# Return randomly selected 12% of the dataframe
sample.df(workhrs, 0.10, sample.only = TRUE)

# Return 21 randomly selected observations
sample.df(workhrs, 21, sample.only = TRUE)

# Return randomly selected 30% of the dataframe, distributed proportionally across marital.status
sample.df(workhrs, 0.30, strata = "marital.status", sample.only = TRUE)

ddavid-evdy/miscR documentation built on May 15, 2019, 1:49 a.m.