take.sample: Take sample(s) from population data frame

View source: R/take.sample.r

take.sampleR Documentation

Take sample(s) from population data frame

Description

This function generates a single sample (of class 'sample') or a multiple sample (of class 'samples.population') containing all possible samples without replacement, from a given population.

Usage

take.sample(dat,y.name,n=0,m=0,type="srs",aux.name=NULL,take.all=FALSE)

Arguments

dat

Data frame containing the population to be sampled.

y.name

Name of the response variable (must be a column name of 'dat')

n

Primary sample unit sample size. If it is a vector, it is assumed to be the sample sizes of within each stratum. For example, to take a stratified sample with 5 primary sample units from stratum 1, 3 primary sample units from stratum 2 and 6 primary sample units from stratum 3, you would have 'n=c(5,3,6). Has no effect in the case of cluster sampling.

m

Secondary sample unit size. In cluster sampling, it is the number of clusters to be sampled. Has no effect in the case of stratified sampling.

type

Type of sampling. Valid types are

  • "srs"for simple random sampling,

  • "strat"for stratified simple random sampling,

  • "clust"for cluster sampling.

aux.name

Name of the auxiliary variable, if this is to be included in the sample (must be a column name of 'dat'). Not used in cluster sampling.

take.all

if FALSE, a single sample is generated; if TRUE all possible samples without replacement are generated

Details

Units are sampled without replacement. In the case of cluster sampling, secondary units are sampled with equal probability; in the case of simple random sampling, primary units are sampled with equal probability, and in the case of stratified simple random sampling, primary units are sampled with equal probability within strata.

Value

The function returns an object of class 'sample? when take.all==FALSE and of class 'samples.population' when take.all==TRUE. The object has the following components:

  • MNumber of secondary units in population

  • mNumber of secondary units sampled

  • NNumber of primary units in population; if stratified, this is a vector of length M.

  • nNumber of primary units in sample; if stratified, this is a vector of length M.

  • y.valueFor cluster sampling, this contains a vector with the sum of the y values in the sampled subunits. For srs and stratified rs, it is NULL.

  • x.valueFor cluster sampling, this contains a vector with the sizes (Nj) of the sampled subunits. For srs and stratified rs, it is NULL.

  • subunit1 ... subunitmObjects of class 'subsample' or 'subsamples.population' containing the primary unit sampled data. In the case of cluster sampling, this is NULL. Each subunit contains the following components:

    • NNumber of primary units in the subunit

    • nNumber of primary units sampled in the subunit

    • mu.xPopulation mean of the auxiliary variable in the subunit. If there is no auxiliary variable, it is NULL.

    • x.valueSampled auxiliary variable (x) values in the subunit. This is a matrix with one row per sample. If there is no auxiliary variable, it is NULL.

    • y.valueSampled response (y) values in the subunit. This is a matrix with one row per sample.

    • unitsIndices of the primary units in the sample. This is a character vector, with one element per sample. Each element is of the form "(i1,i2,...,in)", where i1 is the index of the first unit, i2 the index of the second unit, and so on.

See Also

define.subunit

Examples

  data(ackroyd) # get ackroyd data
  
  # simple random sample of sales, of size 4
  samp<-take.sample(ackroyd,y.name="sales",n=4)
  summary(samp) # summarize sample data
  
  # All simple random samples of sales, of size 4
  all.samp<-take.sample(ackroyd,y.name="sales",n=4,take.all=TRUE) 
  summary(all.samp) # summarize sample data
  
  # All simple random samples of sales, of size 4, with auxiliary variable "mplyees"
  all.samp<-take.sample(ackroyd,y.name="sales",n=4,aux.name="mplyees",take.all=TRUE) 
  summary(all.samp) # summarize sample data
  
  # All stratified random samples of sales, of size 4, with auxiliary variable "mplyees"
  # First need to define strata:
  strat.ackroyd<-define.subunit(ackroyd,aux.name="nature",type="strat")
  # Then take samples:
  all.strat.samp<-take.sample(strat.ackroyd,type="strat", y.name="sales",n=c(2,2),aux.name="mplyees",take.all=TRUE) 
summary(all.strat.samp) # summarize sample data


david-borchers/sampling documentation built on Sept. 17, 2022, 7:54 a.m.