rep_sample_n: Perform repeated sampling

Description Usage Arguments Value Examples

View source: R/rep_sample_n.R

Description

Perform repeated sampling of samples of size n. Useful for creating sampling distributions.

Usage

1
rep_sample_n(tbl, size, replace = FALSE, reps = 1, prob = NULL)

Arguments

tbl

Data frame of population from which to sample.

size

Sample size of each sample.

replace

Should sampling be with replacement?

reps

Number of samples of size n = size to take.

prob

A vector of probability weights for obtaining the elements of the vector being sampled.

Value

A tibble of size rep times size rows corresponding to rep samples of size n = size from tbl.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(ggplot2))

# A virtual population of N = 10,010, of which 3091 are hurricanes
population <- dplyr::storms %>%
  select(status)

# Take samples of size n = 50 storms without replacement; do this 1000 times
samples <- population %>%
  rep_sample_n(size = 50, reps = 1000)
samples

# Compute p_hats for all 1000 samples = proportion hurricanes
p_hats <- samples %>%
  group_by(replicate) %>%
  summarize(prop_hurricane = mean(status == "hurricane"))
p_hats

# Plot sampling distribution
ggplot(p_hats, aes(x = prop_hurricane)) +
  geom_density() +
  labs(x = "p_hat", y = "Number of samples",
  title = "Sampling distribution of p_hat from 1000 samples of size 50")

andrewpbray/infer documentation built on Aug. 29, 2019, 5:57 a.m.