generate_data: Generate simulated data

View source: R/generate_data.R

generate_dataR Documentation

Generate simulated data

Description

Generate data for a regular monitoring design. The counts follow a negative binomial distribution with given size parameters and the true mean mu depending on a year, period and site effect. All effects are independent from each other and have, on the log-scale, a normal distribution with zero mean and given standard deviation.

Usage

generate_data(
  intercept = 2,
  n_year = 24,
  n_period = 6,
  n_site = 20,
  year_factor = FALSE,
  period_factor = FALSE,
  site_factor = FALSE,
  trend = 0.01,
  sd_rw_year = 0.1,
  amplitude_period = 1,
  mean_phase_period = 0,
  sd_phase_period = 0.2,
  sd_site = 1,
  sd_rw_site = 0.02,
  sd_noise = 0.01,
  size = 2,
  n_run = 10,
  as_list = FALSE,
  details = FALSE
)

Arguments

intercept

The global mean on the log-scale.

n_year

The number of years.

n_period

The number of periods.

n_site

The number of sites.

year_factor

Convert year to a factor. Defaults to FALSE.

period_factor

Convert period to a factor. Defaults to FALSE.

site_factor

Convert site to a factor. Defaults to FALSE.

trend

The long-term linear trend on the log-scale.

sd_rw_year

The standard deviation of the year effects on the log-scale.

amplitude_period

The amplitude of the periodic effect on the log-scale.

mean_phase_period

The mean of the phase of the periodic effect among years. Defaults to 0.

sd_phase_period

The standard deviation of the phase of the periodic effect among years.

sd_site

The standard deviation of the site effects on the log-scale.

sd_rw_site

The standard deviation of the random walk along year per site on the log-scale.

sd_noise

The standard deviation of the noise effects on the log-scale.

size

The size parameter of the negative binomial distribution.

n_run

The number of runs with the same mu.

as_list

Return the dataset as a list rather than a data.frame. Defaults to FALSE.

details

Add variables containing the year, period and site effects. Defaults tot FALSE.

Value

A data.frame with five variables. Year, Month and Site are factors identifying the location and time of monitoring. Mu is the true mean of the negative binomial distribution in the original scale. Count are the simulated counts.


inbo/multimput documentation built on Sept. 17, 2023, 4:35 a.m.