uni.simudistrib: Random sample simulation

View source: R/02_data_preparation.R

uni.simudistribR Documentation

Random sample simulation

Description

The uni.simudistrib function automatically generates 5 Cleveland dotplots of random samples from different distributions (either normal, log-normal, or poisson) based on the parameters of the variables in the input data.frame or matrix (see Details).
This function is useful to see whether variables' extreme values are actual outliers or whether they lie in a range of values possible for a random sample drawn from a normal, log-normal, or a poisson distribution. In fine, it may help determine if the original variable can be approximated by these distribution with or without a transformation.

Usage

uni.simudistrib(simu.var, distribution)

Arguments

simu.var

A data.frame or a matrix. For obvious layout and readability reasons, simu.var should not include too many variables (p < 12 is advised) if the plot is to be printed in a page or window of limited dimensions. For any use in HTML documents (e.g. with RMarkdown), the number of input variables should not be a problem.

distribution

Either "normal", "log-normal" or "poisson" (case sensitive).

Details

The uni.simudistrib function extracts some key parameters from the input variables (sample size, mean and standard deviation) and generates random samples based on these parameters. For instance, if simu.var contains i variables X1, X2, ... Xi and if distribution = "normal", the function will return a panel of ix5 plots:

  • The 1st row will contain five dotplots for five random samples with n = length(X1) and drawn from a Normal distribution with the same mean and standard deviation as X1.

  • The 2nd row will contain five dotplots for five random samples with n = length(X2) and drawn from a Normal distribution with the same mean and standard deviation as X2.

  • Etc.

Warning: the function may fail for log-normal and poisson distributions if input variables contain negative values (because these distributions are by definition positive). Additionally, if distribution = "poisson", the resulting plots will return integer values as Poisson is a discrete probability distribution.

Value

A panel of p*5 plots, where p is the number of variables in simu.var.

Examples

uni.simudistrib(simu.var = iris[,1:4], distribution = "normal")

mrelnoob/jk.dusz.tarping documentation built on July 31, 2023, 9:19 a.m.