sim_df: Simulate an existing dataframe
In faux: Simulation for Factorial Designs

sim_df

R Documentation

Simulate an existing dataframe

Description

Produces a data table with the same distributions and correlations as an existing data table Only returns numeric columns and simulates all numeric variables from a continuous normal distribution (for now).

Usage

sim_df(
  data,
  n = 100,
  within = c(),
  between = c(),
  id = "id",
  dv = "value",
  empirical = FALSE,
  long = faux_options("long"),
  seed = NULL,
  missing = FALSE,
  sep = faux_options("sep")
)

Arguments

`data`	the existing tbl
`n`	the number of samples to return per group
`within`	a list of the within-subject factor columns (if long format)
`between`	a list of the between-subject factor columns
`id`	the names of the column(s) for grouping observations
`dv`	the name of the DV (value) column
`empirical`	Should the returned data have these exact parameters? (versus be sampled from a population with these parameters)
`long`	whether to return the data table in long format
`seed`	DEPRECATED use set.seed() instead before running this function
`missing`	simulate missing data?
`sep`	separator for factor levels

Details

See vignette("sim_df", package = "faux") for details.

Value

a tbl

Examples

iris100 <- sim_df(iris, 100)
iris_species <- sim_df(iris, 100, between = "Species")

# set the names of within factors and (the separator character) 
# if you want to return a long version
longdf <- sim_df(iris, 
                 between = "Species", 
                 within = c("type", "dim"),
                 sep = ".",
                 long = TRUE)
                 
# or if you are simulating data from a table in long format
widedf <- sim_df(longdf, 
                 between = "Species", 
                 within = c("type", "dim"),
                 sep = ".")

faux documentation built on April 3, 2025, 7:44 p.m.