knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

newdata Poisson Consulting logo

Lifecycle: stable R-CMD-check codecov License: MIT CRAN status

Introduction

newdata is an R package to generate new data frames by varying some variables while holding the others constant.

By default, all specified variables vary across their range while all other variables are held constant at a reference value. The user can specify the length of each sequence, require that only observed values and combinations are used and add new variables. Types, classes, factor levels and time zones are always preserved.

Consider the following observed 'old' data frame.

library(newdata)

newdata::old_data

Reference Value

By default all variables are set to a reference value.

xnew_data(old_data)

The reference value depends on the class of the variable, by default:

Sequences

Specifying a variable causes it to vary sequentially across its range.

xnew_data(old_data, int)

By default the sequence depends on the class of the variable:

These values can be overridden by setting the following options:

When programming it is strongly recommended that the user explicitly specify the length of each sequence individually.

xnew_data(old_data, lgl, xnew_seq(int, length_out = 3))

A third alternative is to specify the length of all the sequences in the data set but this can result in less common character strings or later factor or ordered levels being dropped.

xnew_data(old_data, dbl, int, .length_out = 2)

Observed Values

The user can also indicate whether only observed values should be used in the sequence.

xnew_data(old_data, xnew_seq(int, length_out = 3, obs_only = TRUE))

The xobs_only() function can be used to filter out unobserved values after the sequence has been generated.

xnew_data(old_data, xobs_only(xnew_seq(int, length_out = 3)))

and when two or more variables are specified all combinations are used.

xnew_data(old_data, int, fct)

to only get observed combinations.

xnew_data(old_data, xobs_only(int, fct))

Modifying Variables

Modifying an existing variable or changing an existing one is simple.

xnew_data(old_data, lgl = median(lgl, na.rm = TRUE), extra = c(TRUE, FALSE))

Casting Variables

Casting variables to be the same class as the original is achieved as follows.

xnew_data(old_data, xcast(lgl = 1, int = 7, dbl = 10L, fct = "a rarity", hms = "00:00:02"))

A Simple Wrapper

Although superseded, for consistency with existing code new_data() which is a simple wrapper on xnew_data() allows the user to pass a character vector and to specifying the length of all the sequences is also provided.

new_data(old_data, seq = c("int", "fct"), length_out = 5)

Installation

To install the latest release version from CRAN.

install.packages("newdata")

To install the latest development version from GitHub

# install.packages("pak")
pak::pak("poissonconsulting/newdata")

or from r-universe.

install.packages("newdata", repos = c("https://poissonconsulting.r-universe.dev", "https://cloud.r-project.org"))

Contribution

Please report any issues.

Pull requests are always welcome.

Code of Conduct

Please note that the newdata project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.



poissonconsulting/newdata documentation built on July 4, 2025, 3:29 p.m.