R/data_reg.R

#' Regression toy dataset
#' 
#' A simple dataset containing simulated values for a numeric response variable
#' and four covariates of both mixed and partially structured type. The data
#' generation process is based on Section 5 (''Example: synthetic data'') from
#' Serban and Wasserman (2005).
#' 
#' @format List with two elements: \code{covs}, which is a list containing the
#'   covariates, and \code{resp}, which is a numeric vector of length 200
#'   representing the response variable. The response variable is specified as
#'   in Serban and Wasserman (2005). The four covariates in \code{covs} all have
#'   length 200 and are characterized as follows:
#' \itemize{
#' \item Nominal: level 0 for observations having negative response variable,
#' level 1 otherwise;
#' \item Numeric: coefficients for one of the basis used to perform the
#' B-splines expansion of the curves that are in turn specified as in Serban and
#' Wasserman (2005);
#' \item Functional: curves as specified in Serban and Wasserman (2005), with 50
#' observations coming from each of the four curve shapes;
#' \item Graphs: Erd\"{o}s-R\'{e}nyi graphs with connection probability given by
#' a transformation of the response variable obtained standardizing between 0.2
#' and 0.8 its value after adding a normally distributed noise with mean 0 and
#' standard deviation 7.
#' }
#' 
#' @references 
#' 
#' Serban, N., and Wasserman, L. (2005). CATS: clustering after transformation
#' and smoothing. \emph{Journal of the American Statistical Association},
#' 100(471), 990-999.
#' 
"data_reg"

Try the etree package in your browser

Any scripts or data that you put into this service are public.

etree documentation built on July 16, 2022, 9:05 a.m.