R/Simdata.R

Defines functions Simdata

Documented in Simdata

#' Generate simulation data (The unified class framework to generate simulation data)
#'
#' This function helps you quickly generate simulation data.
#' You just need to input the sample and dimension of the data
#' you want to generate and the covariance parameter rho.
#' The models is numerous.
#'
#' @param n Number of subjects in the dataset to be simulated. It will also equal to the
#' number of rows in the dataset to be simulated, because it is assumed that each
#' row represents a different independent and identically distributed subject.
#' @param p Number of predictor variables (covariates) in the simulated dataset.
#' These covariates will be the features screened by model-free procedures.
#' @param rho The correlation between adjacent covariates in the simulated matrix X.
#' The within-subject covariance matrix of X is assumed to has the same form as an
#' AR(1) auto-regressive covariance matrix, although this is not meant to imply
#' that the X covariates for each subject are in fact a time series. Instead, it is just
#' used as an example of a parsimonious but nontrivial covariance structure. If
#' rho is left at the default of zero, the X covariates will be independent and the
#' simulation will run faster.
#' @param R A positive integer, number of outcome categories for multinomial (categorical) outcome Y.
#' @param beta A vector with length of n, which are the coefficients that you want to generate
#' about chosen model. The default is beta=(1,1,1,1,1,0,...,0)^T.
#' @param error The distribution of error term.
#' @param R A positive integer, number of outcome categories for multinomial (categorical) outcome Y.
#' @param lambda This parameter control the censoring rate in survival data.
#' The censored time is generated by exponential distribution with mean 1/lambda. The default
#' is lambda=0.1.
#' @param order The number of interactive variables and the default is 2.
#' @param style  Whether categories in categorial data are balanced or not.
#' @param type The type of multivariate response models, which use different mean and covariance
#' structure to generate data. Specially, type="a" is following the Model 3.a and
#' type="b" is following the Model 3.b by Liu et al.(2020).
#' @param model The model that you choose to generate simulation data.
#'
#' @return the list of your simulation data
#' @import MASS
#' @importFrom MASS mvrnorm
#' @importFrom stats rnorm
#' @importFrom stats rt
#' @importFrom stats rcauchy
#' @export
#' @author Xuewei Cheng \email{xwcheng@hunnu.edu.cn}
#' @examples
#' n <- 100
#' p <- 200
#' rho <- 0.5
#' data <- Simdata(n, p, rho, error = "gaussian", model = "linear")
#' @references
#'
#' Liu, W., Y. Ke, J. Liu, and R. Li (2020). Model-free feature screening and FDR control with knockoff features. Journal of the American Statistical Association, 1–16.
Simdata <- function(n, p, rho,
                    beta = c(rep(1, 5), rep(0, p - 5)),
                    error = c("gaussian", "t", "cauchy"),
                    R = 3,
                    style = c("balanced", "unbalanced"),
                    lambda = 0.1,
                    order = 2,
                    type = c("a", "b"),
                    model = c(
                      "linear", "nonlinear", "binomial", "poisson", "classification",
                      "Cox", "interaction", "group", "multivariate", "AFT"
                    )) {
  if (model == "linear") {
    data <- GendataLM(n, p, rho, beta, error) ## NO.1
  } else if (model == "nonlinear") {
    data <- GendataTM(n, p, rho, beta, error) ## NO.2
  } else if (model == "binomial") {
    data <- GendataLGM(n, p, rho, beta) ## NO.3
  } else if (model == "poisson") {
    data <- GendataPM(n, p, rho, beta) ## NO.4
  } else if (model == "classification") {
    data <- GendataLDA(n, p, R, error, style) ## NO.5
  } else if (model == "Cox") {
    data <- GendataCox(n, p, rho, beta, lambda) ## NO.6
  } else if (model == "interaction") {
    data <- GendataIM(n, p, rho, order) ## NO.7
  } else if (model == "group") {
    data <- GendataGP(n, p, rho, error) ## NO.8
  } else if (model == "multivariate") {
    data <- GendataMRM(n, p, rho, type) ## NO.9
  } else if (model == "AFT") {
    data <- GendataAFT(n, p, rho, beta, lambda, error) ## NO.10
  } else {
    stop("The author has not implemented this model yet.")
  }
  return(data)
}

Try the MFSIS package in your browser

Any scripts or data that you put into this service are public.

MFSIS documentation built on June 22, 2024, 9:42 a.m.