View source: R/create_design_filebacked.R
create_design_filebacked | R Documentation |
A function to create a design matrix, outcome, and penalty factor to be passed to a model fitting function
create_design_filebacked(
data_file,
rds_dir,
obj,
new_file,
feature_id = NULL,
add_outcome,
outcome_id,
outcome_col,
na_outcome_vals = c(-9, NA_integer_),
add_predictor = NULL,
predictor_id = NULL,
unpen = NULL,
logfile = NULL,
overwrite = FALSE,
quiet = FALSE
)
data_file |
A filepath to rds file of processed data (data from |
rds_dir |
The path to the directory in which you want to create the new '.rds' and '.bk' files. |
obj |
The RDS object read in by |
new_file |
User-specified filename (without .bk/.rds extension) for the to-be-created .rds/.bk files. Must be different from any existing .rds/.bk files in the same folder. |
feature_id |
A string specifying the column in the data X (the feature data) with the row IDs (e.g., identifiers for each row/sample/participant/, etc.). No duplicates allowed.
- for PLINK data: a string specifying an ID column of the PLINK |
add_outcome |
A data frame or matrix with two columns: and ID column and a column with the outcome value (to be used as 'y' in the final design). IDs must be characters, outcome must be numeric. |
outcome_id |
A string specifying the name of the ID column in 'add_outcome' |
outcome_col |
A string specifying the name of the phenotype column in 'add_outcome' |
na_outcome_vals |
A vector of numeric values used to code NA values in the outcome. Defaults to |
add_predictor |
Optional (for PLINK data only): a matrix or data frame to be used for adding additional unpenalized covariates/predictors/features from an external file (i.e., not a PLINK file). This matrix must have one column that is an ID column; all other columns aside the ID will be used as covariates in the design matrix. Columns must be named. |
predictor_id |
Optional (for PLINK data only): A string specifying the name of the column in 'add_predictor' with sample IDs. Required if 'add_predictor' is supplied. The names will be used to subset and align this external covariate with the supplied PLINK data. |
unpen |
Optional (for delimited file data only): an optional character vector with the names of columns to mark as unpenalized (i.e., these features would always be included in a model). Note: if you choose to use this option, X must have column names. |
logfile |
Optional: name of the '.log' file to be written – Note: do not append a |
overwrite |
Logical: should existing .rds files be overwritten? Defaults to FALSE. |
quiet |
Logical: should messages to be printed to the console be silenced? Defaults to FALSE |
A filepath to the created .rds file containing all the information for model fitting, including a standardized X and model design information
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.