knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
After creating an as.MLinput()
object, the next phase in the peppuR pipeline involves common preprocessing steps such as:
Since we have no missing data, we'll proceed into correlation filtering which utilizes Max Kuhn's caret
package. In general we use a correlation matrix based approach with the peppuR function univariate_feature_selection()
library(peppuR) library(MASS) birthweight_data <- birthwt birthweight_data$ID <- paste("ID",1:nrow(birthweight_data), sep = "_") birthweight_data$low <- as.factor(birthweight_data$low) # Make categorical columns factors birthweight_data[, colnames(birthweight_data) %in% c("race", "smoke", "ht", "ui")] <- lapply(birthweight_data[, colnames(birthweight_data) %in% c("race", "smoke", "ht", "ui")], function(x) as.factor(x)) # Create an organized data object single_source_peppuRobj <- as.MLinput(X = birthweight_data, Y = NULL, meta_colnames = c("ID", "low"), categorical_features = TRUE, sample_cname = "ID", outcome_cname = "low") single_source_peppuRobj <- univariate_feature_selection(single_source_peppuRobj)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.