knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/replace-missing-data-", out.width = "100%" )
library(MetaPipe)
MetaPipe can handle with missing data in a couple of ways:
replace_missing(raw_data = raw_data, excluded_columns = c(2, 3, ..., M), # Optional out_prefix = "metapipe", prop_na = 0.5, replace_na = FALSE)
where raw_data is a data frame containing the raw data, as described in Load Raw Data and excluded_columns is a vector containing the indices of the properties, e.g. c(2, 3, ..., M). The other arguments are optional, out_prefix is the prefix for output files, prop_na is the proportion of NA values (used to drop traits), and replace_na is a logical flag to indicate whether or not NAs should be replace by half of the minimum value.
By default the pipeline will drop traits that exceed an NA proportion threshold,
this can be fine tuned by the user with the parameter prop_na. It is important
to keep in mind the side effects of excluding variables from the QTL mapping,
like making wrong conclusions regarding the most significant QTLs.
set.seed(123) example_data <- data.frame(ID = 1:5, P1 = c("one", "two", "three", "four", "five"), T1 = rnorm(5), T2 = rnorm(5), T3 = c(NA, rnorm(4)), # 20 % NAs T4 = c(NA, 1.2, -0.5, NA, 0.87), # 40 % NAs T5 = NA) # 100 % NAs # Default parameters: NA proportion = 50% replace_missing(example_data, c(2)) # NA proportion = 30% replace_missing(example_data, c(2), prop_na = 0.3)
Alternatively, the user can indicate whether or not NA values should be replaced
by finding the minimum value for each trait and dividing it by two. This result
can be achieved by passing the parameter replace_na = TRUE. Users should be
cautious when using this approach, as the replacement of data points can have
side effects that might lead to the wrong conclusions.
set.seed(123) example_data <- data.frame(ID = 1:5, P1 = c("one", "two", "three", "four", "five"), T1 = rnorm(5), T2 = rnorm(5), T3 = c(NA, rnorm(4)), # 20 % NAs T4 = c(NA, 1.2, -0.5, NA, 0.87), # 40 % NAs T5 = NA) # 100 % NAs # Default parameters: NA proportion = 50% replace_missing(example_data, c(2), replace_na = TRUE) # NA proportion = 30% replace_missing(example_data, c(2), prop_na = 0.3, replace_na = TRUE)
From the last example can be seen that prop_na and replace_na are mutually
exclusive, and replace_na has precedence over prop_na.
Next, see Assess Normality.
filenames <- c(list.files(".", "metapipe_*")) output <- lapply(filenames, file.remove)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.