iGATE Report

library(knitr)
library(xtable)
library(kableExtra)
knitr::opts_chunk$set(echo = FALSE)
#Determine the follow up test used for analysis
follow_up_test <- if(params$test == "w"){"Wilcoxon Rank Test"}else{"Student's t-test"}
data(list = params$results_path)
results <- get(params$results_path)
if(params$validation){
data(list = params$validation_path)
validation <- get(params$validation_path)
data(list = params$validation_counts)
validation_counts <- get(params$validation_counts)
data(list = params$validation_summary)
validation_summary <- get(params$validation_summary)
}

Overview

Using the r params$df_name data set, we conducted a r params$versus good versus r params$versus bad r params$type pairwise comparison analysis on r Sys.Date(). The target variable was the r params$type r params$target. The goal of the analysis was to find process parameters in the r params$df_name data set that significantly influence the values of r params$target. For r params$target, r params$good_outcome values correspond to good values. A list of all the parameters that were analysed can be found in the [Suspected Sources of Variation] section.

Analysis

Outlier removal for the target variable was r if(params$outlier_removal_target){"conducted, giving the following results"}else{"not conducted"}. r if(params$type == "categorical"){"Outlier removal for the target only makes sense if the target is continuous."}

#Determine the filename of boxplot of target
#filepath <- paste0("Boxplot_of_", params$target,".png")
filepath <- paste0(params$image_directory,"/Boxplot_of_", params$target,".png")
knitr::include_graphics(filepath)

r if(params$outlier_removal_target){"Then, the"}else{"First, the"} Tukey-Duckworth test was conducted to obtain a preliminary selection of potentially influential process parameters. Then, the r follow_up_test was used as a follow up test. The count obtained from the counting method and the p-value from the r follow_up_test have been recorded in the [Results] section.

For the influential parameters good and bad control bands have been extracted. These are shown in the [Results] section.

A sanity check was performed for the selected parameters by visually inspecting a r if(params$type == "continuous"){"regression plot of the ssv against the target and checking for a correlation."}else{"density plot and checking for a separation between good and bad categories."} These plots can be found in the Appendix under r if(params$type == "continuous"){"Regression Plots"}else{"Density Plots"}.

The influential parameters and control bands r if(params$validation){"have"}else{"have not yet"} been validated. r if(params$validation){"The results can be found in the Validation section."}

Results

The Tukey-Duckworth test and r follow_up_test produced the following results.

kable(results[,1:3],
      caption = paste0('Results of the counting method and the ', follow_up_test, "."),
      booktabs = TRUE,
      longtable = TRUE)%>%
  kable_styling(latex_options = c("hold_position", "repeat_header"))

\blandscape

The Tukey-Duckworth test and r follow_up_test produced the following results.

#Only print bands for kept SSV.
kable(results[,c(1,4:7)],
      digits = 3,
      caption = 'Control bands for the influential parameters.',
      booktabs = TRUE,
      longtable = TRUE,
      col.names = c("SSV",
                    "lower",
                    "upper",
                    "lower",
                    "upper"))%>%
  add_header_above(c(" " = 1, "Good Band" = 2, "Bad Band" = 2),
                   bold = TRUE) %>%
  kable_styling(latex_options = c("hold_position", "repeat_header")) %>%
  column_spec(1, border_right = TRUE, width = "5cm")

\elandscape

\newpage

Validation

The influential parameters and control bands r if(params$validation){"have"}else{"have not yet"} been validated. r if(params$validation){"In the validation step the previous results are tested against a new data frame. From the new data frame all observations are extracted that fall either into all the good control bands or all the bad control bands of the ssv the user decided to keep. In this section a summary of the validation is presented. The list of all extracted observations and their expected quality can be found in the appendix."}

r if((params$validation) && (params$type == "continuous")){"The table below summarizes the validation results. It shows the maximum and minimum observed value of the target variable for the observations with good expected quality and bad expected quality."} r if((params$validation) && (params$type == "categorical")){"The table below summarizes the validation results. It shows how frequently which category appears in the observations with good expected quality and bad expected quality."}

if(params$validation){
kable(validation_summary,
      row.names = FALSE,
      caption = 'Summary of validation step.',
      booktabs = TRUE,
      longtable = TRUE)%>%
  kable_styling(latex_options = c("hold_position", "repeat_header"))
}

r if(params$validation){"The following table shows how many observations would fall into the good/ bad category IF only that one SSV had been analysed. If a count is very low, then that means that this SSV is excluding many observations from the validation output. In that case it might be worthwhile to re-run the validation procedure without that particular SSV."}

if(params$validation){
  kable(validation_counts,
        caption = 'Number of observations falling into the good/ bad bands per SSV.',
      booktabs = TRUE,
      longtable = TRUE)%>%
  kable_styling(latex_options = c("hold_position", "repeat_header"))
}

Appendix

Suspected Sources of Variation

Outlier removal for each ssv was r if(params$outlier_removal_ssv){"conducted"}else{"not conducted"}. The ssv that were analysed are

# We want a table with three columns listing the ssv analised
ssv <- as.character(params$ssv)
fill_up <- ifelse(length(ssv) %% 3 == 0, 0, 3 - (length(ssv) %% 3))
list_ssv <- c(ssv,rep(NA,fill_up))
ssv_matrix <- matrix(list_ssv, ncol = 3)
options(knitr.kable.NA = '')
kable(ssv_matrix,
      caption = "Analysed ssv.",
      longtable = TRUE,
      booktabs = TRUE)%>%
    kable_styling(latex_options = c("hold_position", "repeat_header"))%>%
  column_spec(column = 1:3, width = "4cm")

r if(params$type == "continuous"){"Regression"}else{"Density"} Plots {#sanityCheck}

The r if(params$type == "continuous"){"regression"}else{"density"} plots that were used to determine the final selection of influential process parameters are shown below.

# # We want three plots per row
length_causes <- length(results$Causes)

# for (i in 1:length_causes) {
#    plot_name <- paste0(str_replace_all(results$Causes[i], "[^[:alnum:]]", ""), "_against_", params$target,".png")
#    cat("![](", plot_name, "){width=33%}")
#    cat('\n')
# }
for (i in 1:length_causes) {
   plot_name <- paste0(params$image_directory,"/",str_replace_all(results$Causes[i], "[^[:alnum:]]", ""), "_against_", params$target,".png")
   cat("![](", plot_name, "){width=33%}")
   cat('\n')
}

r if(params$validation){"Validated Observations"}

r if(params$validation){"The list of all observations extracted in the validation step."}

if(params$validation){
validation <- validation[order(validation$expected_quality),]

kable(validation,
      row.names = FALSE,
      caption = 'The records of the validation data set that fall into all the good control bands resp. bad control bands.',
      booktabs = TRUE,
      longtable = TRUE)%>%
  kable_styling(latex_options = c("hold_position", "repeat_header"))
}


Try the igate package in your browser

Any scripts or data that you put into this service are public.

igate documentation built on Sept. 11, 2019, 1:04 a.m.