View source: R/sample_verification.R
sample_verification | R Documentation |
This function takes in a level-1 data frame and an exclusion list and returns a level-2 data frame with a verification column. The verification column contains either "Y", indicating the row is good for analysis, or messages contained in the exclusion list for why the data rows are excluded. If an exclusion list is not provided, all rows are assumed to be good for use in further analyses and are verified with "Y".
sample_verification(
FILENAME,
data.in,
exclusion.info,
assay,
output.res = FALSE,
INPUT.DIR = NULL,
OUTPUT.DIR = NULL,
verbose = TRUE
)
FILENAME |
(Character) A string used to identify the output level-1 file. "<FILENAME>-<assay>-Level1.tsv". |
data.in |
(Data Frame) A level-1 data frame from the format functions. |
exclusion.info |
(Data Frame) A data frame containing the variables and values of the corresponding variables to exclude rows. See details for full explanation. |
assay |
(Character) A string indicating what assay data the input file is. Valid
input is one of the following: "Clint", "fup-UC", "fup-RED", or "Caco-2".
This argument only needs to be specified when importing input data set with |
output.res |
(Logical) When set to |
INPUT.DIR |
(Character) Path to the directory where the input level-1 file exists.
If |
OUTPUT.DIR |
(Character) Path to the directory to save the output file.
If |
verbose |
(logical) Indicate whether printed statements should be shown. (Default is TRUE.) |
The 'exclusion.info' should be a data frame with the following columns:
Variables | level-1 variable(s) used to filter rows for exclusion |
Values | Value(s) to exclude |
Message | Simple explanation for the exclusion |
When filtering on multiple variable-value pairs, the character input for "Variables" and "Values" should be separated by a vertical bar "|" , and the variable-value pairs should match. See demonstration in Examples, Scenario 1.
NOTE: Currently if NA's exist in a variable of interest for 'verification' assignments, then that variable cannot be used for assigning verification. Thus, either alternative variable-value pairs will need to be used in lieu of variable with missing values, or (though less ideal) "manual coding" adjustments in the verification column may be necessary.
If the output level-2 data frame is chosen to be exported and an output directory
is not specified, it will be exported to the user's R session temporary directory.
This temporary directory is a per-session directory whose path can be found
with the following code: tempdir()
. For more details, see
https://www.collinberke.com/til/posts/2023-10-24-temp-directories/.
As a best practice, INPUT.DIR
(when importing a .tsv file) and/or
OUTPUT.DIR
should be specified to simplify the process of importing
and exporting files. This practice ensures that the exported files can easily
be found and will not be exported to a temporary directory.
A level-2 data frame with a verification column.
Zhihui (Grace) Zhao
level1 <- invitroTKstats::clint_L1
# Scenario 1: Pass in data.in and exclusion.info data frame from R session
# Create a exclusion criteria data frame
# Use the excluded samples found in \code{invitroTKstats::clint_L2_heldout}
# If more than one variable is used to define a set of samples to be excluded,
# enter them as one string, separate the Variables with a vertical bar, "|",
# and do the same for Values.
excluded_level2 <- invitroTKstats::clint_L2_heldout
exclusion_criteria <- data.frame(
Variables = paste("Compound.Name", "Lab.Sample.Name", sep = "|"),
Values = paste(excluded_level2[,"Compound.Name"], excluded_level2[,"Lab.Sample.Name"], sep = "|"),
Message = excluded_level2[,"Verified"]
)
# Run the verification function.
my.level2 <- sample_verification(data.in=level1,
exclusion.info = exclusion_criteria,
output.res = FALSE)
# Scenario 2: Import 'tsv' as input data and do not pass in an exclusion.info data frame
## Not run:
# Write the level-1 file to some folder
# Will need to replace <desired level-1 FOLDER> with desired export folder location.
# The <desired level-1 FOLDER> needs to already exist.
write.table(level1,
file=here::here("<desired level-1 FOLDER>/Smeltz-Clint-Level1.tsv"),
sep="\t",
row.names=FALSE,
quote=FALSE)
# Run the verification function.
# Specify the path to import level-1 data with INPUT.DIR.
# Will need to replace INPUT.DIR = <desired level-1 FOLDER> with chosen output
# folder location from above
# If no exclusion.info data frame is used, will label all samples as verified.
# A level-2 file is also exported to INPUT.DIR when OUTPUT.DIR is not specified.
my.level2 <- sample_verification(FILENAME="Smeltz",
assay="Clint", INPUT.DIR = here::here("<desired level-1 FOLDER>"))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.