merge_level0 | R Documentation |
This function reads multiple Excel files containing mass-spectrometry (MS) data and extracts the chemical sample data from the specified sheets. The argument 'level0.catalog' is a table that provides the necessary information to find the data for each chemical. The primary data of interest are the analyte peak area, the internal standard peak area, and the target concentration for calibration curve (CC) samples. The argument 'data.label' is used to annotate this particular mapping of level-0 files into data ready to be organized into a level-1 file.
merge_level0(
FILENAME = "MYDATA",
level0.catalog,
file.col = "File",
sheet = NULL,
sheet.col = "Sheet",
skip.rows = NULL,
skip.rows.col = "Skip.Rows",
num.rows = NULL,
num.rows.col = NULL,
date = NULL,
date.col = "Date",
compound.col = "Chemical.ID",
istd.col = "ISTD",
col.names.loc = NULL,
col.names.loc.col = "Col.Names.Loc",
sample.colname = NULL,
sample.colname.col = "Sample.ColName",
type.colname = NULL,
type.colname.col = "Type",
peak.colname = NULL,
peak.colname.col = "Peak.ColName",
istd.peak.colname = NULL,
istd.peak.colname.col = "ISTD.Peak.ColName",
conc.colname = NULL,
conc.colname.col = "Conc.ColName",
analysis.param.colname = NULL,
analysis.param.colname.col = "AnalysisParam.ColName",
additional.colnames = NULL,
additional.colname.cols = NULL,
chem.ids,
chem.lab.id.col = "Chem.Lab.ID",
chem.name.col = "Compound",
chem.dtxsid.col = "DTXSID",
catalog.out = FALSE,
output.res = FALSE,
INPUT.DIR = NULL,
OUTPUT.DIR = NULL,
verbose = TRUE
)
FILENAME |
(Character) A string used to identify outputs of the function call. (Default to "MYDATA") |
level0.catalog |
A data frame describing which columns of which sheets in which Excel files contain MS data for analysis. See details for full explanation. |
file.col |
(Character) Column name containing level-0 file names to pull data from. |
sheet |
(Character) Excel file sheet name/identifier containing level-0 where data is to be pulled from. (Defaults to 'NULL'.) (Note: Single entry only, use only if all files have the same sheet identifier for level-0 data.) |
sheet.col |
(Character) Catalog column name containing 'sheet' information. (Default to "Sheet") |
skip.rows |
(Numeric) Number of rows to skip when extracting level-0 data from the specified Excel file(s). (Defaults to 'NULL'.) (Note: Single entry only, use only if all files need to skip the same number of rows for extracting level-0 data.) |
skip.rows.col |
(Character) Catalog column name containing 'skip.rows' information. (Default to "Skip.Rows") |
num.rows |
(Numeric) Number of rows to pull when extracting level-0 data from the specified Excel file(s). (Defaults to 'NULL'.) (Note: Single entry only, use only if all files need to pull the same number of rows for extracting level-0 data.) |
num.rows.col |
(Character) Catalog column name containing 'num.rows' information. (Default to 'NULL') |
date |
(Character) Date of laboratory measurements. Typical format "MMDDYY" ("MM" = 2 digit month, "DD" = 2 digit day, and "YY" = 2 digit year). (Defaults to 'NULL'.) (Note: Single entry only, use only if all files have the same laboratory measurement date.) |
date.col |
(Character) Catalog column name containing 'date' information. (Defaults to "Date") |
compound.col |
(Character) Catalog column name containing 'compound' information. (Defaults to "Chemical.ID") |
istd.col |
(Character) Catalog column name containing 'istd' information, or the MS peak area for the internal standard. (Defaults to "ISTD") |
col.names.loc |
(Numeric) Row location of data column names. (Defaults to 'NULL'.) (Note: Single entry only, use only if all files have column names in the same row location, typically the first row.) |
col.names.loc.col |
(Character) Catalog column name containing 'col.names.loc' information. (Defaults to "Col.Names.Loc") |
sample.colname |
(Character) Column name of level-0 data containing sample information. (Defaults to 'NULL'.) (Note: Single entry only, use only if all files use the same column name for sample names when extracting level-0 data.) |
sample.colname.col |
(Character) Catalog column name containing 'sample.colname' information. (Defaults to "Sample.ColName") |
type.colname |
(Character) Column name of the level-0 data containing the type of sample. (Defaults to 'NULL'.) (Note: Single entry only, use only if all files use the same column name for sample type information when extracting level-0 data.) |
type.colname.col |
(Character) Catalog column name containing 'type.colname' information. (Defaults to "Type".) |
peak.colname |
(Character) Column name of the level-0 data containing the analyte Mass Spectrometry peak area. (Defaults to 'NULL'.) (Note: Single entry only, use only if all files use the same column name for analyte peak area information when extracting level-0 data.) |
peak.colname.col |
(Character) Catalog column name containing 'peak.colname' information. (Defaults to "Peak.ColName") |
istd.peak.colname |
(Character) Column name of the level-0 data containing the internal standard Mass Spectrometry peak area. (Note: Single entry only, use only if all files use the same column name for internal standard MS peak area information when extracting level-0 data.) |
istd.peak.colname.col |
(Character) Catalog column name containing 'istd.peak.colname' information. (Defaults to "ISTD.Peak.ColName") |
conc.colname |
(Character) Column name of the level-0 data containing intended concentrations for calibration curves. (Defaults to 'NULL'.) (Note: Single entry only, use only if all files use the same column name for intended concentration information when extracting level-0 data.) |
conc.colname.col |
(Character) Catalog column name containing 'conc.colname' information. (Defaults to "Conc.ColName") |
analysis.param.colname |
(Character) Column name of the level-0 data containing Mass Spectrometry instrument parameters for the analyte. (Defaults to 'NULL'.) (Note: Single entry only, use only if all files use the same column name for analysis parameter information when extracting level-0 data.) |
analysis.param.colname.col |
(Character) Catalog column name containing 'analysis.param.colname' information. (Defaults to "AnalysisParam.ColName") |
additional.colnames |
Additional columns from the level-0 data files to pull information from when extracting level-0 data and include in the compiled level-0 returned from 'merge_level0'. (Defaults to 'NULL'.) |
additional.colname.cols |
Catalog column name(s) containing 'additional.colnames' information, (Defaults to 'NULL'.) |
chem.ids |
(Data frame) A data frame containing basic chemical identification information for tested chemicals. |
chem.lab.id.col |
(Character) Column in 'chem.ids' containing the compound/chemical identifier used by the laboratory in level-0 measured data. (Defaults to "Chem.Lab.ID") |
chem.name.col |
(Character) 'chem.ids' column name containing the "standard" chemical name to use for annotation of the compiled level-0 returned from 'merge_level0'. (Defaults to "Compound") |
chem.dtxsid.col |
(Character) ‘chem.ids' column name containing EPA’s DSSTox Structure ID (http://comptox.epa.gov/dashboard) (Defaults to "DTXSID") |
catalog.out |
(Logical) When set to |
output.res |
(Logical) When set to |
INPUT.DIR |
(Character) Path to the directory where the Excel files
with level-0 data exist. If not specified, looking for the files
in the current working directory. (Defaults to |
OUTPUT.DIR |
(Character) Path to the directory to save the output file.
If |
verbose |
(logical) Indicate whether printed statements should be shown. (Default is TRUE.) |
Unless specified to be a single value for all the files, for example sheet="Data", the argument 'level0.catalog' should be a data frame with the following columns:
File | The Excel filename to be loaded |
Sheet | The name of the Sheet to examine within in the Excel file |
Skip.Rows | How many rows should be skipped on the sheet to get usable column names |
Date | The date the measurements were made |
Chemical.ID | The laboratory chemical identity |
ISTD | The internal standard used |
Col.Names.Loc | The row locations of the column names |
Sample.ColName | The column name on the sheet that contains sample identity |
Type.ColName | The column name on the sheet that contains the type of sample |
Peak.ColName | The column name on the sheet that contains the analyte MS peak area |
ISTD.Peak.ColName | The column name on the sheet that contains the internal standard MS peak area |
Conc.ColName | The column name on the sheet that contains the intended concentration for calibration curves |
AnalysisParam.ColName | The column name on the sheet that contains the MS instrument parameters for the analyte |
Columns with names ending in ".ColName" indicate the columns to be extracted from the specified Excel file and sheet containing level-0 data.
If the output level-0 file is chosen to be exported and an output directory
is not specified, it will be exported to the user's R session temporary directory.
This temporary directory is a per-session directory whose path can be found with
the following code: tempdir()
. For more details, see
https://www.collinberke.com/til/posts/2023-10-24-temp-directories/.
As a best practice, INPUT.DIR
(when importing a .tsv file) and/or
OUTPUT.DIR
shoud be specified to simplify the process of importing and
exporting files. This practice ensures that the exported files can easily be
found and will not be exported to a temporary directory.
data.frame |
A data.frame in standardized level-0 format |
John Wambaugh
# Create level0.catalog data.frame
# Will need to retrieve "Hep_745_949_959_082421_final.xlsx" file from
# inst/extdata/Kreutz-Clint and save it to desired directory.
# Note XLSX file does not need to be saved to current working directory.
catalog <- create_catalog(file = "Hep_745_949_959_082421_final.xlsx",
sheet = "Data063021",
skip.rows = 44,
num.rows = 30,
date = "063021",
compound = "745",
istd = "MFBET",
sample = "Name",
type = "Type",
peak = "Area...13",
istd.peak = "Resp....16",
conc = "Final Conc....11",
analysis.param = "Exp. Conc....10",
col.names.loc = 2)
# Create chem.ids data.frame
chem.ids <- data.frame("Chem.Lab.ID" = "745",
"Compound" = "(Heptafluorobutanoyl)pivaloylmethane",
"DTXSID" = "DTXSID3066215")
# Create level0 data.frame
# Will need to replace <PATH TO FILE> with chosen desired directory containing
# XLSX file from above.
level0 <- merge_level0(level0.catalog = catalog,
INPUT.DIR = system.file("extdata/Kreutz-Clint",package = "invitroTKstats"),
istd.col = "ISTD.Name",
type.colname.col = "Type.ColName",
num.rows.col = "Number.Data.Rows",
chem.ids = chem.ids,
catalog.out = FALSE,
output.res = FALSE) # do not auto-save the file
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.