View source: R/class_constructor.R
read_from_excel | R Documentation |
Reads data from an Excel file of the following format:
Left side of the sheet contains information about the features, size features x feature info columns
Top part contains sample information, size sample info variables x samples
The middle contains the actual abundances, size features x samples
See the vignette for more information. This function separates the three parts from the file, and returns them in a list
read_from_excel(
file,
sheet = 1,
id_column = NULL,
corner_row = NULL,
corner_column = NULL,
id_prefix = "ID_",
split_by = NULL,
name = NULL,
mz_limits = c(10, 2000),
rt_limits = c(0, 20),
skip_checks = FALSE
)
file |
path to the Excel file |
sheet |
the sheet number or name |
id_column |
character, column name for unique identification of samples |
corner_row |
integer, the bottom row of sample information, usually contains data file names and feature info column names. If set to NULL, will be detected automatically. |
corner_column |
integer or character, the corresponding column number or the column name (letter) in Excel. If set to NULL, will be detected automatically. |
id_prefix |
character, prefix for autogenerated sample IDs, see Details |
split_by |
character vector, in the case where all the modes are in the same Excel file, the column names of feature data used to separate the modes (usually Mode and Column) |
name |
in the case where the Excel file only contains one mode, the name of the mode, such as "Hilic_neg" |
mz_limits |
numeric vector of two, all m/z values should be in between these |
rt_limits |
numeric vector of two, all retention time values should be in between these |
skip_checks |
logical: skip checking data integrity. Not recommended, but sometimes useful when you just want to read the data in as is and fix errors later. NOTE: Sample_ID and QC columns will not be constructed. The data integrity checks need to be passed when contstructing MetaboSet objects. |
Only specify one of split_by
and name
. The feature data returned will contain a column
named "Split", which is used to separate features from different modes. Unless a column named "Feature_ID"
is found in the file, a feature ID will be generated based on the value of "Split", mass and retention time.
The function will try to find columns for mass and retention time by looking at a few common alternatives,
and throw an error if no matching column is found.
Sample information needs to contain a row called "Injection_order",
and the values need to be unique. In addition, a possible sample identifier row needs to be named "Sample_ID",
or to be specified in id_column
, and the values need to be unique, with an exception of QC samples:
if there are any "QC" identifiers, they will be replaced with "QC_1", "QC_2" and so on.
If a "Sample_ID" row is not found, it will be created using the id_prefix
and injection order.
list of three data frames:
exprs: the actual abundances, size features x samples
pheno_data: sample information, size sample info variables x samples
feature_data: information about the features, size features x feature info columns
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.