format_data: Format raw experimental data for subsequent processing

Description Usage Arguments Value

View source: R/format_data.R

Description

This function formats data for subsequent processing. It will accept all data files in a given directory, or any number of data files specified by their names, or any number of dataframes supplied as a named list, or even a single dataframe supplied as is.

Usage

1
format_data(input, data_type, skip_lines = 0, metadata_json = NULL)

Arguments

input

A directory name, a vector of file names, a single dataset already loaded in memory as a dataframe, or a list of datasets already loaded in memory as dataframes.

data_type

A character string describing which type of data is supplied for pre-processing. Currently supported options are "fret" and "fp".

skip_lines

The number of lines to skip at the beginning of CSV files (usually containing header information before the actual data starts).

metadata_json

A JSON file describing mappings between internal names and metadata information in the actual data files. It can contain the following entries, independent of the type of data:

content

The name of the column describing sample content. Default value: "Content".

concentration

The name of the column containing the concentration series. Default value: "concentration".

titration

The name describing points of the titration series. Default value: "titration".

buffer_only

The name describing points of the control series with only buffer. Default value: "buffer_only".

A metadata file for FRET datasets can also contain the following entries:

fret_channel

The name of the column containing fluorescence intensities from the FRET channel. Default value: "fret_channel".

acceptor_channel

The name of the column containing fluorescence intensities from the acceptor channel. Default value: "acceptor_channel".

donor_channel

The name of the column containing fluorescence intensities from the donor channel. Default value: "donor_channel".

acceptor_only

The name describing points of the control series with only acceptor. Default value: "acceptor_only".

donor_only

The name describing points of the control series with only donor. Default value: "donor_only".

A metadata file for FP/FA datasets can also contain the following entries:

parallel

The name of the column containing fluorescence intensities from the parallel channel. Default value: "parallel".

perpendicular

The name of the column containing fluorescence intensities from the perpendicular channel. Default value: "perpendicular".

polarization

The name of the column containing fluorescence polarization values, if already calculated by the instrument. Default value: "polarization".

anisotropy

The name of the column containing fluorescence anisotropy values, if already calculated by the instrument. Default value: "anisotropy".

intensity

The name of the column containing total fluorescence intensity values, if already calculated by the instrument. Default value: "intensity".

baseline

The name describing points of the baseline series (fluorescent probe only, without titrant molecule). Default value: "baseline".

Value

A single dataframe with the combined input data, containing 8 columns: Experiment, Type, Replicate, Observation, fret_channel, acceptor_channel, donor_channel and concentration. Rows with missing values will be dropped.


Guilz/rfret documentation built on Oct. 18, 2021, 2:14 p.m.