as.seqData: Convert Data to Appropriate pmartRseq Class
In pmartR/pmartRseq: Analysis and Visualization of 16S Data

Description Usage Arguments Details Author(s)

Converts a list object or several data.frames of rRNA (16S/ITS/18S), metatranscript, or metagenomic data to an object of the class 'seqData'. Objects of the class 'seqData' are lists with two obligatory components e_data and f_data. An optional list component e_meta is used if analysis or visualization at other levels (e.g. taxonomy) is also desired.

1 2	as.seqData(e_data, f_data, e_meta = NULL, edata_cname, fdata_cname, data_type, taxa_cname = NULL, ...)

`e_data`	a p \times n + 1 data.frame of expression data, where p is the number of features observed and n is the number of samples (an additional feature identifier/name column should also be present anywhere in the data.frame). Each row corresponds to data for each feature.
`f_data`	a data.frame with n rows. Each row corresponds to a sample with one column giving the unique sample identifiers found in e_data column names and other columns providing qualitative and/or quantitative traits of each sample.
`e_meta`	an optional data.frame with p rows. Each row corresponds to a feature with one column giving identifiers (must be named the same as the column in `e_data`) and other columns giving meta information (e.g. mappings of OTU identification to taxonomy).
`edata_cname`	character string specifying the name of the column containing the identifiers in `e_data` and `e_meta` (if applicable).
`fdata_cname`	character string specifying the name of the column containing the sample identifiers in `f_data`.
`data_type`	character string specifying if this is 'rRNA' (for 16S/ITS/18S), 'metagenomic', or 'metatranscriptomic' data.
`taxa_cname`	optional character string specifying the name of the column containing the taxonomy in `e_meta` (if applicable). Defaults to NULL. If `e_meta` is NULL, then specify `taxa_cname` as NULL.
`...`	further arguments
`e_tree`	an optional NEXUS or Newick formatted phylogenetic tree file, imported using ape::read.tree(tree_path). The OTU labels in the tree file should match the OTU identifiers in the preceeding data fields.
`e_seq`	an optional fasta formatted representation of biological sequences imported using Biostrings::readDNAStringSet(fasta_path, ...). Each OTU in the fasta maps to at least one sequence in the preceeding data fields.
`ec_cname`	optional character string specifying the name of the column containing the EC numbers in `e_meta` (if applicable). Defaults to NULL. If `e_meta` is NULL, then specify `ec_cname` as NULL.
`gene_cname`	optional character string specifying the name of the column containing the gene names in `e_meta` (if applicable). Defaults to NULL. If `e_meta` is NULL, then specify `gene_cname` as NULL.

Objects of class 'seqData' contain some attributes that are referenced by downstream functions. These attributes can be changed from their default value by manual specification. A list of these attributes as well as their default values are as follows:

data_scale	Scale of the data provided in `e_data`. Acceptable values are 'log2', 'log10', 'log', 'count', and 'abundance', which indicate data is log base 2, base 10, natural log transformed, raw count data, and raw abundance, respectively. Default values is 'count'.

data_norm	A logical argument, specifying whether the data has been normalized or not. Default value is 'FALSE'.

norm_method	Null if data_norm is FALSE. If data_norm is TRUE, character string defining which normalization method was used. Default value is 'NULL'.

location_param	NULL if there are no location parameters from normalization, otherwise a vector detailing the normalization location parameters for each sample.

scale_param	NULL if there are no scale parameters from normalization, otherwise a vector detailing the normalization scale parameters for each sample.

seq_type	Character string describing the type of sequencer (e.g. 'HiSeq'). Default value is 'NULL'.

db	Character string describing which database was used to process the data (e.g. "TIGR"). Default value is 'NULL'.

db_version	Character string describing which version of the database was used. Default value is 'NULL'. If db is NULL, then db_version will default to a NULL value.

Computed values included in the data_info attribute are as follows:

num_edata	The number of unique `edata_cname` entries.

num_na	The number of NA observations in the dataset.

frac_na	The prportion of `e_data` values that are NA.

num_zero	The number of observations that equal 0 in the dataset.

frac_zero	The proportion of `e_data` values that are 0.

num_taxa	The number of unique `taxa_cname` entries.

num_ec	The number of unique `ec_cname` entries.

num_gene	The number of unique `gene_cname` entries.

num_samps	The number of samples that make up the columns of `e_data`.

meta_info	A logical argument, specifying whether `e_meta` is provided.