supermatrix: The Supermatrix function

Description Usage Arguments Details Value

View source: R/supermatrix.R

Description

Not really a matrix! This function allows you to perform multiple pairwise association tests between numerical/categorical variables (phenotypes) and aligned amino-acid sites (categorical), create a matrix of p-values and adjust them by several p-value adjustment methods. For numeric vs. categorical data, Kruskal-Wallis test is used. For categorical data, Fisher Exact test is used. See details for more information.

Usage

1
2
supermatrix(x, var, mut, fisher.method = "exact", correction = NULL,
  control = TRUE)

Arguments

x

A data frame containing your data. See details for more information.

var

A range of columns containing trait data (Set 1) in x. Data can be either quantitative (numeric) or qualitative (factor). Example: 1:20, for 20 columns of trait data, located in the first 20 columns.

mut

A range of columns containing genetypic data (Set 2) in x. Example: 20:40, for 20 sites located in columns 20 to 40.

fisher.method

Indicates which method should be used for fisher's tests. "exact" is default. For big datasets, other methods might be useful. "sim" simulates p-values using 20000 Monte Carlo iterations, and "hybrid" will use hybrid method (see ?fisher.test for more details on Fisher's methods).

correction

In case that p-value correction for multiple tests is required, use this parameter to specify which correction must be used. See ?p.adjust for details of available correction methods.

control

Used as a internal control helper, to check that supermatrix performed the tests accordingly to variables' type.If TRUE, two columns indicating whether (1) the trait value was identified as numeric, and (2) the name of the test used for that column vs. genotypic data.

Details

Supermatrix function was designed to perform multiple association analyses between phenotypic/life history traits, and proteic sites across multiple animal species. Such data should be distributed in columns, using rows for each species to be included in the analyses. The recommended format is a single data.frame containing the two sets of data to be compared pairwise. Set1 must ideally contain phenotypic data (in columns), which can be quantitative (e.g. number of offspring, weight, etc) or categorical unordered (e.g. diurnal/nocturnal, herbivore/carnivore/omnivore) data. Set2 must ideally contain categorical unordered data, such as which amino-acid in a specific site in a proteic sequence. IMPORTANT: Categorical data must be of factor class. Since supermatrix only creates a data frame with P-values (which may be insufficient in most cases), we encourage using spw function instead.

Value

A matrix-like dataframe with P-values for each test performed.


GRealesM/superwise documentation built on May 28, 2019, 12:38 p.m.